Fine-Tuning Gemma 2 and Using it Locally-AI-php.cn

Home

Technology peripherals

Fine-Tuning Gemma 2 and Using it Locally

Jennifer Aniston

Mar 05, 2025 am 10:01 AM

This tutorial demonstrates fine-tuning Google's Gemma 2 model on a patient-doctor conversation dataset and deploying it for offline use. We'll cover model preparation, fine-tuning with LoRA, model merging, quantization, and local deployment with the Jan application.

Fine-Tuning Gemma 2 and Using it Locally

Understanding Gemma 2

Gemma 2, Google's latest open-source large language model (LLM), offers 9B and 27B parameter versions under a permissive license. Its improved architecture provides faster inference across various hardware, integrating seamlessly with Hugging Face Transformers, JAX, PyTorch, and TensorFlow. Enhanced safety features and ethical AI deployment tools are also included.

Fine-Tuning Gemma 2 and Using it Locally

Accessing and Running Gemma 2

This section details downloading and running inference with 4-bit quantization (necessary for memory efficiency on consumer hardware).

Install packages: Install bitsandbytes, transformers, and accelerate.
Hugging Face Authentication: Use a Hugging Face token (obtained from your Hugging Face account) to authenticate.
Load Model and Tokenizer: Load the google/gemma-2-9b-it model using 4-bit quantization and appropriate device mapping.
Inference: Create a prompt, tokenize it, generate a response, and decode it.

Fine-Tuning Gemma 2 and Using it Locally

Fine-tuning Gemma 2 with LoRA

This section guides you through fine-tuning Gemma 2 on a healthcare dataset using LoRA (Low-Rank Adaptation) for efficient training.

Setup: Install required packages (transformers, datasets, accelerate, peft, trl, bitsandbytes, wandb). Authenticate with Hugging Face and Weights & Biases.
Model and Tokenizer Loading: Load Gemma 2 (9B-It) with 4-bit quantization, adjusting data type and attention implementation based on your GPU capabilities. Configure LoRA parameters.
Dataset Loading: Load and preprocess the lavita/ChatDoctor-HealthCareMagic-100k dataset, creating a chat format suitable for the model.
Training: Set training arguments (adjust hyperparameters as needed) and train the model using the SFTTrainer. Monitor training progress with Weights & Biases.

Fine-Tuning Gemma 2 and Using it Locally

Evaluation: Finish the Weights & Biases run to generate an evaluation report.
Saving the Model: Save the fine-tuned LoRA adapter locally and push it to the Hugging Face Hub.

Fine-Tuning Gemma 2 and Using it Locally

Merging the Adapter and Base Model

This step merges the fine-tuned LoRA adapter with the base Gemma 2 model for a single, deployable model. This is done on a CPU to manage memory constraints.

Setup: Create a new notebook (CPU-based), install necessary packages, and authenticate with Hugging Face.
Load and Merge: Load the base model and the saved adapter, then merge them using PeftModel.merge_and_unload().
Save and Push: Save the merged model and tokenizer locally and push them to the Hugging Face Hub.

Fine-Tuning Gemma 2 and Using it Locally

Quantizing with Hugging Face Space

Use the GGUF My Repo Hugging Face Space to easily convert and quantize the model to the GGUF format for optimal local deployment.

Fine-Tuning Gemma 2 and Using it Locally

Using the Fine-tuned Model Locally with Jan

Download and install the Jan application.
Download the quantized model from the Hugging Face Hub.
Load the model in Jan, adjust parameters (stop sequences, penalties, max tokens, instructions), and interact with the fine-tuned model.

Fine-Tuning Gemma 2 and Using it Locally

Conclusion

This tutorial provides a comprehensive guide to fine-tuning and deploying Gemma 2. Remember to adjust hyperparameters and settings based on your hardware and dataset. Consider exploring Keras 3 for potentially faster training and inference.

The above is the detailed content of Fine-Tuning Gemma 2 and Using it Locally. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

California Taps AI To Fast-Track Wildfire Recovery PermitsMay 04, 2025 am 11:10 AM

AI Streamlines Wildfire Recovery Permitting Australian tech firm Archistar's AI software, utilizing machine learning and computer vision, automates the assessment of building plans for compliance with local regulations. This pre-validation significan

What The US Can Learn From Estonia's AI-Powered Digital GovernmentMay 04, 2025 am 11:09 AM

Estonia's Digital Government: A Model for the US? The US struggles with bureaucratic inefficiencies, but Estonia offers a compelling alternative. This small nation boasts a nearly 100% digitized, citizen-centric government powered by AI. This isn't

Wedding Planning Via Generative AIMay 04, 2025 am 11:08 AM

Planning a wedding is a monumental task, often overwhelming even the most organized couples. This article, part of an ongoing Forbes series on AI's impact (see link here), explores how generative AI can revolutionize wedding planning. The Wedding Pl

What Are Digital Defense AI Agents?May 04, 2025 am 11:07 AM

Businesses increasingly leverage AI agents for sales, while governments utilize them for various established tasks. However, consumer advocates highlight the need for individuals to possess their own AI agents as a defense against the often-targeted

A Business Leader's Guide To Generative Engine Optimization (GEO)May 03, 2025 am 11:14 AM

Google is leading this shift. Its "AI Overviews" feature already serves more than one billion users, providing complete answers before anyone clicks a link.[^2] Other players are also gaining ground fast. ChatGPT, Microsoft Copilot, and Pe

This Startup Is Using AI Agents To Fight Malicious Ads And Impersonator AccountsMay 03, 2025 am 11:13 AM

In 2022, he founded social engineering defense startup Doppel to do just that. And as cybercriminals harness ever more advanced AI models to turbocharge their attacks, Doppel’s AI systems have helped businesses combat them at scale— more quickly and

How World Models Are Radically Reshaping The Future Of Generative AI And LLMsMay 03, 2025 am 11:12 AM

Voila, via interacting with suitable world models, generative AI and LLMs can be substantively boosted. Let’s talk about it. This analysis of an innovative AI breakthrough is part of my ongoing Forbes column coverage on the latest in AI, including

May Day 2050: What Have We Left To Celebrate?May 03, 2025 am 11:11 AM

Labor Day 2050. Parks across the nation fill with families enjoying traditional barbecues while nostalgic parades wind through city streets. Yet the celebration now carries a museum-like quality — historical reenactment rather than commemoration of c

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055523 fails to install in Windows 11?

3 weeks agoByDDD

How to fix KB5055518 fails to install in Windows 10?

3 weeks agoByDDD

Strength Levels for Every Enemy & Monster in R.E.P.O.

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Roblox: Dead Rails - How To Tame Wolves

3 weeks agoByDDD

Blue Prince: How To Get To The Basement

3 weeks agoByDDD

Hot Tools

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),