search
HomeTechnology peripheralsAIFine-Tuning Gemma 2 and Using it Locally

This tutorial demonstrates fine-tuning Google's Gemma 2 model on a patient-doctor conversation dataset and deploying it for offline use. We'll cover model preparation, fine-tuning with LoRA, model merging, quantization, and local deployment with the Jan application.

Fine-Tuning Gemma 2 and Using it Locally

Understanding Gemma 2

Gemma 2, Google's latest open-source large language model (LLM), offers 9B and 27B parameter versions under a permissive license. Its improved architecture provides faster inference across various hardware, integrating seamlessly with Hugging Face Transformers, JAX, PyTorch, and TensorFlow. Enhanced safety features and ethical AI deployment tools are also included.

Fine-Tuning Gemma 2 and Using it Locally

Accessing and Running Gemma 2

This section details downloading and running inference with 4-bit quantization (necessary for memory efficiency on consumer hardware).

  1. Install packages: Install bitsandbytes, transformers, and accelerate.

  2. Hugging Face Authentication: Use a Hugging Face token (obtained from your Hugging Face account) to authenticate.

  3. Load Model and Tokenizer: Load the google/gemma-2-9b-it model using 4-bit quantization and appropriate device mapping.

  4. Inference: Create a prompt, tokenize it, generate a response, and decode it.

Fine-Tuning Gemma 2 and Using it Locally

Fine-Tuning Gemma 2 and Using it Locally

Fine-tuning Gemma 2 with LoRA

This section guides you through fine-tuning Gemma 2 on a healthcare dataset using LoRA (Low-Rank Adaptation) for efficient training.

  1. Setup: Install required packages (transformers, datasets, accelerate, peft, trl, bitsandbytes, wandb). Authenticate with Hugging Face and Weights & Biases.

  2. Model and Tokenizer Loading: Load Gemma 2 (9B-It) with 4-bit quantization, adjusting data type and attention implementation based on your GPU capabilities. Configure LoRA parameters.

  3. Dataset Loading: Load and preprocess the lavita/ChatDoctor-HealthCareMagic-100k dataset, creating a chat format suitable for the model.

  4. Training: Set training arguments (adjust hyperparameters as needed) and train the model using the SFTTrainer. Monitor training progress with Weights & Biases.

Fine-Tuning Gemma 2 and Using it Locally

Fine-Tuning Gemma 2 and Using it Locally

  1. Evaluation: Finish the Weights & Biases run to generate an evaluation report.

  2. Saving the Model: Save the fine-tuned LoRA adapter locally and push it to the Hugging Face Hub.

Fine-Tuning Gemma 2 and Using it Locally

Merging the Adapter and Base Model

This step merges the fine-tuned LoRA adapter with the base Gemma 2 model for a single, deployable model. This is done on a CPU to manage memory constraints.

  1. Setup: Create a new notebook (CPU-based), install necessary packages, and authenticate with Hugging Face.

  2. Load and Merge: Load the base model and the saved adapter, then merge them using PeftModel.merge_and_unload().

  3. Save and Push: Save the merged model and tokenizer locally and push them to the Hugging Face Hub.

Fine-Tuning Gemma 2 and Using it Locally

Quantizing with Hugging Face Space

Use the GGUF My Repo Hugging Face Space to easily convert and quantize the model to the GGUF format for optimal local deployment.

Fine-Tuning Gemma 2 and Using it Locally

Using the Fine-tuned Model Locally with Jan

  1. Download and install the Jan application.

  2. Download the quantized model from the Hugging Face Hub.

  3. Load the model in Jan, adjust parameters (stop sequences, penalties, max tokens, instructions), and interact with the fine-tuned model.

Fine-Tuning Gemma 2 and Using it Locally

Conclusion

This tutorial provides a comprehensive guide to fine-tuning and deploying Gemma 2. Remember to adjust hyperparameters and settings based on your hardware and dataset. Consider exploring Keras 3 for potentially faster training and inference.

The above is the detailed content of Fine-Tuning Gemma 2 and Using it Locally. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
California Taps AI To Fast-Track Wildfire Recovery PermitsCalifornia Taps AI To Fast-Track Wildfire Recovery PermitsMay 04, 2025 am 11:10 AM

AI Streamlines Wildfire Recovery Permitting Australian tech firm Archistar's AI software, utilizing machine learning and computer vision, automates the assessment of building plans for compliance with local regulations. This pre-validation significan

What The US Can Learn From Estonia's AI-Powered Digital GovernmentWhat The US Can Learn From Estonia's AI-Powered Digital GovernmentMay 04, 2025 am 11:09 AM

Estonia's Digital Government: A Model for the US? The US struggles with bureaucratic inefficiencies, but Estonia offers a compelling alternative. This small nation boasts a nearly 100% digitized, citizen-centric government powered by AI. This isn't

Wedding Planning Via Generative AIWedding Planning Via Generative AIMay 04, 2025 am 11:08 AM

Planning a wedding is a monumental task, often overwhelming even the most organized couples. This article, part of an ongoing Forbes series on AI's impact (see link here), explores how generative AI can revolutionize wedding planning. The Wedding Pl

What Are Digital Defense AI Agents?What Are Digital Defense AI Agents?May 04, 2025 am 11:07 AM

Businesses increasingly leverage AI agents for sales, while governments utilize them for various established tasks. However, consumer advocates highlight the need for individuals to possess their own AI agents as a defense against the often-targeted

A Business Leader's Guide To Generative Engine Optimization (GEO)A Business Leader's Guide To Generative Engine Optimization (GEO)May 03, 2025 am 11:14 AM

Google is leading this shift. Its "AI Overviews" feature already serves more than one billion users, providing complete answers before anyone clicks a link.[^2] Other players are also gaining ground fast. ChatGPT, Microsoft Copilot, and Pe

This Startup Is Using AI Agents To Fight Malicious Ads And Impersonator AccountsThis Startup Is Using AI Agents To Fight Malicious Ads And Impersonator AccountsMay 03, 2025 am 11:13 AM

In 2022, he founded social engineering defense startup Doppel to do just that. And as cybercriminals harness ever more advanced AI models to turbocharge their attacks, Doppel’s AI systems have helped businesses combat them at scale— more quickly and

How World Models Are Radically Reshaping The Future Of Generative AI And LLMsHow World Models Are Radically Reshaping The Future Of Generative AI And LLMsMay 03, 2025 am 11:12 AM

Voila, via interacting with suitable world models, generative AI and LLMs can be substantively boosted. Let’s talk about it. This analysis of an innovative AI breakthrough is part of my ongoing Forbes column coverage on the latest in AI, including

May Day 2050: What Have We Left To Celebrate?May Day 2050: What Have We Left To Celebrate?May 03, 2025 am 11:11 AM

Labor Day 2050. Parks across the nation fill with families enjoying traditional barbecues while nostalgic parades wind through city streets. Yet the celebration now carries a museum-like quality — historical reenactment rather than commemoration of c

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function