Home >Technology peripherals >AI >Mistral 7B Tutorial: A Step-by-Step Guide to Using and Fine-Tuning Mistral 7B

Mistral 7B Tutorial: A Step-by-Step Guide to Using and Fine-Tuning Mistral 7B

尊渡假赌尊渡假赌尊渡假赌
尊渡假赌尊渡假赌尊渡假赌Original
2025-03-09 10:37:13885browse

This tutorial provides a comprehensive guide to using and fine-tuning the Mistral 7B language model for natural language processing tasks. You'll learn to leverage Kaggle for model access, perform inference, apply quantization techniques, fine-tune the model, merge adapters, and deploy to the Hugging Face Hub.

Accessing Mistral 7B

Mistral 7B is accessible via various platforms including Hugging Face, Vertex AI, Replicate, Sagemaker Jumpstart, and Baseten. This tutorial focuses on utilizing Kaggle's "Models" feature for streamlined access, eliminating the need for manual downloads.

This section demonstrates loading the model from Kaggle and performing inference. Essential library updates are crucial to prevent errors:

<code>!pip install -q -U transformers
!pip install -q -U accelerate
!pip install -q -U bitsandbytes</code>

4-bit quantization with NF4 configuration using BitsAndBytes enhances loading speed and reduces memory usage:

<code>from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, pipeline
import torch

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)</code>

Adding the Mistral 7B model to your Kaggle notebook involves these steps:

  1. Click " Add Models" in the right panel.
  2. Search for "Mistral 7B", select "7b-v0.1-hf", and add it.
  3. Note the directory path.

Mistral 7B Tutorial: A Step-by-Step Guide to Using and Fine-Tuning Mistral 7B

Model and tokenizer loading uses the transformers library:

<code>model_name = "/kaggle/input/mistral/pytorch/7b-v0.1-hf/1"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
        model_name,
        load_in_4bit=True,
        quantization_config=bnb_config,
        torch_dtype=torch.bfloat16,
        device_map="auto",
        trust_remote_code=True,
    )</code>

Inference is simplified using the pipeline function:

<code>pipe = pipeline(
    "text-generation", 
    model=model, 
    tokenizer = tokenizer, 
    torch_dtype=torch.bfloat16, 
    device_map="auto"
)</code>

Prompting the model and setting parameters:

<code>prompt = "As a data scientist, can you explain the concept of regularization in machine learning?"

sequences = pipe(
    prompt,
    do_sample=True,
    max_new_tokens=100, 
    temperature=0.7, 
    top_k=50, 
    top_p=0.95,
    num_return_sequences=1,
)
print(sequences[0]['generated_text'])</code>

Mistral 7B Fine-tuning

This section guides you through fine-tuning Mistral 7B on the guanaco-llama2-1k dataset, utilizing techniques like PEFT, 4-bit quantization, and QLoRA. The tutorial also references a guide on Fine-Tuning LLaMA 2 for further context.

Setup

Necessary libraries are installed:

<code>%%capture
%pip install -U bitsandbytes
%pip install -U transformers
%pip install -U peft
%pip install -U accelerate
%pip install -U trl</code>

Relevant modules are imported:

<code>from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig,HfArgumentParser,TrainingArguments,pipeline, logging
from peft import LoraConfig, PeftModel, prepare_model_for_kbit_training, get_peft_model
import os,torch, wandb
from datasets import load_dataset
from trl import SFTTrainer</code>

API keys are securely managed using Kaggle Secrets:

<code>from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
secret_hf = user_secrets.get_secret("HUGGINGFACE_TOKEN")
secret_wandb = user_secrets.get_secret("wandb")</code>

Hugging Face and Weights & Biases are configured:

<code>!huggingface-cli login --token $secret_hf
wandb.login(key = secret_wandb)
run = wandb.init(
    project='Fine tuning mistral 7B', 
    job_type="training", 
    anonymous="allow"
)</code>

Base model, dataset, and new model name are defined:

<code>base_model = "/kaggle/input/mistral/pytorch/7b-v0.1-hf/1"
dataset_name = "mlabonne/guanaco-llama2-1k"
new_model = "mistral_7b_guanaco"</code>

Data Loading

The dataset is loaded and a sample is displayed:

<code>dataset = load_dataset(dataset_name, split="train")
dataset["text"][100]</code>

Mistral 7B Tutorial: A Step-by-Step Guide to Using and Fine-Tuning Mistral 7B

Loading Mistral 7B

The model is loaded with 4-bit precision:

<code>bnb_config = BitsAndBytesConfig(  
    load_in_4bit= True,
    bnb_4bit_quant_type= "nf4",
    bnb_4bit_compute_dtype= torch.bfloat16,
    bnb_4bit_use_double_quant= False,
)
model = AutoModelForCausalLM.from_pretrained(
        base_model,
        load_in_4bit=True,
        quantization_config=bnb_config,
        torch_dtype=torch.bfloat16,
        device_map="auto",
        trust_remote_code=True,
)
model.config.use_cache = False
model.config.pretraining_tp = 1
model.gradient_checkpointing_enable()</code>

Loading the Tokenizer

The tokenizer is loaded and configured:

<code>tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
tokenizer.padding_side = 'right'
tokenizer.pad_token = tokenizer.eos_token
tokenizer.add_eos_token = True
tokenizer.add_bos_token, tokenizer.add_eos_token</code>

Adding the Adapter

A LoRA adapter is added for efficient fine-tuning:

<code>model = prepare_model_for_kbit_training(model)
peft_config = LoraConfig(
    lora_alpha=16,
    lora_dropout=0.1,
    r=64,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj","gate_proj"]
)
model = get_peft_model(model, peft_config)</code>

Hyperparameters

Training arguments are defined:

<code>training_arguments = TrainingArguments(
    output_,
    num_train_epochs=1,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=1,
    optim="paged_adamw_32bit",
    save_steps=25,
    logging_steps=25,
    learning_rate=2e-4,
    weight_decay=0.001,
    fp16=False,
    bf16=False,
    max_grad_norm=0.3,
    max_steps=-1,
    warmup_ratio=0.03,
    group_by_length=True,
    lr_scheduler_type="constant",
    report_to="wandb"
)</code>

SFT Training

The SFTTrainer is configured and training is initiated:

<code>!pip install -q -U transformers
!pip install -q -U accelerate
!pip install -q -U bitsandbytes</code>

Mistral 7B Tutorial: A Step-by-Step Guide to Using and Fine-Tuning Mistral 7B

Saving and Pushing the Model

The fine-tuned model is saved and pushed to the Hugging Face Hub:

<code>from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, pipeline
import torch

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)</code>

Model Evaluation

Model performance is assessed using Weights & Biases. Inference examples are provided.

Merging the Adapter

The adapter is merged with the base model, and the resulting model is pushed to Hugging Face.

Accessing the Fine-tuned Model

The merged model is loaded from Hugging Face and inference is demonstrated.

Conclusion

The tutorial concludes with a summary of Mistral 7B's capabilities and a recap of the steps involved in accessing, fine-tuning, and deploying the model. Resources and FAQs are also included. The emphasis is on providing a practical guide for users to work with this powerful language model.

The above is the detailed content of Mistral 7B Tutorial: A Step-by-Step Guide to Using and Fine-Tuning Mistral 7B. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn