Home >Technology peripherals >AI >Mistral 7B Tutorial: A Step-by-Step Guide to Using and Fine-Tuning Mistral 7B
This tutorial provides a comprehensive guide to using and fine-tuning the Mistral 7B language model for natural language processing tasks. You'll learn to leverage Kaggle for model access, perform inference, apply quantization techniques, fine-tune the model, merge adapters, and deploy to the Hugging Face Hub.
Mistral 7B is accessible via various platforms including Hugging Face, Vertex AI, Replicate, Sagemaker Jumpstart, and Baseten. This tutorial focuses on utilizing Kaggle's "Models" feature for streamlined access, eliminating the need for manual downloads.
This section demonstrates loading the model from Kaggle and performing inference. Essential library updates are crucial to prevent errors:
<code>!pip install -q -U transformers !pip install -q -U accelerate !pip install -q -U bitsandbytes</code>
4-bit quantization with NF4 configuration using BitsAndBytes enhances loading speed and reduces memory usage:
<code>from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, pipeline import torch bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True, )</code>
Adding the Mistral 7B model to your Kaggle notebook involves these steps:
Model and tokenizer loading uses the transformers
library:
<code>model_name = "/kaggle/input/mistral/pytorch/7b-v0.1-hf/1" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, load_in_4bit=True, quantization_config=bnb_config, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True, )</code>
Inference is simplified using the pipeline
function:
<code>pipe = pipeline( "text-generation", model=model, tokenizer = tokenizer, torch_dtype=torch.bfloat16, device_map="auto" )</code>
Prompting the model and setting parameters:
<code>prompt = "As a data scientist, can you explain the concept of regularization in machine learning?" sequences = pipe( prompt, do_sample=True, max_new_tokens=100, temperature=0.7, top_k=50, top_p=0.95, num_return_sequences=1, ) print(sequences[0]['generated_text'])</code>
This section guides you through fine-tuning Mistral 7B on the guanaco-llama2-1k
dataset, utilizing techniques like PEFT, 4-bit quantization, and QLoRA. The tutorial also references a guide on Fine-Tuning LLaMA 2 for further context.
Necessary libraries are installed:
<code>%%capture %pip install -U bitsandbytes %pip install -U transformers %pip install -U peft %pip install -U accelerate %pip install -U trl</code>
Relevant modules are imported:
<code>from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig,HfArgumentParser,TrainingArguments,pipeline, logging from peft import LoraConfig, PeftModel, prepare_model_for_kbit_training, get_peft_model import os,torch, wandb from datasets import load_dataset from trl import SFTTrainer</code>
API keys are securely managed using Kaggle Secrets:
<code>from kaggle_secrets import UserSecretsClient user_secrets = UserSecretsClient() secret_hf = user_secrets.get_secret("HUGGINGFACE_TOKEN") secret_wandb = user_secrets.get_secret("wandb")</code>
Hugging Face and Weights & Biases are configured:
<code>!huggingface-cli login --token $secret_hf wandb.login(key = secret_wandb) run = wandb.init( project='Fine tuning mistral 7B', job_type="training", anonymous="allow" )</code>
Base model, dataset, and new model name are defined:
<code>base_model = "/kaggle/input/mistral/pytorch/7b-v0.1-hf/1" dataset_name = "mlabonne/guanaco-llama2-1k" new_model = "mistral_7b_guanaco"</code>
The dataset is loaded and a sample is displayed:
<code>dataset = load_dataset(dataset_name, split="train") dataset["text"][100]</code>
The model is loaded with 4-bit precision:
<code>bnb_config = BitsAndBytesConfig( load_in_4bit= True, bnb_4bit_quant_type= "nf4", bnb_4bit_compute_dtype= torch.bfloat16, bnb_4bit_use_double_quant= False, ) model = AutoModelForCausalLM.from_pretrained( base_model, load_in_4bit=True, quantization_config=bnb_config, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True, ) model.config.use_cache = False model.config.pretraining_tp = 1 model.gradient_checkpointing_enable()</code>
The tokenizer is loaded and configured:
<code>tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True) tokenizer.padding_side = 'right' tokenizer.pad_token = tokenizer.eos_token tokenizer.add_eos_token = True tokenizer.add_bos_token, tokenizer.add_eos_token</code>
A LoRA adapter is added for efficient fine-tuning:
<code>model = prepare_model_for_kbit_training(model) peft_config = LoraConfig( lora_alpha=16, lora_dropout=0.1, r=64, bias="none", task_type="CAUSAL_LM", target_modules=["q_proj", "k_proj", "v_proj", "o_proj","gate_proj"] ) model = get_peft_model(model, peft_config)</code>
Training arguments are defined:
<code>training_arguments = TrainingArguments( output_, num_train_epochs=1, per_device_train_batch_size=4, gradient_accumulation_steps=1, optim="paged_adamw_32bit", save_steps=25, logging_steps=25, learning_rate=2e-4, weight_decay=0.001, fp16=False, bf16=False, max_grad_norm=0.3, max_steps=-1, warmup_ratio=0.03, group_by_length=True, lr_scheduler_type="constant", report_to="wandb" )</code>
The SFTTrainer is configured and training is initiated:
<code>!pip install -q -U transformers !pip install -q -U accelerate !pip install -q -U bitsandbytes</code>
The fine-tuned model is saved and pushed to the Hugging Face Hub:
<code>from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, pipeline import torch bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True, )</code>
Model performance is assessed using Weights & Biases. Inference examples are provided.
The adapter is merged with the base model, and the resulting model is pushed to Hugging Face.
The merged model is loaded from Hugging Face and inference is demonstrated.
The tutorial concludes with a summary of Mistral 7B's capabilities and a recap of the steps involved in accessing, fine-tuning, and deploying the model. Resources and FAQs are also included. The emphasis is on providing a practical guide for users to work with this powerful language model.
The above is the detailed content of Mistral 7B Tutorial: A Step-by-Step Guide to Using and Fine-Tuning Mistral 7B. For more information, please follow other related articles on the PHP Chinese website!