DeepSeek的開創性AI模型挑戰Openai的主導地位。 這些先進的推理模型是免費的,可以使獲得強大AI的訪問民主化。 了解如何通過我們的視頻教程微調DeepSeek:
該教程微調使用擁抱臉部醫療鏈數據集使用DeepSeek-r1-Distill-lalama-8b型號。 這種蒸餾型型號衍生自Llama 3.1 8b,提供了與原始DeepSeek-R1相當的推理能力。 LLM和微調的新手? 考慮我們在Python課程中對LLM的介紹。
>由作者 圖像
介紹DeepSeek R1模型> deepSeek-r1-Zero
> deepSeek-r1
DeepSeek-R1解決DeepSeek-R1-Zero的局限性,在RL之前包含了冷啟動數據。這種多階段的訓練可實現最先進的性能,匹配OpenAI-O1,同時提高輸出清晰度。
來源:DeepSeek-ai/deepSeek-r1
>在我們的博客文章中了解更多有關DeepSeek-R1的功能,開發,蒸餾模型,訪問,定價和OpenAi O1比較的信息:“ DeepSeek-R1:功能,O1比較,蒸發模型及更多”。
>微調DeepSeek R1:實用指南
按照以下步驟微調您的DeepSeek R1型號:
>我們利用Kaggle的免費GPU訪問權限。創建一個Kaggle筆記本電腦,將您的擁抱臉和偏見令牌添加為秘密。安裝
<code>%%capture !pip install unsloth !pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git</code>
>用擁抱的面部CLI和重量和偏見(WANDB)進行身份驗證:
<code>from huggingface_hub import login from kaggle_secrets import UserSecretsClient user_secrets = UserSecretsClient() hf_token = user_secrets.get_secret("HUGGINGFACE_TOKEN") login(hf_token) import wandb wb_token = user_secrets.get_secret("wandb") wandb.login(key=wb_token) run = wandb.init( project='Fine-tune-DeepSeek-R1-Distill-Llama-8B on Medical COT Dataset', job_type="training", anonymous="allow" )</code>2。加載模型和令牌
3。預先調節推理
<code>from unsloth import FastLanguageModel max_seq_length = 2048 dtype = None load_in_4bit = True model, tokenizer = FastLanguageModel.from_pretrained( model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B", max_seq_length = max_seq_length, dtype = dtype, load_in_4bit = load_in_4bit, token = hf_token, )</code>
用樣本醫學問題測試模型:
<code>prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response. ### Instruction: You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. Please answer the following medical question. ### Question: {} ### Response: <think>{}"""</think></code>
>觀察模型的預先調整推理,並通過微調來確定改進的領域。
<code>question = "A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?" FastLanguageModel.for_inference(model) inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda") outputs = model.generate( input_ids=inputs.input_ids, attention_mask=inputs.attention_mask, max_new_tokens=1200, use_cache=True, ) response = tokenizer.batch_decode(outputs) print(response[0].split("### Response:")[1])</code>
4。加載和處理數據集
創建一個函數以格式化數據集:
<code>train_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response. ### Instruction: You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. Please answer the following medical question. ### Question: {} ### Response: <think> {} </think> {}"""</code>
加載並處理數據集:
<code>EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN def formatting_prompts_func(examples): inputs = examples["Question"] cots = examples["Complex_CoT"] outputs = examples["Response"] texts = [] for input, cot, output in zip(inputs, cots, outputs): text = train_prompt_style.format(input, cot, output) + EOS_TOKEN texts.append(text) return { "text": texts, }</code>
5。設置模型
<code>from datasets import load_dataset dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT","en", split = "train[0:500]",trust_remote_code=True) dataset = dataset.map(formatting_prompts_func, batched = True,) dataset["text"][0]</code>使用lora配置模型:
設置教練:
<code>model = FastLanguageModel.get_peft_model( model, r=16, target_modules=[ "q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", ], lora_alpha=16, lora_dropout=0, bias="none", use_gradient_checkpointing="unsloth", # True or "unsloth" for very long context random_state=3407, use_rslora=False, loftq_config=None, )</code>
6。模型培訓
<code>from trl import SFTTrainer from transformers import TrainingArguments from unsloth import is_bfloat16_supported trainer = SFTTrainer( model=model, tokenizer=tokenizer, train_dataset=dataset, dataset_text_field="text", max_seq_length=max_seq_length, dataset_num_proc=2, args=TrainingArguments( per_device_train_batch_size=2, gradient_accumulation_steps=4, # Use num_train_epochs = 1, warmup_ratio for full training runs! warmup_steps=5, max_steps=60, learning_rate=2e-4, fp16=not is_bfloat16_supported(), bf16=is_bfloat16_supported(), logging_steps=10, optim="adamw_8bit", weight_decay=0.01, lr_scheduler_type="linear", seed=3407, output_, ), )</code>訓練模型:
(注意:原始響應包括訓練進度的圖像;此處省略了這些圖像,因為不可能進行圖像複製。
7。郵政調節推理<code>trainer_stats = trainer.train()</code>
通過與以前相同的問題查詢微調模型來比較結果。 觀察推理和響應簡潔性的改善。
>在本地保存模型,然後將其推到擁抱的臉部集線器:
(注意:原始響應包括顯示成功的模型保存和推動的圖像;此處省略了這些。)
>教程結束時,建議使用Bentoml或本地轉換為GGEF格式提出部署選項。 它強調了開源LLM的重要性,並強調了O3和操作員AI的OpenAI櫃檯。 保留了這些資源的鏈接。
<code>new_model_local = "DeepSeek-R1-Medical-COT" model.save_pretrained(new_model_local) tokenizer.save_pretrained(new_model_local) model.save_pretrained_merged(new_model_local, tokenizer, save_method = "merged_16bit",) new_model_online = "kingabzpro/DeepSeek-R1-Medical-COT" model.push_to_hub(new_model_online) tokenizer.push_to_hub(new_model_online) model.push_to_hub_merged(new_model_online, tokenizer, save_method = "merged_16bit")</code>
>重寫的響應在簡化結構並刪除不必要的重複時維護核心信息。 保留代碼塊以進行完整。 圖像被引用但不復制。
>以上是微調DeepSeek R1(推理模型)的詳細內容。更多資訊請關注PHP中文網其他相關文章!