집 >기술 주변기기 >일체 포함 >미세 조정 DeepSeek R1 (추론 모델)

미세 조정 DeepSeek R1 (추론 모델)

Lisa Kudrow원래의: 2025-03-01 09:08:13500검색

Deepseek의 획기적인 AI 모델은 Openai의 지배력에 도전합니다. 이러한 고급 추론 모델은 자유롭게 이용 가능하며 강력한 AI에 대한 접근을 민주화합니다. 비디오 튜토리얼 :

로 DeepSeek를 미세 조정하는 방법을 알아보십시오

이 튜토리얼은 Hugging Face Medical-Thought DataSet을 사용하여 DeepSeek-R1-Distill-Llama-8B 모델을 미세 조정합니다. LLAMA 3.1 8B에서 파생 된이 증류 모델은 원래 DeepSeek-R1과 비슷한 추론 기능을 제공합니다. LLMS 및 미세 조정에 새로 생겼습니까? Python 코스에서 LLM에 대한 소개를 고려하십시오.

저자에 의한 이미지

DeepSeek R1 모델 소개 Deepseek AI에는 오픈 소스 딥 스가 R1과 DeepSeek-R1-Zero가 있으며 OpenAi의 O1과 함께 추론 작업 (수학, 코딩, 논리)을 사용합니다. 자세한 내용은 포괄적 인 DeepSeek R1 가이드를 탐색하십시오 deepseek-r1-Zero 이 개척 모델은 대규모 강화 학습 (RL)을 사용하여 초기 감독 된 미세 조정 (SFT)을 우회합니다. 독립적 인 체인 (COT) 추론을 가능하게하는 동안 반복적 인 추론 및 가독성 문제와 같은 도전을 제시합니다. deepseek-r1 Fine-Tuning DeepSeek R1 (Reasoning Model) DeepSeek-R1-Zero의 제한 사항을 해결하면서 DeepSeek-R1은 RL 전에 콜드 스타트 데이터를 통합합니다. 이 다단계 교육은 최첨단 성능을 달성하여 OpenAI-O1과 일치하는 동시에 출력 선명도를 향상시킵니다. 깊은 증류 Deepseek는 또한 증류 모델, 전력 및 효율성 균형을 제공합니다. 이 작은 모델 (1.5b ~ 70b 매개 변수)은 강력한 추론을 유지하며 벤치 마크에서 OpenAI-O1-Mini를 능가하는 DeepSeek-R1-Distill-Qwen-32B와 함께 강력한 추론을 유지합니다. 이것은 증류 과정의 효과를 강조합니다.

출처 : DeepSeek-AI/DeepSeek-R1 블로그 게시물에서 DeepSeek-R1의 기능, 개발, 증류 모델, 액세스, 가격 책정 및 OpenAi O1 비교에 대해 자세히 알아보십시오. 미세 조정 Deepseek R1 : 실용 가이드 DeepSeek R1 모델을 미세 조정하려면 다음 단계를 수행하십시오

1. 설정 <.> 우리는 Kaggle의 무료 GPU 액세스를 사용합니다. Kaggle 노트북을 만들어 포옹 얼굴과 무게를 추가하고 토큰을 비밀로 추가하십시오. 더 빠르고 메모리 효율적인 미세 조정을 위해

파이썬 패키지를 설치하십시오. 자세한 내용은 "Unsloth Guide : LLM 미세 조정 최적화 및 속도 업"을 참조하십시오.

<code>%%capture
!pip install unsloth
!pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git</code>

포옹 얼굴 Cli와 웨이트 및 바이어스 (Wandb)로 인증 :

2. 모델과 토큰 화기

from huggingface_hub import login
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()

hf_token = user_secrets.get_secret("HUGGINGFACE_TOKEN")
login(hf_token)

import wandb

wb_token = user_secrets.get_secret("wandb")

wandb.login(key=wb_token)
run = wandb.init(
    project='Fine-tune-DeepSeek-R1-Distill-Llama-8B on Medical COT Dataset', 
    job_type="training", 
    anonymous="allow"
)

로드 최적화 된 성능을 위해 4 비트 양자화를 사용하여 DeepSeek-R1-Distill-Llama-8b의 미스 슬롯 버전을로드하십시오.

3. 사전 튜닝 추론

질문과 응답에 대한 자리 표시 자와 신속한 스타일을 정의하십시오. 이것은 모델의 단계별 추론을 안내합니다

샘플 의료 질문으로 모델을 테스트하십시오

<code>from unsloth import FastLanguageModel

max_seq_length = 2048 
dtype = None 
load_in_4bit = True


model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    token = hf_token, 
)</code>

모델의 사전 조정 추론을 관찰하고 미세 조정을 통한 개선 영역을 식별합니다.

4. 데이터 세트로드 및 처리 복잡한 사고 체인에 대한 자리 표시자를 포함하도록 프롬프트 스타일을 수정하십시오.

데이터 세트를 형식화하기 위해 함수를 만듭니다 :

데이터 세트를로드하고 처리하십시오

<code>prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request. 
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. 
Please answer the following medical question. 

### Question:
{}

### Response:
<think>{}"""</think></code>

5. 모델 설정 lora를 사용하여 모델을 구성하십시오 :

트레이너를 설정하십시오 :

<code>question = "A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?"


FastLanguageModel.for_inference(model) 
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])</code>

6. 모델 훈련 <.> <:> 모델 훈련 :

(참고 : 원래의 응답에는 훈련 진행 이미지가 포함되어 있습니다. 이미지 재생산이 불가능하기 때문에 여기서 생략됩니다.)

7. 사후 조정 추론

미세 조정 모델을 이전과 동일한 질문으로 쿼리하여 결과를 비교하십시오. 추론과 응답 간결함의 개선을 관찰하십시오.

(참고 : 원래 응답에는 개선 된 모델 출력이 포함되어 있으며 간결함을 위해 여기서 생략됩니다.)

8. 모델을 저장하고 밀고

train_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request. 
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. 
Please answer the following medical question. 

### Question:
{}

### Response:
<think>
{}
</think>
{}"""

로컬로 모델을 저장하고 포옹 페이스 허브로 밀어 넣으십시오 :

(참고 : 원래 응답에는 성공적인 모델 저장 및 푸시를 보여주는 이미지가 포함되어 있습니다. 여기서는 생략됩니다.)

<code>EOS_TOKEN = tokenizer.eos_token  # Must add EOS_TOKEN


def formatting_prompts_func(examples):
    inputs = examples["Question"]
    cots = examples["Complex_CoT"]
    outputs = examples["Response"]
    texts = []
    for input, cot, output in zip(inputs, cots, outputs):
        text = train_prompt_style.format(input, cot, output) + EOS_TOKEN
        texts.append(text)
    return {
        "text": texts,
    }</code>

9. 배포 및 결론 튜토리얼은 Bentoml 또는 GGUF 형식으로 로컬 변환을 사용한 배포 옵션을 제안함으로써 결론을 내립니다. Open-Source LLM의 중요성이 커지고 OpenAi의 카운터 이동을 강조합니다. O3 및 운영자 AI. 해당 자원에 대한 링크는 보존됩니다 다시 작성된 응답은 핵심 정보를 유지하면서 구조를 단순화하고 불필요한 반복을 제거합니다. 코드 블록은 완전성을 위해 유지됩니다. 이미지는 참조하지만 재현되지는 않습니다

위 내용은 미세 조정 DeepSeek R1 (추론 모델)의 상세 내용입니다. 자세한 내용은 PHP 중국어 웹사이트의 기타 관련 기사를 참조하세요!

Python define for while format include math using operator function this llama prompt Access Prompt

성명：

이전 기사：정책 그라디언트 정리 설명 : 실습 소개다음 기사：정책 그라디언트 정리 설명 : 실습 소개