首页 >科技周边 >人工智能 >在Databricks上部署DeepSeek R1:逐步指南

在Databricks上部署DeepSeek R1:逐步指南

Jennifer Aniston
Jennifer Aniston原创
2025-02-28 16:33:10837浏览

>在Databricks上部署DeepSeek R1模型:逐步指南

流行的数据工程平台 databricks越来越多地用于AI和机器学习任务。 该教程通过在Databricks上部署分布式DeepSeek R1模型来指导您,这是一种强大的大型语言模型,通常是本地部署的优先使用。 这避免将数据发送到外部服务器。 要深入了解DeepSeek R1的功能和比较,请参见DeepSeek-R1:功能,比较,蒸馏模型和更多博客。

本指南涵盖了帐户设置,使用UI的模型注册以及通过操场和本地卷曲命令访问。 Databricks的新手? Databricks课程简介提供了Databricks Lakehouse平台及其数据管理功能的全面概述。 要更深入地了解Databricks中的数据管理,请考虑Databricks课程中的数据管理。

注册DeepSeek R1模型>

    >
  1. >启动笔记本:创建您的数据链球工作空间后,单击“ new”,然后选择笔记本。

Deploying DeepSeek R1 on Databricks: A Step-by-Step Guide

    >
  1. >安装软件包:安装必要的python库:
<code class="language-python">%%capture
!pip install torch transformers mlflow accelerate torchvision
%restart_python</code>
  1. 负载模型和代币器:从拥抱面上加载DeepSeek R1型号和令牌,>
<code class="language-python">import pandas as pd
import mlflow
import mlflow.transformers
import torch
from mlflow.models.signature import infer_signature
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig, pipeline

model_name = "deepseek-ai/DeepSeek-R1-Distill-Llama-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
config = AutoConfig.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, config=config, torch_dtype=torch.float16)</code>

Deploying DeepSeek R1 on Databricks: A Step-by-Step Guide

    >>>测试模型:
  1. 使用样本提示进行测试,并生成用于模型注册的签名:
预期输出(可能会稍有变化):
<code class="language-python">text_generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
example_prompt = "How does a computer work?"
example_inputs = pd.DataFrame({"inputs": [example_prompt]})
example_outputs = text_generator(example_prompt, max_length=200)
signature = infer_signature(example_inputs, example_outputs)
print(example_outputs)</code>

<code>[{'generated_text': "How does a computer work? What is the computer? What is the computer used for? What is the computer used for in real life?\n\nI need to answer this question, but I need to do it step by step. I need to start with the very basic level and build up from there. I need to make sure I understand each concept before moving on. I need to use a lot of examples to explain each idea. I need to write my thoughts as if I'm explaining them to someone else, but I need to make sure I understand how to structure the answer properly.\n\nOkay, let's start with the basic level. What is a computer? It's an electronic device, right? And it has a central processing unit (CPU) that does the processing. But I think the central processing unit is more efficient, so maybe it's the CPU. Then, it has memory and storage. I remember that memory is like RAM and storage is like ROM. But wait, I think"}]</code>
    conda环境:
  1. 定义一个conda环境:
<code class="language-python">conda_env = {
    "name": "mlflow-env",
    "channels": ["defaults", "conda-forge"],
    "dependencies": [
        "python=3.11",
        "pip",
        {"pip": ["mlflow", "transformers", "accelerate", "torch", "torchvision"]}
    ]
}</code>
    注册模型:
  1. 使用注册模型: mlflow.transformers.log_model
<code class="language-python">with mlflow.start_run() as run:
    mlflow.transformers.log_model(
        transformers_model=text_generator,
        artifact_path="deepseek_model",
        signature=signature,
        input_example=example_inputs,
        registered_model_name="deepseek_r1_llama_8b",
        conda_env=conda_env
    )</code>

Deploying DeepSeek R1 on Databricks: A Step-by-Step Guide

部署DeepSeek R1

    >
  1. >导航到模型:

    在databricks仪表板中,转到“模型”选项卡。

    >
  2. 使用模型:
  3. 选择您的模型,然后单击“使用此模型”。

    1. 配置端点:命名您的端点,选择计算选项,设置并发性,然后单击“创建”。

    Deploying DeepSeek R1 on Databricks: A Step-by-Step Guide

    Deploying DeepSeek R1 on Databricks: A Step-by-Step Guide

    有关自定义数据集上的微调,请参阅微调DeepSeek R1教程。

    >访问已部署的模型

    >

      > databricks Playground:
    1. 直接在Databricks Playground中进行测试。

    Deploying DeepSeek R1 on Databricks: A Step-by-Step Guide

      curl命令:生成数据链球api键(设置&gt; developer),将其设置为环境变量(
    1. ),然后使用curl:> $DATABRICKS_TOKEN
    <code class="language-python">%%capture
    !pip install torch transformers mlflow accelerate torchvision
    %restart_python</code>

    Deploying DeepSeek R1 on Databricks: A Step-by-Step Guide

    有关DeepSeek R1 vs. V3的信息,请参见DeepSeek R1 vs V3博客。 LLM的新手? Python课程中LLMS的介绍是一个很棒的起点。 请记住,尽管CPU部署是可能的,但可能会慢。

    >

以上是在Databricks上部署DeepSeek R1:逐步指南的详细内容。更多信息请关注PHP中文网其他相关文章!

声明:
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn