Home >Technology peripherals >AI >Deploy large language models locally on 2GB DAYU200

Deploy large language models locally on 2GB DAYU200

WBOY
WBOYforward
2024-02-07 18:20:26477browse

在2GB DAYU200上本地部署大语言模型

Implementation ideas and steps

Port the lightweight LLM model inference framework InferLLM to the OpenHarmony standard system, and compile it to run on OpenHarmony Binary file to run. This inference framework is a simple and efficient LLM CPU inference framework that can locally deploy the quantitative model in LLM.

Use OpenHarmony NDK to compile the InferLLM executable file on OpenHarmony (specifically use the OpenHarmony lycium cross-compilation framework, and then write some scripts. Then store it in the tpc_c_cplusplusSIG warehouse.)

Deploy the large language model locally on DAYU200

Compile and obtain the InferLLM third-party library compiled product

Download OpenHarmony sdk, download address:

http://ci.openharmony.cn/workbench/cicd/dailybuild/dailyList

Download this warehouse

git clone https://gitee.com/openharmony-sig/tpc_c_cplusplus.git --depth=1
# 设置环境变量export OHOS_SDK=解压目录/ohos-sdk/linux# 请替换为你自己的解压目录 cd lycium./build.sh InferLLM

Get the InferLLM third party library header file and the generated library

In the tpc_c_cplusplus/thirdparty/InferLLM/ directory The InferLLM-405d866e4c11b884a8072b4b30659c63555be41d directory will be generated. There are compiled 32-bit and 64-bit third-party libraries in this directory. (The relevant compilation results will not be packaged into the usr directory under the lycium directory).

InferLLM-405d866e4c11b884a8072b4b30659c63555be41d/arm64-v8a-buildInferLLM-405d866e4c11b884a8072b4b30659c63555be41d/armeabi-v7a-build

Push the compiled product and model file to the development board for running

  • Download the model file: https ://huggingface.co/kewin4933/InferLLM-Model/tree/main
  • will compile the llama executable file generated by InferLLM, the libc _shared.so in OpenHarmony sdk, and the downloaded The model file chinese-alpaca-7b-q4.bin is packaged into the folder llama_file
# 将llama_file文件夹发送到开发板data目录hdc file send llama_file /data
# hdc shell 进入开发板执行cd data/llama_file# 在2GB的dayu200上加swap交换空间# 新建一个空的ram_ohos文件touch ram_ohos# 创建一个用于交换空间的文件(8GB大小的交换文件)fallocate -l 8G /data/ram_ohos# 设置文件权限,以确保所有用户可以读写该文件:chmod 777 /data/ram_ohos# 将文件设置为交换空间:mkswap /data/ram_ohos# 启用交换空间:swapon /data/ram_ohos# 设置库搜索路径export LD_LIBRARY_PATH=/data/llama_file:$LD_LIBRARY_PATH# 提升rk3568cpu频率# 查看 CPU 频率cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_cur_freq# 查看 CPU 可用频率(不同平台显示的可用频率会有所不同)cat /sys/devices/system/cpu/cpufreq/policy0/scaling_available_frequencies# 将 CPU 调频模式切换为用户空间模式,这意味着用户程序可以手动控制 CPU 的工作频率,而不是由系统自动管理。这样可以提供更大的灵活性和定制性,但需要注意合理调整频率以保持系统稳定性和性能。echo userspace > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor# 设置rk3568 CPU 频率为1.9GHzecho 1992000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed# 执行大语言模型chmod 777 llama./llama -m chinese-alpaca-7b-q4.bin -t 4

Port the InferLLM third-party library in A large language model is deployed on the OpenHarmmony device rk3568 to realize human-computer dialogue. The final running effect is a bit slow, and the pop-up of the human-machine dialog box is also a bit slow. Please wait patiently.

在2GB DAYU200上本地部署大语言模型

The above is the detailed content of Deploy large language models locally on 2GB DAYU200. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete