Home  >  Article  >  Technology peripherals  >  Deploy large language models locally in OpenHarmony

Deploy large language models locally in OpenHarmony

王林
王林Original
2024-06-07 10:02:23873browse

Deploy large language models locally in OpenHarmony

This article will open source the results of "Local Deployment of Large Language Models in OpenHarmony" demonstrated at the 2nd OpenHarmony Technology Conference. The open source address: https://gitee.com/openharmony -sig/tpc_c_cplusplus/blob/master/thirdparty/InferLLM/docs/hap_integrate.md.

Implementation ideas and steps

Port the lightweight LLM model inference framework InferLLM to the OpenHarmony standard system, and compile a binary product that can run in OpenHarmony.

InferLLM is a simple and efficient LLM CPU inference framework that can locally deploy quantitative models in LLM.

Use OpenHarmony NDK to compile the InferLLM executable file on OpenHarmony.

Specifically use the OpenHarmony lycium cross-compilation framework, and then write some scripts. Then store it in the tpc_c_cplusplusSIG warehouse.

Steps of local deployment of large language model

Compile and obtain the InferLLM third-party library compiled product

Download OpenHarmony sdk, download address: http://ci.openharmony.cn/workbench/ cicd/dailybuild/dailyList2.

Download this warehouse.

git clone https://gitee.com/openharmony-sig/tpc_c_cplusplus.git --depth=1
# 设置环境变量export OHOS_SDK=解压目录/ohos-sdk/linux# 请替换为你自己的解压目录 cd lycium./build.sh InferLLM

Get the InferLLM third-party library header file and the generated library will generate the InferLLM-405d866e4c11b884a8072b4b30659c63555be41d directory in the tpc_cplusplus/thirdparty/InferLLM/ directory, which exists Compiled 32-bit and 64-bit third-party libraries. (The relevant compilation results will not be packaged into the usr directory under the lycium directory).

InferLLM-405d866e4c11b884a8072b4b30659c63555be41d/arm64-v8a-buildInferLLM-405d866e4c11b884a8072b4b30659c63555be41d/armeabi-v7a-build

Push the compiled product and model files to the development board and run

Download the model file: https://huggingface.co/kewin4933/InferLLM-Model/tree/ main.

Package the llama executable file generated by compiling InferLLM, libc++_shared.so in OpenHarmony sdk, and the downloaded model file chinese-alpaca-7b-q4.bin into the folder llama_file.

# 将llama_file文件夹发送到开发板data目录hdc file send llama_file /data
# hdc shell 进入开发板执行cd data/llama_file# 在2GB的dayu200上加swap交换空间# 新建一个空的ram_ohos文件touch ram_ohos# 创建一个用于交换空间的文件(8GB大小的交换文件)fallocate -l 8G /data/ram_ohos# 设置文件权限,以确保所有用户可以读写该文件:chmod 777 /data/ram_ohos# 将文件设置为交换空间:mkswap /data/ram_ohos# 启用交换空间:swapon /data/ram_ohos# 设置库搜索路径export LD_LIBRARY_PATH=/data/llama_file:$LD_LIBRARY_PATH# 提升rk3568cpu频率# 查看 CPU 频率cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_cur_freq# 查看 CPU 可用频率(不同平台显示的可用频率会有所不同)cat /sys/devices/system/cpu/cpufreq/policy0/scaling_available_frequencies# 将 CPU 调频模式切换为用户空间模式,这意味着用户程序可以手动控制 CPU 的工作频率,而不是由系统自动管理。这样可以提供更大的灵活性和定制性,但需要注意合理调整频率以保持系统稳定性和性能。echo userspace > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor# 设置rk3568 CPU 频率为1.9GHzecho 1992000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed# 执行大语言模型chmod 777 llama./llama -m chinese-alpaca-7b-q4.bin -t 4

Port the InferLLM third-party library and deploy the large language model on the OpenHarmmony device rk3568 to realize human-computer dialogue. The final running effect is a bit slow, and the pop-up of the human-machine dialog box is also a bit slow. Please wait patiently.

Deploy large language models locally in OpenHarmony

The above is the detailed content of Deploy large language models locally in OpenHarmony. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn