Beyond GPT-4, the Stanford team's large model that can be run on mobile phones became popular, with over 2k downloads overnight-AI-php.cn

Beyond GPT-4, the Stanford team's large model that can be run on mobile phones became popular, with over 2k downloads overnight

王林

Apr 07, 2024 pm 04:19 PM

industryStanford Universityai agent

In the process of implementing large models, end-side AI is a very important direction.

Recently, Octopus v2 launched by researchers at Stanford University has become popular and has received great attention from the developer community. The model has been downloaded over 2k times overnight.

The 2 billion-parameter Octopus v2 can run on smartphones, cars, PCs, etc., surpassing GPT-4 in accuracy and latency, and reducing context length by 95%. Furthermore, Octopus v2 is 36 times faster than the Llama7B RAG scheme.

Many netizens lamented: The era of device-side AI agents has arrived!

Beyond GPT-4, the Stanford teams large model that can be run on mobile phones became popular, with over 2k downloads overnight

Paper: Octopus v2: On-device language model for super agent
Paper address: https ://arxiv.org/abs/2404.01744
Model homepage: https://huggingface.co/NexaAIDev/Octopus-v2

Model Overview

Octopus-V2-2B is an open source language model with 2 billion parameters, tailored for the Android API. It runs seamlessly on Android devices and extends its utility to a variety of applications ranging from Android system management to orchestration of multiple devices.

Beyond GPT-4, the Stanford teams large model that can be run on mobile phones became popular, with over 2k downloads overnight

Typically, Retrieval Augmented Generation (RAG) methods require detailed descriptions of potential function parameters (sometimes requiring up to tens of thousands of input tokens). Based on this, Octopus-V2-2B introduces a unique function token strategy in the training and inference phases, which not only enables it to achieve a performance level comparable to GPT-4, but also significantly improves the inference speed, surpassing RAG-based methods. This makes it particularly beneficial for edge computing devices.

Beyond GPT-4, the Stanford teams large model that can be run on mobile phones became popular, with over 2k downloads overnight

Octopus-V2-2B is capable of generating individual, nested and parallel function calls in a variety of complex scenarios.

Dataset

In order to adopt high-quality datasets for the training, validation and testing phases, and especially to achieve efficient training, the research team created the dataset with three key stages:

Generate relevant queries and their associated function call parameters;
Generate unrelated queries from the appropriate function components;
Binary verification support via Google Gemini.

Beyond GPT-4, the Stanford teams large model that can be run on mobile phones became popular, with over 2k downloads overnight

The research team wrote 20 Android API descriptions for training the model. The following is an example of Android API description:

def get_trending_news (category=None, region=&#39;US&#39;, language=&#39;en&#39;, max_results=5):"""Fetches trending news articles based on category, region, and language.Parameters:- category (str, optional): News category to filter by, by default use None for all categories. Optional to provide.- region (str, optional): ISO 3166-1 alpha-2 country code for region-specific news, by default, uses &#39;US&#39;. Optional to provide.- language (str, optional): ISO 639-1 language code for article language, by default uses &#39;en&#39;. Optional to provide.- max_results (int, optional): Maximum number of articles to return, by default, uses 5. Optional to provide.Returns:- list [str]: A list of strings, each representing an article. Each string contains the article&#39;s heading and URL.    """

Model development and training

This research uses the Google Gemma-2B model as the pre-processor in the framework Train the model using two different training methods: full model training and LoRA model training.

In the complete model training, this study uses the AdamW optimizer, the learning rate is set to 5e-5, the number of warm-up steps is set to 10, and a linear learning rate scheduler is used.

LoRA model training uses the same optimizer and learning rate configuration as the full model training, LoRA rank is set to 16, and LoRA is applied to the following modules: q_proj, k_proj, v_proj, o_proj, up_proj, down_proj. Among them, the LoRA alpha parameter is set to 32.

For both training methods, the number of epochs is set to 3.

Using the following code, you can run the Octopus-V2-2B model on a single GPU.

from transformers import AutoTokenizer, GemmaForCausalLMimport torchimport timedef inference (input_text):start_time = time.time ()input_ids = tokenizer (input_text, return_tensors="pt").to (model.device)input_length = input_ids ["input_ids"].shape [1]outputs = model.generate (input_ids=input_ids ["input_ids"], max_length=1024,do_sample=False)generated_sequence = outputs [:, input_length:].tolist ()res = tokenizer.decode (generated_sequence [0])end_time = time.time ()return {"output": res, "latency": end_time - start_time}model_id = "NexaAIDev/Octopus-v2"tokenizer = AutoTokenizer.from_pretrained (model_id)model = GemmaForCausalLM.from_pretrained (model_id, torch_dtype=torch.bfloat16, device_map="auto")input_text = "Take a selfie for me with front camera"nexa_query = f"Below is the query from the users, please call the correct function and generate the parameters to call the function.\n\nQuery: {input_text} \n\nResponse:"start_time = time.time () print ("nexa model result:\n", inference (nexa_query)) print ("latency:", time.time () - start_time,"s")

Evaluation

Octopus-V2-2B demonstrated superior inference speed in benchmark tests, outperforming "Llama7B" on a single A100 GPU RAG solution is 36 times faster. Additionally, Octopus-V2-2B is 168% faster compared to GPT-4-turbo, which relies on clustered A100/H100 GPUs. This efficiency breakthrough is attributed to the functional token design of Octopus-V2-2B.

Beyond GPT-4, the Stanford teams large model that can be run on mobile phones became popular, with over 2k downloads overnight

Octopus-V2-2B not only performs well in speed, but also in accuracy, surpassing the "Llama7B RAG solution" in function call accuracy by 31%. Octopus-V2-2B achieves function calling accuracy comparable to GPT-4 and RAG GPT-3.5.

Beyond GPT-4, the Stanford teams large model that can be run on mobile phones became popular, with over 2k downloads overnight

Interested readers can read the original text of the paper to learn more about the research content.

The above is the detailed content of Beyond GPT-4, the Stanford team's large model that can be run on mobile phones became popular, with over 2k downloads overnight. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:机器之心. If there is any infringement, please contact admin@php.cn delete

undress free porn AI tool websiteMay 13, 2025 am 11:26 AM

https://undressaitool.ai/ is Powerful mobile app with advanced AI features for adult content. Create AI-generated pornographic images or videos now!

How to create pornographic images/videos using undressAIMay 13, 2025 am 11:26 AM

Tutorial on using undressAI to create pornographic pictures/videos: 1. Open the corresponding tool web link; 2. Click the tool button; 3. Upload the required content for production according to the page prompts; 4. Save and enjoy the results.

undress AI official website entrance website addressMay 13, 2025 am 11:26 AM

The official address of undress AI is:https://undressaitool.ai/;undressAI is Powerful mobile app with advanced AI features for adult content. Create AI-generated pornographic images or videos now!

How does undressAI generate pornographic images/videos?May 13, 2025 am 11:26 AM

undressAI porn AI official website addressMay 13, 2025 am 11:26 AM

The official address of undress AI is:https://undressaitool.ai/;undressAI is Powerful mobile app with advanced AI features for adult content. Create AI-generated pornographic images or videos now!

UndressAI usage tutorial guide articleMay 13, 2025 am 10:43 AM

[Ghibli-style images with AI] Introducing how to create free images with ChatGPT and copyrightMay 13, 2025 am 01:57 AM

The latest model GPT-4o released by OpenAI not only can generate text, but also has image generation functions, which has attracted widespread attention. The most eye-catching feature is the generation of "Ghibli-style illustrations". Simply upload the photo to ChatGPT and give simple instructions to generate a dreamy image like a work in Studio Ghibli. This article will explain in detail the actual operation process, the effect experience, as well as the errors and copyright issues that need to be paid attention to. For details of the latest model "o3" released by OpenAI, please click here⬇️ Detailed explanation of OpenAI o3 (ChatGPT o3): Features, pricing system and o4-mini introduction Please click here for the English version of Ghibli-style article⬇️ Create Ji with ChatGPT

Explaining examples of use and implementation of ChatGPT in local governments! Also introduces banned local governmentsMay 13, 2025 am 01:53 AM

As a new communication method, the use and introduction of ChatGPT in local governments is attracting attention. While this trend is progressing in a wide range of areas, some local governments have declined to use ChatGPT. In this article, we will introduce examples of ChatGPT implementation in local governments. We will explore how we are achieving quality and efficiency improvements in local government services through a variety of reform examples, including supporting document creation and dialogue with citizens. Not only local government officials who aim to reduce staff workload and improve convenience for citizens, but also all interested in advanced use cases.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks agoByDDD

How to fix KB5055612 fails to install in Windows 10?

3 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Nordhold: Fusion System, Explained

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.