search
HomeTechnology peripheralsAIBeyond GPT-4, the Stanford team's large model that can be run on mobile phones became popular, with over 2k downloads overnight

In the process of implementing large models, end-side AI is a very important direction.

Recently, Octopus v2 launched by researchers at Stanford University has become popular and has received great attention from the developer community. The model has been downloaded over 2k times overnight.

The 2 billion-parameter Octopus v2 can run on smartphones, cars, PCs, etc., surpassing GPT-4 in accuracy and latency, and reducing context length by 95%. Furthermore, Octopus v2 is 36 times faster than the Llama7B RAG scheme. Beyond GPT-4, the Stanford teams large model that can be run on mobile phones became popular, with over 2k downloads overnight

Many netizens lamented: The era of device-side AI agents has arrived!

Beyond GPT-4, the Stanford teams large model that can be run on mobile phones became popular, with over 2k downloads overnight

  • Paper: Octopus v2: On-device language model for super agent

  • Paper address: https ://arxiv.org/abs/2404.01744

  • Model homepage: https://huggingface.co/NexaAIDev/Octopus-v2

Model Overview

Octopus-V2-2B is an open source language model with 2 billion parameters, tailored for the Android API. It runs seamlessly on Android devices and extends its utility to a variety of applications ranging from Android system management to orchestration of multiple devices.

Beyond GPT-4, the Stanford teams large model that can be run on mobile phones became popular, with over 2k downloads overnight

Typically, Retrieval Augmented Generation (RAG) methods require detailed descriptions of potential function parameters (sometimes requiring up to tens of thousands of input tokens). Based on this, Octopus-V2-2B introduces a unique function token strategy in the training and inference phases, which not only enables it to achieve a performance level comparable to GPT-4, but also significantly improves the inference speed, surpassing RAG-based methods. This makes it particularly beneficial for edge computing devices.

Beyond GPT-4, the Stanford teams large model that can be run on mobile phones became popular, with over 2k downloads overnight

Octopus-V2-2B is capable of generating individual, nested and parallel function calls in a variety of complex scenarios.

Dataset

In order to adopt high-quality datasets for the training, validation and testing phases, and especially to achieve efficient training, the research team created the dataset with three key stages:

  • Generate relevant queries and their associated function call parameters;

  • Generate unrelated queries from the appropriate function components;

  • Binary verification support via Google Gemini.

Beyond GPT-4, the Stanford teams large model that can be run on mobile phones became popular, with over 2k downloads overnight

The research team wrote 20 Android API descriptions for training the model. The following is an example of Android API description:

def get_trending_news (category=None, region='US', language='en', max_results=5):"""Fetches trending news articles based on category, region, and language.Parameters:- category (str, optional): News category to filter by, by default use None for all categories. Optional to provide.- region (str, optional): ISO 3166-1 alpha-2 country code for region-specific news, by default, uses 'US'. Optional to provide.- language (str, optional): ISO 639-1 language code for article language, by default uses 'en'. Optional to provide.- max_results (int, optional): Maximum number of articles to return, by default, uses 5. Optional to provide.Returns:- list [str]: A list of strings, each representing an article. Each string contains the article's heading and URL.    """

Model development and training

This research uses the Google Gemma-2B model as the pre-processor in the framework Train the model using two different training methods: full model training and LoRA model training.

In the complete model training, this study uses the AdamW optimizer, the learning rate is set to 5e-5, the number of warm-up steps is set to 10, and a linear learning rate scheduler is used.

LoRA model training uses the same optimizer and learning rate configuration as the full model training, LoRA rank is set to 16, and LoRA is applied to the following modules: q_proj, k_proj, v_proj, o_proj, up_proj, down_proj. Among them, the LoRA alpha parameter is set to 32.

For both training methods, the number of epochs is set to 3.

Using the following code, you can run the Octopus-V2-2B model on a single GPU.

from transformers import AutoTokenizer, GemmaForCausalLMimport torchimport timedef inference (input_text):start_time = time.time ()input_ids = tokenizer (input_text, return_tensors="pt").to (model.device)input_length = input_ids ["input_ids"].shape [1]outputs = model.generate (input_ids=input_ids ["input_ids"], max_length=1024,do_sample=False)generated_sequence = outputs [:, input_length:].tolist ()res = tokenizer.decode (generated_sequence [0])end_time = time.time ()return {"output": res, "latency": end_time - start_time}model_id = "NexaAIDev/Octopus-v2"tokenizer = AutoTokenizer.from_pretrained (model_id)model = GemmaForCausalLM.from_pretrained (model_id, torch_dtype=torch.bfloat16, device_map="auto")input_text = "Take a selfie for me with front camera"nexa_query = f"Below is the query from the users, please call the correct function and generate the parameters to call the function.\n\nQuery: {input_text} \n\nResponse:"start_time = time.time () print ("nexa model result:\n", inference (nexa_query)) print ("latency:", time.time () - start_time,"s")

Evaluation

Octopus-V2-2B demonstrated superior inference speed in benchmark tests, outperforming "Llama7B" on a single A100 GPU RAG solution is 36 times faster. Additionally, Octopus-V2-2B is 168% faster compared to GPT-4-turbo, which relies on clustered A100/H100 GPUs. This efficiency breakthrough is attributed to the functional token design of Octopus-V2-2B.

Beyond GPT-4, the Stanford teams large model that can be run on mobile phones became popular, with over 2k downloads overnight

Octopus-V2-2B not only performs well in speed, but also in accuracy, surpassing the "Llama7B RAG solution" in function call accuracy by 31%. Octopus-V2-2B achieves function calling accuracy comparable to GPT-4 and RAG GPT-3.5.

Beyond GPT-4, the Stanford teams large model that can be run on mobile phones became popular, with over 2k downloads overnight

Interested readers can read the original text of the paper to learn more about the research content.

The above is the detailed content of Beyond GPT-4, the Stanford team's large model that can be run on mobile phones became popular, with over 2k downloads overnight. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:机器之心. If there is any infringement, please contact admin@php.cn delete
undress free porn AI tool websiteundress free porn AI tool websiteMay 13, 2025 am 11:26 AM

https://undressaitool.ai/ is Powerful mobile app with advanced AI features for adult content. Create AI-generated pornographic images or videos now!

How to create pornographic images/videos using undressAIHow to create pornographic images/videos using undressAIMay 13, 2025 am 11:26 AM

Tutorial on using undressAI to create pornographic pictures/videos: 1. Open the corresponding tool web link; 2. Click the tool button; 3. Upload the required content for production according to the page prompts; 4. Save and enjoy the results.

undress AI official website entrance website addressundress AI official website entrance website addressMay 13, 2025 am 11:26 AM

The official address of undress AI is:https://undressaitool.ai/;undressAI is Powerful mobile app with advanced AI features for adult content. Create AI-generated pornographic images or videos now!

How does undressAI generate pornographic images/videos?How does undressAI generate pornographic images/videos?May 13, 2025 am 11:26 AM

Tutorial on using undressAI to create pornographic pictures/videos: 1. Open the corresponding tool web link; 2. Click the tool button; 3. Upload the required content for production according to the page prompts; 4. Save and enjoy the results.

undressAI porn AI official website addressundressAI porn AI official website addressMay 13, 2025 am 11:26 AM

The official address of undress AI is:https://undressaitool.ai/;undressAI is Powerful mobile app with advanced AI features for adult content. Create AI-generated pornographic images or videos now!

UndressAI usage tutorial guide articleUndressAI usage tutorial guide articleMay 13, 2025 am 10:43 AM

Tutorial on using undressAI to create pornographic pictures/videos: 1. Open the corresponding tool web link; 2. Click the tool button; 3. Upload the required content for production according to the page prompts; 4. Save and enjoy the results.

[Ghibli-style images with AI] Introducing how to create free images with ChatGPT and copyright[Ghibli-style images with AI] Introducing how to create free images with ChatGPT and copyrightMay 13, 2025 am 01:57 AM

The latest model GPT-4o released by OpenAI not only can generate text, but also has image generation functions, which has attracted widespread attention. The most eye-catching feature is the generation of "Ghibli-style illustrations". Simply upload the photo to ChatGPT and give simple instructions to generate a dreamy image like a work in Studio Ghibli. This article will explain in detail the actual operation process, the effect experience, as well as the errors and copyright issues that need to be paid attention to. For details of the latest model "o3" released by OpenAI, please click here⬇️ Detailed explanation of OpenAI o3 (ChatGPT o3): Features, pricing system and o4-mini introduction Please click here for the English version of Ghibli-style article⬇️ Create Ji with ChatGPT

Explaining examples of use and implementation of ChatGPT in local governments! Also introduces banned local governmentsExplaining examples of use and implementation of ChatGPT in local governments! Also introduces banned local governmentsMay 13, 2025 am 01:53 AM

As a new communication method, the use and introduction of ChatGPT in local governments is attracting attention. While this trend is progressing in a wide range of areas, some local governments have declined to use ChatGPT. In this article, we will introduce examples of ChatGPT implementation in local governments. We will explore how we are achieving quality and efficiency improvements in local government services through a variety of reform examples, including supporting document creation and dialogue with citizens. Not only local government officials who aim to reduce staff workload and improve convenience for citizens, but also all interested in advanced use cases.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment