Vector Streaming: Memory-efficient Indexing with Rust-AI-php.cn

Home

Technology peripherals

Vector Streaming: Memory-efficient Indexing with Rust

Lisa Kudrow

Apr 12, 2025 am 10:42 AM

Introduction

Vector streaming in EmbedAnything is being introduced, a feature designed to optimize large-scale document embedding. Enabling asynchronous chunking and embedding using Rust’s concurrency reduces memory usage and speeds up the process. Today, I will show how to integrate it with the Weaviate Vector Database for seamless image embedding and search.

In my previous article, Supercharge Your Embeddings Pipeline with EmbedAnything, I discussed the idea behind EmbedAnything and how it makes creating embeddings from multiple modalities easy. In this article, I want to introduce a new feature of EmbedAnything called vector streaming and see how it works with Weaviate Vector Database.

Vector Streaming: Memory-efficient Indexing with Rust

Overview

Vector streaming in EmbedAnything optimizes embedding large-scale documents using asynchronous chunking with Rust’s concurrency.
It solves memory and efficiency issues in traditional embedding methods by processing chunks in parallel.
Integration with Weaviate enables seamless embedding and searching in a vector database.
Implementing vector streaming involves creating a database adapter, initiating an embedding model, and embedding data.
This approach offers a more efficient, scalable, and flexible solution for large-scale document embedding.

What is the problem?
Our Solution to the Problem
Example Use Case with EmbedAnything
- Step 1: Create the Adapter
- Step 2: Create the Embedding Model
- Step 3: Embed the Directory
- Step 4: Query the Vector Database
- Step 5: Query the Vector Database
- Output
Frequently Asked Questions

What is the problem?

First, examine the current problem with creating embeddings, especially in large-scale documents. The current embedding frameworks operate on a two-step process: chunking and embedding. First, the text is extracted from all the files, and chunks/nodes are created. Then, these chunks are fed to an embedding model with a specific batch size to process the embeddings. While this is done, the chunks and the embeddings stay on the system memory.

This is not a problem when the files and embedding dimensions are small. But this becomes a problem when there are many files, and you are working with large models and, even worse, multi-vector embeddings. Thus, to work with this, a high RAM is required to process the embeddings. Also, if this is done synchronously, a lot of time is wasted while the chunks are being created, as chunking is not a compute-heavy operation. As the chunks are being made, passing them to the embedding model would be efficient.

Our Solution to the Problem

The solution is to create an asynchronous chunking and embedding task. We can effectively spawn threads to handle this task using Rust’s concurrency patterns and thread safety. This is done using Rust’s MPSC (Multi-producer Single Consumer) module, which passes messages between threads. Thus, this creates a stream of chunks passed into the embedding thread with a buffer. Once the buffer is complete, it embeds the chunks and sends the embeddings back to the main thread, which then sends them to the vector database. This ensures no time is wasted on a single operation and no bottlenecks. Moreover, the system stores only the chunks and embeddings in the buffer, erasing them from memory once they are moved to the vector database.

Vector Streaming: Memory-efficient Indexing with Rust

Example Use Case with EmbedAnything

Now, let’s see this feature in action:

With EmbedAnything, streaming the vectors from a directory of files to the vector database is a simple three-step process.

Create an adapter for your vector database: This is a wrapper around the database’s functions that allows you to create an index, convert metadata from EmbedAnything’s format to the format required by the database, and the function to insert the embeddings in the index. Adapters for the prominent databases have already been created and are present here.

Initiate an embedding model of your choice: You can choose from different local models or even cloud models. The configuration can also be determined by setting the chunk size and buffer size for how many embeddings need to be streamed at once. Ideally, this should be as high as possible, but the system RAM limits this.

Call the embedding function from EmbedAnything: Just pass the directory path to be embedded, the embedding model, the adapter, and the configuration.

In this example, we will embed a directory of images and send it to the vector databases.

Step 1: Create the Adapter

In EmbedAnything, the adapters are created outside so as not to make the library heavy and you get to choose which database you want to work with. Here is a simple adapter for Weaviate:

from embed_anything import EmbedData

from embed_anything.vectordb import Adapter

class WeaviateAdapter(Adapter):

def __init__(self, api_key, url):

super().__init__(api_key)

self.client = weaviate.connect_to_weaviate_cloud(

cluster_url=url, auth_credentials=wvc.init.Auth.api_key(api_key)

)

if self.client.is_ready():

print("Weaviate is ready")

def create_index(self, index_name: str):

self.index_name = index_name

self.collection = self.client.collections.create(

index_name, vectorizer_config=wvc.config.Configure.Vectorizer.none()

)

return self.collection

def convert(self, embeddings: List[EmbedData]):

data = []

for embedding in embeddings:

property = embedding.metadata

property["text"] = embedding.text

data.append(

wvc.data.DataObject(properties=property, vector=embedding.embedding)

)

return data

def upsert(self, embeddings):

data = self.convert(embeddings)

self.client.collections.get(self.index_name).data.insert_many(data)

def delete_index(self, index_name: str):

self.client.collections.delete(index_name)

### Start the client and index

URL = "your-weaviate-url"

API_KEY = "your-weaviate-api-key"

weaviate_adapter = WeaviateAdapter(API_KEY, URL)

index_name = "Test_index"

if index_name in weaviate_adapter.client.collections.list_all():

weaviate_adapter.delete_index(index_name)

weaviate_adapter.create_index("Test_index")

Step 2: Create the Embedding Model

Here, since we are embedding images, we can use the clip model

import embed_anything import WhichModel

model = embed_anything.EmbeddingModel.from_pretrained_cloud(

embed_anything.WhichModel.Clip,

model_)

Step 3: Embed the Directory

data = embed_anything.embed_image_directory(

"\image_directory",

embeder=model,

adapter=weaviate_adapter,

config=embed_anything.ImageEmbedConfig(buffer_size=100),

)

Step 4: Query the Vector Database

query_vector = embed_anything.embed_query(["image of a cat"], embeder=model)[0].embedding

Step 5: Query the Vector Database

response = weaviate_adapter.collection.query.near_vector(

near_vector=query_vector,

limit=2,

return_metadata=wvc.query.MetadataQuery(certainty=True),

)

Check the response;

Output

Using the Clip model, we vectorized the whole directory with pictures of cats, dogs, and monkeys. With the simple query “images of cats, ” we were able to search all the files for images of cats.

Vector Streaming: Memory-efficient Indexing with Rust

Check out the notebook for the code here on colab.

Conclusion

I think vector streaming is one of the features that will empower many engineers to opt for a more optimized and no-tech debt solution. Instead of using bulky frameworks on the cloud, you can use a lightweight streaming option.

Check out the GitHub repo over here: EmbedAnything Repo.

Frequently Asked Questions

Q1. What is vector streaming in EmbedAnything?

Ans. Vector streaming is a feature that optimizes large-scale document embedding by using Rust’s concurrency for asynchronous chunking and embedding, reducing memory usage and speeding up the process.

Q2. What problem does vector streaming solve?

Ans. It addresses high memory usage and inefficiency in traditional embedding methods by processing chunks asynchronously, reducing bottlenecks and optimizing resource use.

Q3. How does vector streaming work with Weaviate?

Ans. It uses an adapter to connect EmbedAnything with the Weaviate Vector Database, allowing seamless embedding and querying of data.

Q4. What are the steps for using vector streaming?

Ans. Here are steps:
1. Create a database adapter.
2. Initiate an embedding model.
3. Embed the directory.
4. Query the vector database.

Q5. Why use vector streaming over traditional methods?

Ans. It offers better efficiency, reduced memory usage, scalability, and flexibility compared to traditional embedding methods.

The above is the detailed content of Vector Streaming: Memory-efficient Indexing with Rust. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

How Powerful Nations Are Using Visas To Win The Global AI Talent RaceMay 16, 2025 am 02:13 AM

The globe's leading nations are fiercely competing for a shrinking group of elite AI researchers. They are employing accelerated visa procedures and fast-tracked citizenship to draw in the top international talent. This international race is turning

Do I need a phone number to register for ChatGPT? We also explain what to do if you can't registerMay 16, 2025 am 01:24 AM

No mobile number is required for ChatGPT registration? This article will explain in detail the latest changes in the ChatGPT registration process, including the advantages of no longer mandatory mobile phone numbers, as well as scenarios where mobile phone number authentication is still required in special circumstances such as API usage and multi-account creation. In addition, we will also discuss the security of mobile phone number registration and provide solutions to common errors during the registration process. ChatGPT registration: Mobile phone number is no longer required In the past, registering for ChatGPT required mobile phone number verification. But an update in December 2023 canceled the requirement. Now, you can easily register for ChatGPT by simply having an email address or Google, Microsoft, or Apple account. It should be noted that although it is not necessary

Top Ten Uses Of AI Puts Therapy And Companionship At The #1 SpotMay 16, 2025 am 12:43 AM

Let's delve into the fascinating world of AI and its top uses as outlined in the latest analysis.This exploration of a groundbreaking AI development is a continuation of my ongoing Forbes column, where I delve into the latest advancements in AI, incl

Can't use ChatGPT! Explaining the causes and solutions that can be tested immediately [Latest 2025]May 14, 2025 am 05:04 AM

ChatGPT is not accessible? This article provides a variety of practical solutions! Many users may encounter problems such as inaccessibility or slow response when using ChatGPT on a daily basis. This article will guide you to solve these problems step by step based on different situations. Causes of ChatGPT's inaccessibility and preliminary troubleshooting First, we need to determine whether the problem lies in the OpenAI server side, or the user's own network or device problems. Please follow the steps below to troubleshoot: Step 1: Check the official status of OpenAI Visit the OpenAI Status page (status.openai.com) to see if the ChatGPT service is running normally. If a red or yellow alarm is displayed, it means Open

Calculating The Risk Of ASI Starts With Human MindsMay 14, 2025 am 05:02 AM

On 10 May 2025, MIT physicist Max Tegmark told The Guardian that AI labs should emulate Oppenheimer’s Trinity-test calculus before releasing Artificial Super-Intelligence. “My assessment is that the 'Compton constant', the probability that a race to

An easy-to-understand explanation of how to write and compose lyrics and recommended tools in ChatGPTMay 14, 2025 am 05:01 AM

AI music creation technology is changing with each passing day. This article will use AI models such as ChatGPT as an example to explain in detail how to use AI to assist music creation, and explain it with actual cases. We will introduce how to create music through SunoAI, AI jukebox on Hugging Face, and Python's Music21 library. Through these technologies, everyone can easily create original music. However, it should be noted that the copyright issue of AI-generated content cannot be ignored, and you must be cautious when using it. Let’s explore the infinite possibilities of AI in the music field together! OpenAI's latest AI agent "OpenAI Deep Research" introduces: [ChatGPT]Ope

What is ChatGPT-4? A thorough explanation of what you can do, the pricing, and the differences from GPT-3.5!May 14, 2025 am 05:00 AM

The emergence of ChatGPT-4 has greatly expanded the possibility of AI applications. Compared with GPT-3.5, ChatGPT-4 has significantly improved. It has powerful context comprehension capabilities and can also recognize and generate images. It is a universal AI assistant. It has shown great potential in many fields such as improving business efficiency and assisting creation. However, at the same time, we must also pay attention to the precautions in its use. This article will explain the characteristics of ChatGPT-4 in detail and introduce effective usage methods for different scenarios. The article contains skills to make full use of the latest AI technologies, please refer to it. OpenAI's latest AI agent, please click the link below for details of "OpenAI Deep Research"

Explaining how to use the ChatGPT app! Japanese support and voice conversation functionMay 14, 2025 am 04:59 AM

ChatGPT App: Unleash your creativity with the AI assistant! Beginner's Guide The ChatGPT app is an innovative AI assistant that handles a wide range of tasks, including writing, translation, and question answering. It is a tool with endless possibilities that is useful for creative activities and information gathering. In this article, we will explain in an easy-to-understand way for beginners, from how to install the ChatGPT smartphone app, to the features unique to apps such as voice input functions and plugins, as well as the points to keep in mind when using the app. We'll also be taking a closer look at plugin restrictions and device-to-device configuration synchronization

See all articles