Home >Backend Development >Python Tutorial >Getting Started with Vector Search (Part 2)
In Part 1, we set up PostgreSQL with pgvector. Now, let's see how vector search actually works.
An embedding is like a smart summary of content in numbers. The distance between two embeddings indicates their level of similarity. A small distance suggests that the vectors are quite similar, and a large distance indicates that they are less related.
? Book A: Web Development (Distance: 0.2) ⬅️ Very Similar! ? Book B: JavaScript 101 (Distance: 0.3) ⬅️ Similar! ? Book C: Cooking Recipes (Distance: 0.9) ❌ Not Similar
Now, let's populate our database with some data. We'll use:
pgvector-setup/ # From Part 1 ├── compose.yml ├── postgres/ │ └── schema.sql ├── .env # New: for API keys └── scripts/ # New: for data loading ├── requirements.txt ├── Dockerfile └── load_data.py
Let's start with a script to load data from external APIs. The full script is Here.
OPENAI_API_KEY=your_openai_api_key
services: # ... existing db service from Part 1 data_loader: build: context: ./scripts environment: - DATABASE_URL=postgresql://postgres:password@db:5432/example_db - OPENAI_API_KEY=${OPENAI_API_KEY} depends_on: - db
docker compose up data_loader
You should see 10 programming books with their metadata.
Connect to your database:
docker exec -it pgvector-db psql -U postgres -d example_db
Let's peek at what embeddings actually look like:
-- View first 5 dimensions of an embedding SELECT name, (embedding::text::float[])[1:5] as first_5_dimensions FROM items LIMIT 1;
Try a simple similarity search:
-- Find 3 books similar to any book about Web SELECT name, metadata FROM items ORDER BY embedding <-> ( SELECT embedding FROM items WHERE metadata->>'title' LIKE '%Web%' LIMIT 1 ) LIMIT 3;
Let's break down the operators used in vector search queries:
Extracts text value from a JSON field.
Example:
-- If metadata = {"title": "ABC"}, it returns "ABC" SELECT metadata->>'title' FROM items;
Measures similarity between two vectors.
Example:
-- Find similar books SELECT name, embedding <-> query_embedding as distance FROM items ORDER BY distance LIMIT 3;
Up next, we'll:
Stay tuned for Part 3: "Building a Vector Search API"! ?
Feel free to drop a comment below! ?
The above is the detailed content of Getting Started with Vector Search (Part 2). For more information, please follow other related articles on the PHP Chinese website!