Home » What Are Vector Databases and Why They Matter

What Are Vector Databases and Why They Matter

2 weeks agoby 12 min read

What are vector databases: databases optimized for similarity search on high-dimensional vector embeddings, the infrastructure underneath most modern AI applications

Vector databases are databases optimized for storing high-dimensional numerical vectors (called embeddings) and finding the vectors most similar to a query vector. They sound abstract; their importance is concrete. Almost every modern AI application that retrieves relevant context (search, recommendation, retrieval-augmented generation, semantic similarity, deduplication, classification) runs on a vector database underneath. As AI applications have exploded over the past few years, vector databases have grown from a research curiosity into a meaningful new category of infrastructure that most software organizations now interact with directly or indirectly.

This post walks through what vector databases actually are, why they emerged as a distinct category from traditional databases, how they work at a high level, the major products in the market, the trade-offs between dedicated vector databases and vector extensions to existing databases, and how to think about whether your application needs one.

What "vector" means in this context

A vector, in the AI sense, is a list of numbers that represents the meaning of some piece of content. The list is typically long: hundreds or thousands of numbers, where each position captures some aspect of the content’s semantic meaning. The numbers come from an embedding model: feed text (or images, or audio) into the model, get a vector out.

The key property of embedding vectors: pieces of content that are similar in meaning produce vectors that are mathematically close together in the high-dimensional space. "Dog" and "puppy" produce similar vectors. "Bank" (the financial institution) and "bank" (the side of a river) produce different vectors because the surrounding context determines meaning. Search for "documents about machine learning" and the embedding of that query is close in vector space to embeddings of documents that actually discuss machine learning, even if they don’t use those exact words.

The math underneath is linear algebra: vectors in a high-dimensional space, similarity computed as cosine of the angle between vectors (or related distance metrics). The math has been standard for decades; what’s new is that modern AI models produce vectors that capture genuinely useful semantic meaning, making vector similarity a practical tool for retrieval.

Why vector databases became their own category

Traditional databases (relational like Postgres, MySQL, SQL Server; document like MongoDB; key-value like Redis) aren’t designed for high-dimensional vector similarity search. You can store vectors in any of them, but searching for the most-similar vector across millions of records requires comparing against every vector, which gets impractical fast.

Vector databases use specialized algorithms (approximate nearest neighbor search) and data structures (HNSW graphs, IVF indexes, product quantization, others) that make similarity search feasible at scale. The trade-off is that approximate nearest neighbor search returns "very close to the most-similar vectors" rather than provably-exact answers, but the approximation is close enough for almost every real-world use case and the speed gain is enormous (often three or four orders of magnitude faster than exact search).

The other capabilities a serious vector database offers:

Hybrid search combining vector similarity with traditional keyword search. Pure vector search misses things; pure keyword search misses things; the combination usually outperforms either alone.
Metadata filtering: each vector is associated with metadata (document ID, author, date, access control labels), and queries can filter on metadata while doing similarity search. Useful for access-controlled or scope-restricted retrieval.
Real-time updates: new vectors get added and become searchable immediately, without rebuilding indexes from scratch.
Scalability: handling billions of vectors, sometimes across distributed clusters.
Multi-tenancy: separating different tenants’ data in shared infrastructure.

These capabilities aren’t impossible to bolt onto traditional databases, but a purpose-built vector database typically does them faster and more cleanly than a general-purpose database with vector capability added.

How vector databases work at a high level

The end-to-end flow when you use a vector database in a RAG or search application:

Indexing: documents (or other content) get chunked, fed through an embedding model, and the resulting vectors plus metadata get stored in the vector database.
Index building: behind the scenes, the database builds an approximate-nearest-neighbor index over the vectors. The index makes similarity search fast at query time at the cost of memory and one-time index construction.
Querying: when a query comes in, the query gets embedded into a vector using the same embedding model, the vector database searches for the most-similar stored vectors, and the matching chunks (with metadata) get returned.
Re-ranking (optional): the top N retrieved chunks may go through a second-pass scoring with a more accurate but slower model to refine the ranking.
Usage: the retrieved chunks get used by the calling application, typically as context for an LLM in a RAG system.

The vector database is the middle layer that makes "find the most relevant chunks" fast. Without it, the same retrieval would be impractical at production scale.

The major vector database products

The market has matured over the past few years. The current major options:

Pinecone. Fully-managed vector database, one of the oldest commercial products in the category. Cloud-only. Strong in production deployments where managed simplicity is the priority. Common in enterprise RAG and AI applications.

Weaviate. Open-source vector database with cloud and self-hosted options. Strong in production deployments where flexibility and customizability matter.

Chroma. Open-source, developer-friendly, designed for AI application developers. Common in starter projects and for development environments. Less battle-tested at very high scale than the older options.

Qdrant. Open-source vector database with cloud and self-hosted options, written in Rust. Strong performance characteristics; growing adoption.

Milvus / Zilliz. Open-source (Milvus) and managed cloud version (Zilliz Cloud). Strong in large-scale deployments.

LanceDB. Embeddable vector database, particularly suited to local or edge deployments.

pgvector. A Postgres extension that adds vector similarity search to standard Postgres. Not a dedicated vector database but increasingly the right answer for organizations already running Postgres at modest vector-data scale. Postgres + pgvector handles many real applications without needing a separate vector database product.

Elasticsearch / OpenSearch with vector search. Both Elasticsearch and OpenSearch have added vector search capabilities. For organizations already running these for full-text search, the integrated vector capability is often the path of least resistance.

Cloud-provider native vector databases. AWS OpenSearch Vector Engine, Azure AI Search vector capability, Google Cloud Vertex AI Vector Search. Provider-native options for organizations standardized on a specific cloud.

The market segmentation, simplified:

Pure-play managed (Pinecone, Zilliz Cloud) for teams who want zero infrastructure work.
Open-source self-hosted (Weaviate, Qdrant, Milvus, Chroma) for teams who want control.
General-database extensions (pgvector, Elasticsearch) for teams who want to consolidate.
Cloud-native services (OpenSearch Vector Engine, Azure AI Search, Vertex AI Vector Search) for teams committed to a specific cloud.

Dedicated vector database vs. extension to existing database

A live debate in 2026 is whether organizations need a dedicated vector database or whether vector extensions to general-purpose databases (pgvector for Postgres, vector capability in Elasticsearch, vector indexes in MongoDB Atlas) are sufficient.

The case for dedicated vector databases:

Better performance at scale, especially for very large vector collections.
Purpose-built features (hybrid search, advanced filtering, sophisticated re-ranking integration).
Easier multi-tenancy and access control patterns.
Specialized operational tooling.

The case for extensions to existing databases:

One fewer system to operate.
Existing database expertise transfers.
Transactional consistency between vector data and other application data.
Lower cost at modest scale.

The realistic answer for most organizations: pgvector or a similar extension is sufficient for many production applications. Move to a dedicated vector database when specific limitations of the extension approach become binding (typically at multi-million-vector scale, or when specialized features like sophisticated hybrid search become important).

When you need a vector database

Vector databases are the right tool when:

Your application does semantic retrieval at scale: finding “the most relevant” content based on meaning rather than exact keyword matches.
You’re building a RAG system (covered in our RAG explainer).
You’re doing recommendation systems based on content similarity (similar products, similar articles, similar users).
You’re building semantic search that handles paraphrased or differently-worded queries gracefully.
You’re doing deduplication or clustering on large content collections.
Your content scale is past the threshold where in-memory exact similarity search is impractical (typically more than a few thousand items).

Vector databases are less likely to be the right tool when:

Exact keyword search is sufficient for your use case. Traditional search engines (Elasticsearch, Algolia, even Postgres full-text search) may handle this more naturally and cheaper.
Your data scale is small enough that simple approaches (in-memory vector store, brute-force similarity search) work fine.
Your retrieval requirements are met by existing database query patterns (filtering by metadata, joins, exact matches) without needing semantic similarity.

The mental model: vector databases handle the "find by meaning" problem. If your application has a find-by-meaning problem, a vector database is appropriate infrastructure. If it doesn’t, a vector database is unnecessary complexity.

How to start using a vector database

For a team new to vector databases, the realistic path:

Start with the simplest workable option. For small-to-mid scale, pgvector on Postgres (if you’re already running Postgres) or Chroma (for development simplicity) gets you running quickly without committing to a specific dedicated platform.
Use a managed service if you want zero operational overhead. Pinecone is the most-mature managed option; Weaviate Cloud and Zilliz Cloud are also credible.
Pick a use case with a defined scope: a customer-support knowledge base, an internal-documentation search, a product-similarity recommender. Constrained scope makes the first deployment achievable.
Measure quality, not just speed. Vector retrieval can be fast and bad. Set up an evaluation pipeline that measures whether retrieved results are actually relevant to the queries.
Scale carefully. The vector database that works for 100K vectors may not work the same way for 100M. Plan for migration if you expect significant growth.
Tune chunking and embedding choices with the application. The vector database is the substrate; the chunking strategy and embedding model are usually where quality gains come from.

For broader context, our RAG piece covers the broader pattern vector databases enable, and our AI Agents practitioner’s guide covers the agent layer that often uses vector retrieval as a tool.

Frequently Asked Questions

Are vector databases replacing traditional databases?

No. Vector databases are a complementary category, not a replacement. Traditional databases (relational, document, key-value) handle structured data, transactions, and exact queries; vector databases handle semantic similarity search on embedding vectors. Modern applications often use both: a traditional database for the application’s structured data, a vector database (or a vector extension to the existing database) for semantic retrieval needs.

Do I need a vector database for a small RAG application?

For very small applications (under a few thousand documents), no. In-memory vector storage with brute-force similarity search works fine and is simpler. For applications at moderate scale (tens of thousands to millions of documents), a vector database becomes increasingly important. For very large applications (tens of millions or billions of vectors), a vector database is essentially required.

Is pgvector good enough for production, or do I need a dedicated vector database?

For many production applications at modest scale, pgvector is good enough and is often the right choice because it consolidates infrastructure (one fewer system to operate). Dedicated vector databases become important when you hit specific limitations: very large vector collections (multi-million vectors with low-latency requirements), advanced features (sophisticated hybrid search, complex re-ranking), or multi-tenant patterns that require purpose-built infrastructure. The honest recommendation: start with pgvector if you’re already on Postgres; migrate to a dedicated vector database when specific pain points emerge.

What’s the difference between an embedding model and a vector database?

The embedding model converts content (text, images, audio) into vectors. The vector database stores those vectors and provides fast similarity search. The two work together but are different layers: the embedding model is typically a machine learning model (often a smaller specialized model, sometimes a frontier LLM with embedding capability), and the vector database is data infrastructure. Many AI applications use commodity embedding models from OpenAI, Cohere, or open-source options paired with a vector database of choice.

How much do vector databases cost?

Self-hosted open-source options (Weaviate, Qdrant, Milvus, Chroma) are free as software; you pay for the infrastructure you run them on. Managed services charge based on data volume, query volume, and infrastructure tier. Pinecone, for example, has a free tier suitable for development and starter projects, with production pricing typically ranging from tens to thousands of dollars per month depending on scale. pgvector running on existing Postgres infrastructure is essentially free at modest scale. Cloud-provider services (Azure AI Search, Vertex AI Vector Search) follow their respective providers’ pricing models. The right cost analysis is application-specific; ballpark figures should be sanity-checked against your actual data and query volumes.

Facebook X

What Are Vector Databases and Why They Matter

What "vector" means in this context

Why vector databases became their own category

How vector databases work at a high level

The major vector database products

Dedicated vector database vs. extension to existing database

When you need a vector database

How to start using a vector database

Frequently Asked Questions

What Is Retrieval-Augmented Generation (RAG)?

OpenAI Launches Personal Finance Experience in ChatGPT

OpenAI Pairs with Plaid for Wider Access to Personal Finance

Gemini 3.5 Flash: Google’s New AI Coding Frontier Model

What Is the Model Context Protocol (MCP)? A 2026 Guide

What Are AI Agents? A Practitioner’s Guide

Menu

Instagram

Search

What Are Vector Databases and Why They Matter

What "vector" means in this context

Why vector databases became their own category

How vector databases work at a high level

The major vector database products

Dedicated vector database vs. extension to existing database

When you need a vector database

How to start using a vector database

Frequently Asked Questions

Further reading

What Is Retrieval-Augmented Generation (RAG)?

OpenAI Launches Personal Finance Experience in ChatGPT

OpenAI Pairs with Plaid for Wider Access to Personal Finance

Gemini 3.5 Flash: Google’s New AI Coding Frontier Model

What Is the Model Context Protocol (MCP)? A 2026 Guide

What Are AI Agents? A Practitioner’s Guide

Menu

Instagram