A company with 500,000 document fragments and an existing Postgres faces the question we regularly ask ourselves: should we just add pgvector and wrap it up in two days, or deploy Qdrant right away? The answer depends on several criteria rarely found together in one article. Below, we’ve compiled them in one place.
How pgvector and a dedicated vector database differ
#pgvector is a PostgreSQL extension that adds a vector data type and similarity operators (L2, cosine, dot product). Vectors reside in the same database as the rest of the transactional data (alongside customer, order, and document tables). This is a major operational advantage: one backup, one set of permissions, one monitoring system.
A dedicated vector database (Qdrant, Weaviate, Milvus) is a separate process designed solely for storing and searching vectors. Its indexing engine (typically HNSW) is optimized for approximate nearest neighbor (ANN) search with very low latency, even on tens of millions of vectors.
The difference isn’t binary; it’s a spectrum of trade-offs. Since version 0.7.0, pgvector supports the HNSW index (previously only IVFFlat), significantly narrowing the performance gap. With 100-200,000 vectors and a few queries per second, this gap is negligible today.
Five criteria that really matter
#Corpus scale. pgvector with an HNSW index performs well up to about 1-2 million vectors on an average server (16 GB RAM, 8 cores). Beyond this threshold, index build time and memory usage start competing with other database workloads. Qdrant manages memory per-collection and can keep part of the index on disk (mmap), scaling to hundreds of millions of vectors without server upgrades.
Metadata filtering (payload filtering). This is where pgvector most often loses in practice. Filtering WHERE category = 'hr' AND status = 'active' before vector search in pgvector is effectively a post-ANN filter or a full scan with a pre-ANN filter—both are costly. Qdrant builds inverted indexes on payload fields and merges them with the HNSW path, so must: [{ key: "category", match: { value: "hr" } }] doesn’t slow down the query. If your RAG needs to isolate fragments by department, client, or access level, payload filtering is a critical criterion.
Hybrid queries (vector + full-text). pgvector lacks built-in BM25; you must combine it with Postgres’ tsvector and merge results manually (RRF or custom logic). Since version 1.10, Qdrant has native support for hybrid search with reranking in a single API call. For systems where users enter both natural language questions and document numbers or codes, hybrid search matters. We detail this in the article on hybrid search with BM25 and vectors.
Self-hosting vs SaaS and GDPR requirements. Both options can be deployed on your own server. pgvector is part of Postgres, so if you already have an on-prem database and retention policies, adding the extension doesn’t change the threat model. Qdrant Cloud (SaaS) requires a data processing agreement (Article 28 GDPR) and a DPO assessment regarding data-residency. Embedding data isn’t PII, but if the payload contains customer content, storage region matters. Qdrant Community Edition deployed locally has the same GDPR profile as pgvector on your own server. More on GDPR-compliant infrastructure in the article on self-hosted LLM and GDPR.
Operational complexity and TCO. pgvector adds zero new components to the stack: one service, one backup, one team. Qdrant is a separate service (container or process), with its own metrics and updates. In a small team without DevOps, this is a real difference—not just licensing costs, but maintenance time. On the other hand, Qdrant exposes ready-made Prometheus endpoints and collection dashboards, simplifying observability at scale.
Table: pgvector vs dedicated vector database
#| Criterion | pgvector | Dedicated database (e.g., Qdrant) |
|---|---|---|
| Scale up to 1-2M vectors | good (HNSW 0.7+) | good |
| Scale beyond 2M vectors | limited (RAM/build-time) | good (mmap, disk segments) |
| Payload filtering | post-ANN or scan | native (pre-ANN) |
| Hybrid search | manual composition (tsvector+RRF) | native (since Qdrant 1.10) |
| New infrastructure components | none (Postgres extension) | separate service |
| Self-hosting GDPR | yes (like all Postgres) | yes (Community Edition) |
| SaaS (cloud) | yes (e.g., Supabase, Neon) | yes (Qdrant Cloud, Pinecone) |
| Backup and migrations | shared with Postgres | separate (snapshots API) |
| Multi-tenant isolation | schemas / RLS | collections per tenant |
| Typical startup cost | 0 (already have Postgres) | deployment time ~1-3 days |
When pgvector is the right choice
#pgvector makes sense when you already use Postgres, your corpus fits within 500,000 to 1 million fragments, queries are simple (no complex payload filtering), and you don’t want to add a new component to your infrastructure. Deployment boils down to CREATE EXTENSION vector, adding a column, and building an HNSW index. The team can do this in one day. At this scale, query latency is 20-80 ms, which is absolutely sufficient for most internal knowledge assistants.
pgvector is also a natural choice for semantic search when personal data cannot leave the transactional database: embeddings and payload stay in the same engine with the same permission model.
When to switch to a dedicated database
#Signs it’s time for Qdrant or another dedicated database:
- The corpus exceeds 2 million fragments, and HNSW index build time in pgvector exceeds the maintenance window.
- You need payload filtering before ANN search, e.g., client isolation in a multi-tenant system, document status filters, or access level filters.
- You’re deploying hybrid search (vector + BM25) and don’t want to maintain separate result-merging logic.
- You want to isolate collections per product or per client without impacting other collections’ query performance.
- You plan many concurrent write operations (incremental reindexing hourly) and don’t want them to burden the main transactional database.
At scales beyond 5-10 million fragments, a dedicated database is practically the only viable option. We cover this in more detail in the article on preparing corporate data for AI.
Migration pattern: from pgvector to Qdrant
#If you start with pgvector and grow beyond the threshold after a few months, migration is simpler than it seems. Vectors are floating-point numbers. Exporting from pgvector to JSON format and reimporting into Qdrant is a matter of one ETL script. You map the payload schema once. The entire process for 2 million vectors takes 2-4 hours. The dual-index pattern (old pgvector + new Qdrant running in parallel, traffic shifted gradually) minimizes risk, similar to migrating from API to your own AI model.
Try it: match the stack to your requirements
#FAQ
#Is pgvector suitable for production RAG?
#Yes, for datasets up to 1-2 million fragments and with an HNSW index (pgvector 0.7+), it’s a fully production-ready solution. Many companies deploy RAG on pgvector because it doesn’t require a new infrastructure component. Limitations appear with advanced payload filtering and large scale; in these cases, Qdrant is worth evaluating.
What are the latency differences between pgvector and Qdrant for large datasets?
#At 500,000 vectors, the difference is negligible (both fit within 20-60 ms on a typical server). At 5 million vectors, Qdrant with disk segments (mmap) maintains 30-80 ms, while pgvector may reach 200-500 ms if RAM is insufficient. This strongly depends on server configuration and HNSW index parameters (ef_construction / m).
Is Qdrant Community Edition GDPR-compliant?
#Yes, Qdrant Community Edition run on-premises has the same GDPR profile as pgvector on your own server: data doesn’t leave the infrastructure, and there’s no backward telemetry to the vendor (in the default configuration). Qdrant Cloud (SaaS) requires a separate data-residency assessment and a data processing agreement. The key point is that the collection payload shouldn’t contain unencrypted personal data, regardless of the chosen database.
How much does deploying Qdrant in a company cost?
#Qdrant Community Edition is open-source software with no licensing fees. Deployment costs include team time (1-3 days for configuration, API integration, and testing), server (for 2 million BGE-M3 1024-dim vectors, about 8-12 GB RAM is needed for the HNSW index), and maintenance. For collections beyond 10 million vectors, consider Qdrant Cloud or a dedicated machine. A total cost of ownership calculator is available in the ROI calculator.
Can pgvector and Qdrant be used simultaneously?
#Yes, and it makes sense in architectures where transactional data (short descriptions, metadata) is in pgvector, while large document collections are in Qdrant. The dual-index pattern is also useful during migration: new fragments go to Qdrant, old ones stay in pgvector until full reimport. Both systems can be queried in parallel, and results merged via reranking.