Vector Database Pricing 2026: Pinecone vs Qdrant vs Supabase
A practical 2026 vector database cost comparison — Pinecone, Qdrant, Weaviate, Supabase pgvector, Turbopuffer, and more, with real RAG workload examples.
Vector database pricing in 2026 ranges from $0 (self-hosted Postgres pgvector) to $400+ per month for the same 1-million-vector RAG workload, depending on provider, query rate, and quantization choices. This guide breaks down nine providers across realistic RAG workloads (100k to 100M vectors) so you can pick the right one for your scale. For real-time comparison across your exact numbers, use our Vector DB Cost Estimator.
Vector DB is usually 10-25% of an AI app's total infrastructure bill — small enough to ignore at MVP scale, large enough to dominate decisions at production scale. The good news is the math is more predictable than LLM token cost: it scales linearly with vectors, dimensions, and queries.
What does a vector database actually charge for?
Three line items appear on every vector DB bill:
- Storage — usually billed per GB-month of indexed data. Index overhead (HNSW typically 1.3-1.5×) means stored bytes are 30-50% larger than raw vectors.
- Reads — billed per million queries, or bundled into a node-hour rate. Hybrid search (vector + keyword) often costs 2× a pure vector query.
- Writes — billed per million upserts. Re-indexing a document hot-reloads the whole HNSW graph, so frequent updates can dominate the bill.
A fourth hidden item: plan minimums. Most managed providers have a $25-$200/month floor before per-usage billing even kicks in. For tiny experiments, that floor is the entire bill.
What is the cheapest vector DB at each scale?
The cheapest provider depends sharply on scale. Here's a breakdown across four common RAG workload sizes, using float32 1536-dimension OpenAI-style embeddings:
| Workload | Vectors | Queries/day | Cheapest provider | Approx. monthly |
|---|---|---|---|---|
| Small RAG (proof-of-concept) | 100k | 5,000 | Self-hosted pgvector | $20 (VM only) |
| Small RAG (managed) | 100k | 5,000 | Supabase pgvector | $25 |
| Medium RAG | 1M | 50,000 | Pinecone Serverless | $40-60 |
| Large RAG | 10M | 200,000 | Turbopuffer | $35-80 |
| Enterprise | 100M | 1M | Turbopuffer or self-host | $300-800 |
Turbopuffer is the surprise winner at large scale because its object-storage architecture trades cold-read latency (200-500ms vs 30-80ms warm) for radically cheaper storage. For RAG where queries can wait 500ms, that trade is almost always worth it.
How does Pinecone Serverless pricing actually work?
Pinecone Serverless bills three line items separately, then sums:
- Storage: $0.33 per GB-month of indexed data
- Reads: $8.25 per million read units (1 RU ≈ 1 query × 1KB result)
- Writes: $4.00 per million upserts
A worked example for 1M vectors at 1536 dim with 50k queries/day and 5k writes/day:
storage: 1M × 1536 × 4 bytes × 1.4 overhead / (1024^3) = 8.0 GB
8.0 × $0.33 = $2.64 per month
reads: 50,000 × 30 = 1.5M reads / month
1.5 × $8.25 = $12.38 per month
writes: 5,000 × 30 = 150k writes / month
0.15 × $4.00 = $0.60 per month
total: $15.62 per month
That's the bare minimum. In practice you'll have some baseline storage of metadata and tags that add 10-30%. Still, Pinecone Serverless is genuinely cheap at this scale — the headline price chart looks expensive until you do the math.
The catch: above ~50M vectors the read pricing dominates. At 10M reads/month against a 50M-vector index, you'd pay $82.50 just for reads. Pod-based Pinecone (or migrating to Qdrant / Turbopuffer) becomes cheaper.
Is Qdrant cheaper than Pinecone?
It depends entirely on query rate.
Qdrant Cloud charges per-node-hour, not per-query. Their starter Hybrid Cloud node (1GB, 1 vCPU) runs $0.105/hour = $76/month. You get unlimited queries inside the node's CPU capacity (~50-100 QPS for vector search).
| Scenario | Pinecone Serverless | Qdrant Cloud |
|---|---|---|
| 1M vectors, 10k queries/day | $7 | $76 |
| 1M vectors, 100k queries/day | $40 | $76 |
| 1M vectors, 1M queries/day | $260 | $76 (likely 2 nodes = $152) |
| 10M vectors, 100k queries/day | $90 | $200 |
Pinecone wins on low-query-rate workloads (because storage is cheap). Qdrant wins on high-query-rate workloads (because predictable per-node pricing dominates per-query pricing past a certain threshold).
Pro tip: if you're already running Postgres, pgvector on Supabase or Neon is even cheaper than either Qdrant or Pinecone for under 10M vectors at moderate query rate. The trade-off is recall (HNSW on Postgres is competitive but lacks some advanced features), and operational simplicity (one DB to manage instead of two).
How much can quantization save?
A lot. Precision converts directly to storage cost:
| Precision | Bytes/value | Storage vs float32 | Recall hit |
|---|---|---|---|
| float32 | 4 | 100% | baseline |
| float16 | 2 | 50% | ~0.5% |
| int8 | 1 | 25% | ~5% |
| binary | 0.125 | 3% | ~15% (rerank required) |
For 100M float32 1536-dim vectors, raw storage is 570GB. Drop to int8 and it's 142GB — at $0.33/GB on Pinecone that's $190/month versus $47/month. Saving four figures annually.
Binary quantization is the most aggressive option but requires a reranking pass with the original float32 vectors (or with a cross-encoder) for production-quality recall. Tools like Pinecone's namespace feature, Cohere's Rerank API, and Voyage AI's reranker make this practical.
When should you use Postgres pgvector instead?
The pgvector decision tree:
- Use pgvector if you have under 10M vectors, under 100 queries/sec, and already run Postgres. The operational simplicity beats any niche feature.
- Use a purpose-built vector DB if you have over 10M vectors, over 1,000 queries/sec, need sparse-dense hybrid search, or are doing serious metadata filtering with high cardinality.
- Use Turbopuffer if you're cost-bound and can tolerate 200-500ms cold reads. Object-storage backing is decisive at large scale.
- Use Weaviate / Qdrant if you need built-in modules (CLIP, multi-vector, multi-tenant ACL) without writing them yourself.
The pgvector ecosystem matured significantly in 2024-2025. Native HNSW indexing, IVFFlat for cold storage, half-precision support, and built-in hybrid search make it competitive for most real-world RAG workloads. The Supabase team's pgvector v0.8 benchmarks are within 10-20% of dedicated vector DBs for under-10M-vector workloads.
What about MongoDB Atlas Vector Search and Redis Vector?
Both are good "we already use this database" options:
- MongoDB Atlas Vector Search is bundled into Atlas pricing starting at M10 ($57/month). For teams already on MongoDB, the operational and querying integration is genuinely valuable — JSON metadata filtering with vector search in one query.
- Redis Vector is included in Redis Cloud pricing. Sub-millisecond query latency is the headline feature; it's the right choice for ad serving, recommendation, and other ultra-low-latency use cases.
Neither is the cheapest at any specific scale, but both can be the right choice when "consolidate vendors" is more valuable than "minimize line-item cost".
How do I actually pick?
Use this decision sequence:
- Estimate vector count and query rate for the next 12 months, not just MVP day-one. Vector DBs are sticky — migration is painful.
- Estimate quantization tolerance by running a small recall benchmark with int8 vs float32 against your actual reranker. Most teams find ≤2% recall loss is acceptable.
- Pick on total monthly cost at your 12-month target, not headline price. Use our Vector DB Cost Estimator to plug in numbers across all 9 providers in one shot.
- Layer in the qualitative factors: do you need built-in CLIP / multi-tenancy / GDPR EU residency / hybrid search?
A common 2026 pattern is two-tier storage: hot tier on Pinecone or Qdrant for the past 30 days of content (high query rate), cold tier on Turbopuffer for older archives (rare queries, dirt-cheap storage). The crossover saves 40-60% on a real production RAG bill.
Don't over-optimize at MVP scale. The total vector DB bill for a small AI app is probably under $50/month — engineer time spent shaving that bill is engineer time not spent improving retrieval quality, which is a much bigger lever for product success.