Rethinking Vector Search Infrastructure as Systems Grow

By - The Libzter
Posted on April 4, 2026April 4, 2026
Posted in AI

Rethinking Vector Search Infrastructure as Systems Grow

I recently worked on a system that was initially built using pgvector, where vector embeddings were stored alongside their corresponding text data in a Postgres database. This approach has some clear advantages.

First, it simplifies data consistency. Inserts, updates, and deletes can be handled in a single transaction, making it straightforward to keep embeddings and source data in sync. Second, it keeps infrastructure minimal—there’s only one database to operate, monitor, and scale.

However, as the system evolved, some limitations became apparent.

Using a single database to serve multiple workloads with different performance and access characteristics can introduce tension over time. In this case, vector search traffic grew significantly as users performed more queries over tenant data, while transactional workloads (such as account management) remained relatively steady. Despite their different scaling needs, both workloads competed for the same underlying resources—CPU, memory, and I/O—which can lead to inefficient scaling and degraded performance under load.

There’s also a difference in how these workloads are structured. While pgvector works well for colocated data and straightforward similarity search, more advanced retrieval patterns—such as hybrid search and integrated ranking pipelines—require additional application-level logic. These features can certainly be implemented on top of Postgres, but they are not provided as a unified, built-in retrieval system, which increases the amount of code that must be maintained over time.

To address this, I migrated the vector search layer to Pinecone. Managed vector databases provide capabilities like hybrid search (combining dense and sparse vectors) and metadata filtering as part of their core query interface. Offloading these concerns reduced the complexity of the application code and made the search pipeline less brittle, since these features are handled and maintained by the underlying system.

As the system continues to grow, this separation also allows each component to scale according to its own requirements. Postgres remains focused on transactional workloads, while Pinecone handles search-specific access patterns and performance characteristics.

To keep the two systems in sync, I implemented an outbox-based approach. Postgres maintains an outbound events table that records all mutations relevant to the vector index. A background worker continuously polls for unprocessed events (e.g., rows where completed_at IS NULL), processes them in order, and updates Pinecone accordingly. Upon success, the worker marks the event as completed by setting the completed_at timestamp.

This pattern provides durability and ensures that updates are not lost, even if Pinecone is temporarily unavailable. It also makes the consistency model explicit: the system is eventually consistent, with a clear and recoverable path for propagating changes.

The primary reasons for choosing Pinecone were centered around reducing operational and maintenance overhead.

Hybrid search is supported out of the box, allowing dense and sparse signals to be combined within a single query. Previously, this logic existed in the application layer on top of pgvector, which added complexity and ongoing maintenance cost. Moving this responsibility into the database layer simplified the system without sacrificing flexibility.

Similarly, metadata filtering is applied during retrieval rather than after. In the previous implementation, deterministic filters were applied post-retrieval, which could waste top-K results on irrelevant candidates. By pushing filtering into the query itself, only qualified candidates are considered during retrieval, improving both efficiency and result quality.

Overall, pgvector remains a strong choice for simpler systems or when tight coupling between data and embeddings is desirable. But as retrieval requirements become more sophisticated and workloads diverge, a dedicated vector database can provide meaningful advantages in scalability, maintainability, and system clarity.

Libby Louis

Rethinking Vector Search Infrastructure as Systems Grow

Previous Article