Libby Louis

Rambling thoughts from a curious engineer

Rethinking Vector Search Infrastructure as Systems Grow

I recently worked on a system that was initially built using pgvector, where vector embeddings were stored alongside their corresponding text data in a Postgres database. This approach has some clear advantages. First, it simplifies data consistency. Inserts, updates, and deletes can be handled in a single transaction, making it straightforward to keep embeddings and

Mitigating Bias in Semantic Search

Retrieval-augmented search sounds clean: retrieve, then generate from what you retrieved. In practice, both steps are sensitive to what was in the query string. If a user’s wording encodes race, gender, age, religion, or similar dimensions—and the system blindly embeds that string, filters on it, or asks an LLM to paraphrase it—you get two problems at once: outcomes can track attributes you never meant to

Designing Rate Limiting: API Gateway vs ALB

When building a new system, it’s easy to focus on core functionality first—APIs, data models, and business logic—and leave concerns like rate limiting for later. But rate limiting isn’t just a “nice to have.” It directly impacts system stability, cost control, and tenant isolation. While designing a recent system, I had to decide early on:

RAG: Relationship Graph and Traversal

I have recently been working on an AI-native search software based on a RAG framework. The system treats each indexed JSON document as an entity with relationships that point at other entity items via unique IDs. Those links are normalized into a tenant-scoped graph: stored edges, reverse indexes for “who points at whom,” Redis caches

ECS Fargate vs. Kubernetes: Picking the Right Container Orchestrator

If you’re running containers in production, you’ve probably had the “should we just use Kubernetes?” conversation. Maybe you’re already on ECS Fargate and wondering if you’re missing out. Maybe you’re on Kubernetes and envious of teams that don’t have a platform team. Either way, the choice is less about which is “better” and more about

HIPAA on AWS: Building Compliance Into the Architecture

Healthcare-facing search sits in an awkward place: you might not be building an EHR, but queries and indexed content can still look like PHI — names in a search box, provider directories that resemble patient-adjacent workflows, snippets flowing to logs and third-party APIs. HIPAA’s technical safeguards are about access, audit, integrity, and transmission — not

Building a Real-time Generative UI Application

I recently built a real-time, AI-assisted web app that tailors responses and generates the UI based on what users are actually doing in the UI. It streams model output as it’s generated, and it folds in lightweight client telemetry—clicks, scroll velocity, component context—to keep the AI grounded in the moment. The architecture – simplicity vs control At a high level, the system is split into a Next.js frontend and an ASP.NET Core backend. The browser talks to the API over REST for requests that

From Edge to LLM: Designing a Serverless AI Pipeline

When building AI-powered applications, infrastructure decisions can have an outsized impact on speed, cost, and maintainability. For one recent project, I designed the system to be serverless-first: a static React frontend hosted on S3 behind CloudFront, and a thin API layer on API Gateway + Lambda. This approach keeps operational overhead minimal, scales automatically, and