Upload
POST /api/upload
Client guards the 5 MB cap; server re-verifies size, MIME, and the %PDF magic bytes. Upstash holds a per-IP fixed window — 3 uploads per day, no exceptions.
Architecture · Live demo
Behind the curtain
A live look at the pipeline you just used — the embeddings, the vector store, the cron that quietly throws everything out after a day. What you see here is what's actually running in production.
Active documents
—
Embedded chunks
—
Avg / document
—
Next cleanup
—
Pipeline
POST /api/upload
Client guards the 5 MB cap; server re-verifies size, MIME, and the %PDF magic bytes. Upstash holds a per-IP fixed window — 3 uploads per day, no exceptions.
pdf-parse · per page
pdf-parse walks the document one page at a time. Each page is normalised and sliced into ~500-token blocks with 50-token overlap, snapping to sentence boundaries when it can.
Vertex · text-embedding-004
Chunks pack into batches under Vertex's 20k-token limit, pessimistically estimated at chars/2. Retries on 429 with exponential backoff, plus a 250 ms breath between batches.
Supabase · pgvector
Vectors land in a dedicated rag schema with RLS enabled (anon role bounced at the door). An ivfflat cosine index handles approximate-nearest-neighbour search.
POST /api/chat
The question is embedded once, fed to the match_chunks RPC scoped to the active document. Top 5 passages return as context, each keeping its page number for citations.
Vertex · Gemini 2.5 Flash
A system prompt enforces 'cite [Source N] or refuse'. Tokens stream back as Server-Sent Events. The client renders citation footnotes inline as the text arrives.
Auto-cleanup
Schedule
—
Retention
—
cascade delete
Next run in
—
A Vercel cron hits /api/cleanup daily with a Bearer token. The endpoint calls rag.cleanup_old_documents('24 hours') — a single SQL function that drops anything older than the cutoff and lets the foreign-key cascade do the rest. No filename logs are kept.
Live data
Filenames are redacted — fellow visitors uploaded these, and that's their business.
| Label | ID | Pages | Chunks | Age | Deletes in |
|---|---|---|---|---|---|
| Loading… | |||||
Stack
For hire
One-time build. I scope it with you, build it on your GCP project, hand over the code and the keys. No retainer, no monthly fee, no “tier” you have to stay subscribed to. The infrastructure is yours from day one — I just got it shipped.
What's included
What I don't do
Scope a project
Send me the rough idea — document count, target users, deadline — and I'll come back with a fixed scope, fixed price, and a build window.