// case study
Ask Periyar
Bilingual Tamil + English RAG system answering grounded questions over a 65,000-chunk corpus — built by an AI Engineer in Chennai.
PythonPineconeGemini FlashFastAPIcustom transliteration dict
Problem
01
Build a question-answering system over 12 Tamil-language books that answers in Tamil or English, never hallucinates outside the source material, and cites where every answer comes from — at production latency and cost.
Architecture
02
architecture diagram
drop SVG/PNG here
What I built
03
- 01A retrieval pipeline over ~65,000 semantic chunks (≈64K Tamil, ≈2K English) using Pinecone hybrid search (BM25 + dense vectors).
- 02A deterministic, zero-LLM PreFilter that handles language detection, prompt-injection blocking, route classification, and Tanglish→Tamil transliteration before any model call — replacing a single 7-decision LLM router.
- 03A slim Flash LLM layer that owns only retrieval-query construction and generation config.
- 04Strict corpus-only grounding, inline source citations ("View Source"), and age-adaptive responses.
Results
04
68% → 100%
Routing accuracy after replacing LLM router with deterministic PreFilter
Zero
LLM calls on the routing / safety path — fully deterministic
Live
In production at askperiyar.ai
65K
Semantic chunks indexed and searchable
Stack
05
PythonPineconeGemini FlashFastAPIcustom transliteration dict
Links