A document QA agent is only useful when it can show its work
Building a RAG demo as a product problem: ingestion, chunking, retrieval, citations, bounded storage, and a mock answer path that reviewers can run without API keys.
The fastest way to make a document chatbot impressive is to make it sound confident.
The fastest way to make it useful is to make it accountable.
That is why the document QA agent is built around citations first. The goal was not just to answer questions about uploaded PDFs or text files. The goal was to show which chunks the answer came from, keep the retrieval path understandable, and let someone run the project without paid API keys.
The pipeline
The app has a simple shape:
- upload a document
- split it into chunks
- store a bounded corpus in memory
- build a lightweight inverted index
- retrieve likely chunks for a question
- generate an answer with citations attached
There is an OpenAI-backed path, but the default mode is a deterministic mock generator. That choice makes the demo easier to review and easier to test.
Why mock mode is not a shortcut
Mock mode proves that the product surface, retrieval contract, citation builder, and tests work even when no external model is available.
A reviewer can see the important system behavior immediately: documents come in, chunks are created, retrieval returns evidence, and the answer points back to source material. The model can improve later. The accountability pattern should already be there.
This is the part I keep coming back to with RAG systems. The retrieval quality and citation design determine whether anyone can trust the output. A confident hallucination with no citations is worse than a hedged answer with good source attribution. The system should make it easy to verify the answer, not just easy to accept it.