I Built a RAG System to Chat With Newton's Entire Wikipedia
Most RAG tutorials just say "chunk your PDF and call OpenAI". I wanted to build something more real — a proper pipeline that actually ingests, cleans, embeds, and serves knowledge from Isaac Newton...

Source: DEV Community
Most RAG tutorials just say "chunk your PDF and call OpenAI". I wanted to build something more real — a proper pipeline that actually ingests, cleans, embeds, and serves knowledge from Isaac Newton's Wikipedia page end to end. The result is Newton LLM. You can now ask things like "What are Newton's contributions in Calculus?" and get proper answers with sources instead of made-up stuff. Here's how I actually built it and what I learned. The Problem With Most RAG Demos Every YouTube RAG tutorial follows the same boring steps: load PDF, split into chunks, put in vector store, done. But nobody talks about the real issues: How do you keep the data fresh when the source changes? How do you clean messy web data before embedding? How do you separate the ingestion part from the serving part? How do you make the whole thing actually deployable? Newton LLM tries to solve these. Its not just a notebook — its a small system. Architecture Overview The system has two main layers: Data Ingestion Laye