← All projects
Expert~25 hours

Mini Search Engine

Crawl, index, rank. The whole stack of a search engine in 500 lines.

PythonBeautifulSoupTF-IDFWhoosh

Build plan

  1. 1
    Crawler with depth limit, robots.txt, and dedup
  2. 2
    Tokenizer + stemmer + inverted index
  3. 3
    TF-IDF ranking with cosine similarity
  4. 4
    FastAPI search endpoint + simple HTML UI
Stuck?

Open the playground and prototype the trickiest part first. Even 20 lines of working code beats a perfect plan that never runs.

Open Playground