Build It Yourself
Clone the repo, run the pipeline, pass the ship gate.
Build It Yourself
The reading is over. Now you build.
The build environment for this track lives in github.com/portofcams/bluewave-school — a separate repository with the FastAPI backend, the baseline RAG primitives, the seed corpus, and the automated ship gate. You clone it, set up a local environment, and run it against the fixtures.
Why a separate repo? Because the build environment is a real Python app with a real RAG pipeline — Chroma, embeddings, Claude calls. It does not belong inside a Next.js marketing/school site. The repo is the thing you actually run; these lessons are the thing you actually read.
What You Will Do
Five commands. If your environment is ready, this takes twenty minutes end to end.
1. Clone
git clone git@github.com:portofcams/bluewave-school.git
cd bluewave-school2. Add Your Keys
cp .env.example .envEdit .env and fill in:
ANTHROPIC_API_KEY— your Anthropic key for Claude Sonnet 4.6VOYAGE_API_KEYOROPENAI_API_KEY— at least one embedding providerADMIN_PASSWORD— any string; this gates the /ask and /corpus pages
3. Set Up
scripts/setup.shThis creates a Python 3.12 virtualenv, installs all dependencies (chromadb, anthropic, voyageai, fastapi, etc.), and runs npm install for the Astro frontend.
4. Ingest the Seed Corpus
scripts/seed-ingest.shThis ingests the bundled demo corpus — a handful of markdown files covering Anthropic docs, stack docs, and Hawaii prevailing wage rules. You will query against this first, before switching to your own documents.
5. Run the Ship Gate
scripts/ship-gate.shThis runs the Module 1 ship gate against five fixture questions. You should see five PASS lines. If any fail, read the error carefully — the gate is loud on purpose.
If the ship gate fails with "No embedding provider configured," your .env is not loaded. Check that source .env or set -a; source .env; set +a runs before the script — the provided scripts handle this automatically.
Browse It
Start the dev server:
scripts/dev.shThe FastAPI app comes up on port 8010, the Astro frontend on port 4321.
- Open http://localhost:4321/login and enter the
ADMIN_PASSWORDyou set - Navigate to
/askand ask: "What is the default TTL for prompt caching in the Anthropic API?" - Inspect the answer, the cited sources, and the metrics (retrieval_ms, generation_ms, cost_usd)
You just ran a full RAG pipeline end to end. Congratulations — this is the worst version of it you will ever ship.
Ingest Your Own Corpus
The seed corpus is a demo. Your own documents are the real thing.
python -m app.rag.cli ingest /path/to/your/docs --corpus mineThen query against corpus=mine in the /ask page or via CLI:
python -m app.rag.cli ask "your question here" --corpus mineThe Boss Challenge
Once the ship gate passes and you have ingested your own corpus, run the boss challenge:
python content/module-1-baseline/ship-gate.py \
--learner-module reference.module_1 \
--fixtures content/module-1-baseline/fixtures-boss.jsonlThe boss has ten adversarial questions across five categories: cross-chunk synthesis, definitional drift, multi-hop, needle-in-haystack, and negation. Baseline RAG typically scores between 30% and 50%. Write down your number. It is the scoreboard.
Do not skip the boss challenge. The score is the only thing that makes future modules honest. When Module 3 claims reranking improves retrieval, you check it against the exact same boss on the exact same corpus. No baseline number, no comparison, no learning.
If You Get Stuck
- Ship gate output is the first thing to read — errors are structured
- The
data/costs.jsonlfile shows every API call and its cost - Open a GitHub issue on the bluewave-school repo with the ship-gate output pasted in
You Are Done With Module 1
Once your ship gate passes 5/5 and you have recorded a boss-challenge baseline, Module 1 is complete. Module 2 starts with: "your chunking is wrong in three specific ways."
Exercises
0/3Why is the build environment in a separate repository instead of embedded in the school?
What does the Module 1 ship gate check?
After running the boss challenge against your own corpus, record your score and pick one of the ten questions the baseline got wrong. What category of failure was it (cross-chunk / drift / multi-hop / needle / negation) and why did the baseline fail?
Hint: Be specific. "Retrieval was bad" is not enough — was the right chunk missing from top-k, was it present but the LLM ignored it, did the query embed to the wrong region?