Solutions

EdTech AI

EdTech AI agents need to stay aligned to course materials, so students get answers, explanations, and practice that match what’s actually taught (with citations back to the syllabus/content).

This example shows how to tune a retrieval-first RAG system over a course catalog so responses stay grounded and useful for advising and study support.

Dataset / Course Corpus

UCSB 2009–2010 General Catalog (490-page PDF) chunked into 1,304 document chunks containing course descriptions, prerequisites, degree requirements, and policies.

Use Case

An academic advising + course support assistant that answers natural-language student questions (e.g., prerequisites, writing requirement), instead of manual searching through the catalog.

Agent

An “academic advisor” prompt that cites catalog sections, enabling course-aligned summaries and explanations (and extending naturally to generating practice questions and study aids tied to the retrieved sections).

Objectives

Retrieve the most relevant catalog sections with balanced precision + recall (primary metric: F1), plus strong NDCG@5 and MRR so the first relevant section appears quickly.

Dataset / Course Corpus

UCSB 2009–2010 General Catalog (490-page PDF) chunked into 1,304 document chunks containing course descriptions, prerequisites, degree requirements, and policies.

Use Case

An academic advising + course support assistant that answers natural-language student questions (e.g., prerequisites, writing requirement), instead of manual searching through the catalog.

Agent

Objectives

Retrieve the most relevant catalog sections with balanced precision + recall (primary metric: F1), plus strong NDCG@5 and MRR so the first relevant section appears quickly.

Dataset / Course Corpus

UCSB 2009–2010 General Catalog (490-page PDF) chunked into 1,304 document chunks containing course descriptions, prerequisites, degree requirements, and policies.

Use Case

An academic advising + course support assistant that answers natural-language student questions (e.g., prerequisites, writing requirement), instead of manual searching through the catalog.

Agent

Objectives

Retrieve the most relevant catalog sections with balanced precision + recall (primary metric: F1), plus strong NDCG@5 and MRR so the first relevant section appears quickly.

Experiment knobs

Chunk size

128 vs 256 tokens (overlap fixed at 32).

Reranker top_n

2 vs 5 documents after initial retrieval.

Key result

256-token chunks

(with top_n=2 or 5) won, improving all retrieval metrics vs 128-token chunks: F1 0.434 vs 0.372 (+16.7%), MRR 0.632 vs 0.524 (+20.6%), plus gains in precision/recall/NDCG@5.

Insight

256-token chunks preserve complete course descriptions (course title + prerequisites + units) in one chunk, reducing fragmentation and improving retrieval quality.

Metric	Config 1/2 (256 tokens)	Config 3/4 (128 tokens)	Improvement
Precisin	0.373	0.341	+9.4%
Recall	0.520	0.427	+21.8%
F1 Score	0.434	0.372	+16.7%
NDCG@5	0.125	0.105	+19.0%
MRR	0.632	0.524	+20.6%

Comparison of retrieval performance metrics across different chunk sizes (128 vs. 256 tokens) for the UCSB course catalog, highlighting that the 256-token configuration improved all key metrics

How to Apply This to Your Data

This workflow demonstrates how to operationalize Outcome Engineering for your own EdTech content (syllabi, lecture notes, textbooks, assignment rubrics, LMS pages). By testing chunking and retrieval settings side-by-side and optimizing for F1/MRR, you can reliably:

Align answers to course text (with citations),

Generate practice questions from the top retrieved sections,

Provide step-by-step explanations and study aids grounded in the same material students are graded on.

Access the Notebook

Get it on GitHub

Join Our Discord

Read the Docs