Rafael COELHO
← Work

MARCH 2025

YouTube Content Search

Multi-mode GraphRAG platform that extracts entities and relationships from YouTube transcripts into a Neo4j knowledge graph for multi-hop reasoning — the research that became COELHO Nexus.

Outcome
Four search modes (Search · Video · Channel · Playlist) · multi-hop reasoning via Neo4j knowledge graph · multi-provider LLM routing across Groq, OpenAI, SambaNova, Scaleway.
LangChain LangGraph Neo4j GraphRAG Llama 3.1 Gemma 2 Groq OpenAI SambaNova Scaleway Streamlit Python

Executive summary

YouTube Content Search is the research project that became COELHO Nexus. It builds a Knowledge Graph from YouTube video transcripts using AI Agents (LangChain + LangGraph), stores entities and relationships in Neo4j, and exposes four distinct retrieval modes — Search, Video, Channel, Playlist — enabling multi-hop reasoning that vanilla vector RAG can’t do.

LLM inference runs across multiple providers (Groq, OpenAI, SambaNova, Scaleway), giving the system cost-aware fallback and avoiding vendor lock-in. The interface is Streamlit; the agent orchestration is LangGraph.

See it deployed

The platform requires API keys to multiple LLM providers and a running Neo4j instance, so you can’t trivially spin it up — these 20 slides are the verifiable record of the system running with real YouTube content, KG entity extraction, multi-hop queries answered, and the four search modes in operation. Navigate with arrows or open fullscreen for the full read.

Loading viewer…

Four retrieval modes

ModeWhat it doesWhen it matters
SearchAI Agents autonomously locate videos matching context + filters supplied by userOpen-ended discovery — “find videos discussing X”
VideoExtract structured information from a single specified videoDeep-dive into one source
ChannelAI Agents traverse a YouTube channel’s full transcript corpus and reason across it”What does this creator believe about Y?”
PlaylistSame pattern as Channel but scoped to a curated playlistCurated topic exploration

Each mode exposes a follow-up question chat: an agent queries the Knowledge Graph and answers with multi-hop reasoning grounded in the graph traversal — not chunk-similarity lottery.

Why Knowledge Graphs beat vanilla vector RAG here

Vector RAG retrieves by semantic similarity (cosine over chunk embeddings). That works for “summarize this video,” but fails on questions like:

  • “Which AI researchers discuss safety in interviews with Lex Fridman?” → requires filtering by speaker AND topic AND venue
  • “What does Karpathy say about LLM evaluation across his last 10 talks?” → requires multi-hop reasoning over (speaker → talks → claims → topics)
  • “Who appears in two or more videos with topic Z?” → relational query, not similarity query

Knowledge Graphs encode these relationships explicitly: X works_for Y, Y located_in Z, A discussed B in C. Cypher queries walk the graph; the LLM grounds its answer in the relationship structure. The tradeoff is extraction quality (entity / relation extraction is harder than chunk embedding) — but the win on multi-hop correctness is substantial.

This proof-of-concept became the architectural backbone of COELHO Nexus, where the same GraphRAG pattern scales to a production agentic RAG platform with adaptive retrieval routing.

Stack

  • LangChain — agent and tool composition primitives
  • LangGraph — multi-agent orchestration with explicit state graphs and retry edges
  • Neo4j — Knowledge Graph storage + Cypher querying for multi-hop traversal
  • LLM providers — Groq (low-latency Llama 3.1, Gemma 2), OpenAI (fallback), SambaNova, Scaleway (cost diversity)
  • Streamlit — interactive interface
  • Python — implementation

What this project proves

  • GraphRAG is more than a buzzword — explicit entity-relationship modeling beats vector-only retrieval on multi-hop questions
  • Multi-provider LLM routing was shipping-ready in early 2025 — switching across Groq / OpenAI / SambaNova / Scaleway for production cost control predates the rotator I now run in COELHO Nexus
  • Foundation for COELHO Nexus — every retrieval insight here was absorbed into the Nexus adaptive 3-mode agentic RAG architecture

Source on GitHub →