Documentation

⌘K

GitHub Go to App →

Architecture

System Architecture

Understanding how NotebookLLM's frontend, backend, and AI services work together.

High-Level Architecture

NotebookLLM System Architecture Diagram

Technology Stack

Backend

• FastAPI - Web framework
• PostgreSQL - Database
• Qdrant - Vector store
• Supabase - Auth & Storage
• LlamaIndex - RAG framework
• Procrastinate - Task queue

Frontend

• Next.js 14 - React framework
• TypeScript - Type safety
• Tailwind CSS - Styling
• shadcn/ui - Components
• React Query - State
• next-themes - Theming

AI/ML

• Google Gemini - LLM
• Cohere - Reranking
• Sentence Transformers
• Kokoro TTS - Audio
• Langfuse - Observability

Data Flow

Document Ingestion Flow

User uploads file via frontend
File stored in Supabase (private bucket)
Background worker processes document
Text extracted, chunked, and embedded
Vectors stored in Qdrant
Document status updated to COMPLETED

Chat Query Flow

User sends message via chat interface
Backend retrieves relevant document chunks (hybrid search)
Chunks reranked using Cohere
LLM generates response with citations
Response streamed to frontend
Message and citations saved to database

Content Generation Flow

User requests content (podcast/quiz/flashcards)
LLM generates content using document context
For podcasts: TTS converts script to audio
Generated content stored in database
Audio files uploaded to public bucket
Content available in Studio panel

Key Components

RAG Pipeline

Retrieval-Augmented Generation powered by LlamaIndex. Combines semantic and keyword search with reranking for accurate, cited responses.

Content Generation

AI-powered generation of podcasts, quizzes, flashcards, and mind maps using structured outputs for reliable results.

Background Processing

Procrastinate-based task queue handles long-running operations like document processing and content generation.

Authentication

Supabase Auth with JWT validation. Users provisioned automatically on first login.

Next: Backend Services →