INITIALIZING SECURE CHANNEL...
CPU MEM NET PING9.2 ms SECSECURED
COFFEE—WEB v4.0 // eu-1
ONLINE 2 847 --:--:--
/english > 38. RAG Systems
// УРОК 38

RAG Systems

B2

Retrieval-Augmented Generation (RAG)

RAG combines a retrieval system with an LLM to answer questions using up-to-date or private knowledge.

RAG Architecture

  1. Ingestion: Documents are split into chunks and embedded into vectors
  2. Storage: Vectors stored in a vector database (Pinecone, Weaviate, pgvector)
  3. Retrieval: User query is embedded; nearest vectors are retrieved
  4. Generation: Retrieved context + user query is sent to the LLM

Key Terms

TermMeaning
embeddingA vector representation of text capturing semantic meaning
vector databaseDatabase optimized for similarity search on embeddings
chunkingSplitting documents into smaller pieces for embedding
semantic searchFinding documents by meaning, not just keyword match
groundingProviding the LLM with factual context to reduce hallucinations
// TERMINAL CHALLENGE

Проверь себя

Q1. What problem does RAG solve that a standalone LLM cannot?
Q2. What is an 'embedding' in a RAG system?
Q3. What is 'chunking' in the RAG ingestion pipeline?
Q4. What does 'grounding' an LLM mean?
Q5. Complete: 'The user query is ___ and compared against stored vectors to find the most relevant chunks.'
╔═ GL1TCH v0.1 ═[ПОДКЛЮЧЕНО]═╗ [×]
СОЕДИНЕНИЕ АКТИВНО
запросов:
// сессия #{} начата
>_
[ РАЗРЫВ СВЯЗИ ]
лимит исчерпан...
иду спать... zzZ
хочешь больше? [зарегистрироваться] // +10 запросов в день