๐Ÿ”— LangChain RAG Cheatsheet

pip install langchain langchain-openai langchain-text-splitters langchain-chroma langchain-community chromadb

๐Ÿ“„ Loadโ†’โœ‚๏ธ Splitโ†’๐Ÿงฎ Embedโ†’๐Ÿ—„๏ธ Storeโ†’๐Ÿ” Retrieveโ†’๐Ÿ“ Promptโ†’๐Ÿค– LLMโ†’๐Ÿ’ฌ Answer

๐Ÿ“ฆ INSTALL & SETUP

terminal
pip install langchain langchain-openai \
langchain-text-splitters langchain-chroma \
langchain-community chromadb
imports
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
ย 
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_community.document_loaders import TextLoader
โš Old paths like langchain.schema, langchain.text_splitter are deprecated since v1.0

๐Ÿ“„ DOCUMENT LOADERS

langchain_community.document_loaders
from langchain_community.document_loaders import TextLoader
ย 
loader = TextLoader("data/manual.txt", encoding="utf-8")
docs = loader.load() # -> List[Document]

All Options

TextLoaderPlain .txt files
PyPDFLoaderPDF files โ€” one Document per page
UnstructuredPDFLoaderComplex PDFs with tables / images
CSVLoaderCSV rows โ€” one Document per row
JSONLoaderJSON with jq-style content extraction
WebBaseLoaderScrape a URL (requires bs4)
DirectoryLoaderGlob a folder, auto-detect by extension
Docx2txtLoaderMicrosoft Word .docx files
WikipediaLoaderWikipedia articles by title

โœ‚๏ธ TEXT SPLITTERS

langchain_text_splitters
from langchain_text_splitters import RecursiveCharacterTextSplitter
ย 
splitter = RecursiveCharacterTextSplitter(
chunk_size=500,
chunk_overlap=50,
separators=["\n\n", "\n", ". ", " ", ""]
)
chunks = splitter.split_documents(docs)

All Options

RecursiveCharacterTextSplitterDEFAULT โ€” paragraph, sentence, word
CharacterTextSplitterSimple fixed-size by one delimiter
TokenTextSplitterSplit by exact token count (tiktoken)
MarkdownHeaderTextSplitterSplit .md by # headings
HTMLHeaderTextSplitterSplit HTML by h1-h4 tags
PythonCodeTextSplitterSplit by class / function
RecursiveJsonSplitterSplit large JSON objects
SemanticChunkerEmbedding-based boundaries (expensive)
๐Ÿ’กChunk sizes: 200-500 (FAQ), 500-1000 (balanced), 1000-2000 (coarse). Overlap: 10-20% of chunk_size.

๐Ÿงฎ EMBEDDINGS

langchain_openai
from langchain_openai import OpenAIEmbeddings
ย 
embeddings = OpenAIEmbeddings(
model="text-embedding-3-small" # 1536 dims
# dimensions=512, # optional dimensionality reduction
)
ModelDims$/1M tok
text-embedding-3-small1536$0.02
text-embedding-3-large3072$0.13
text-embedding-ada-0021536$0.10 (legacy)
HuggingFace (local)variesFree
Cohere embed-v31024$0.10

๐Ÿ—„๏ธ VECTOR STORES

langchain_chroma
from langchain_chroma import Chroma
ย 
vectorstore = Chroma.from_documents(
documents=chunks,
embedding=embeddings,
persist_directory="./chroma_db",
collection_name="my_collection",
collection_metadata={"hnsw:space": "cosine"}
)

All Options

ChromaLocal dev, easy setup, persistent
FAISSUltra-fast, no server, large datasets
PineconeManaged cloud, auto-scaling
QdrantCloud / self-hosted, rich filtering
WeaviateHybrid search built-in
PGVectorPostgres extension โ€” use existing PG
InMemoryVectorStoreBuilt into langchain-core, testing only
๐Ÿ’กDistance metrics: cosine (default, best for normalised embeddings), l2 (euclidean), ip (inner product)

๐Ÿ” RETRIEVERS

retriever = vectorstore.as_retriever(
search_type="similarity", # or "mmr"
search_kwargs={"k": 5}
)
ย 
docs = retriever.invoke("my question")

Search Types

similarityPure cosine similarity (default)
mmrMaximal Marginal Relevance โ€” relevance + diversity
similarity_score_thresholdOnly return docs above a score cutoff

Advanced Retrievers

BM25RetrieverKeyword-based, no embeddings needed
EnsembleRetrieverCombine BM25 + vector (hybrid)
MultiQueryRetrieverLLM generates query variants
ContextualCompressionRetrieverLLM re-ranks / filters results
SelfQueryRetrieverAuto-extracts metadata filters from query

๐Ÿค– LLM (ChatOpenAI)

langchain_openai
from langchain_openai import ChatOpenAI
ย 
llm = ChatOpenAI(
model="gpt-4o-mini",
temperature=0.2,
# max_tokens=500,
# streaming=True,
)
Model$/1M inBest for
gpt-4o-mini$0.15Fast, cheap RAG
gpt-4o$2.50Complex reasoning
gpt-3.5-turbo$0.50Deprecated

โ›“๏ธ RAG CHAIN (LCEL)

the modern way
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import (
RunnablePassthrough, RunnableParallel
)
ย 
prompt = ChatPromptTemplate.from_messages([
("system", "Answer from context only.\n{context}"),
("human", "{question}"),
])
ย 
def format_docs(docs):
return "\n\n".join(d.page_content for d in docs)
ย 
# The RAG chain
rag_chain = (
RunnableParallel(
context=retriever | format_docs,
question=RunnablePassthrough(),
)
| prompt
| llm
| StrOutputParser()
)
ย 
# Use it
answer = rag_chain.invoke("my question")
ย 
# Stream it
for chunk in rag_chain.stream("my question"):
print(chunk, end="", flush=True)
โš LLMChain, ConversationalRetrievalChain, AgentExecutor are all deprecated. Use LCEL pipes.

โšก QUICK REFERENCE

Invoke / Stream / Batch

chain.invoke("question") # single
chain.stream("question") # token-by-token
chain.batch(["q1", "q2"]) # parallel
await chain.ainvoke("question") # async

Reload Existing Vector Store

vectorstore = Chroma(
persist_directory="./chroma_db",
embedding_function=embeddings,
collection_name="my_collection"
)

Direct LLM Messages

from langchain_core.messages import (
HumanMessage, SystemMessage, AIMessage
)
response = llm.invoke([
SystemMessage(content="You are helpful."),
HumanMessage(content="Hello!"),
])
LangChain v1.x ยท LCEL Pipes ยท langchain-openai ยท langchain-chroma ยท 2026