RAG Implementation Guide
Preparation Phase
1. Define Business Requirements
- Clearly outline the specific use cases and goals for the RAG system
- Identify key performance indicators (KPIs) to measure success
2. Gather Test Documents
- Pertinence - Documents must meet the business requirements
- Representativeness - Documents should be representative of all the types of documents that your solution will use
- Physical document quality - Documents need to be in a usable shape, e.g. clear document scans
- Document content quality - Documents must have high content quality, e.g. there should not be misspellings or grammatical errors
Hints:
- Remember to redact PII from real documents
- Have at least two documents for each document type
- If using synthetic documents, ensure they are as close to real data as possible
- You can use LLMs to help evaluate the document quality
3. Gather Test Queries
- Create a diverse set of queries covering various use cases
- Include edge cases and potential challenging scenarios
Chunking Phase
1. Perform Document Analysis
- What information should be ignored or excluded?
- What information should be captured in chunks?
- How should the document be chunked? (e.g. by sentence or fixed size with overlap)
2. Choose Chunking Method
- Sentence-based parsing - Breaks text into chunks of complete sentences
- Fixed-size parsing (with overlap) - Breaks text into fixed-size chunks with overlap
- Custom code - Uses text parsing techniques like regex
- LLM augmentation - Generates textual representations of images or summaries of tables using LLM
- Document layout analysis - Combines OCR with deep learning to extract document structure and text
- Prebuilt model - Uses models pre-trained for specific document types or genres
- Custom model - Uses custom models for structured documents where no prebuilt model exists
- Manual - Uses human-curated chunks
Method | Tool Examples | Effort | Processing Cost |
---|---|---|---|
Sentence-based parsing |
|
Low | Low |
Fixed-size parsing |
|
Low | Low |
Custom code |
|
Medium | Low |
LLM augmentation |
|
Medium | High |
Document layout analysis |
|
Medium | Medium |
Prebuilt model |
|
Low | Medium/High |
Custom model |
|
High | Medium/High |
Manual |
|
High | Low |
Enrichment Phase
1. Clean Chunks
- Lowercase text - Embeddings are usually case-sensitive meaning "Cheetah" and "cheetah" would result in a different vector despite having the same logical meaning
- Remove stop words - Removing stop words like "a", "an" and "the" would allow both "a cheetah is faster than a puma" and "the cheetah is faster than the puma" to both be vectorially equal to "cheetah faster than puma."
- Fix spelling mistakes
- Remove unicode characters
- Normalisation (expand abbreviations, convert numbers to words, expand contractions)
2. Augment Chunks with Metadata
Metadata can help filter the chunks prior to the semantic search or be used as part of it.
Examples of metadata fields:
- Title
- Summary
- Keywords
- Source
- Language
- Questions that the chunk can answer
Embedding Phase
1. Choose Embedding Model
2. Evaluate Embedding Model
- Visualise your embeddings using tools such as t-SNE from Scikit-learn
- Calculate embedding distances using Euclidean or Manhattan distance
Persisting Phase
Vector storage Examples:
- Pure vector databases: Pinecone, Weaviate
- Full-text search databases: Elasticsearch, OpenSearch
- Vector-capable SQL databases: PostgreSQL with pgvector
- Vector-capable NoSQL databases: MongoDB
Retrieval Phase
1. Implement Semantic Search
- Choose between approximate nearest neighbor (ANN) or exact nearest neighbor search based on performance requirements
- Implement hybrid search combining semantic and keyword-based approaches for improved results
2. Fine-tune Retrieval
- Experiment with different similarity thresholds
- Implement re-ranking strategies to improve relevance of retrieved chunks
Conclusion
Implementing a successful RAG solution requires careful consideration of each phase, from preparation to evaluation. By following this guide, you can create a robust and effective RAG system tailored to your specific needs.