Skip to content

RAG Implementation Guide

rag

Preparation Phase

1. Define Business Requirements

  • Clearly outline the specific use cases and goals for the RAG system
  • Identify key performance indicators (KPIs) to measure success

2. Gather Test Documents

  • Pertinence - Documents must meet the business requirements
  • Representativeness - Documents should be representative of all the types of documents that your solution will use
  • Physical document quality - Documents need to be in a usable shape, e.g. clear document scans
  • Document content quality - Documents must have high content quality, e.g. there should not be misspellings or grammatical errors

Hints:

  • Remember to redact PII from real documents
  • Have at least two documents for each document type
  • If using synthetic documents, ensure they are as close to real data as possible
  • You can use LLMs to help evaluate the document quality

3. Gather Test Queries

  • Create a diverse set of queries covering various use cases
  • Include edge cases and potential challenging scenarios

Chunking Phase

1. Perform Document Analysis

  • What information should be ignored or excluded?
  • What information should be captured in chunks?
  • How should the document be chunked? (e.g. by sentence or fixed size with overlap)

2. Choose Chunking Method

  • Sentence-based parsing - Breaks text into chunks of complete sentences
  • Fixed-size parsing (with overlap) - Breaks text into fixed-size chunks with overlap
  • Custom code - Uses text parsing techniques like regex
  • LLM augmentation - Generates textual representations of images or summaries of tables using LLM
  • Document layout analysis - Combines OCR with deep learning to extract document structure and text
  • Prebuilt model - Uses models pre-trained for specific document types or genres
  • Custom model - Uses custom models for structured documents where no prebuilt model exists
  • Manual - Uses human-curated chunks
Method Tool Examples Effort Processing Cost
Sentence-based parsing
  • SpaCy sentence tokenizer
  • NLTK sentence tokenizer
Low Low
Fixed-size parsing
  • LangChain recursive text splitter
  • Hugging Face chunk visualizer
Low Low
Custom code
  • Python (re, regex, BeautifulSoup)
  • R (stringr, xml2)
Medium Low
LLM augmentation
  • Azure OpenAI
  • OpenAI
Medium High
Document layout analysis
  • Azure AI Document Intelligence
  • Donut
Medium Medium
Prebuilt model
  • Azure AI Document Intelligence
  • Power Automate
Low Medium/High
Custom model
  • Azure AI Document Intelligence
  • Tesseract
High Medium/High
Manual
  • Human reviewers
  • LabelStudio
High Low

Enrichment Phase

1. Clean Chunks

  • Lowercase text - Embeddings are usually case-sensitive meaning "Cheetah" and "cheetah" would result in a different vector despite having the same logical meaning
  • Remove stop words - Removing stop words like "a", "an" and "the" would allow both "a cheetah is faster than a puma" and "the cheetah is faster than the puma" to both be vectorially equal to "cheetah faster than puma."
  • Fix spelling mistakes
  • Remove unicode characters
  • Normalisation (expand abbreviations, convert numbers to words, expand contractions)

2. Augment Chunks with Metadata

Metadata can help filter the chunks prior to the semantic search or be used as part of it.

Examples of metadata fields:

  • Title
  • Summary
  • Keywords
  • Source
  • Language
  • Questions that the chunk can answer

Embedding Phase

1. Choose Embedding Model

2. Evaluate Embedding Model

Persisting Phase

Vector storage Examples:

- Pure vector databases: Pinecone, Weaviate
- Full-text search databases: Elasticsearch, OpenSearch
- Vector-capable SQL databases: PostgreSQL with pgvector
- Vector-capable NoSQL databases: MongoDB

Retrieval Phase

  • Choose between approximate nearest neighbor (ANN) or exact nearest neighbor search based on performance requirements
  • Implement hybrid search combining semantic and keyword-based approaches for improved results

2. Fine-tune Retrieval

  • Experiment with different similarity thresholds
  • Implement re-ranking strategies to improve relevance of retrieved chunks

Conclusion

Implementing a successful RAG solution requires careful consideration of each phase, from preparation to evaluation. By following this guide, you can create a robust and effective RAG system tailored to your specific needs.

References