Basic Retriever
Alternative Names
- Vector Retriever
- Naive Retriever
- Baseline RAG
- Basic RAG
- Typical RAG
Required Graph Shape
Context
It’s useful to chunk large documents into smaller pieces when creating embeddings. An embedding is a text’s semantic representation capturing the meaning of what the text is about. If the given text is long and contains too many diverse subjects, the informative value of its embedding deteriorates.
Description
The user question is embedded using the same embedder that has been used before to create the chunk embeddings. A vector similarity search is executed on the chunk embeddings to retrieve k (number previously configured by developer / user) most similar chunks.
Usage
This pattern is useful if the user asks for specific information about a topic that exists in one or more (but not too many) chunks. The question should not require complex aggregations or knowledge about the whole dataset. Since the pattern only contains a vector similarity search it is easy to understand, implement and get started with.
Required pre-processing
Split documents into chunks and use an embedding model to embed the text content of the chunks. See chunking.
Retrieval Query
No additional query is necessary since the Neo4j Vector retriever retrieves similar chunks by default.
Further reading
- Advanced Retriever Techniques to Improve Your RAGs (Damian Gil, April 2024)
- Implementing advanced RAG strategies with Neo4j (November 2023)
Existing Implementations
- Neo4j GraphRAG - Vector Retriever
- Langchain Retrievers: Vector store-backed retriever
- Langchain: Neo4jVector