Skip to content

Basic Retriever

Alternative Names

  • Vector Retriever
  • Naive Retriever
  • Baseline RAG
  • Basic RAG
  • Typical RAG

Required Graph Shape

Lexical Graph

Context

It’s useful to chunk large documents into smaller pieces when creating embeddings. An embedding is a text’s semantic representation capturing the meaning of what the text is about. If the given text is long and contains too many diverse subjects, the informative value of its embedding deteriorates.

Description

The user question is embedded using the same embedder that has been used before to create the chunk embeddings. A vector similarity search is executed on the chunk embeddings to retrieve k (number previously configured by developer / user) most similar chunks.

Usage

This pattern is useful if the user asks for specific information about a topic that exists in one or more (but not too many) chunks. The question should not require complex aggregations or knowledge about the whole dataset. Since the pattern only contains a vector similarity search it is easy to understand, implement and get started with.

Required pre-processing

Split documents into chunks and use an embedding model to embed the text content of the chunks. See chunking.

Retrieval Query

No additional query is necessary since the Neo4j Vector retriever retrieves similar chunks by default.

Further reading

Existing Implementations

Example Implementations