Qdrant - Vector Database

Native BM25 and Full-Text Search

Running Elasticsearch for keyword search and Qdrant for semantic search? Two databases, double the deployment, constant synchronization. With native BM25 in Qdrant - it's one service.

  • Replacing Elasticsearch + vector DB with a single Qdrant instance in a RAG system
  • Autocomplete with prefix tokenizer without a separate service
  • Multilingual search with Unicode-aware tokenization for 100+ languages

Предварительные знания

  • Sparse Vectors: BM42 and SPLADE
  • Payload Indexes

Native BM25: Qdrant Computes It Itself

Qdrant v1.16+ introduced **native BM25**: Qdrant computes sparse BM25 vectors from text payload directly, without external libraries like FastEmbed or Elasticsearch. Simply create a `text` index on a payload field - and the collection gains lexical search capability.

The difference between a **text filter** and a **BM25 sparse index**: - `match: {text: 'query'}` - a filter (boolean: contains or not), no ranking - BM25 sparse vector - ranking by TF-IDF weights with IDF computed across the whole collection Both mechanisms use a text index, but they produce different results.

TokenizerDescriptionWhen to use
wordSplit by words (spaces, punctuation)Standard English text
whitespaceSplit on spaces only, no punctuation splittingCode, URLs, IDs
prefixword + all prefixes of each tokenAutocomplete, prefix search
multilingualUnicode-aware tokenizer for multilingual textRussian, Arabic, CJK languages

How does native BM25 in Qdrant differ from a text filter using `match: {text: 'query'}`?

Full-Text Search and Tokenization

After creating a text index, Qdrant supports two types of text search: **filter-based** (via payload filter) and **BM25 ranking** (via sparse vector query). Let's look at both and learn how to combine them.

**ASCII folding** and **stemming**: the text tokenizer with `lowercase: true` automatically normalizes casing. For ASCII folding (é → e, ñ → n) and stemming, use the `multilingual` tokenizer. The prefix tokenizer creates sub-tokens: the word `hello` → tokens `h`, `he`, `hel`, `hell`, `hello` - enabling prefix search without wildcard queries.

For autocomplete, use the `prefix` tokenizer with a filter rather than a BM25 query. The prefix filter is faster and more precise for this use case: `{key: 'title', match: {text: 'mach'}}` will find all documents starting with 'mach'.

Which tokenizer is best suited for autocomplete functionality?

Full Hybrid Search Without Elasticsearch

The traditional hybrid search architecture: **Elasticsearch** (lexical) + **vector DB** (semantic). With native BM25 in Qdrant, the entire system fits into one service. This simplifies deployment, reduces latency, and eliminates the data synchronization problem between two databases.

**Native BM25 vs FastEmbed BM42**: with native BM25, no external embedder is needed for the sparse part - Qdrant computes everything from the payload. FastEmbed BM42 generates sparse vectors on the client (attention-weighted). For most cases, native BM25 is simpler and sufficiently accurate. BM42 offers better quality but requires a separate embedder service.

Do you need to manually compute the BM25 sparse vector before upserting when using native BM25 in Qdrant v1.16+?

Summary

  • Native BM25 in Qdrant v1.16+: create a text payload index - the sparse vector is computed automatically
  • Text filter (`match: {text}`) - boolean match; BM25 query - TF-IDF ranking
  • 4 tokenizers: word (standard), whitespace (code), prefix (autocomplete), multilingual (Unicode)
  • Full hybrid search = dense (semantic) + BM25 (lexical) + RRF fusion - all in one Qdrant instance
  • Native BM25 is simpler to deploy than FastEmbed BM42; BM42 gives better quality at the cost of an external embedder

What's Next

Native BM25 covers the lexical side. Now let's add advanced exploration on top.

  • Discovery Search — Exploration on top of hybrid search: context-guided retrieval
  • FastEmbed BM42 — Alternative to native BM25: attention-weighted sparse embeddings
  • Payload Index — Fundamentals: how payload indexes work in Qdrant

Вопросы для размышления

  • In what scenario would you keep Elasticsearch alongside Qdrant rather than replacing it?
  • How would you implement multilingual autocomplete (EN + FR) with minimal resources?
  • What would change in your RAG system architecture if you switched from Elasticsearch + Qdrant to a single Qdrant with native BM25?

Связанные уроки

  • alg-10-binary-search
Native BM25 and Full-Text Search

0

1

Sign In