Hybrid Search

How it works

Pure semantic (dense) search misses exact-match queries like product codes, names, or technical terms. Pure keyword (sparse) search misses paraphrases and synonyms. MemWire combines both using Qdrant’s hybrid search with reciprocal rank fusion:

Dense vectors — sentence embeddings from TextEmbedding (FastEmbed), capturing semantic meaning
Sparse vectors — SPLADE token weights from SparseTextEmbedding, capturing exact lexical signal

Both vectors are computed for every stored memory and every query. Results are merged by Qdrant before being returned to MemWire.

Enabling hybrid search

Hybrid search is on by default. No extra configuration needed:

from memwire import MemWire, MemWireConfig

config = MemWireConfig(qdrant_path="./memwire_data")   # use_hybrid_search=True by default
memory = MemWire(config=config)

Disabling hybrid search

If you want dense-only retrieval (faster, lower memory):

config = MemWireConfig(
    qdrant_path="./memwire_data",
    use_hybrid_search=False,
)

Disabling hybrid search means the sparse model is never loaded. This saves ~200 MB of RAM but reduces retrieval quality for exact-match queries.

Adding a cross-encoder reranker

For the highest retrieval quality, combine hybrid search with a cross-encoder reranker. The reranker re-scores the top candidates using full query-document attention:

config = MemWireConfig(
    qdrant_path="./memwire_data",
    use_hybrid_search=True,
    use_reranking=True,
    reranker_model_name="Xenova/ms-marco-MiniLM-L-6-v2",  # default
)
memory = MemWire(config=config)

results = memory.search("deadline for the project", user_id="alice", top_k=5)
for record, score in results:
    print(f"[{score:.3f}] {record.content}")

The reranker is lazy-loaded — it is only downloaded and initialised on the first search() call.

Models

Role	Default model	Notes
Dense embedding	`sentence-transformers/all-MiniLM-L6-v2`	384-dim, fast
Sparse embedding	`prithivida/Splade_PP_en_v1`	SPLADE++
Reranker	`Xenova/ms-marco-MiniLM-L-6-v2`	Cross-encoder, optional

Swap any model via MemWireConfig:

config = MemWireConfig(
    model_name="BAAI/bge-small-en-v1.5",
    sparse_model_name="prithivida/Splade_PP_en_v1",
    reranker_model_name="Xenova/ms-marco-MiniLM-L-6-v2",
    embedding_dim=384,
)

Configuration reference

Parameter	Default	Description
`use_hybrid_search`	`True`	Combine dense and sparse vectors for retrieval.
`use_reranking`	`False`	Apply a cross-encoder reranker to top results.
`model_name`	`sentence-transformers/all-MiniLM-L6-v2`	Dense embedding model.
`sparse_model_name`	`prithivida/Splade_PP_en_v1`	Sparse (SPLADE) embedding model.
`reranker_model_name`	`Xenova/ms-marco-MiniLM-L-6-v2`	Cross-encoder reranker model.
`embedding_dim`	`384`	Dimension of the dense embedding.
`embedding_cache_maxsize`	`10000`	LRU cache size for embedding vectors.

Getting started

Features

Configuration

How it works

Enabling hybrid search

Disabling hybrid search

Adding a cross-encoder reranker

Models

Configuration reference

Getting started

Features

Configuration

​How it works

​Enabling hybrid search

​Disabling hybrid search

​Adding a cross-encoder reranker

​Models

​Configuration reference

How it works

Enabling hybrid search

Disabling hybrid search

Adding a cross-encoder reranker

Models

Configuration reference