Skip to main content
FastEmbed is MemWire’s default embedding backend. Models run entirely on-device — no OpenAI key, no network calls, no per-token cost. Two model types are used:
TypePurpose
Dense modelSemantic similarity — powers memory search and recall
Sparse modelKeyword matching — used alongside dense vectors in hybrid search

Default models

No configuration needed to get started — MemWire ships with sensible defaults.
from memwire import MemWire, MemWireConfig

config = MemWireConfig(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    embedding_dim=384,
    sparse_model_name="prithivida/Splade_PP_en_v1",
)
memory = MemWire(config=config)
ModelTypeDimensionsNotes
sentence-transformers/all-MiniLM-L6-v2Dense384Fast, multilingual-friendly, good general-purpose baseline
prithivida/Splade_PP_en_v1SparseSPLADE sparse model for hybrid BM25+vector search

Changing the dense model

You can swap the dense model for any FastEmbed-compatible model. Make sure embedding_dim matches the model’s output dimension.
# Smaller — faster inference, slightly lower accuracy
config = MemWireConfig(
    model_name="BAAI/bge-small-en-v1.5",
    embedding_dim=384,
)
# Balanced — good quality, moderate size
config = MemWireConfig(
    model_name="BAAI/bge-base-en-v1.5",
    embedding_dim=768,
)
# Larger — best quality, slower inference
config = MemWireConfig(
    model_name="BAAI/bge-large-en-v1.5",
    embedding_dim=1024,
)
If you change the embedding model after data has already been stored, existing vectors will be incompatible with new embeddings. Reset your Qdrant storage when switching models.

If you want dense-only retrieval and don’t need the sparse model to load:
config = MemWireConfig(
    use_hybrid_search=False,
)

Enabling reranking

A cross-encoder reranker re-scores the top candidates after initial retrieval. This improves precision at the cost of slightly higher latency.
config = MemWireConfig(
    use_reranking=True,
    reranker_model_name="Xenova/ms-marco-MiniLM-L-6-v2",
)

Configuration reference

ParameterDefaultDescription
model_namesentence-transformers/all-MiniLM-L6-v2Dense embedding model. Must be FastEmbed-compatible.
embedding_dim384Output dimension of the dense model. Must match the model.
sparse_model_nameprithivida/Splade_PP_en_v1Sparse model for hybrid BM25+vector search.
use_hybrid_searchTrueCombine dense and sparse vectors. Set to False for dense-only search.
use_rerankingFalseApply a cross-encoder reranker to re-score top candidates.
reranker_model_nameXenova/ms-marco-MiniLM-L-6-v2Cross-encoder reranker model (used when use_reranking=True).
embedding_cache_maxsize10000LRU cache size for embedding vectors. Increase for large-scale workloads.