FastEmbed is MemWire’s default embedding backend. Models run entirely on-device — no OpenAI key, no network calls, no per-token cost.
Two model types are used:
| Type | Purpose |
|---|
| Dense model | Semantic similarity — powers memory search and recall |
| Sparse model | Keyword matching — used alongside dense vectors in hybrid search |
Default models
No configuration needed to get started — MemWire ships with sensible defaults.
from memwire import MemWire, MemWireConfig
config = MemWireConfig(
model_name="sentence-transformers/all-MiniLM-L6-v2",
embedding_dim=384,
sparse_model_name="prithivida/Splade_PP_en_v1",
)
memory = MemWire(config=config)
| Model | Type | Dimensions | Notes |
|---|
sentence-transformers/all-MiniLM-L6-v2 | Dense | 384 | Fast, multilingual-friendly, good general-purpose baseline |
prithivida/Splade_PP_en_v1 | Sparse | — | SPLADE sparse model for hybrid BM25+vector search |
Changing the dense model
You can swap the dense model for any FastEmbed-compatible model. Make sure embedding_dim matches the model’s output dimension.
# Smaller — faster inference, slightly lower accuracy
config = MemWireConfig(
model_name="BAAI/bge-small-en-v1.5",
embedding_dim=384,
)
# Balanced — good quality, moderate size
config = MemWireConfig(
model_name="BAAI/bge-base-en-v1.5",
embedding_dim=768,
)
# Larger — best quality, slower inference
config = MemWireConfig(
model_name="BAAI/bge-large-en-v1.5",
embedding_dim=1024,
)
If you change the embedding model after data has already been stored, existing vectors will be incompatible with new embeddings. Reset your Qdrant storage when switching models.
Disabling hybrid search
If you want dense-only retrieval and don’t need the sparse model to load:
config = MemWireConfig(
use_hybrid_search=False,
)
Enabling reranking
A cross-encoder reranker re-scores the top candidates after initial retrieval. This improves precision at the cost of slightly higher latency.
config = MemWireConfig(
use_reranking=True,
reranker_model_name="Xenova/ms-marco-MiniLM-L-6-v2",
)
Configuration reference
| Parameter | Default | Description |
|---|
model_name | sentence-transformers/all-MiniLM-L6-v2 | Dense embedding model. Must be FastEmbed-compatible. |
embedding_dim | 384 | Output dimension of the dense model. Must match the model. |
sparse_model_name | prithivida/Splade_PP_en_v1 | Sparse model for hybrid BM25+vector search. |
use_hybrid_search | True | Combine dense and sparse vectors. Set to False for dense-only search. |
use_reranking | False | Apply a cross-encoder reranker to re-score top candidates. |
reranker_model_name | Xenova/ms-marco-MiniLM-L-6-v2 | Cross-encoder reranker model (used when use_reranking=True). |
embedding_cache_maxsize | 10000 | LRU cache size for embedding vectors. Increase for large-scale workloads. |