Why HypEmbed
- Pure-Rust inference from tokenizer to pooling layer
- No Python, ONNX Runtime, libtorch, or hosted API dependency
- Supports BERT-style encoder models such as MiniLM and DistilBERT
- Stable numerics, typed errors, and a compact public API
HypEmbed loads local BERT-family weights, tokenizes text, runs the full encoder stack in Rust, and returns vectors ready for search, retrieval, and ranking pipelines.
cargo add hypembed
use hypembed::{Embedder, EmbeddingOptions, PoolingStrategy};
let model = Embedder::load("./model")?;
let options = EmbeddingOptions::default()
.with_pooling(PoolingStrategy::Mean)
.with_normalize(true);
let embeddings = model.embed(
&["hello world", "rust embeddings"],
&options,
)?;
Overview, installation, and model directory conventions.
Internals ArchitectureTensor engine, tokenizer, model layout, and execution pipeline.
Direction RoadmapNear-term performance work and longer-term model support plans.
Reference API DocsGenerated Rust API documentation published with every push.