HypEmbed | Pure-Rust Text Embeddings

Why HypEmbed

Pure-Rust inference from tokenizer to pooling layer
No Python, ONNX Runtime, libtorch, or hosted API dependency
Supports BERT-style encoder models such as MiniLM and DistilBERT
Stable numerics, typed errors, and a compact public API

Current Scope

Load local `config.json`, `vocab.txt`, and `model.safetensors`
Mean pooling and CLS pooling
F32, F16, and BF16 weights converted to `f32` for inference
CPU-only execution today

Quick start

use hypembed::{Embedder, EmbeddingOptions, PoolingStrategy};

let model = Embedder::load("./model")?;

let options = EmbeddingOptions::default()
    .with_pooling(PoolingStrategy::Mean)
    .with_normalize(true);

let embeddings = model.embed(
    &["hello world", "rust embeddings"],
    &options,
)?;

Start here README

Overview, installation, and model directory conventions.

Internals Architecture

Tensor engine, tokenizer, model layout, and execution pipeline.

Direction Roadmap

Near-term performance work and longer-term model support plans.

Reference API Docs

Generated Rust API documentation published with every push.