Model2Vec-rs is a Rust crate providing an efficient implementation for inference with Model2Vec static embedding models.
It’s ~1.7x faster than the Python version and is designed for high performance inference in Rust applications.
Quickstart
You can utilize model2vec-rs
in two ways:
- As a library in your Rust projects
- As a standalone Command-Line Interface (CLI) tool for quick terminal-based inferencing
Using model2vec-rs
as a Library
First, add model2vec-rs
as a dependency:
Then, you can use it like this:
use anyhow::Result;
use model2vec_rs::model::StaticModel;
fn main() -> Result<()> {
// Load a model from the Hugging Face Hub or a local path.
// Arguments: (repo_or_path, hf_token, normalize_embeddings, subfolder_in_repo)
let model = StaticModel::from_pretrained(
"minishlab/potion-base-8M", // Model ID from Hugging Face or local path to model directory
None, // Optional: Hugging Face API token for private models
None, // Optional: bool to override model's default normalization. `None` uses model's config.
None // Optional: subfolder if model files are not at the root of the repo/path
)?;
let sentences = vec![
"Hello world".to_string(),
"Rust is awesome".to_string(),
];
// Generate embeddings using default parameters
// (Default max_length: Some(512), Default batch_size: 1024)
let embeddings = model.encode(&sentences);
// `embeddings` is a Vec<Vec<f32>>
println!("Generated {} embeddings.", embeddings.len());
// To generate embeddings with custom arguments:
let custom_embeddings = model.encode_with_args(
&sentences,
Some(256), // Optional: custom max token length for truncation
512, // Custom batch size for processing
);
println!("Generated {} custom embeddings.", custom_embeddings.len());
Ok(())
}
Using the model2vec-rs
CLI
Install model2vec-rs:
cargo install model2vec-rs
CLI Usage
# Encode a single sentence
model2vec-rs encode-single "Hello world" "minishlab/potion-base-8M"
# Encode multiple lines from a file and save to an output file:**
echo -e "This is the first sentence.\nThis is another sentence." > my_texts.txt
model2vec-rs encode my_texts.txt "minishlab/potion-base-8M" --output embeddings_output.json
Note: ensure ~/.cargo/bin/
is in your system’s PATH
to run model2vec-rs
from any directory.