Skip to content

Model2Vec-rs

Model2Vec-rs is a Rust crate providing an efficient implementation for inference with Model2Vec static embedding models. It’s ~1.7x faster than the Python version and is designed for high performance inference in Rust applications.

Quickstart

You can utilize model2vec-rs in two ways:

  1. As a library in your Rust projects
  2. As a standalone Command-Line Interface (CLI) tool for quick terminal-based inferencing

Using model2vec-rs as a Library

First, add model2vec-rs as a dependency:

Terminal window
cargo add model2vec-rs

Then, you can use it like this:

use anyhow::Result;
use model2vec_rs::model::StaticModel;
fn main() -> Result<()> {
// Load a model from the Hugging Face Hub or a local path.
// Arguments: (repo_or_path, hf_token, normalize_embeddings, subfolder_in_repo)
let model = StaticModel::from_pretrained(
"minishlab/potion-base-8M", // Model ID from Hugging Face or local path to model directory
None, // Optional: Hugging Face API token for private models
None, // Optional: bool to override model's default normalization. `None` uses model's config.
None // Optional: subfolder if model files are not at the root of the repo/path
)?;
let sentences = vec![
"Hello world".to_string(),
"Rust is awesome".to_string(),
];
// Generate embeddings using default parameters
// (Default max_length: Some(512), Default batch_size: 1024)
let embeddings = model.encode(&sentences);
// `embeddings` is a Vec<Vec<f32>>
println!("Generated {} embeddings.", embeddings.len());
// To generate embeddings with custom arguments:
let custom_embeddings = model.encode_with_args(
&sentences,
Some(256), // Optional: custom max token length for truncation
512, // Custom batch size for processing
);
println!("Generated {} custom embeddings.", custom_embeddings.len());
Ok(())
}

Using the model2vec-rs CLI

Install model2vec-rs:

Terminal window
cargo install model2vec-rs

CLI Usage

Terminal window
# Encode a single sentence
model2vec-rs encode-single "Hello world" "minishlab/potion-base-8M"
# Encode multiple lines from a file and save to an output file:**
echo -e "This is the first sentence.\nThis is another sentence." > my_texts.txt
model2vec-rs encode my_texts.txt "minishlab/potion-base-8M" --output embeddings_output.json

Note: ensure ~/.cargo/bin/ is in your system’s PATH to run model2vec-rs from any directory.