Model2Vec-rs
Model2Vec-rs is a Rust crate providing an efficient implementation for inference with Model2Vec static embedding models. It’s ~1.7x faster than the Python version and is designed for high performance inference in Rust applications.
Quickstart
You can utilize model2vec-rs in two ways:
- As a library in your Rust projects
- As a standalone Command-Line Interface (CLI) tool for quick terminal-based inferencing
Using model2vec-rs as a Library
First, add model2vec-rs as a dependency:
cargo add model2vec-rsThen, you can use it like this:
use anyhow::Result;use model2vec_rs::model::StaticModel;
fn main() -> Result<()> { // Load a model from the Hugging Face Hub or a local path. // Arguments: (repo_or_path, hf_token, normalize_embeddings, subfolder_in_repo) let model = StaticModel::from_pretrained( "minishlab/potion-base-8M", // Model ID from Hugging Face or local path to model directory None, // Optional: Hugging Face API token for private models None, // Optional: bool to override model's default normalization. `None` uses model's config. None // Optional: subfolder if model files are not at the root of the repo/path )?;
let sentences = vec![ "Hello world".to_string(), "Rust is awesome".to_string(), ];
// Generate embeddings using default parameters // (Default max_length: Some(512), Default batch_size: 1024) let embeddings = model.encode(&sentences); // `embeddings` is a Vec<Vec<f32>> println!("Generated {} embeddings.", embeddings.len());
// To generate embeddings with custom arguments: let custom_embeddings = model.encode_with_args( &sentences, Some(256), // Optional: custom max token length for truncation 512, // Custom batch size for processing ); println!("Generated {} custom embeddings.", custom_embeddings.len());
Ok(())}Using the model2vec-rs CLI
Install model2vec-rs:
cargo install model2vec-rsCLI Usage
# Encode a single sentencemodel2vec-rs encode-single "Hello world" "minishlab/potion-base-8M"
# Encode multiple lines from a file and save to an output file:**echo -e "This is the first sentence.\nThis is another sentence." > my_texts.txtmodel2vec-rs encode my_texts.txt "minishlab/potion-base-8M" --output embeddings_output.jsonNote: ensure ~/.cargo/bin/ is in your system’s PATH to run model2vec-rs from any directory.