Skip to content

Inference

Loading a Model

To load a pre-trained model, you can use the StaticModel.from_pretrained method. This method allows you to load models from the Hugging Face Hub or from a local path.

from model2vec import StaticModel
model = StaticModel.from_pretrained("minishlab/potion-base-8M")
Parameters
ParameterTypeDefaultDescription
pathPathLikeThe path to load your static model from.
tokenstr | NoneNoneThe Hugging Face token to use when loading a private model. If not provided, no token is used.
normalizebool | NoneNoneWhether to normalize the embeddings after loading. If True, embeddings will be normalized; if False, they won’t. If None, the model’s default behavior applies.
subfolderstr | NoneNoneThe subfolder within the repository or local directory to load the model from. Leave as None to load from the root.
quantize_tostr | DType | NoneNoneThe data type to quantize the model to (e.g., "float16" or a torch.dtype). If a string is passed, it’s converted to the corresponding DType. Set to None for no quantization.
dimensionalityint | NoneNoneThe dimensionality to load the model at. If None, uses the model’s inherent dimensionality. Useful when loading a model with reduced dimensions (e.g., trained via PCA or MRL).

Creating Mean Embeddings

To create mean embeddings, you can use the encode method of the StaticModel class. This method allows you to encode a list of sentences into mean embeddings.

from model2vec import StaticModel
model = StaticModel.from_pretrained("minishlab/potion-base-8M")
embeddings = model.encode(["Hello world", "Static embeddings are great!"])
Parameters
ParameterTypeDefaultDescription
sentencesSequence[str]The list of sentences to encode. You can also pass a single sentence.
show_progress_barboolFalseWhether to show the progress bar.
max_lengthint | None512The maximum length of the sentences. Any tokens beyond this length will be truncated. If this is None, no truncation is done.
batch_sizeint1024The batch size to use.
use_multiprocessingboolTrueWhether to use multiprocessing. By default, this is enabled for inputs > multiprocessing_threshold sentences and disabled otherwise.
multiprocessing_thresholdint10000The threshold in number of sentences for using multiprocessing.
kwargsAnyAny additional arguments. These are ignored.

Creating Sequence Embeddings

To create sequence embeddings, you can use the encode_as_sequence method of the StaticModel class. This method allows you to encode a list of sentences into sequence embeddings, which are useful for tasks where you need a single embedding per token.

from model2vec import StaticModel
model = StaticModel.from_pretrained("minishlab/potion-base-8M")
embeddings = model.encode_as_sequence(["Hello world", "Static embeddings are great!"], mode="sequence")
Parameters
ParameterTypeDefaultDescription
sentencesSequence[str]The list of sentences to encode. You can also pass a single sentence.
show_progress_barboolFalseWhether to show the progress bar.
max_lengthint | None512The maximum length of the sentences. Any tokens beyond this length will be truncated. If this is None, no truncation is done.
batch_sizeint1024The batch size to use.
use_multiprocessingboolTrueWhether to use multiprocessing. By default, this is enabled for inputs > multiprocessing_threshold sentences and disabled otherwise.
multiprocessing_thresholdint10000The threshold in number of sentences for using multiprocessing.
kwargsAnyAny additional arguments. These are ignored.