Inference
Loading a Model
To load a pre-trained model, you can use the StaticModel.from_pretrained method. This method allows you to load models from the Hugging Face Hub or from a local path.
from model2vec import StaticModel
model = StaticModel.from_pretrained("minishlab/potion-base-8M")Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
path | PathLike | The path to load your static model from. | |
token | str | None | None | The Hugging Face token to use when loading a private model. If not provided, no token is used. |
normalize | bool | None | None | Whether to normalize the embeddings after loading. If True, embeddings will be normalized; if False, they won’t. If None, the model’s default behavior applies. |
subfolder | str | None | None | The subfolder within the repository or local directory to load the model from. Leave as None to load from the root. |
quantize_to | str | DType | None | None | The data type to quantize the model to (e.g., "float16" or a torch.dtype). If a string is passed, it’s converted to the corresponding DType. Set to None for no quantization. |
dimensionality | int | None | None | The dimensionality to load the model at. If None, uses the model’s inherent dimensionality. Useful when loading a model with reduced dimensions (e.g., trained via PCA or MRL). |
Creating Mean Embeddings
To create mean embeddings, you can use the encode method of the StaticModel class. This method allows you to encode a list of sentences into mean embeddings.
from model2vec import StaticModel
model = StaticModel.from_pretrained("minishlab/potion-base-8M")embeddings = model.encode(["Hello world", "Static embeddings are great!"])Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
sentences | Sequence[str] | The list of sentences to encode. You can also pass a single sentence. | |
show_progress_bar | bool | False | Whether to show the progress bar. |
max_length | int | None | 512 | The maximum length of the sentences. Any tokens beyond this length will be truncated. If this is None, no truncation is done. |
batch_size | int | 1024 | The batch size to use. |
use_multiprocessing | bool | True | Whether to use multiprocessing. By default, this is enabled for inputs > multiprocessing_threshold sentences and disabled otherwise. |
multiprocessing_threshold | int | 10000 | The threshold in number of sentences for using multiprocessing. |
kwargs | Any | Any additional arguments. These are ignored. |
Creating Sequence Embeddings
To create sequence embeddings, you can use the encode_as_sequence method of the StaticModel class. This method allows you to encode a list of sentences into sequence embeddings, which are useful for tasks where you need a single embedding per token.
from model2vec import StaticModelmodel = StaticModel.from_pretrained("minishlab/potion-base-8M")embeddings = model.encode_as_sequence(["Hello world", "Static embeddings are great!"], mode="sequence")Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
sentences | Sequence[str] | The list of sentences to encode. You can also pass a single sentence. | |
show_progress_bar | bool | False | Whether to show the progress bar. |
max_length | int | None | 512 | The maximum length of the sentences. Any tokens beyond this length will be truncated. If this is None, no truncation is done. |
batch_size | int | 1024 | The batch size to use. |
use_multiprocessing | bool | True | Whether to use multiprocessing. By default, this is enabled for inputs > multiprocessing_threshold sentences and disabled otherwise. |
multiprocessing_threshold | int | 10000 | The threshold in number of sentences for using multiprocessing. |
kwargs | Any | Any additional arguments. These are ignored. |