Installation
To distill, make sure you install the distill extra:Distilling a Model from a Sentence Transformer
To distill a model from a Sentence Transformer, you can use thedistill
function. This function allows you to create a lightweight static model from any Sentence Transformer. This can be done on a CPU in a few minutes.
Parameters
Parameters
The model name to use. Any SentenceTransformer-compatible model works.
The vocabulary to use. If
None
, uses the model’s built-in vocabulary.The device on which to run distillation (e.g.,
"cpu"
, "cuda"
). If None
, defaults to the library’s device selection logic.The number of PCA components to retain. If
None
, PCA is skipped; if "auto"
, we still apply PCA without reducing dimensionality.Deprecated. Used to control whether Zipf weighting is applied. Now controlled via
sif_coefficient
. If None
, no Zipf weighting is applied.The SIF coefficient to use for weighting. Must be ≥ 0 and < 1. If
None
, no weighting is applied.A regex pattern. Tokens matching this pattern will be removed from the vocabulary before distillation.
Whether to trust remote code when loading components. If
False
, only components from transformers
are loaded; if True
, all remote code is trusted.The data type to quantize the distilled model to (e.g.,
DType.Float16
or its string equivalent). Defaults to float16 quantization.Deprecated. If not
None
, a warning is shown. Does not affect current behavior.