Overview
Overview
Overview of the packages we are currently working on
-
Model2Vec (docs, repo): Create state-of-the-art static embedding models by distilling Sentence Transformers.
-
SemHash (docs, repo): Fast semantic text deduplication, outlier detection, and representative sampling.
-
Vicinity (docs, repo): A lightweight library for efficient nearest neighbor search that supports various backends.
-
Tokenlearn (docs, repo): Our method to pre-train static embedding models.
-
Model2Vec-rs (docs, repo): Rust-native implementation of Model2Vec for high performance.