- Model2Vec (docs, repo): Create state-of-the-art static embedding models by distilling Sentence Transformers.
- SemHash (docs, repo): Fast semantic text deduplication, outlier detection, and representative sampling.
- Vicinity (docs, repo): A lightweight library for efficient nearest neighbor search that supports various backends.
- Tokenlearn (docs, repo): Our method to pre-train static embedding models.
- Model2Vec-rs (docs, repo): Rust-native implementation of Model2Vec for high performance.