About
Who we are
We’re a two-person lab: pringled and stephantul.
We focus on building small, efficient, and effective models for natural language processing. Our goal is to make state-of-the-art NLP accessible, regardless of hardware budget or deployment constraints.
What you can do with Minish
- Embed at speed: embed the entire English Wikipedia in minutes on commodity hardware.
- Classify on CPU: run document classification at tens of thousands of texts per second.
- Curate large datasets: deduplicate and clean large datasets quickly with semantic tooling.
- Build lightweight retrieval: power fast retrieval systems without large embedding stacks.
- Compare ANN backends: evaluate which nearest-neighbor approach works best for your own data.