Skip to content

About

Who we are

We’re a two-person lab: pringled and stephantul.

We focus on building small, efficient, and effective models for natural language processing. Our goal is to make state-of-the-art NLP accessible, regardless of hardware budget or deployment constraints.

What you can do with Minish

  • Embed at speed: embed the entire English Wikipedia in minutes on commodity hardware.
  • Classify on CPU: run document classification at tens of thousands of texts per second.
  • Curate large datasets: deduplicate and clean large datasets quickly with semantic tooling.
  • Build lightweight retrieval: power fast retrieval systems without large embedding stacks.
  • Compare ANN backends: evaluate which nearest-neighbor approach works best for your own data.