small models. working together. on your iPhone.
on-device AI orchestration — several small open-weight models collaborate locally, and nothing you type ever leaves your phone.
download on the app storecoming to the App Store — free, iPhone, iOS 17+
private by design
Zero data collection — no account, no analytics, no tracking. Conversations are stored only on your device. The network is used for one thing: downloading model weights from Hugging Face.
a team, not a single model
A tiny router reads each request and hands it to the right specialist — general chat, coding, or writing. Relay mode chains models into pipelines like draft → polish; council mode has two models answer while a third merges the best of both.
yours, offline
After a model is downloaded, it works in airplane mode. Models are free, open-weight, and quantized to 4-bit for Apple silicon — small enough to live on your phone, quick enough to stream.
the starter pack — qwen3 0.6b as router and llama 3.2 1b for everyday questions — is about 1.1 GB. gemma 3, qwen2.5 coder and larger llama models are one tap away.