noni — small models. working together. on your iPhone.

noni

small models. working together. on your iPhone.

on-device AI orchestration — several small open-weight models collaborate locally, and nothing you type ever leaves your phone.

download on the app store

coming to the App Store — free, iPhone, iOS 17+

private by design

Zero data collection — no account, no analytics, no tracking. Conversations are stored only on your device. The network is used for one thing: downloading model weights from Hugging Face.

a team, not a single model

A tiny router reads each request and hands it to the right specialist — general chat, coding, or writing. Relay mode chains models into pipelines like draft → polish; council mode has two models answer while a third merges the best of both.

yours, offline

After a model is downloaded, it works in airplane mode. Models are free, open-weight, and quantized to 4-bit for Apple silicon — small enough to live on your phone, quick enough to stream.

a noni chat answering a coding question with a copyable code card — the coding specialist answering, entirely on-device

the starter pack — qwen3 0.6b as router and llama 3.2 1b for everyday questions — is about 1.1 GB. gemma 3, qwen2.5 coder and larger llama models are one tap away.