All Models

Xiaomi: MiMo-V2-Omni

Reasoning Tool Calling Attachments Open Weights

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step...

Providers 1
Released Mar 18, 2026
Input Modalities audio, image, text, video
Output Modalities text
Tarsk Use coding

Available Providers (1)

Provider Model ID Input Cost Output Cost Context Max Output Docs
Kilo Gateway xiaomi/mimo-v2-omni $0.40/MTok $2.00/MTok 262.1K 65.5K

Capabilities

Reasoning
Tool Calling
Attachments
Open Weights
Structured Output