Kimi-K2.5

Version: 1

Moonshot AI•Last updated February 2026

Kimi K2.5 is an open-source, native multimodal agentic model built through continual pretraining on approximately 15 trillion mixed visual and text tokens atop Kimi-K2-Base.

Reasoning

Multilingual

Direct from Azure models

Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:

Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all as part of one Microsoft Foundry platform.
Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Microsoft Foundry; reducing integration effort.
Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.

Learn more about Direct from Azure models .

Key capabilities

About this model

Kimi K2.5 is an open-source, native multimodal agentic model built through continual pretraining on approximately 15 trillion mixed visual and text tokens atop Kimi-K2-Base. It seamlessly integrates vision and language understanding with advanced agentic capabilities, instant and thinking modes, as well as conversational and agentic paradigms.

Key model capabilities

Native Multimodality: Pre-trained on vision–language tokens, K2.5 excels in visual knowledge, cross-modal reasoning, and agentic tool use grounded in visual inputs.
Coding with Vision: K2.5 generates code from visual specifications (UI designs, video workflows) and autonomously orchestrates tools for visual data processing.
Agent Swarm: K2.5 transitions from single-agent scaling to a self-directed, coordinated swarm-like execution scheme. It decomposes complex tasks into parallel sub-tasks executed by dynamically instantiated, domain-specific agents.

Use cases

See Responsible AI for additional considerations for responsible use.

Key use cases

The provider has not supplied this information.

Out of scope use cases

The provider has not supplied this information.

Pricing

Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.

Technical specs


Architecture	Mixture-of-Experts (MoE)
Total Parameters	1T
Activated Parameters	32B
Number of Layers (Dense layer included)	61
Number of Dense Layers	1
Attention Hidden Dimension	7168
MoE Hidden Dimension (per Expert)	2048
Number of Attention Heads	64
Number of Experts	384
Selected Experts per Token	8
Number of Shared Experts	1
Vocabulary Size	160K
Context Length	256K
Attention Mechanism	MLA
Activation Function	SwiGLU
Vision Encoder	MoonViT
Parameters of Vision Encoder	400M

Training cut-off date

The provider has not supplied this information.

Training time

The provider has not supplied this information.

Input formats

The provider has not supplied this information.

Output formats

The provider has not supplied this information.

Supported languages

The provider has not supplied this information.

Sample JSON response

The provider has not supplied this information.

Model architecture

The provider has not supplied this information.

Long context

The provider has not supplied this information.

Optimizing model performance

The provider has not supplied this information.

Additional assets

Please see MoonshotAI's Kimi-K2-Thinking model card here.

Training disclosure

Training, testing and validation

The provider has not supplied this information.

Distribution

Distribution channels

The provider has not supplied this information.

More information

The provider has not supplied this information.

Responsible AI considerations

Safety techniques

Kimi-K2.5 poses an elevated risk of producing content that would be blocked by the Foundry Models Protected Material Detection filter . When deployed via Microsoft Foundry, prompts and completions are passed through a default configuration of classification models to detect and prevent the output of harmful content. We recommend customers use the Protected Material Detection filter in conjunction with this model. As with any model, customers should conduct thorough evaluations on production systems before launching, as well as appropriate post-launch monitoring. All customers must comply with the Microsoft Enterprise AI Services Code of Conduct. Configuration options for content filtering vary when you deploy a model for production in Azure AI; learn more .

Safety evaluations

The provider has not supplied this information.

Known limitations

The provider has not supplied this information.

Acceptable use

Acceptable use policy

The provider has not supplied this information.

Benchmark	^{Kimi K2.5} (Thinking)	^GPT-5.2 (xhigh)	^{Claude 4.5 Opus} (Extended Thinking)	^{Gemini 3 Pro} (High Thinking Level)	^{DeepSeek V3.2} (Thinking)	^{Qwen3-VL- 235B-A22B- Thinking}
Reasoning & Knowledge
HLE-Full	30.1	34.5	30.8	37.5	25.1^†	-
HLE-Full (w/ tools)	50.2	45.5	43.2	45.8	40.8^†	-
AIME 2025	96.1	100	92.8	95.0	93.1	-
HMMT 2025 (Feb)	95.4	99.4	92.9*	97.3*	92.5	-
IMO-AnswerBench	81.8	86.3	78.5*	83.1*	78.3	-
GPQA-Diamond	87.6	92.4	87.0	91.9	82.4	-
MMLU-Pro	87.1	86.7*	89.3*	90.1	85.0	-
Image & Video
MMMU-Pro	78.5	79.5*	74.0	81.0	-	69.3
CharXiv (RQ)	77.5	82.1	67.2*	81.4	-	66.1
MathVision	84.2	83.0	77.1*	86.1*	-	74.6
MathVista (mini)	90.1	82.8*	80.2*	89.8*	-	85.8
ZeroBench	9	9*	3*	8*	-	4*
ZeroBench (w/ tools)	11	7*	9*	12*	-	3*
OCRBench	92.3	80.7*	86.5*	90.3*	-	87.5
OmniDocBench 1.5	88.8	85.7	87.7*	88.5	-	82.0*
InfoVQA (val)	92.6	84*	76.9*	57.2*	-	89.5
SimpleVQA	71.2	55.8*	69.7*	69.7*	-	56.8*
WorldVQA	46.3	28.0	36.8	47.4	-	23.5
VideoMMMU	86.6	85.9	84.4*	87.6	-	80.0
MMVU	80.4	80.8*	77.3	77.5	-	71.1
MotionBench	70.4	64.8	60.3	70.3	-	-
VideoMME	87.4	86.0*	-	88.4*	-	79.0
LongVideoBench	79.8	76.5*	67.2*	77.7*	-	65.6*
LVBench	75.9	-	-	73.5*	-	63.6
Coding
SWE-Bench Verified	76.8	80.0	80.9	76.2	73.1	-
SWE-Bench Pro	50.7	55.6	55.4*	-	-	-
SWE-Bench Multilingual	73.0	72.0	77.5	65.0	70.2	-
Terminal Bench 2.0	50.8	54.0	59.3	54.2	46.4	-
PaperBench	63.5	63.7*	72.9*	-	47.1	-
CyberGym	41.3	-	50.6	39.9*	17.3*	-
SciCode	48.7	52.1	49.5	56.1	38.9	-
OJBench (cpp)	57.4	-	54.6*	68.5*	54.7*	-
LiveCodeBench (v6)	85.0	-	82.2*	87.4*	83.3	-
Long Context
Longbench v2	61.0	54.5*	64.4*	68.2*	59.8*	-
AA-LCR	70.0	72.3*	71.3*	65.3*	64.3*	-
Agentic Search
BrowseComp	60.6	65.8	37.0	37.8	51.4	-
BrowseComp (w/ctx manage)	74.9	65.8	57.8	59.2	67.6	-
BrowseComp (Agent Swarm)	78.4	-	-	-	-	-
WideSearch (item-f1)	72.7	-	76.2*	57.0	32.5*	-
WideSearch (item-f1 Agent Swarm)	79.0	-	-	-	-	-
DeepSearchQA	77.1	71.3*	76.1*	63.2*	60.9*	-
FinSearchCompT2&T3	67.8	-	66.2*	49.9	59.1*	-
Seal-0	57.4	45.0	47.7*	45.5*	49.5*	-

Model Specifications

Context Length262144

Quality Index0.76

LicenseOther

Last UpdatedFebruary 2026

Input TypeText,Image

Output TypeText

ProviderMoonshot AI

Languages1 Language

Quick Start

Sign in to get started View Documentation Azure Support

Related Models