
We introduce Kimi K2, a Mixture-of-Experts large language model with 32 billion activated parameters and 1 trillion total parameters. We propose the MuonClip optimizer, which enhances Muon with a novel QK-clip technique to address training instability. The model was pre-trained on 15.5 trillion tokens with zero loss spike. We further develop a multi-stage post-training process featuring a large-scale agentic data synthesis pipeline and joint reinforcement learning where the model improves through interactions with real and synthetic environments. Evaluations show that Kimi K2 achieves state-of-the-art results including 66.1 on Tau2-Bench, 76.5 on ACEBench (En), 65.8 on SWE-Bench Verified, 47.3 on SWE-Bench Multilingual, 53.7 on LiveCodeBench v6, 49.5 on AIME 2025, 75.1 on GPQA-Diamond, and 27.1 on OJBench, all without extended thinking, making it one of the most capable open-source large language models to date.