January 20 – DeepSeek releases DeepSeek-R1, a large language model based on DeepSeek-V3 utilising a chain-of-thought, stating it achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.[1] DeepSeek-R1 is open-source.
January 23 – Humanity's Last Exam, a benchmark for large language models, is published. The dataset consists of 3,000 challenging questions across over a hundred subjects.[4]
January 27
Nvidia's stock falls by as much as 17–18%, after the release of DeepSeek-R1.[5]
DeepSeek-R1 surpasses ChatGPT as the most-downloaded free app on the iOS App Store in the United States.[6]
February 3 – OpenAI releases ChatGPT Deep Research, an artificial intelligence system integrated into ChatGPT,[7] which generates cited reports on a user-specified topic by autonomously browsing the web for 5 to 30 minutes.[8]
February 6 – Mistral AI releases Le Chat, an AI assistant able to answer up to 1,000 words per second.[9]
Google launches A.I. Mode, which will be a feature on their search engine, and uses the Gemini model.[16]
Google DeepMind announces Veo 3, a new state-of-the-art video generation model.[17] The company also boosts the performance of Gemini 2.5 Pro, its flagship AI model.[18]
22 May – Anthropic releases Claude 4, with two models: Claude Opus 4 and Claude Sonnet 4. According to Anthropic, Claude 4 can function on its own for hours.[19]