Google just gave us something new to talk about: an experimental version of its latest model, Gemini 2.5 Pro. This is the newest and what Google believes to be its most intelligent AI model yet. Plus, Google is now providing free access to this experimental version for all users, allowing more people to experience its newest AI capabilities firsthand. Let’s look closer at how Gemini 2.5 Pro works and why it’s worth paying attention to.
So, what’s different here? Google’s Gemini 2.5 family introduces the idea of “thinking models”. The experimental Pro version is the first to use this approach, designed to better handle complex problems by pausing to “think” before responding. Essentially, it breaks tasks into smaller steps and uses logical reasoning to arrive at, hopefully, more accurate and well-considered answers.
Google believes this method improves performance across various tasks, and, moving forward, it’s a capability we’ll see in all future models in the Gemini 2.5 series.
Organizations looking to make use of the full potential of advanced AI models like Gemini 2.5 Pro for real business challenges can benefit from Revolgy’s AI Program, with expert guidance and tailored AI solutions across a range of industries.
Gemini 2.5 Pro Experimental shows strong results in reasoning tasks, achieving high scores on several standard AI benchmarks. Notably, it currently holds the top spot on the LMArena leaderboard – which basically means human testers preferred its output quality. It also achieved an impressive 18.8% on Humanity’s Last Exam without using external tools. That particular benchmark is tough, designed by experts to test advanced knowledge and reasoning, so scoring well there highlights the model’s sophisticated understanding.
Beyond reasoning, Google also put significant effort into improving coding capabilities compared to Gemini 2.0. This new model is quite skilled at generating code for web apps and tasks needing agent-like behavior (what they call agentic code).
It also shows better performance in transforming and editing existing code. We’ve seen demos where Gemini 2.5 Pro generated functional code for complex things like interactive 3D simulations, solving the Rubik’s cube, or making games from just one prompt — which could be a potentially very powerful tool for developers who want to work faster and try new creative approaches.
One of the most talked-about features is its massive 1 million token context window. To put this into perspective, a million tokens can roughly translate to around 750,000 words — more text than the entire “Lord of the Rings” trilogy! This huge capacity allows the model to process and understand really large amounts of information in one go, whether that’s long documents, big codebases, or even hours of audio or video.
Plus, Google plans to double this to 2 million tokens soon, which opens up even more possibilities for addressing complex problems that need insights from vast amounts of data.
Read also: Free ebook on Gemini Prompting Guide 101, with best practices on how to prompt Gemini.
Google obviously isn’t the only one building large language models. How does Gemini 2.5 Pro compare to competitors in the space? Against models from other major players, it performs well and often shows very good results.
While benchmark results always need context, here’s a more detailed look at where Gemini 2.5 Pro stands based on specific tests:
Perhaps the best part for many readers is that Google has now made the experimental version of Gemini 2.5 Pro available to all users for free. Access used to be more limited, mainly for Gemini Advanced subscribers or via Google AI Studio.
Now, you can test its capabilities directly by accessing it through Google AI Studio or by selecting it as the model within the Gemini app. This wider access is great because it lets more people experiment, provide feedback, and maybe even generate new ideas. For businesses looking to build on this, Gemini 2.5 Pro is also planned to arrive on Vertex AI, Google Cloud’s machine learning platform, soon.
So, Google’s experimental Gemini 2.5 Pro is definitely an important update in the AI landscape. Better reasoning, strong coding skills, and that huge context window make it a very capable model. The fact that it’s now freely accessible is a great opportunity for anyone interested in AI, to see what it can do. It’s worth checking out via Google AI Studio or the Gemini app to understand where Google is heading with its AI technology.
Revolgy is a Premier Partner of Google, helping you adopt and take advantage of the latest AI tools and solutions in your cloud. Contact us today for a free consultation with our experts.
Q1: What is Gemini 2.5 Pro?
Gemini 2.5 Pro is an experimental version of Google’s latest AI model, which Google considers its most intelligent model yet. It is the first in the Gemini 2.5 family to use a “thinking models” approach.
Q2: What is the “thinking models” approach used by Gemini 2.5 Pro?
The “thinking models” approach allows the AI to better handle complex problems by pausing to “think” — breaking tasks into smaller steps and using logical reasoning — before responding, aiming for more accurate answers. This approach is planned for all future models in the Gemini 2.5 series.
Q3: How does the experimental Gemini 2.5 Pro perform in reasoning tasks?
The experimental version shows strong reasoning results. It holds the top spot on the LMArena leaderboard, meaning human testers preferred its output quality. It also scored 18.8% on Humanity’s Last Exam without using external tools, which is noted as the best score recorded so far for this reasoning test under those conditions. Additionally, it reportedly leads on math and science benchmarks like GPQA and AIME 2025 without needing costly test-time techniques and scores well on MMLU and MRCR (long-context evaluation).
Q4: What are the coding capabilities of the experimental Gemini 2.5 Pro?
Its coding capabilities are described as a “big leap” compared to Gemini 2.0. The model is skilled at generating code for web apps and agent-like tasks (“agentic code”), as well as transforming and editing existing code. Demos showed it generating functional code for interactive 3D simulations and games from single prompts. On the SWE-Bench Verified benchmark (agentic code), it scored 63.8% (lower than Claude 3.7 Sonnet’s 70.3% in that test), but it achieved a dominant 68.6% on the Aider Polyglot coding benchmark, outperforming models from OpenAI, Anthropic, and DeepSeek.
Q5: What is the context window size of the experimental Gemini 2.5 Pro?
It features a massive 1 million token context window, equivalent to roughly 750,000 words. This allows it to process very large amounts of information, like long documents or entire codebases, at once. Google plans to increase this to 2 million tokens soon.
Q6: How does the experimental Gemini 2.5 Pro compare to competitors based on the benchmarks mentioned in the text?
According to the benchmarks cited: It leads the LMArena leaderboard (score 1443), indicating high human preference for its output. It achieved the highest score recorded (18.8%) on Humanity’s Last Exam without using external tools, surpassing models like o3-mini and DeepSeek R1. It reportedly leads on GPQA and AIME 2025 math/science tests. In coding, while its SWE-Bench Verified score (63.8%) was behind Claude 3.7 Sonnet (70.3%), it significantly outperformed competitors on the Aider Polyglot benchmark (68.6%).
Q7: How can users access the experimental Gemini 2.5 Pro?
Google has made the experimental version of Gemini 2.5 Pro available for free to all users. It can be accessed through Google AI Studio or by selecting it as the model within the Gemini app. Availability on Vertex AI is planned soon.
Q8: Can Revolgy assist businesses with adopting Gemini 2.5 Pro?
Yes, as a Premier Partner of Google, Revolgy helps businesses adopt and leverage the latest AI tools and solutions, such as Gemini 2.5 Pro, within their cloud infrastructure. A free consultation with experts is available.