Anthropic's Claude Code has a High Ceiling

Anthropic's Hybrid Reasoning Update is Substantial: Claude Code is going to rule.

Michael Spencer

Feb 25, 2025

Good Morning,

I’ve been thinking about Anthropic’s release of late February, 2025 and it’s going to impact a lot of new startups.

Claude Code
Claude Extended Thinking
Claude 3.7 Sonnet

3.7 Benchmarks are Impressive

Claude Code is extremely new. Claude Code is available as a limited research preview, and enables developers to delegate substantial engineering tasks to Claude directly from their terminal.

That is, Claude Code is as now a beta product in research preview to learn directly from developers about their experiences collaborating with AI agents.

Claude 3.7 Sonnet is reported to significantly outperform its predecessors and competing AI coding tools.

It’s just a massive update for how a new class of startups are going to be able to scale faster and scale leaner which changes how new Startups out-compete less agile companies and older startups. You might have come across this infographic before, and we can include Together AI in it as well.

So I always knew 2025 would be a big year for AI coding startups and their funding but also a big year for Anthropic, which basically is the tech behind Cursor and others.

What is Claude Code?

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster through natural language commands. By integrating directly with your development environment, Claude Code streamlines your workflow without requiring additional servers or complex setup.

Key Features

Code Understanding: Claude Code can understand your codebase, enabling it to assist with complex tasks such as editing files, fixing bugs, and answering questions related to the architecture and logic of the code.
Automation of Routine Tasks: The tool automates many routine programming tasks. For example, it can execute tests, lint code, and manage version control operations like searching through Git history, resolving merge conflicts, and creating pull requests without additional servers or complex setups.
Collaborative Development: Claude Code is designed to be an active collaborator in the software development process, making it easier for developers to delegate engineering tasks directly to Claude right from their terminal. This capability has shown impressive efficiency, reportedly reducing development time significantly for tasks that typically require much manual effort.
Security Features: It is built with security and privacy considerations in mind, ensuring that your code queries go directly to Anthropic’s API without passing through intermediate servers, thus maintaining greater control over your data.
Easy Integration: Users can install Claude Code with basic command-line instructions, and it runs smoothly on multiple operating systems, including macOS, Ubuntu, and Windows via WSL. Authentication with an Anthropic account is required.

Claude was already the best-in-class and trusted model for Coding, but now it’s only going to get even better.

This means startups and companies will be spending less time and money on Engineering making their teams radically more agile.

Companies like Anysphere (Cursor), Poolside and Codeium are going to be able to raise huge amounts to push the limits of AI at the intersection of code in the next few years. As Claude Code gets better, they will ALSO only get better. Obviously GitHub Copilot was only the beginning.

Key Points

Claude 3.7 Sonnet offers both instant responses and extended thinking modes within a single model, unlike competitors that separate these capabilities.
The model excels at coding tasks, achieving state-of-the-art results on real-world software benchmarks like SWE-bench Verified.
Alongside the model, Anthropic launched Claude Code, an agentic terminal tool that can read codebases, edit files, and even push to GitHub repositories.
Pricing remains unchanged at $3 per million input tokens and $15 per million output tokens, with thinking tokens included in the output cost.

Seriously we have to think about what Claude Code will become now in the next 5-10 years:

Claude 3.7 Sonnet achieves state-of-the-art performance on SWE-bench Verified, which evaluates AI models' ability to solve real-world software issues, and on TAU-bench, a framework testing AI agents on complex tasks with user and tool interactions.

U.S.- and Europe-based Poolside was founded in 2023 by CEO Jason Warner and Eiso Kant, both software engineers. Codeium has reached about $40 million in annualized recurring revenue (ARR), and the others I’ve mentioned are even further along. Anthropic’s models will be how countless startups and new companies scale revenue faster than incumbents and competitors.

As you likely know already, Claude AI is compatible with popular coding tools such as Cursor and Replit.

The advanced reasoning capabilities of Claude allow it to handle multi-step coding tasks.

All to say this hybrid reasoning announcement is substantial and makes me more bullish about Anthropic’s revenue generation in the 2025 to 2030 period.

What can Claude Code do?

Edit files and fix bugs across your codebase
Answer questions about your code's architecture and logic
Execute and fix tests, lint, and other commands
Search through git history, resolve merge conflicts, and create commits and PRs

As you might know already: Claude 3.7 Sonnet shows particularly strong improvements in coding and front-end web development.

Anthropic’s Economic index showed them how key coding is for the future of their company. It’s going to mean billions of future revenue for this company and dominance in Enterprise AI like we’ve rarely seen before. It’s going to be pretty epic and is much more sustainable a tool than ChatGPT is going to be.

Last month, Anysphere, the maker of AI-powered coding assistant Cursor, announced a new funding round of financing at a $2.5 billion valuation. So this is all beginning to compound in a weird way.

Claude Code not only facilitates basic coding tasks but also introduces advanced features such as generating tests, refactoring code, and managing Git operations like committing changes and pushing them to repositories on platforms like GitHub.

Anthropic’s Findings on Economic use of AI

The main findings from the Economic Index’s first paper are:

Today, usage is concentrated in software development and technical writing tasks. Over one-third of occupations (roughly 36%) see AI use in at least a quarter of their associated tasks, while approximately 4% of occupations use it across three-quarters of their associated tasks.
AI use leans more toward augmentation (57%), where AI collaborates with and enhances human capabilities, compared to automation (43%), where AI directly performs tasks.
AI use is more prevalent for tasks associated with mid-to-high wage occupations like computer programmers and data scientists, but is lower for both the lowest- and highest-paid roles. This likely reflects both the limits of current AI capabilities, as well as practical barriers to using the technology.

Anthropic’s entire future hangs on how good Claude Code becomes.

Claude 3.7 Sonnet supports outputs of up to 128,000 tokens, which is significantly longer than previous models, facilitating rich and detailed text generation, especially useful in coding contexts and extensive documentation.

Anthropic describes Claude 3.7 Sonnet as “both an ordinary LLM and a reasoning model in one.” Which is of course what OpenAI is slower to reveal with GPT-5 later this year. This means Anthropic is executing better on product I believe, than both Google and OpenAI.

As OpenAI, Google, Anthropic, Microsoft and others get hyper-focused on products and agentic AI, China will itself lead in other kinds of application layer tools like text-to-video. China is traditionally very good at go-to-market strategies of apps that go global and we can expect a few good ones from the Generative AI hype wave.

While many companies might think twice about working with Google or OpenAI, Anthropic is a more trustworthy brand.

Users now have the option to toggle an "extended thinking mode," allowing Claude to take additional time when faced with challenging questions. So it’s a better interface for most users. Anthropic’s intent to be customer-centric is obviously superior to Google and OpenAI.

The future of software engineering will certainly continue to change with the coding capabilities these models will soon be cable of. On February 18th, 2025 OpenAI released a new benchmark called SWE-Lancer benchmark.

This is interesting because SWE-Lancer, is a benchmark of over 1,400 freelance software engineering tasks from Upwork, valued at $1 million USD total in real-world payouts. SWE-Lancer encompasses both independent engineering tasks. OpenAI hilariously omitted o3 in their analysis:

This bar graph compares the pass@1 percentages of different AI models—GPT-4o, o1, and Claude 3.5 Sonnet—across various task variants and subsets for software engineering roles. (Captioned by AI)

How do you suppose Claude 3.7 Sonnet would perform?

We aren’t there yet though. The findings from the SWE-Lancer benchmarking have significant implications for the economic landscape, suggesting that while AI models show promise, they currently lack the ability to tackle many practical tasks effectively.

This is why I’m so bullish about Claude Code, they will allocate to its success appropriately. Which leads me to the kicker.

Anthropic, which makes the AI chatbot Claude, is finalizing a $3.5 billion fundraising round that values the company at $61.5 billion, according to The Wall Street Journal. Anthropic initially set out to raise $2 billion, but investors have now agreed to a larger tranche, per the WSJ.

MGX also gets involved here, the UAE fyi. Lightspeed Venture Partners, General Catalyst, Bessemer Venture Partners, and Abu Dhabi-based investment firm MGX are said to be in talks to participate in the coming round. Should it top out at $3.5 billion, it’d bring Anthropic’s total raised to around $18 billion. Since both Amazon and Google support Claude, it also has a network advantage over Microsoft backed OpenAI, that doesn’t do as well with B2B and Enterprise AI customers.

Machine Economy Press

Discussion about this post