OpenAI's Codex

The Coding agents are coming!

May 17, 2025

Srinivas Narayanan of OpenAI speaking at a summit.

Welcome Back!

Between Claude Code and OpenAI’s latest coding agent Codex, it’s a brave new world!

OpenAI is stealing the thunder of Google’s AlphaEvolve paper and its ramifications somewhat. But maybe the trend of Coding agents and recursive self-improvement are more related than they seem? Or will be soon?

The AI hype is reaching peak extravagance for the Venture capital boosters. And what about those software development jobs?

While I don’t think Google is building the future AI here, (it’s losing a hold on Search in 2025), OpenAI and Anthropic are very well positioned for not just pesky Deep Research or Research agents, the coding agents are coming.

“Will we look back at 2025 as the year engineering teams fundamentally changed how they work? OpenAI just released a research preview of Codex, a cloud-based coding agent.” - Azeem Azhar

OpenAI announced on Friday it’s launching a research preview of Codex, the company’s most capable AI coding agent yet.

Here is the Video by OpenAI about the tool: (20+ minutes)

Virtual Employees who can Code

With Codex and its new codex-1 model, OpenAI wants to be an engineer’s “virtual coworker.”

Will this decrease those pesky junior software developer roles even more one has to wonder?

Codex-1 Better than o3 for Software Engineering

Codex is powered by codex-1, a version of OpenAI o3 optimized for software engineering. It was trained using reinforcement learning on real-world coding tasks in a variety of environments to generate code that closely mirrors human style and PR preferences, adheres precisely to instructions, and can iteratively run tests until it receives a passing result. They are starting to roll out Codex to ChatGPT Pro, Enterprise, and Team users today, with support for Plus and Edu coming soon.

Will Coding be faster with AI?

General Impression of Codex

Matthew Brennan:

A Devin-esque Interface
A coding co-worker
OpenAI are hoping Codex can be the next major product at OpenAI
The Codex agent runs in a sandboxed, virtual computer in the cloud.
By connecting with GitHub, Codex’s environment can come preloaded with your code repositories. (so a lot of Microsoft integration here)

What is OpenAI’s Codex?

Codex, launched May 16th, 2025 as a research preview, brings the company's AI prowess to software engineering by running multiple coding tasks in parallel cloud environments.

The system is powered by codex-1, a version of OpenAI's reasoning-focused o3 model that's been specifically optimized to produce code that more closely mirrors human style. This is a direct competitor to Anthropic’s Claude Code.

OpenAI says the AI coding agent will take anywhere from one to 30 minutes to write simple features, fix bugs, answer questions about your codebase, and run tests, among other tasks.

The PR/Comms people at OpenAI are busy trying to make this seem like a big deal (who control Sam Altman’s X account):

xAI is definately going to get into coding agents as well, at this rate. What will it mean for the next iterations of tools like Manus AI do you suppose in open-weight country where Qwen is mixed with Claude?

Many Virtual Coding Employees

Who needs software developers? Codex can handle multiple software engineering tasks simultaneously, says OpenAI, and it doesn’t limit users from accessing their computer and browser while it’s running.

So is this a virtual employee product that’s going to cost thousands of dollars soon? It would appear so.

After Vibe coding are Virtual Coworkers

OpenAI says users will have “generous access” to Codex to start, but in the coming weeks, the company will implement rate limits for the tool.

A remote software agent.

As Google gets better at models good at coding, Gemini Code Assist will also got a lot better.

All that vibe coding has made the businesses behind AI coding platforms some of the fastest-growing in tech.

“There are lots of good AI systems out there,” said Srinivas Narayanan, vice president of engineering at OpenAI. “Competition is clearly there.”

OpenAI is also going up against more specialized startups directly here like Magic.dev, hCompany and others. As these AI coding tools proliferate with mainstream agentic protocols, things start to get a lot more interesting around this time in 2026.

How to Access Research Preview?

Users with access to Codex can find the tool in ChatGPT’s sidebar, and assign the agent new coding tasks by typing a prompt and clicking the “Code” button. Users can also ask questions about their codebase and click the “Ask” button. Below the prompting bar, users can see other tasks they’ve assigned Codex to do, and monitor their progress.

"Software engineering is changing, and by the end of 2025, it's going to look fundamentally different," said Greg Brockman, OpenAI's President and co-founder, during the announcement. "This is a step towards where we think software engineering is going."

The Rise of “Vibe Coworkers”

But how powerful with these agentic capabilities become? OpenAI wants the world to believe that these AI coding agents will act as “virtual teammates,” completing tasks autonomously that take human engineers “hours or even days” to accomplish. OpenAI claims it’s already using Codex internally to offload repetitive tasks, scaffold new features, and draft documentation.

It’s not clear if actual people who are “AI-first” will be able to keep up with the evolution of these various agents.

Coding agents are still very nascent, as you might know a recent study from Microsoft found that industry-leading AI coding models, such as Claude 3.7 Sonnet and o3-mini, struggled to reliably debug software. Is OpenAI really ahead of others with Codex? It’s not very clear.

Codex-1 as a model that is now the default in Codex CLI, and will be available in OpenAI’s API for $1.50 per 1M input tokens (roughly 750,000 words, more than the entire Lord of the Rings book series) and $6 per 1M output tokens. In the OpenAI video the “AGI” term was also dropped multiple times. In recent months Claude and Google have caught up in LLMs that do coding well. It’s unclear if OpenAI has moved ahead now.

Early adopters like Cisco, Temporal, Superhuman, and Kodiak have already been testing the system.

The Codex agent runs in a sandboxed, virtual computer in the cloud.

The tasks of reviewing code will increasingly become automated. Will this continually lead to less demand for actual graduating software developers? And how quickly will see see a Virtual coworker for other sets of tasks, roles and industries? OpenAI is betting on that it can get there first.

The Path to Coding Coworkers

OpenAI is also in talks to acquire AI coding startup Windsurf for $3 billion, according to a Bloomberg report. Windsurf only has a reported $40 million in annualized revenue, so that $3 Billin price tag is quite a premium for a company in OpenAI that is bleeding cash already. Competition with Google and Anthropic will be fierce since AI getting better at coding is one of the biggest things happening in LLMs in the mid 2020s.

Cheaper Coding agents might lead to all sorts of new kinds of companies, startups and business ideas and products. Even as it disrupts jobs and leads to layoffs in some sectors.

Actual Use Case Now in the Real World?

OpenAI wasn’t so good at illustrating what is so special at Codex with what it can do now. Apparently, Temporal reported using Codex to accelerate feature development, debug issues, write and execute tests, and refactor large codebases, while Superhuman found it useful for speeding up repetitive tasks and enabling product managers to contribute lightweight code changes without requiring an engineer except for review. Nothing exactly we haven’t heard before.

OpenAI has also updated its Codex CLI, the lightweight open-source coding agent launched last month that runs in local terminals. By 2027 of 2028 we might have legit Virtual employees that you can rent for a lot cheaper than hiring actual coworkers. But if that happens, what might be the consequences?

How Fast is Codex?

It can take minutes—sometimes longer—for remote tasks to complete, and you’ll still want to eyeball every change before merging. This is really like so much else, a product in Beta.

Virtual employees like Codex are expected to take meaningful bigger roles in software dev teams in the months and years ahead. Supposedly as models and infrastructure improve, Codex will take ever-larger chunks of work—debugging sessions, cross-repo refactors, even CI failure resolutions—to AI colleagues. That’s the hope, in theory.

Interacting with Code - Really is an Augmented No-Code Interface

Ask or Code? What will you choose. Codex is a unique interface (not to be confused with the Codex CLI tool introduced by OpenAI last month) that can be reached from the side bar in the ChatGPT web app.

While OpenAI is the 800-pound gorilla when it comes to consumer-facing AI chatbots, it doesn’t have the same status in coding and there’s no guarantee it will even be a major Enterprise AI player around this important niche. Indeed Anthropic and Google might beat them, among others and whoever in China. It’s very early to say.

OpenAI tried to make a big point that to make Codex more effective, developers can include an "AGENTS.md" file in the repo with custom instructions, for example to contextualize and explain the code base or to communicate standardizations and style practices for the project—kind of a README.md but for AI agents rather than humans.

“Codex feels like a co-worker with its own computer,” OpenAI’s Greg Brockman said in the launch demo. “You ask it to run tests or fix typos, and it just does it while you keep coding or grab lunch.”

While OpenAI’s desperation for more products is clear, how it will build agents that it will be able to charge thousands of dollars for is less clear. A whole range of companies will be competing for the same use cases. That they are acquiring Windsurf is a major sign that they don’t have as of right now internally what it takes to compete. They coudln’t afford to acquire Cursor and misjudged how quickly that startup was evolving.

Is placing all of their new products inside ChatGPT really the best idea? It just supercharges adoption as they are like the Amazon Prime of chatbots with a huge lead in ChatGPT’s user growth and retention. Google has made fairly decent progress over the past year. Google has reached 150 million subscriptions (in mid May), as compared to 100 million in February.

OpenAI’s Codex is bold, if it works. Such a virtual coworker it is hoped could act as a hybrid of real-time pairing and asynchronous delegation that could redefine engineering workflows. Microsoft tried too to make Github Copilot feel like a pair programming AI back in the day. But what Codex cannot do is also striking.

As for aligning to human preference, Anthropic has been known for that for both writing and coding for quite some time. OpenAI claiming that it’s a priority for Codex is a bit puzzling. OpenAI claim that one of their primary goals while training codex-1 was to align outputs closely with human coding preferences and standards. Is that so?

OpenAI is trying to position Codex as a tool software developers will use to be more productive, already a lucrative proposition. With that enable leaner software developing and engineering teams? What happens when there’s a Codex for Product management as well? PMs and Engineers are among the most common and expensive employees and talent to pay at tech companies.

I certainly see Codex as becoming yet another coding tool in the toolbox of many dev teams. Teams that are using Cursor, Claude Code, Github Copilot and others already.

A Coding AI Zoo

I’m not a software engineer, but the proliferation of coding AI tools is going to get intense. BigTech and AI startups at YC brag about how much of their code is automated in 2025.

AI-Powered Development Assistants:

Qodo
Codeium (Windsurf now)
AskCodi

Code Intelligence & Completion:

Github Copilot
Tabnine
IntelliCode

Security & Analysis:

DeepCode AI
Codiga
Amazon CodeWhisperer

Cross-Language & Translation:

CodeT5
Figstack
CodeGeeX

Educational & Learning Tools:

Replit
OpenAI Codex
SourceGraph Cody

There’s going to be no shortage of companies in 2026 claiming to be building the next Coworking employee or Virtual Vibe Worker. There’s even a sense of how OpenAI is cloning others here, rather than really innovating.

Claude 3.5 Opus will likely be better than Codex-1 at Code

Obviously Claude 3.5 Opus is going to be pretty good at code. Anthropic’s focus is more B2B and Enterprise AI so we have to imagine they will get better at some AI agents than OpenAI, a B2C company will be able to manage.

What happens as these develop along with Qwen and DeepSeek in China, is traditional BigTech will get left behind in these future markets. Outside of Google, there’s no much evidence Microsoft, Amazon, Meta or Apple are able to keep up. That’s important because these are going to be a significant big markets.

All to say the announcement of OpenAI’s Codex is a bit superficial when you begin to dig a bit deeper. It’s more just another tool than anything actually resembling a Vibe coding virtual AI employee.

ChatGPT is stuffed with Sora, Operator, Deep Research and now Codex. But at the end of the day it’s just a chatbot. Deep Research is useful for some white collar professionals, but for most people Operator or Codex won’t be that useful. ChatGPT revenue growth is impressive, but the verdict on Agentic AI will take a few years to pan out.

Is OpenAI’s Codex going to convince existing subscribers to pay OpenAI more money for increased rate limits or lead to an actual Virtual employee that can make the firm thousands of dollars a month? I guess they are betting on it.

Machine Economy Press

Discussion about this post