Why humans are AI's biggest bottleneck (and what's coming in 2026) | Alexander Embiricos (OpenAI)
Summary
In this episode, Lenny speaks with Alexander Imbirikos, product lead for Codex, OpenAI's powerful coding agent. Alexander shares insights on how Codex is transforming software development, enabling unprecedented speed and productivity, and his vision for AI agents that function as true engineering teammates rather than just tools.
-
Explosive growth: Codex has seen 20x growth since GPT-5's launch, with the models now serving trillions of tokens weekly and becoming OpenAI's most served coding model.
-
Teammate metaphor: Codex is designed to be a software engineering teammate, not just a coding tool—"a bit like this really smart intern that refuses to read Slack, doesn't check Datadog unless you ask it to."
-
Accelerating development: The Sora Android app was built in just 18 days and launched publicly 10 days later, becoming the #1 app in the App Store—a task that would have taken months without Codex.
-
Validation bottleneck: The current limiting factor for AI productivity isn't model capability but human review speed—the time it takes to validate AI-generated work.
-
Code review focus: The team is now prioritising making code review easier, as reviewing AI-written code is often less enjoyable than writing code.
-
Proactivity goal: A major objective is making Codex proactive rather than just responsive, helping users thousands of times daily without explicit prompting.
-
Future vision: Alexander believes coding agents are the foundation for all AI agents, as the most effective way for models to use computers is by writing code.
Who it is for: Engineers, product builders, and anyone interested in how AI coding agents are reshaping software development and what this means for the future of work.
- - Alexander frames Codex as a smart intern evolving into a proactive software-engineering teammate that helps by default without constant prompting.
- - Imbirikos says an AI agent needs a smart model, an API that understands agent concepts, and a tool harness, all tuned together.
- - Allow agents to act directly from team chats and social signals so code ships without formal specs, mirroring self-driven teams.
- - Embedding the assistant inside a browser lets it leverage full page context to surface in-flow actions instead of flooding users with push notifications.
- - Human prompt-and-review speed now limits AI acceleration, so systems must let agents be default useful to unlock hockey-stick gains.
Transcript
Alexander Imbirikos:Do lead work on Codex.
Alexander Imbirikos:Codex is OpenAI's coding agent. We think of Codex as just the beginning of a software engineering teammate. It's a bit like this really smart intern that refuses to read Slack, doesn't check Datadog unless you ask it to.
Lenny Rachitsky:I remember Karpathy tweeted the gnarliest bugs that he runs into that he just spends hours trying to figure out. Nothing else has solved. He gives it to Codex, lets it run for an hour, it solves it.
Alexander Imbirikos:Starting to see glimpses of the future where we're actually starting to have Codex be on call for its own training. Codex writes a lot of the code that helps like manage its training run, the key infrastructure, and so we have a Codex code review. It's like catching a lot of mistakes. It's actually caught some like pretty interesting configuration mistakes. One of the most mind blowing examples of acceleration: the Sora Android app, like a fully new app. We built it in eighteen days and then ten days later, so twenty eight days total, we went to the public.
Lenny Rachitsky:How do you think you win in this space?
Alexander Imbirikos:One of our major goals with Codex is to get to productivity. If we're gonna build a super assistant, it has to be able to do things. One of the learnings over the past year is that for models to do stuff, they are much more effective when they can use a computer. It turns out the best way for models to use computers is simply to write code, and so we're kinda getting to this idea where if you wanna build any agent, maybe you should be building a coding agent.
Lenny Rachitsky:When you think about progress on Codex, I imagine you have a bunch of evals and there's all these public benchmarks.
Alexander Imbirikos:A few of us are like constantly on Reddit. You know, there's praise up there and there's a lot of complaints. What we can do is as a product team just try to always think about how are we building a tool so that it feels like we're maximally accelerating people rather than building a tool that makes it more unclear what you should do as the human.
Lenny Rachitsky:Being at OpenAI, I can't not ask about how far you think we are from AGI.
Alexander Imbirikos:The current underappreciated limiting factor is literally human typing speed or human multitasking speed.
Lenny Rachitsky:Today my guest is Alexander Imbirikos, product lead for Codex, OpenAI's incredibly popular and powerful coding agent. In the words of Nick Turley, head of ChatGPT and former podcast guest, Alex is one of my all-time favorite humans I've ever worked with, and bringing him and his company into OpenAI ended up being one of the best decisions we've ever made. Similarly, Kevin Weil, OpenAI's CPO, said Alex is simply the best. In our conversation, we chat about what it's truly like to build product at OpenAI, how Codex allowed the Sora team to ship the Sora app, which became the number one app in the App Store in under one month, also the 20x growth Codex is seeing right now and what they did to make it so good at coding, why his team is now focused on making it easier to review code, not just write code, his AGI timelines, his thoughts on when AI agents will actually be really useful, and so much more. A huge thank you to Ed Baze, Nick Turley, and Dennis Yang for suggesting topics for this conversation. If you enjoy this podcast, don't forget to subscribe and follow it in your favorite podcasting app or YouTube, and if you become an annual subscriber of my newsletter, you get a year free of 19 incredible products including a year free of Devon, Lovable, Replid, Bolt, N8M, Linear, Superhuman, Descript, Whisperflow, Gamma, Perplexity, Warp, Granolah, Magic Patterns, Raycast, Chopardy, Mobbin, PostHog, and Stripe Atlas. Head on over to Lenny'sNewsletter.com and click Product Pass. With that, I
Alexander Imbirikos:bring you
Lenny Rachitsky:Alexander Imbirikos. After a short word from our sponsors, here's a puzzle for you: what do OpenAI, Cursor, Perplexity, Vercel, Platt, and hundreds of other winning companies have in common? The answer is they're all powered by today's sponsor WorkOS. If you're building software for enterprises, you've probably felt the pain of integrating single sign-on, SCIM, RBAC, audit logs, and other features required by big customers. WorkOS turns those deal blockers into drop-in APIs with a modern developer platform built specifically for B2B SaaS. Whether you're a seed-stage startup trying to land your first enterprise customer or a unicorn expanding globally, WorkOS is the fastest path to becoming enterprise-ready and unlocking growth. They're essentially Stripe for enterprise features. Visit WorkOS.com to get started or just set up their Slack support where they have real engineers in there who answer your questions super fast. WorkOS allows you to build like the best with delightful APIs, comprehensive docs, and a smooth developer experience. Go to WorkOS.com to make your app enterprise-ready today. This episode is brought to you by Fin, the number one AI agent for customer service. If your customer support tickets are piling up, then you need Fin. Fin is the highest performing AI agent on the market with a 65% average resolution rate. Fin resolves even the most complex customer queries. No other AI agent performs better in head-to-head bake-offs with competitors. Fin wins every time. Yes, switching to a new tool can be scary, but Fin works on any help desk with no needed, which means you don't have to overhaul your current system or deal with delays in service for your customers. And Fin is trusted by over 6,000 customer service leaders and top companies like Anthropic, Shutterstock, Synthesia, Clay, Vanta, Lovable, Monday.com, and more. And because Fin is powered by the Fin AI engine, which is a continuously improving system that allows you to analyze, train, test, and deploy with ease, Fin can continuously improve your results too. So if you're ready to transform your customer service and scale your support, give Fin a try for only 99¢ per resolution plus Fin comes with a ninety-day money-back guarantee. Find out how Fin can work for your team at Fin.ai/Lenny. That's Fin.ai/Lenny.
Lenny Rachitsky:Alexander, thank you so much for being here and welcome to the podcast.
Alexander Imbirikos:Thank you so much. I've been following for ages and I'm excited to be here.
Lenny Rachitsky:I'm even more excited I really appreciate that I wanna start with your time at OpenAI so you joined OpenAI about a year ago before that you had your own startup for about five years before that you were a product manager at Dropbox I imagine OpenAI is very different from every other place you've worked let me just ask you this what is most different about how OpenAI operates and what's something that you've learned there that you think you're gonna take with you wherever you go assuming you ever leave
Alexander Imbirikos:By far would say the speed and ambition of working at OpenAI are just like dramatically more than what I can imagine and you know I guess it's kind of an embarrassing thing to say because you know everyone who's a startup founder thinks like yeah my startup moves super fast and the talent bar is super high we're super ambitious but I have to say like working at OpenAI just kinda like made me reimagine what he what that even means
Lenny Rachitsky:We hear this a lot about you know it feels like every AI company is just like oh my god can't believe how fast they're moving is there an example of just like wow that wouldn't have happened this quickly anywhere else
Alexander Imbirikos:The most obvious thing that comes to mind is just like the the explosive growth of Codex itself I think it's a while since we bumped our external number but like know it's like the 10xing of Codex's scale was just super fast in a matter of months and it's well more since then and once you've lived through that or at least speaking for myself having lived through that now I feel like any time I'm going to spend my time on building tech product there's that speed and scale that I now need to meet if I think of what I was doing in my startup it moved way slower and there's always this balance with startups of how much do you commit to an idea that you have versus find out that it's not working and then pivot but I think one thing I've realized at OpenAI is the the amount of impact that we can have and in fact need to have to do a good job is so high that it it's a I have to be like way more ruthless with how I spend my time now
Lenny Rachitsky:Before we get to Codex is there a way that they've structured the org or I don't know the way that OpenAI operates that allows the team to move this quickly because everyone everyone wants to move super fast I imagine there's a structural approach to allowing this to happen
Alexander Imbirikos:I mean so one thing is just the technology that we're building with has like just transformed so many things you know from like both how we build but also like what kinds of things we can enable for users and you know we spend most of our time talking about like the sort of improvements in the foundation models but I believe that even if we had no more progress today with models which is absolutely not the case but even if we had no more progress we are way behind on product there's so much more product to build so I think just the moment is ripe if that makes sense
Lenny Rachitsky:Mhmm
Alexander Imbirikos:But I think there's a lot of sort of counterintuitive things that surprised me when I arrived as far as like how things are structured one example that comes to mind is like when I was working on my startup and and before that when I was at Dropbox it was like very important you know especially as a PM to like always kinda rally the ship and it was kinda like make sure you're pointed in the right direction and then you can like accelerate in that direction but here I think because we don't exactly know like what capabilities will even come up soon and we don't know what's going to work technically and then we also don't know what's going to land even if it works technically it's much more important for us to be very humble and learn a lot more empirically and just try things quickly and the org is set up in that way to be incredibly bottoms up this is again one of those things that as you were saying everyone wants to move fast I think everyone likes to say that they're bottoms up or at least a lot of people do but OpenAI is truly truly bottoms up and that's been a learning experience for me that now like it's it'll be interesting if I ever work at like I don't think it'll ever the it'll even make sense to work at a non AI company in the future I don't even know what that means but if I were to imagine it or go back in time I think I would like run things totally
Lenny Rachitsky:What I'm hearing is kind of this ready fire aim is their approach more than ready aim fire and there's something and as you process that because that may not come across well but I actually have heard this a lot at AI companies is because you don't know and Nick Charlie shared I think the same sentiment because you don't know how people will use it it doesn't make sense to spend a lot of time making it perfect it's better to just get it out there in a primordial way see how people use it and then go big on that use case
Alexander Imbirikos:Yeah it's like okay to use this analogy a little bit I feel like there is an aim component but the aim component is much fuzzier you know it's kind of like roughly what do we think can happen someone I've learned a ton from working here is a research lead and he likes to say that in OpenAI we can have really good conversations about something that's a year plus from now and there's a lot of ambiguity in what will happen but that's a right sort of timeline and then we can have really good conversations about what's happening in low months or weeks but there's kind of this awkward middle ground which was as you start approaching a year but you're not at a year where it's very difficult to reason about and so as far as aiming I think we want to know okay what are some of the futures that we're trying to build towards and a lot of the problems we're dealing with in AI such as alignment are problems you need to be thinking out really far out into the future so we're kind of aiming fuzzily there but when it comes down to the more tactically oh yeah like what product will we build and therefore how will people use that product that's the place where we're much more like let's find out empirically
Lenny Rachitsky:That's a good way of putting it something else that when people hear this they people sometimes hear companies like yours saying okay we're gonna be bottoms up we're gonna try a bunch of stuff we're not gonna have exactly a plan of where it's going in the next few months the key is you all hire the best people in the world and so that feels like a really key ingredient in order to be this successful at bottoms up work
Alexander Imbirikos:It's just super rising basically I was just like again surprised or even shocked when I arrived at like the level of like individual like drive and like autonomy that everyone here has so I think like the way that OpenAI runs like many you can't like read this or be on listen to a podcast and be like I am I'm just gonna deploy this to my company you know maybe this is a harsh thing to say but I think like yeah very few companies have the talent caliber to be able to do that so it might need to be like adjusted if you were gonna implement this
Lenny Rachitsky:Okay so let's talk Codex you lead work on Codex how's Codex going what numbers can you share is there anything you can share there also not everyone knows exactly what Codex is explain what Codex is
Alexander Imbirikos:Totally yeah so I had the very lucky job of of living in the future and leading products on Codex and Codex is OpenAI's coding agent so super concretely means it's an IDE extension like a versus code extension that you can install or a terminal tool that you can install and when you do so you can then basically pair with Codex to answer questions about code write code you know run tests execute code and do a bunch of the work in sort of that like thick middle section of the software development life cycle which is all about you know writing code that you're gonna get into production more broadly we think of Codex as like it's what it currently is is just the beginning of a software engineering teammate and so you know when we use a big word like teammate like some of the things we're imagining are that it's not only able to write code but actually it participates like early on in like the ideation and planning phases of writing software and then further downstream in terms of like validation deploying and like maintaining code to make that a little more fun like one thing I like to imagine is like if you think of what Codex is today it's a bit like this like really smart intern that like refuses to read Slack and like doesn't check Datadog or like Sentry unless you ask it to and so like no matter how smart it is like how much are you gonna trust it to write code without you also working with it right so that's how people use it mostly today is they pair with it but we wanna get to the point where you know it can work like just like a new intern that you hire you don't only ask them to write code but you ask them to participate across the cycle and so you know that like even if they don't get something right the first try they're eventually gonna be able to iterate their weight there
Lenny Rachitsky:I thought the way I thought the point about not reading Slack and Datadog was that it's just not distracted it's just focused and is always in flow but I get what you're saying there is it doesn't have all the context on everything that's going on
Alexander Imbirikos:And like that's not only true when it's performing a task but again if you think of like the best human teammates like you don't tell them what to do right like maybe when you first hire them you have like a couple meetings and you're like hey like you kind of learn like okay this is these prompts work for this teammate these prompts don't right this is how to communicate with this person then eventually you give them some starter tasks you delegate a few tasks but then eventually you just say like hey great okay you're working with this set of people in this area of the code base feel free to work with other people in other parts of the code base too even yeah you tell me what you think makes sense to be done and so we think of this as like proactivity and one of our major goals with codex is to get to proactivity i think this is critically important to achieve the mission of openai which is to deliver the benefits of agi to all humanity i like to joke today that ai products and it's a half joke they're actually really hard to use because you have to like be very thoughtful about when it could help you and if you're not prompting a model to help you it's probably not helping you at that time and if you think of how many times like the average user is prompting ai today it's probably like tens of times but if you think of how many times people could actually get benefit from a really intelligent entity it's thousands of times per day and so a lot a large part of our our goal with codex is to figure out like what is the shape of an actual teammate agent that is sort of helpful by default
Lenny Rachitsky:When people think about cursor and even cloud code it it's like a ide that helps you code and kind of auto completes code and maybe does some agentic work what i'm hearing here is the vision is is different which is it's a teammate it's like a remote teammate a building code for you that you talk to and ask to do things and it also does i mean ide autocomplete and things like that is that is that a kind of a differentiator in the way you think about codex
Alexander Imbirikos:It's basically this idea that like we want the way like if you're a developer and you're trying to get something done we want you to just feel like you have superpowers and you're able to move much much faster but we don't think that in order for you to reap those benefits you need to be sitting there constantly thinking about like how can i invoke ai at this point to do this thing we want you to be able to sort of like plug it in to the way that you work and have it just start to do stuff without you having to think about it
Lenny Rachitsky:Okay i have a lot of questions along those lines but just how's it going is there any stats any numbers you can share about how codex is doing
Alexander Imbirikos:Yeah it's been codex has been growing like absolutely explosively since the launch of gpt-five back in august there's some definitely some interesting like product insights to talk about as to like how we unlock that growth if you're interested but in the last stat we shared there was like we were like well over 10x since august in fact it's been like 20x since then also the codex models are serving many trillions of tokens a week now and it's basically like our most served coding model of the really cool things that we've seen is that the way that we decided to set up the codex team was to build a really tightly integrated product and research team that are iterating on the model and the harness together and it turns out that lets you just do a lot more and try many more experiments as to how these things will work together and so we were just training these models for use in our first party harness that we were very opinionated about and then what we've started to see more recently actually is that other major sort of api coding customers are now starting to adopt these models as well and so we've reached the point where actually the codex model is the most served coding model in the api as well
Lenny Rachitsky:You hinted at this what unlocked this growth i am extremely interested in hearing that it felt like before i don't know maybe this was before he joined the team it just felt like cloud code was killing it just everyone was sitting on top of cloud code it was by far the best way to code and then all of a sudden codex comes around i remember carpathi tweeted that he just like has never seen a model like this he i think the tweet was the gnarliest bugs that he runs into that he just spends hours trying to figure out nothing else has solved he gives it to codex lets it run for an hour it solves it what did what'd you guys do
Alexander Imbirikos:We have this strong sort of mission here at openai to you know basically to build agi and so we we think a lot about what how can we shape the product so that it can scale right you know earlier i was mentioning like hey like if you're an engineer you should be getting help from an from ai like thousands of times per day right and so we thought a lot about the primitives for that when we launched our first version of codex which was codex cloud and that was basically a product that had its own computer it lived in the cloud you could delegate to it and you know the sort of the coolest part about that was you could run many many tasks in parallel but some of the challenges that we saw are that it's a little bit harder to set that up both in terms of like environment configuration like giving the model the tools it needs to validate its changes and to learn how to prompt in that way and sort of my my analogy for this is going back to this teammate analogy it's like if you hired a teammate but you're never allowed to get on a call with them and you can only go back and forth you know asynchronously over time like that works for some teammates and eventually that's actually how you want to spend most of your time so that's still the future but it's hard to initially adopt and so we still have that vision of like that's what we're trying to get you to a teammate that you delegate to and that is proactive and we're seeing that growing the key unlock is actually first you need to land with users in a way that's like much more intuitive and like trivial to get value from so the way that most people discover like the vast majority of users discover codex today is either they download an ide extension or they run it in their cli and the agent works there with you on your computer interactively and it works within a sandbox which is actually like a really cool piece of tech to to help that be safe and secure but it has access to all those dependencies so if the agent needs to do something like it needs to run a command it can do so within the sandbox we don't have to set up any environment and if it's a command that doesn't work in the sandbox it can just ask you and so you can get into this really strong feedback loop using the model and then over time our team's job is to help turn that feedback loop into you sort of as a byproduct of using the product configuring it so that you can then be delegating to it down the line and again analogy keep going back to it but if you hire a teammate and you ask them to do work but you just give them a fresh computer from the store it's gonna be hard for them to do their job right but if as you work with them side by side you could be like oh you don't have a password for this service we use like here's the password for this service you know yeah don't worry feel free to run this command then it's like much easier for them to then go off and do work for hours without you
Lenny Rachitsky:So what i'm hearing is the initial version of codex was almost too far in the future it's like a remote in the cloud agent that's coding for you asynchronously and what you did is okay let's actually come back a little bit let's integrate into the way engineers already integrate into ids and locally and help them kinda on ramp to this new world
Alexander Imbirikos:Totally and this was it was quite interesting because we we dog food products a ton at openai so you know dog food as in we use our own product and so codex has been accelerating openai over the course of the entire year and the cloud product was a massive accelerant to the company as well it just turns out that this was one of those places where the signal we got from dogfooding is a little bit different from the signal you get from like the general market because at openai you know we train reasoning models all day and so we're very used to this kind of prompt thing and like you know think upfront run things massively in parallel you know it would take some time and then come back to it later asynchronously and so you know now when we build we still get a ton of signal from dogfooding internally but you know we're also very cognizant of like the different ways that different audiences use the product
Lenny Rachitsky:That's really funny it's like live in the future but maybe not too far in the future and i could see how everyone with openai is living very far in the future and sometimes that won't that won't work for everyone yeah what about just like intelligence training data don't know is there something else that helped codex accelerate its ability to actually code is it like better cleaner data is it more just models advancing is there anything else that really helped accelerate
Alexander Imbirikos:Yeah so there's like a few components here i guess you you were mentioning models and the models have improved a ton in fact just last wednesday we shipped gpt five four one codex max a very you know accurately named model that is that is awesome it is awesome both because it is for any given task that you were using gpt 5.1 codecs for it's like you know roughly 30% faster at accomplishing that but also it unlocks a ton of intelligence so if you use it at our higher reasoning levels it's just like even smarter and that feedback that or that tweet you were saying like karpathy made about like hey give us your gnarliest bugs obviously there's a ton going on in the market right now but like codex max is definitely like carrying that mantle of tackling the hardest bugs so that is super cool but i will say it's like some of what how we're thinking about this is evolving a little bit from being like yeah we're just gonna think about the model and like let's just like train the best model to really thinking about like what is an agent actually overall right and you know i'm not gonna try to define agent exactly but at least the stack that we think of it as having is it's like you have this model really smart reasoning model that knows how to do a specific kind of task really well so we can talk about how we make that possible but then actually we need to serve that model through an api into a harness and both of those things also have a really big role here so for instance one of the things that we're really proud of is you can have gpt 5.1 codex max work for really long periods of time that's not like normal but you can set it up to do that or that might happen but now routinely we'll hear about people saying like yeah it ran like overnight or it ran for twenty four hours and so you know for a model to work continuously for that amount of time it's going to exceed its context window and so we have a solution for that which we call compaction but compaction is actually a feature that uses like all three layers of that stack so you need to have a model that has a concept of compaction and those like okay as i start to approach this context window i might be asked to like prepare to be running a new context window and then at the api layer you need an api that like understands this concept and like has an endpoint that you can hit to do this change at the harness layer you need a harness that can like prepare the payload for this to be done and so like shipping this compaction feature that now just like made this behavior possible to like anyone using codex actually meant working across all three things and i think that's like increasingly gonna be true another maybe like underappreciated version of this is if you think about all the different coding products out there they all have very different tool harnesses with very different opinions on how the model should work and so if you wanna train a model to be good at all the different ways it could work like maybe you have a strong opinion that it should work using semantic search right maybe you have a strong opinion that it should call bespoke tools or maybe you have like in our case a strong opinion that it should just use the shell and work in the terminal you know you can be much you can move much faster if you're just optimizing for one of those worlds right and so the way that we built codex is that it just uses the shell but in order to make that like safer and secure we have a sandbox that the model is used to operating in so i think one of the biggest accelerants to go all the way back to your to your answered question is just like we're building all three things in parallel and like kinda tuning each one and you know constantly experimenting with how those things work with like a tightly integrated product and research team
Lenny Rachitsky:How do you think you win in this space do you think it it'll event it'll always be this kinda like race with other models constantly kind of leapfrogging each other do you think there's a world where someone just runs away with it and no one else can ever catch up is there like a path to just we win
Alexander Imbirikos:Again comes back to this idea of like building a teammate and not just a teammate that you know participates in team planning and prioritization not just a teammate that you know really tests its code and like helps you maintain and deploy but even a teammate if you think again an engineering teammate they can also schedule a calendar invite or move stand up or do whatever so in my mind if we just imagine that every day or every week some crazy new capability is just going to be deployed by a research lab it's just impossible for us as humans keep up and use all this technology and so i think we need to get to this world where you kind of just have an ai teammate or a super assistant that you just talk to and it just knows how to be helpful like on its own right and so you don't you don't have to be like reading the latest tips for how to use it you just like you've plugged it in it just provides help and so that's kind of the shape of what i think we're building and i think that will be like a very sticky like winning product if we can do so so the shape that in my head at least i have is that we build you know maybe a fun topic is like is chat the right interface for ai actually chat is a very good interface when you don't know what you're supposed to use it for in the same way that if i think of like i'm like on ms teams or in slack with a teammate chat is pretty good i can ask for whatever i want right it's like it's kind of the common denominator for everything so you can chat with a super assistant about whatever topic you want whether it be coding or not and then if you are like a functional expert in a specific domain such as coding there's like a gui that you can pull to go really deep and like look at the code and like work with the code so i think like what we need to build as openai is basically this idea of like you have chat chatgpt and that is a tool that's like ubiquitously available to like everyone you start using it even like outside of work right to just help you you become very comfortable with the idea of being accelerated with ai and so then you get to work and you just can naturally just yeah i'm just gonna ask it for this and i don't need to know about all the connectors or all the different features i'm just gonna ask it for help and it'll surface to me the the best way that it can help at this point in time and maybe even chime in when i didn't ask it for help so in my mind if we can get to that i think that's you know that's how we we really build like the winning product
Lenny Rachitsky:This is so interesting because with the my chat with nick charlie the head of chatgpt i think he shared that the original name for chatgpt was super assistant or something like that
Lenny Rachitsky:And it's interesting that there's like that approach to the super assistant and then there's this Codex approach. It's almost like the B to C version and the B to B version and what I'm hearing is the idea here is okay you start with coding and building and then it's doing all this other stuff for you scheduling meetings I don't know probably posting in Slack I don't know shipping designs I don't know is that is the idea there this is like the the business version of ChatGPT in a sense or is there or or is there something else there
Alexander Imbirikos:Yeah
Alexander Imbirikos:Yeah so you know so we're getting to the like the like one year time horizon conversation a lot of this might happen sooner but in terms of fuzziness I think we're the one year so I'll give you like a contention and a plausible way we get there but as for how it happens who knows so basically if we're gonna build a super system it has to be able to do things so we're gonna have a model and it's gonna be able to do stuff affecting your world and one of the learnings I think we've seen over the past year or so is that for models to do stuff they are much more effective when they can use a computer right okay so now we're like okay we need the super system that can use a computer right or many computers and now the question is okay well how should it use the computer right and there's lots of ways to use a computer you know you could try to hack the OS and like use accessibility APIs maybe a bit easier is you could point and click that's a little slow and unpredictable sometimes and another way it turns out the best way for models to use computers is simply to write code and so we're kind of getting to this idea where like well you want to build any agent maybe you should be building a coding agent and maybe to the user a non technical user they won't even know they're using a coding agent the same way that no one thinks about are they using the internet or not it's just they're more just like is Wi Fi on right so I think that what we're doing with Codex is we're building a software engineering teammate and as part of that we're kind of building an agent that can use a computer by writing code and so we're already seeing some pull for this it's quite early but we're starting to see people who are using Codex for coding adjacent product purposes and so as that develops I think we'll just naturally see that oh it turns out we should just always have the agent write code if there is a coding way to solve a problem instead of even if you're doing it in financial analysis maybe write some code so basically were like hey this is the two ends of this product for the super assistant of ChatGPT in my mind just coding is a core competency of any agent including ChatGPT and so really what we think we're building is that competency so here's the really cool thing about agents writing code is that you can import code code is composable interoperable
Alexander Imbirikos:One very reductive view we could have for an agent is it's just going to be given a computer and it's gonna like point and click and go around but that is the future and then how we get there is difficult to sort of chart a path because a lot of the questions around building agents aren't like can the agent do it but it's more about well how can we help the agent understand the context that it's working in and like the team that's using it you know has a way that they like to do things they have guidelines they probably want certain deterministic guarantees about what the agent can or cannot do or they want to know that the agent understands sort of this detail like an example would be you know if we're looking at a crash reporting tool hitting a connector for it every sub team is probably has a different meta prompt for like how they want their crashes to be analyzed and so we start to get to this thing where like yeah we have this agent sitting in front of a computer but we need to make that configurable for the team or for the user and let them stuff that the agent does often we probably just wanna build in as a competency that this agent has that it can do so I think we end up with this generalizable thing that you were saying of an agent that can just write its own scripts for whatever it wants to do but I think that the really key part here is can we make it so that everything that the agent has to do often or that it does well we can just like remember and store so that the agent doesn't have to write a script for that again right or maybe like if I just joined a team and you are already on the same team as me I can just like use all those scripts that the agents had written already
Lenny Rachitsky:Yeah that's like if this is our teammate we can they can share things that it's learned from working with other people at the company it just makes sense as a metaphor yeah it feels like you're in the camp of agents today are not that great and mostly slop and maybe in the future they'll be awesome does that resonate
Alexander Imbirikos:I think so I think coding agents are pretty great think
Lenny Rachitsky:That feels right
Lenny Rachitsky:There yep
Alexander Imbirikos:Ton of value
Alexander Imbirikos:And then I think like agent side outside of coding it's still very early and this is just my opinion but I think they're gonna get a whole lot better once they can use coding too in a composable way it's kind of the fun part of when you're building for software engineers at my startup we were building for software engineers too for a lot of that journey and they're just such a fun audience to build for because they also like building for themselves and are often even more creative than we are in thinking about how to use the technology and so like by building for software engineers you get to just observe a ton of emergent behaviors and like things that you should do and build into the product
Lenny Rachitsky:I love how you you say that because a lot of people building for engineers get really annoyed because the engineers ever so just they're always complaining about stuff they're like that sucks so why'd you build it this way I love that you enjoy it but I think it's probably because you're building such an amazing tool for engineers that can actually solve problems and just you know code for them kinda along those lines you know there's always this talk of what will happen with jobs engineers coding do you have to learn coding all these things clearly the way you're describing it is it's a teammate it's gonna work with you make you more superhuman it's not gonna replace you with the way you just think about the impact on the field of engineering having a super intelligent engineering teammate
Alexander Imbirikos:I think there's there's two sides to it but the one we were just talking about is this idea that maybe every agent should actually use code and be a coding agent and in my mind that's just like a small part of this broader idea that like hey as we make code even more ubiquitous I mean you could probably claim it's ubiquitous today even pre AI right but as we make code even more ubiquitous it's actually just going to be used for many more purposes and so there's just going to be a ton more need for people with this like humans with this competency so that's my view I think this is like quite a complex topic so you know it's something we talk about a lot and we have to kind of see how it pans out but I think what we can do what we can do basically as a product team building in the space is just try to always think about how are we building a tool so that it feels like we're like maximally accelerating people you know rather than building a tool that makes it like more unclear what you should do as the human right like I think like to to you know give an example right now like nowadays when you work with a coding agent it writes a ton of code but it turns out writing code is actually one of the most fun parts of software engineering for many software engineers and so then you end up reviewing AI code right and that's often a less fun part of the job for many software engineers right and so I actually think like we see that this comes up plays out all the time in a ton of micro decisions so we as a product team are always thinking about okay how do we make this more fun how do we make you feel more empowered whereas it's not working and I would argue that reviewing agent written code is a place that today is less fun and so then I think okay what can we do about that well we can ship a code review feature that helps you build confidence in the AI written code okay cool another thing we could do is we can make it so that the agent's better able to validate its work and it gets all the way down into micro decisions if you're gonna have an agent capability to validate work and let's say you have I'm thinking of Codex web right now you have a pane that sort of reflects the work the agent did what do you see first do you see the diff or do you see the image preview of the code it wrote and I think if you're thinking about this from perspective like how do I empower the human how do I make them feel as accelerated as possible you obviously see the image first you shouldn't be reviewing the code unless first you know you've seen the image unless maybe it's been like reviewed by an AI now it's time for you to take a look
Lenny Rachitsky:When I had Michael Chiralda CEO of Cursor on the podcast he he had this kind of vision of us moving to something beyond code and I've seen this rise of something called spec driven development where you kinda just write the spec and then the code you know the AI writes code for you and so you kinda start working at this higher abstraction level is that something you see where we're going just like engineers not having to actually write code or look at code there's gonna be this higher level of abstraction that we focus on
Alexander Imbirikos:Yeah I mean I think I think there's like constantly these levels of abstraction and they're actually already played out today right like today like coding agents mostly it's like prompts to patch right we're starting to see people doing like spec driven development or like planned and driven development that's actually one of the ways when people ask like hey how do you run Codex on a really long task well it's like often collaborate with it first to write like a plan MD like a markdown file that's your plan and once you're happy with that then you ask a kid to go off and do work if that plan has verifiable steps it'll like work for much longer so we're totally seeing that I think spec driven development is an interesting idea it's not clear to me that it'll work out that way because a lot of people don't like writing specs either but it seems plausible that some people will work that way a bit of a joke idea though is if you think of the way that many teams work today they often don't necessarily have specs but the team is just really self driven and so stuff just gets done and so almost that is like I'm coming up with this on the spot so it's not a good name but chatter driven development where it's just like stuff is happening on social media and in your team communications tools and then as a result code gets written and deployed so yeah I think I'm a little bit more oriented in that way of I don't even necessarily want to have to write a spec sometimes I want to only if I like writing specs other times I might just want to say hey here's the customer service channel and tell me what's interesting to know but if it's a small bug just fix it I don't wanna have to write a spec for that I have this hypothetical future that I like to share sometimes with people as a provocation which is in a world where we have like truly amazing agents like what does it look like to be a solopreneur and you know one terrible idea for how it could look is that it's actually there's a mobile app and every idea that the agent has to do is just like vertical video on your phone and then you can like swipe left if you think it's a bad idea and you can like swipe right if it's a good idea and like you can press and hold and like speak to your phone if you wanna get feedback on the idea before you swipe you know and in this world like basically what your job is just to like plug in this app into like every single like signal system you know system of record and then you just sort of sit back and like swipe I don't know
Lenny Rachitsky:I love this so this is Tinder meets TikTok meets Codex
Alexander Imbirikos:It's pretty terrible
Lenny Rachitsky:No this is great so the idea here is this thing is this agent is watching and right listening to you paying attention to the market your users and it's like cool here's something I should do it's like a proactive engineer just like here we should build this feature and fix this thing
Alexander Imbirikos:Exactly think it's really gonna be communicating with you in like the lowest yeah
Lenny Rachitsky:Yeah like the modern way way to communicate yeah swipe left to right and and vertical feed and then the sora video okay so I see how this all connects now I see
Alexander Imbirikos:Yeah to be clear we're not building that like you know it's a fun idea I mean you see you know like in this example though like one of the things that it's doing is it's consuming external signals right I think the other really interesting thing is like if we think about like what is the most successful like AI product to date
Alexander Imbirikos:I would argue it's funny actually not to confuse things at all but the first time we used the brand codex and openai was actually the model powering github copilot this is way back in the day years ago and so we decided to reuse that that brand recently because it's just so good you know codecs code execution but I think actually like auto completion and ides is like one of the most successful AI products today and part of what's so magical about it is that when it can surface like ideas for helping you really rapidly when it's right you're accelerated when it's wrong it's not like that annoying it can be annoying but it's not that annoying and so you can create this like next initiative system that's like contextually responding to like what you're attempting to do and so in my mind this is like a really interesting thing for us as openai as we're building so for instance when I think about launching a browser which we did with atlas right like in my mind one of the really interesting things we can then do is we can then like contextually surface like ways that we can help you as you're going about your day right and so we break out of this we're just looking at code or we're just in your terminal into this idea that hey like a real teammate is dealing with a lot more than just code right they're dealing with a lot of things that are web content so like you know how can we help you with that
Lenny Rachitsky:Man there's so much there and I love this okay so autocomplete on web with the browser that's so interesting just like here's all the things that we can help you with as you're browsing and going about your day I wanna talk about atlas I'll come back to that codecs code execution did not know that that's really clever I I get it now okay and then this chatter what is it chatter driven development I had a no this is a really good idea but it reminds me I had John G Don G on the podcast CTO of block and they they have this product called goose which is their own internal agent thing and he talked about an engineer at block just has goose watch him with like his screen and listens to every meeting and proactively does work that he should probably wanna do so he ships to PR sends an email drafts a slack message so he's doing exactly what you're describing in in kind of a very early way
Alexander Imbirikos:Yeah that's super interesting and you know I bet you the so if we go if we went and asked them what the bottleneck to that productivity is did did they share what it is
Lenny Rachitsky:Probably looking at it and just making sure this is the right thing to yeah
Alexander Imbirikos:Yeah so we see this now we have a slack integration for codex people love if there's something that you need to do quickly people would just mention codex like why do you think this bug is happening doesn't have to be an engineer even like maybe data scientists often appear are using codex a ton to just answer questions like why do you think this metric moved what happened so questions you get the answer right back in slack it's amazing super useful as for when it's writing code then you have to go back and look at the code and so the real I think bottleneck right now is validating that the code worked and writing code review so in my mind if we wanted to get to something like that friend you were talking about world I think we really need to figure out how to get people to configure their coding agents to be much more autonomous on those later stages of the work it makes sense
Lenny Rachitsky:Like you said writing code I used to be an engineer was an engineer for ten years really fun to write code really fun to just get in the flow build architect test not so fun to look at everyone else's code and just have to go through and be on the hook if it is doing something dumb that's gonna take down production and now that building has become easier what I've always heard from companies that are really at the cutting edge of this is the bottleneck is now like figuring out what to build and then it's at the end of okay I have all this all a 100 PRs to review who's gonna go through all that
Alexander Imbirikos:Right
Lenny Rachitsky:Yeah this episode is brought to you by jira product discovery the hardest part of building products isn't actually building products it's everything else it's proving that the work matters managing stakeholders trying to plan ahead most teams spend more time reacting than learning chasing updates justifying road maps and constantly unblocking work to keep things moving jira product discovery puts you back in control with jira product discovery you can capture insights and prioritize high impact ideas it's flexible so it adapts to the way your team works and helps you build a road map that drives alignment not questions and because it's built on jira you can track ideas from strategy to delivery all in one place less chasing more time to think learn and build the right thing get jira product discovery for free at atlassian.com/lenny that's atlassian.com/lenny what has the impact of codec's been on the way you operate as a product person as a pm it's clear how engineering is impacted code has written for you what has it done to the way you operate and the way pms operate at at openai
Alexander Imbirikos:Yeah I mean I think mostly I just feel like much more empowered I've always been sort of more technical leaning PM and especially when I'm working on products for engineers I feel like it's necessary to like you know dog food the product but even beyond that I I I just feel like I can do much much more as a PM and Scott Belsky talks about this idea of compressing the talent stack I'm not sure if I phrased that right but it's basically this idea that maybe the boundaries between these roles are a little bit less needed than before because people can just do much more and every time someone can do more you can like skip one communication boundary and make the team like that much more efficient right so I think I think we see it you know in a bunch of functions now but I guess since you asked about like product specifically you know now like answering questions much much easier you can just ask codex for thoughts on that a lot of like PM type work understanding what's changing again just ask codex for help with that prototyping is often faster than writing specs this is something that a lot of people have talked about I think something that I don't think it's super surprising but something that's slightly surprising is like we see like we're mostly building codex for to write code that's gonna be deployed to production but actually we see a lot of throwaway code written with codex now that's kind of going back to this idea of ubiquitous code so you'll see someone wants to do an analysis if I wanna understand something it's like okay just give codex a bunch of data but then ask it to build like an interactive like data viewer for this data right you would that's just like too annoying to do in the past but now it's just like totally worth the time of just getting an agent to go do something similarly I've seen like some pretty cool prototypes on our design team about like if you want to well like a designer basically wanted to build an animation and this is the coin animation in codex and it was like normally it'd be too annoying to program this animation so they just vibe coded a animation editor and then they use the animation editor to build the animation which they then check into the repo actually designers are there's a ton of acceleration there and like speaking of compressing the talent stack think our designers are very PMs so you know they they do a ton of product work and like they actually have like an entire like vibe coded sort of side prototype of the codex app and so a lot of how we talk about things is like we'll have like a really quick jam because there's like 10,000 things going on and then the designer will like go think about how this should work instead of talking about it again they'll just vibe code a prototype of that in their standalone prototype we'll play with it if we like it they'll vibe code that prototype into or vibe engineer that prototype into an actual PR to land and then depending on their comfort with the code base like codex CLIs and rust is a little harder maybe they'll like land it themselves or they'll like get close and then an engineer can help them like land the PR you know we recently shipped the sora android app and that was one of the most sort of mind blowing examples of acceleration actually because usage of codex internally to open AI is obviously really really high but it's been growing over the course of the year in terms of like now it's basically like all technical staff use it but even like the intensity and know how of how to make the most of coding agents has gone up by a ton and so the sora android app a fully new app we built it in eighteen days it went from like zero to launch to employees and then ten days later so twenty eight days total we went to just like GA to the public and that was done just like with the help of codex so pretty insane velocity I would say it was like a little bit I don't wanna say easy mode but there is one thing that codex is really good at if you're a company that's like building software on multiple platforms so you've already figured out like some of the underlying like APIs or systems asking codex to port things over is really effective because it has something you can go look at and so the engineers on that team were basically having codex go look at the iOS app produce plans of work that needed to be done and then go implement those I was kinda looking at iOS and Android at the same time and so you know basically it was like two weeks to launch employees four weeks total insanely fast
Lenny Rachitsky:What makes that even more insane is it was the it became the number one app in the app store
Alexander Imbirikos:I don't know.
Lenny Rachitsky:This just boggles the mind okay so
Alexander Imbirikos:Yeah so imagine doing
Lenny Rachitsky:The number one
Alexander Imbirikos:App in the app store with like a handful of engineers I think it was like two or three possibly in a handful of weeks yeah.
Lenny Rachitsky:This is absurd.
Alexander Imbirikos:So yeah so that's a really fun example of acceleration then like atlas is the other one that I think Ben did a podcast engine lean on atlas sharing a little bit about how we built there atlas is actually I mean it's a browser right and building a browser is really hard and so we had to build a lot of difficult systems in order to do that and basically we got to the point where that team has a ton of power users of codex right now and know it got to the point where they basically were you know we were talking to them about it because a lot of those engineers are people I used to work with before my startup and so they'd say you know before this would have taken us like two to three weeks for two to three engineers and now it's like one engineer one week so massive acceleration there as well and what's quite cool is that you know we shipped atlas on mac first but now we're working on the windows you know so the team now is like ramping up on windows and they're helping us make codex better on windows too which is admittedly earlier like just the model we we shipped last week is the first model that natively understands powershell so you know powershell being the native like shell language on windows so yeah it's been it's been really awesome to see like the whole company getting accelerated by codex like from and you know most obviously also research and like improving how quickly we train models and how well we do it and then even like design as we talked about and and marketing like actually we're at this point now where my product marketer is often also making string changes just directly from slack or like updating docs directly from slack.
Lenny Rachitsky:These are amazing examples you guys are living at the bleeding edge of what is possible and this is how other companies are gonna work just shipping again what became the number one app in the app store and just beloved all over the it just like took over the I don't know the world for at least a week built you said at twenty eight days and like I don't know ten days eighteen days just to get like the core of it working.
Alexander Imbirikos:Yeah so like eighteen days we had a thing that employees were playing with yeah and then ten days later we were out.
Lenny Rachitsky:And you said just a couple engineers yeah two or three okay and then atlas you said it was took a week to build.
Alexander Imbirikos:No no no so atlas is not the whole week but atlas was like a really meaty project yeah and so I was talking to one of the engineers on atlas about just how what they use codex for it's basically like we use codex for absolutely everything I was like okay well like how would you measure the acceleration so basically the answer I got back was previously would have taken two to three weeks for two to three engineers and now it's like one engineer one week.
Lenny Rachitsky:Do you think this eventually moves to non engineers doing this sort of thing like does it have to be an engineer building this thing could sort of built been built by I don't know a pm or designer.
Alexander Imbirikos:I think we will very much get to the point where basically where the boundaries are a little bit blurred right like I think you're going to want someone who's like understands the details of what they're building but what details those are will evolve kind of like how now like if you're writing swift you don't have to speak assembly there's a handful of people in the world and it's really important that they exist and like speak assembly maybe more than a handful right but that's like a specialized function that like most companies don't need to have so I think we're just gonna naturally see like an increase in layers of abstraction and then the cool thing is now we're entering like the language layer of abstraction like natural language and the natural language itself is really flexible right like you could have engineers talking about like a plan and then you could have engineers talking about a spec and then you could have engineers talking about just a product or an idea so I think we can also like start moving up those layers of abstraction as well but I do think this is gonna be gradual I don't think it's gonna go to all of a sudden nobody ever writes anything and any code and it's just specs I think it's gonna be much more like okay we've set up our coding agent to be really good at previewing the build or at running tests maybe that's the first part that most people have set up and say okay now we've set it up so they can execute the build and it can see the results of its own changes but we haven't yet built a good integration harness so that it can in the case of atlas by the way don't know if they've done any of this or not I think they've done a lot of this but maybe the next stage is enable it to load a few sample pages to see how well those work so then okay now we're going to set up to that and I think for some time at least we're going to have humans curating which of these connectors or systems or components that the agent needs to be good at talking to and then you know in the future there'll be an even greater unlock where codex tells you how to set it up or maybe sets itself up in a repo.
Lenny Rachitsky:What a wild time to be alive wow I'm curious just the second order effects of this sort of thing just how quickly it is to build stuff what does that do does that mean distribution becomes much much more important does it mean ideas are just worth a lot more it's interesting to think about how quick how that changes.
Alexander Imbirikos:I'm curious what you think I still don't think ideas are worth as much as maybe a lot of people think I think still think execution is really hard right like you can build something fast but you still need to execute well on it still needs to make sense and be a coherent thing overall yeah and distribution is massive.
Lenny Rachitsky:Yeah just feels like everything else is now more important everything that isn't the building piece is coming up with an idea getting to market profit all that kind of I
Alexander Imbirikos:Think we might have been in this weird temporary phase where for a while you could just it was so hard to build product that you mostly just had to be really good at building product and it maybe didn't matter if you like had an intimate understanding of a specific customer but now I think we're getting to this point where actually like if I could only choose like one thing to understand it would be like really meaningful understanding of the problems that a certain customer has if I could only go in with one core competency so I think that that's ultimately still what's going to matter most if you're starting a new company today and you have like a really good understanding and like network of customers that are currently underserved by AI tools I think you're like you're set right whereas if you're like good at building you know websites but you don't have any specific customer to build for I think you're in in for a much harder time
Lenny Rachitsky:Bullish on vertical AI startups is what I'm hearing yeah I completely agree there's like you know there's like the general thing that can solve a lot of problems then there's like we're gonna solve presentations incredibly well and we're gonna understand understand the presentation problem better than anyone and we're gonna plug into your workflows and then all these other things that matter for a very specific problem okay incredible when you think about progress on codecs I imagine you have a bunch of evals and there's all these public benchmarks what's something you look at to tell you okay we're making really good progress I imagine it's not gonna be the one thing but what do you focus on what's like something you're trying to push what's like a KPI or two
Alexander Imbirikos:One of the things that I'm constantly reminding myself of is that a tool like Codex sort of naturally is a tool that you would become a power user of and so we can accidentally spend a lot of our time thinking about features that are very deep in the user adoption journey and so we can kind of end up over solving for that so I think it's just critically important to go look at your D7 retention just go try the product sign up from scratch again I have a few too many Chatcha Petite Pro accounts that I've just in order to maximally correctly dog food like sign up for on my Gmail they charge me like $200 a month I need to expense those but you know like I think just like the feeling of being a user and the early retention stats are still like super important for us because as much as this category is taking off I think we're still in the very early days of people using them another thing that we do that might be I think we might be the most user feedback social media pill team out there in this space a few of us are constantly on Reddit and Twitter and there's praise up there and there's a lot of complaints but we take the complaints very seriously and look at them and I think that again because you can use coding agent for so many different things it often is kind of broken in many sort of ways for specific behaviors and so we actually monitor a lot just like what the vibes are on social media pretty often especially I think for Twitter X it's a little bit more hypey and then Reddit is a little more negative but real actually so I've started increasingly paying attention to like how people are talking about using Codex on Reddit actually
Lenny Rachitsky:This is important for people to know which subreddits do you check most is there like in our Codex
Alexander Imbirikos:Or I mean the algorithm is pretty good at servicing stuff but like r slash Codex is is
Lenny Rachitsky:There okay I'll take very interesting and then if people tag you on Twitter you still see that but maybe not as powerful as seeing it on Reddit
Alexander Imbirikos:Well yeah and the interesting well the thing with Twitter is it's a little bit more one to one even if it's like in public whereas like with Reddit there's like really good upvoting mechanics and like maybe most people are still not bots unclear so you get you get like good signal on what matters and what other people think
Lenny Rachitsky:So interestingly Atlas I wanna talk about that briefly you guys launched Atlas I tweeted actually that I tried Atlas and then I I don't love the AI only search experience I was just like I just want Google sometimes or whatever like just waiting for AI to give me an answer I'm like I don't wanna and there was no way to switch I just tweeted hey I'm I'm switching back at home it's not great and I feel like I made some PMs at OpenAI sad and I saw someone tweet okay we have this now which I imagine was always part of the plan it's probably an example of we just ship we gotta ship stuff see how people use it and then we figure it out so I guess one is that I don't know is there anything there and two I'm just curious why are you guys building a web browser
Alexander Imbirikos:So I I worked on Atlas for a bit I don't work on it now but you know like the a bit of the narrative here for for me just to tell my story a bit was like I was working on this like sharing like pair programming startup right and then we joined OpenAI and so the idea was really to build a contextual desktop assistant and the reason I believe that's so important is because I think that it's really annoying to have to give all your context to an assistant and then to figure out how it can help you right and so if it could just like understand what you are trying to do then it could maximally accelerate you and so I you know I still think of Codex actually as like a contextual assistant from a little bit of a different angle like starting with coding tasks but some some of the thinking at least for me personally I can't speak for the whole product but was that a lot of work is done in the web and if we could build a browser then we could be contextual for you but in a much more first class way we weren't hacking like other desktop software which have like very varied support for for like what content they're rendering to the accessibility tree we wouldn't be relying on screenshots which are a little bit slower and unreliable instead we we could like be in the rendering engine right and like extract whatever we needed to to help you and also I like to think of like you know video games like I don't know if you've played like I don't know say Halo right like you walk up to an object I mean this is true for many games press man it's been a long time this is embarrassing press X and it just does the right thing right and I was one of those guys who always read the instruction manual for every video game that I bought and I remember the first time I read about a contextual action and I just thought it was like this really cool idea and you know the the thing about a contextual action is we need to know what you are attempting to do we have a little bit of context and then we can and then we can help and I think this is critically important because imagine this world that we reach right where we have agents that are helping you thousands of times per day imagine if the only way we could tell you that we helped you is if we could push notify you so you get a thousand push notifications a day of an AI saying like hey I did this thing do you like it it'd be super annoying right whereas imagine going back to software engineering like I was looking at a dashboard and I noticed some like key metric had like gone down and you know at that point in time an AI could like maybe go take a look and then surface the fact that it has an opinion on why this metric went down and maybe a fix right there right when I'm looking at the dashboard right that would be like that would much more keep me in flow and enable the agents to take action on like many more things so in my mind like part of why I'm excited for us to have a browser is that I think we have then like much more context around like what we should help with users have much more control over what they want us to look at it's like hey if you want to open if you want us to like take action on something you can open it in your AI browser if you don't then you can open it in your other browser right so like really clear control and boundaries and then we have the ability to build UX that's like mixed initiatives so that we can surface contextual actions to you like at the time that they're helpful as opposed to just like randomly notifying you
Lenny Rachitsky:Hearing the vision for Codex being the super assistant it's not just there to code for you it's trying to do a lot for you as a teammate as this kind of super teammate that makes you awesome at work so I get this speaking of that are there other non engineering common use cases for Codex just ways that nonengineers we talked about you know designers prototyping and building stuff are there any I don't know fun or unexpected ways people are using codecs that aren't engineers
Alexander Imbirikos:I mean there's a load of a load of unexpected ways but I think like most of what we're seeing real traction with people using things are still for now very I would say coding adjacent or sort of tech oriented places where there's a mature ecosystem or maybe you're doing data analysis or something like that I personally am expecting that we're going to see a lot more of that over time but for now we're keeping the team very focused on just coding for now because there's so much more work to do
Lenny Rachitsky:For people that are thinking about trying out codecs is there like a does it work for all kinds of code bases what code does it support if you're like I don't know SAP can you add codecs and start building things what's kind of like the sweet spot or does it start to not be amazing yet
Alexander Imbirikos:This I'm really glad you asked this question actually because the best way to try Codex is to give it your hardest tasks which is a little different than some of the other coding agents like you know some tools you might think okay me start easy or just vibe code something random and decide if I like the tool whereas we're really building Codex to be the professional tool that you can give your hardest problems to and that writes high quality code in your enormous code base that is in fact not perfect right now so yeah I think if you're gonna try Codex you wanna try it on a real task that you have and not necessarily dumb that task down to something that's trivial but actually a good one would be you have a hard bug and you don't know what's causing that bug and you ask Codex to help figure that out or implement that you know the fix
Lenny Rachitsky:I love that answer just give it to your hardest problem
Alexander Imbirikos:I will say if you're like hey okay well the hardest problem I have is that I to build a new unicorn business obviously that's not not gonna yet so I think it's like give it the hardest problem but something that is still one question right or one task to start that's if you're testing and then over time you can learn how to use it for like bigger things
Lenny Rachitsky:Yeah what languages does does it support
Alexander Imbirikos:Basically the way we've trained Codex is like there's a distribution of languages that we support and it's like fairly aligned with like the frequency of these languages in the world so unless you're writing some like very esoteric language or like some private language it should do fine in your language
Lenny Rachitsky:If someone was just getting started is there a tip you could share to help them be successful like if you could just whisper a little tip into someone just setting up Codex for the first time to help them have a really good time what's something you would whisper
Alexander Imbirikos:I might say try a few things in parallel right so you could try giving it a hard task maybe ask it to understand the code base formulate a plan with it around an idea that you have and kind of build your way up from there and like sort of the meta idea here is it's again it's like you're building trust with the new teammate right and so like you wouldn't go to a new teammate and just give them like hey do this thing here's zero context you would start by like first making sure they understand the code base and then you would like maybe align on a plan approach and then you would have them go off and do bit by bit right and I think if you use Codex in that way you'll just sort of naturally start to understand like the different ways of prompting it because it is it's a super powerful like agent and model but it is it is a little bit different to prompt Codex than other models
Lenny Rachitsky:Just a couple more questions one we touched on this a little bit as AI does more and more coding there's always this question of should I learn to code and why should I spend time doing this sort of thing for people that are trying to figure out what to do with their career especially if they're into software engineering and computer science do you think there's specific elements of computer science that are more and more important to lean into maybe things they don't need to worry about like what do you think people should be leaning into skill wise in as this becomes more and more of a thing in our workplace
Alexander Imbirikos:I think there's like a couple angles you could go at this from I think the well the easiest one to think of at least is just like be a doer of things I think that you know with coding agents getting better and better over time it's just what you can do as even like someone in college or a new grad is just like so much more than what that was before so and I think you just want to be taking advantage of that and definitely when I'm looking at hiring folks who are earlier career it's definitely something that I think about is how productive are they using the latest tools productive and if you think of it in that way they actually have less of a handicap than before versus a more senior career person because the divide is actually getting smaller because they've got these amazing coding agents now so that's one thing is I guess the advice is just learn about whatever you want but just make sure you spend time doing things not just like fulfilling homework assignments I guess I think the other side of it though is that it's still deeply worth understanding like what makes a good like overall software system so I still think that like skills like really strong systems engineering skills or even like really effective like communication and collaboration with your team it goes like that I think are are important I mean it continue to matter for quite some time like I don't think it's gonna be like all of a sudden the AI coding agents are just able to build perfect systems without your help I think it's gonna look much more gradual where it's like okay we have these AI coding agents they're able to validate their work it's still important and like for example like I'm thinking of an engineer who was working on Atlas since we were talking about it he set up Codex so they can like verify its own work which is a little bit nontrivial because of the nature of the Atlas project so the way that he did that was he actually prompted Codex like hey why can't you verify your work fix it and like did that on a loop right and so you still like at various phases are gonna want a human in the loop to like help configure the coding agent to be effective so I think you still want to be able to reason about that so maybe it's less important that you can type really fast and you understand exactly how to write not that anyone writes a for each loop or something but it is or you know you don't need to know how to implement like a specific algorithm but I think you need to be able to reason about the different systems and like what makes like effective a software engineering team effective so I think that's the other really important thing and then like maybe the last angle that you could take is I think if you're on the frontier of knowledge for a given thing I still think that's like deeply interesting to go down partially because that knowledge is still going to be like
Alexander Imbirikos:You know agents aren't going be as good at that but also partially because I think that by trying to advance the frontier of a specific thing you'll actually end up being forced to take advantage of coding agents and using them to accelerate your own workflow as you go
Lenny Rachitsky:What's an example of that when you talk about being at the frontier some
Alexander Imbirikos:Codex writes a lot of the code that helps manage its training runs the key infrastructure move pretty fast so have a Codex code review it's catching a lot of mistakes it's actually caught some pretty interesting configuration mistakes and you know we're starting to see glimpses of the future where we're actually starting to have Codex even like be on call for its own training which is pretty interesting so there's lots there
Lenny Rachitsky:Wait what does that mean to be on call for its own training so it's running it's training and it's like oh something broke someone needs and it it does it like alert people it's like here I'm gonna fix the problem and re yeah we'll restart
Alexander Imbirikos:This is an early idea that we're like figuring out but the basic idea is that you know during a training run there's a bunch of graphs that like today like humans are looking at and it's like really important to like look at those we call this babysitting.
Lenny Rachitsky:Because it's very expensive to train I imagine and very important to move fast.
Alexander Imbirikos:And exactly and there's a lot of systems underlying the training run so like a system could go down or there could be an error somewhere that gets introduced and so we might need to like fix it or pause things or I don't know there's lots of actions we might need to take and so basically having codex like run on a loop to like evaluate how those charts are moving over time is sort of this idea that we have to like how to enable us to like train like way more efficiently.
Lenny Rachitsky:I love that and this is very much along the lines of this is the future of agents it's codex isn't just for building code and write it's it's a lot more than that yeah okay last question being at openai I can't not ask about your agi timeline and how far you think we are from agi I know this isn't what you were come but there's a lot of opinions lot of I don't know timelines how far do you think we are from.
Lenny Rachitsky:Humanly human version of ai whatever that means to you.
Alexander Imbirikos:A
Alexander Imbirikos:For me I think that it's a little bit about like when do we see the acceleration curves kinda go like this or I don't know which way I'm mirrored here right when do we see the hockey stick and I think that the current limiting factor I mean there's many but I think a current underappreciated limiting factor is like literally human typing speed or human multitasking speed on writing prompts and you were talking about you can have an agent watch all the work you're doing but if you don't have the agent also validating its work then you're still bottlenecked on can you go review all that code so my view is that we need to unblock those productivity loops from like humans having to prompt and humans having to like manually validate all the work and so if we can like rebuild systems to let the agent like be default useful we'll start unlocking hockey sticks unfortunately I don't think that's gonna be binary I think it's gonna be very dependent on what you're building right so like I would imagine that like next year if you're a startup and you're building a new new pieces like you know some new app or something it'll be possible for you to set it up on a stack where agents are like much more self sufficient than not right but now let's say I don't know you mess into sap right let's say you work in sap like they have many like complex systems and they're not gonna be able to just like get the agent to be self sufficient overnight those systems so they're gonna have to slowly like maybe replace systems or update systems to allow the agent to like handle more of the work end to end and so basically my sort of long answer to your question maybe boring answer is that I think starting next year we're going to see like early adopters like starting to like hockey stick or productivity and then over the years that follow we're going to see larger and larger companies like hockey stick that productivity and then somewhere in that fuzzy middle is like when that hockey sticking will be like flowing back into the ai labs and that's when we'll we'll basically be at the agi tier.
Lenny Rachitsky:I love this answer it's very practical and it's something that comes up a lot on this podcast just like the time to reveal all the things ai is doing is really annoying and a big bottleneck I love that you're working on this because it's one thing to just make coding much more efficient and do that for people it's another to take care of that final step of okay is this actually great and that's so interesting that your sense is that's the limiting factor it comes back to your earlier point of even if ai did not advance anymore we have so much more potential to unlock if we as we learn to use it more effectively so that is a really unique answer I haven't heard that perspective on what is the big unlock human typing speed to review basically what ai is doing for us.
Alexander Imbirikos:Mhmm.
Lenny Rachitsky:So good okay Alexander we covered a lot of ground is there anything that we haven't covered is there anything you wanted to share maybe double down on before we get to our very exciting lightning round.
Alexander Imbirikos:I think one thing is that the codex team is growing and as I was just saying we're still somewhat limited by human thinking speed and human typing speed we're working on it so if you're an engineer or a salesperson or I'm hiring for product person please hit us up I'm not sure the best way to give contact info but I guess you can go to our jobs page or do they have contact for you actually do you do listeners have contact for you.
Lenny Rachitsky:Before they send me like hey I wanna apply to codex no I do have a contact form at lenny rojitsi dot com I'm afraid of all the amazing people that are gonna ping me but there we go we could try that let's see how that goes.
Alexander Imbirikos:Or yeah or another maybe an easier we can edit all that out I I or up to you but yeah or I would just say you can drop us a dm for example I'm embureco on twitter and hit me up if you're interested in joining the team.
Lenny Rachitsky:What a dream job for so many people what's a sign they I don't know what's like a way to filter people a little bit so they're not letting your inbox.
Alexander Imbirikos:So specifically if you want to join the Codex team then you need to be a technical person who uses these tools and I think I would just ask yourself the question hey let's say you know I were to join OpenAI and work on Codex over the next six months you know and crush it what does the life of a software engineer look like then and I think if you have an opinion on that you should apply if you don't have an opinion on that and have to think about it depending on how long you have to think about it I guess that would be the filter right like I think there's a lot of people thinking about the space and so we're very interested in folks who sort of have already been thinking about like what the future should look like with agents and like we don't have to agree on where where we're going but I think we want people who like are very passionate about the topic I guess
Lenny Rachitsky:It's very rare to be working on a product that has this much impact and is at such a bleeding edge of where it's possible it's a what a cool role for the right person so it's awesome that you have an opening and this audience is a really good fit potentially for for that role so I hope we find someone that would be incredible with that we've reached our very exciting lightning round I've got five questions for you Alexander are you ready
Alexander Imbirikos:I don't know what these are but I'm excited let's do it
Lenny Rachitsky:They're the same questions asked everyone except for the last one so probably not a surprise I should probably make them more often a surprise okay first question what are a couple books that you recommend most to other people two or three books that come to mind
Alexander Imbirikos:I have been reading a lot of science fiction recently and I'm sure this has been recommended before but The Culture I think it's Ian Banks is the name of the author part of why I love it is because it's like basically relatively recent writing about a future with AI but it's an optimistic future with AI and I think you know a lot of sci fi is like fairly dystopian but this is like people sort of the joke at least on The Culture subreddit is that let see if I can get this right it is a like space communist utopia or like I think it's a gay space communist utopia and I just think it's like really fun to think about like to use The Culture as a way to think about like what kind of world can we usher in and like what decisions can we make today to help usher in that world
Lenny Rachitsky:Wow I've not I don't think anyone's recommended that I know you're reading you'd mentioned before you start recording Lord of the Rings right now if you want another AI ish sci fi book have you read Fire Upon the Deep
Alexander Imbirikos:No I haven't
Lenny Rachitsky:Okay it's incredibly good it's like a sci fi space opera sort of epic tale with super intelligence cool yeah someone mostly not optimistic but somewhat optimistic okay next question is there a favorite recent movie or TV show that you've really enjoyed
Alexander Imbirikos:Yeah there's an anime called Jujutsu Kaisen which I really like again it's got a kind of a slightly dark topic of like demons but what I love about it is that the hero is really nice and I think there's this new wave of like anime and cartoons where the protagonists are really friendly and people who care about the world rather than being sort of if you look at some older anime that started the genre there's Evangelion or Akira and like those characters the protagonists are like deeply flawed like quite unhappy they they didn't start the genre but it was like a trend for a while to sort of poke fun at the idea that in these in these cartoons the protagonist was very young but being given a ridiculous amount of responsibility to like save the world and so there was kind of a a wave of like content that was like critiquing this by making the character like basically go through like serious like mental issues in the middle of the show and I'm not saying this is better but at least it's quite fun to have like these like really positive protagonists are just trying to help everyone around them
Lenny Rachitsky:I love how much we're learning about your personality hearing these recommendations yeah nice protagonists optimistic futures I I
Alexander Imbirikos:Like I think you know if you don't believe it you can't rule it into existence mhmm so you need to balance
Lenny Rachitsky:This is your training data is there a product you recently discovered you really love could be an app could be some clothing could be some kitchen gadget tech gadget a hat
Alexander Imbirikos:Yeah so I have been like quite into you know combustion engines and cars actually the reason I came to America initially was because I wanted to work on like US aircraft now I work in software and so for the longest time I basically only had like quite old sports cars old just because they were more affordable and then recently we got a Tesla instead and I have to say that I find the Tesla software like quite inspiring in particular it has like the self driving feature and you know I've mentioned a few times like today like I think it's really interesting to think about how to build like mixed initiative software that makes you feel maximally empowered as a human maximally in control but yet you're getting a lot of help and I think they did a really good job with enabling sort of the car to drive itself but all these different ways that you can adjust what it's doing without turning off the self driving so you can accelerate you know it'll like listen to that you can turn a knob to change its speed you can steer slightly I think it's a it's actually a masterclass in like building an agent that still leaves the human in control
Lenny Rachitsky:This reminds me Nick Turley's whole mantra was are we maximally accelerated yeah feels like it's completely infiltrated everything at OpenAI which makes sense that tracks two more questions do you have a life motto that you often think about and come back to in work or in life that's been helpful
Alexander Imbirikos:I don't know if I have a life motto but maybe I can tell you about the number one value company value from my startup love it which is still something that sticks with me which is to be kind and candid
Lenny Rachitsky:Attracts kind and candid
Alexander Imbirikos:Wow we had to put together because we as founders realized that we often would be nice and it wasn't actually the right thing to do and we would like delay the difficult conversations and we were not candid and so every time we would like remind ourselves of this motto and then we would become more candid and then six months later we would realize that we were in fact not candid six months ago and we needed to be even more candid so then the question is like okay how should we be candid it's like okay well let's think of being candid as an act of kindness but also think of that both in terms of doing it and willing ourselves to do it but also in terms of how we frame it to people
Lenny Rachitsky:That is a beautiful way of summarizing how to how to lead well what's the the book about dare challenge directly but care deeply radical candor
Alexander Imbirikos:Oh yeah yeah right yeah
Lenny Rachitsky:So it's like another way of thinking about radical candor okay last question i was looking up your last name just like hey what's the what's the story here so your last name is ambirikos and i was talking to jat gpt and it told me the most famous individuals with the surname are the influential greek poet and psychoanalyst andreas amburikos and his relative the wealthy shipping magnate and art collector george amburikos so the question is which of these two do you most identify with the greek poet and psychoanalyst or the wealthy shipping magnate and art collector
Alexander Imbirikos:I think it's it's gonna have to be the poet because he he loved the island that our family's from
Lenny Rachitsky:Wait you know those people they're okay this is not news to you
Alexander Imbirikos:Okay well i mean it's an enormous family but it's like greek so you know these big families everyone's like everyone's your uncle you know what i mean like my mother's malaysian and also like everyone is my uncle or aunt in malaysia too if that makes sense yeah but yeah he he loved this island that the family sort of like initiated from i believe i don't actually know where that shipping magnate lived i think it was new york or something but anyway we all came from this island called andros which is a really beautiful place and it's like there's more like livestock there than than humans not too many tourists go there but i think he like part of what i think is really cool is like he published a lot and a lot of his writing is about like the beauty of that island which i think is super cool
Lenny Rachitsky:Wow that was an amazing answer two more questions where can folks find you if they wanna follow you online and you know maybe reach out and then how can listeners be useful to you
Alexander Imbirikos:I i'm one of those people who has social media only for the purposes of having work my phone turns black and white at 9pm at night but yeah so twitter or x envirico and yeah if you post an rcodex i'll probably see it so you can go there how can listeners be useful i would say please try codex please share feedback let us know what to improve we pay a ton of attention to feedback i think it's like honestly like the growth has been amazing but it's still very early times so we still pay a lot of attention and hope to do so forever and also i would say if you're interested in working on the future of coding agents and then agents generally then please apply to our job site and or message me in those social media places
Lenny Rachitsky:Alexander this was awesome i always love meeting people working on ai because it always feels like this very i don't know sterile scary mysterious thing and then you meet the people building these tools and they're always just so awesome you especially so nice and as you like the examples you shared optimism and kindness you know this is what we want to be these are these are the kinds of people we want to be building these tools that are gonna drive the future so i'm i'm really thankful that you did this i'm grateful to have met you and thank you so much for being here
Alexander Imbirikos:Yeah thanks so much for having me this was fun
Lenny Rachitsky:Thank you so much for listening if you found this valuable you can subscribe to the show on apple podcasts spotify or your favorite podcast app also please consider giving us a rating or leaving a review as that really helps other listeners find the podcast you can find all past episodes or learn more about the show at lennyspodcast.com see you in the next episode