From Engineer to Tool Creator: Michael Bolin's Journey at OpenAI

Introduction

“Most engineers adapt to tools, while a few rewrite them out of dissatisfaction.”

Michael Bolin, the technical lead of OpenAI Codex, is a prime example of the latter. His career spans over a decade of pivotal advancements in software engineering infrastructure, from Google Calendar to Facebook’s Buck build system, and now to OpenAI’s Codex.

In a recent episode of The Developing Dev podcast, host Ryan engaged in a deep conversation with Michael Bolin about his 20 years of engineering practice. He reflected on his transformation from a “JavaScript engineer” to leading the development of tool systems, discussing misjudgments, boundaries of capability, and the costs of growth. More importantly, he sought to address a pressing question all engineers face today: in an era where AI is reshaping development practices, which skills are still worth holding onto, and which need to be redefined?

In his view, the real differentiator is not the speed of coding but the choice of problems to solve and how one defines a “better system.”

Key Insights

Several key points worth noting include:

Many engineering breakthroughs stem from dissatisfaction with the status quo and rapid hands-on validation.
An engineer’s influence ultimately depends on whether they solve problems that truly matter to the company.
In the AI programming era, 80%-90% of code can be generated by models, but critical parts still require human oversight.
The ability to ask the right questions is becoming more important than writing code.
In the long run, the execution of programming agents will increasingly shift to the cloud rather than locally.
While AI seems capable of everything, understanding how systems work at a deeper level remains important at this stage.

From Engineer to Tool Creator: A Problem-Driven Growth Path

Ryan: I delved into your website and found a wealth of information. There’s a project you invested a lot of effort into, but now all the links are dead, and I can’t find any related materials. So, what exactly was Chickenfoot?

Michael: That’s a long story. It was my master’s thesis project, a Firefox extension, and one of the few graduation projects based on JavaScript for Firefox at the time. It was essentially a small programming tool embedded in the Firefox sidebar, like a real-time interpreter that users could call upon anytime, with the core idea being to enable web programming.

It included functions like “enter” and “click.” After calling the respective function, users needed to pass string parameters, and it would automatically locate the input box. For instance, entering “click search” would execute the click action. The main workload of this project was building underlying heuristic algorithms: when inputting “enter first name,” it would identify keywords in “first name,” locate the nearest text box, and convert it to input content via JS.

Looking back, it’s quite interesting. Much of the work we did back then is quite similar to the principles of current AI programming assistants—only now, natural language processing has been realized without relying on the JS alternatives we used back then.

Ryan: Interesting. So its role was to parse the front-end interface and convert user commands into corresponding actions through the interactive interface.

Michael: Exactly. We used accessibility labels and toggles to convert text into functionality, and this solution worked particularly well on Craigslist—after all, it was one of the simplest websites. I had a friend who used this tool to automate tasks and even made real money from the efficiency advantage, which was indeed fascinating.

Ryan: You joined Google with great enthusiasm and participated in the Google Calendar project. What attracted you to join Google? What memories do you have from that experience?

Michael: I got into the internet in the 90s. I remember browsing the web and often having to switch between a bunch of search engines to find the information I needed. I still clearly remember in March 2000 when my roommate told me, “Hey, there’s a new search engine that looks good.” It was still under Stanford University’s domain.

I found that Google Search was indeed superior. After studying it closely, I realized its interface was worlds apart from other search engines—Yahoo’s interface was cluttered, while Google was intentionally pursuing a minimalist design, clearly targeting principles more effectively. Later, as we all know, many companies began to emulate this style. Then people around me started working at Google, and I thought: great, they’re hiring top talent.

I really wanted to work with those people; they truly understood the web. In contrast, Microsoft at the time didn’t seem to understand it well and announced the halt of the IE project. I thought this was the gateway to the web, yet Microsoft planned to dismantle it. Google, on the other hand, was clearly more forward-looking regarding the web, and the engineering quality and impact of the projects made Google the place I most wanted to go after graduation.

Ryan: What was the cultural atmosphere at Google like back then? I remember you mentioned in your article that there were also divisions between product and infrastructure lines at Google.

Michael: Many companies, especially as they grow to a certain scale, tend to have a clear bias towards the business that the founding team was initially good at and successful in.

This was also typical at Google. Whether it was information retrieval or underlying infrastructure, these were core capabilities supporting the company’s growth and naturally held higher status internally.

What attracted me to Google at the time was largely due to products like Gmail. These products seemed more focused on user experience, with a relatively open direction and imaginative space. However, internally, their status could never compare to core businesses like search.

For example, the Google Calendar I participated in was primarily aimed at the consumer market, although there were also some enterprise sales scenarios. But from a business perspective, it wasn’t the core of the company’s revenue. In a sense, we were more like a “service support” product team rather than one that directly generated income. That was roughly the situation at the time.

Ryan: You eventually left Google. From your posts, it seems your time there was a mix of joy and frustration. What prompted your decision to leave?

Michael: I worked at Google for four years. Honestly, the key factor that led me to leave was personal planning. First, after four years, I had some savings, which opened up more options. More importantly, I realized I had a bad habit: I tended to pour all my energy into projects that were personally important to me but not necessarily important to Google. For instance, the calendar project I was responsible for fell into this category.

Later, I moved on to work on the efficiency tool Google Tasks, which was a small feature module under Calendar. The user base was two orders of magnitude smaller than Calendar, but I was still passionate about it. I was also fascinated by JS infrastructure and the Closure tool suite. These projects were certainly exciting, and I enjoyed the development process and felt proud of my contributions—I even wrote a book specifically for Closure, so I was quite motivated. But from a career development perspective, it wasn’t the wisest choice, was it?

I thought to myself, I was dealing with high-quality engineering problems, yet the recognition always seemed to go to others. What was the point of my hard work? Perhaps it’s true that choices always outweigh effort, so I realized it might be time to try a different path: either focus on what I was passionate about or dive into the areas the company valued most.

Ryan: Later, you returned to a tech giant and joined Facebook. I understand you were already an expert in JS, and your first major project at Meta was building a toolchain for the Android codebase. Can you share the behind-the-scenes story of your involvement in this project?

Michael: At that time, there was a very clear direction within the company: Facebook wanted to make phones.

Although there had been some failed attempts before, this time the atmosphere was distinctly different; everyone generally felt, “This time it’s really going to happen.” The plan was to collaborate with HTC to customize Android and try something new based on that.

For someone who had just joined the company, this was incredibly exciting. I had done quite a bit of Java-related work—though overall I was more focused on JavaScript, this project gave me the opportunity to engage more with Java.

At that time, there was also a direction called “Face Web,” which essentially aimed to bring HTML5 and Facebook’s web experience directly to mobile. But soon everyone realized that this path was unfeasible. Meanwhile, one thing became increasingly clear: the mobile end would be the key battleground for the company’s success.

It was around this time that a friend told me, “I know you really like JS, but it’s best to spend more energy studying Java or Objective-C, or you might end up transitioning to a product manager role.” Looking back, that was indeed a very important and timely suggestion.

I thought: I really don’t like Objective-C; Java is better. So, I joined the project. Our time was extremely tight because, unlike most other projects, this one had a hard deadline. While other projects could be released after normal completion, this one needed to submit results to HTC on time to ensure they had enough time to burn the code onto the phones around March 1.

So the whole process was a mad dash, and the initial Android codebase was actually taken directly from contractors hired by Google. Big companies are pretty similar; Facebook didn’t want to develop native applications themselves, so they paid to outsource. As a result, when the application went live, the contractor washed their hands of it and just dumped the code. In fact, Google should have thrown that pile of junk away, but they kept it around—what we received was that pile of junk, and the demands for iteration speed were quite high.

Those who have done long-term web development are used to the process of editing first and refreshing later. But the Android build system was… particularly rough. Tools like Ant couldn’t be modularized at all; we had to find a way to forcibly split them into four or five modules.

Every development was painfully difficult. I thought: I must reorganize this build system. I had done quite a bit of work in Java; it shouldn’t be this slow, and it shouldn’t be so sluggish during iterative builds. Facebook had a hackathon culture, so I decided to organize a hackathon to create a new build system, aiming to do it neatly in the style of Google’s build system.

At that time, there was also a build system called FB Build, which was essentially a “knockoff” of Google’s build system. It was written in Python and only supported C language; but I thought, either fix this thing or just give up… or quit. After all, fixing old projects is the most frustrating.

Ryan: If you hadn’t fixed it, would you have quit?

Michael: At least I would have requested to switch projects or found a way to make myself happy, so I could come to work happily every day. I came to work to write code and do a good job, maximizing my abilities.

Interestingly, looking back, I have to admire the people around me at the time—almost everyone told me that what I was doing was a terrible idea. Basically, no one was optimistic, except for one person.

At that time, I was a senior Android engineer, and no one really stopped me. People would express opposition but wouldn’t directly say, “No.” This was different from my experience at Google—at Google, many things would be explicitly vetoed.

So I continued down this path. Soon, we produced a noticeably better version—performance improved by about twice. Once the results came out, everyone’s attitude changed. Many began to realize, “Okay, this direction is indeed better, so let’s go with this.”

Ryan: I find that large companies have an inertia where no one wants to tackle the piled-up disadvantages. Many other engineers notice the same issues, but as long as they feel there’s a way to solve them, they’re reluctant to start a new project. Moreover, Google has competing products, and it might not even be possible to win here. What made you confident that your project could outperform competitors and become the preferred option in the market?

Michael: There are several points here. First, as I said, I had done other Java projects, so I felt that the original build tools shouldn’t be this slow. Or, from a software engineer’s perspective, this underlying implementation shouldn’t be so inefficient. In reality, most opposing opinions were based on rigid logic: if we deviate from the standard solution, we will lose standard support. Or, what if next week the standard performance improves by 100 times, and the new solution can’t inherit it? What then?

Thinking about it, it’s quite ridiculous, as Facebook engineers had also developed their own PHP virtual machine and language and had embraced innovation before. I don’t know why this time was an exception. What I want to say is that at that time, the entire mobile project was feeling its way forward, and I was filled with anxiety and severe time pressure.

The higher-ups seemed to want to treat this project as a scientific experiment, but the problem was we faced hard deadlines. But can we really maximize our time this way? Fortunately, we succeeded in the end. Additionally, I deliberately downplayed the infrastructure nature of the project, emphasizing that “we are building a build system for Android,” and I certainly didn’t dare to expose too much business expansion ambition.

I didn’t plan to encroach on others’ projects, as that would definitely lead to more friction. So in the design, I considered making it support more teams and never forced it. It wasn’t until about a year later that the iOS team proactively came to ask, “Our build system is terrible; can we collaborate on Buck?” I responded on the spot, “Of course, let’s do it, friend.”

Ryan: This is interesting—you lacked credibility when you first joined, as newcomers always go through this process. Later, to advance your project, you had to gain more support from others. And everyone told you not to do it; you had to convince them that this was the right direction. How did you push for this change despite lacking credibility?

Michael: I actually took a shortcut. At that time, I had a colleague, John Perlow, who was also a senior Android engineer and similarly came from Google. He said, “Yes, you should do this; act quickly while no one is opposing you.” He also mentioned, “If you get it done, I’ll support you.” Having such an early supporter, along with a high-output programmer, really sped up my development cycle.

He was one of the first to affirm me and helped me a lot. But I also have to admit that I made a big mistake back then. You mentioned lacking credibility—when I just jumped from Google, I thought everyone here was from Bell Labs, pure top talent. But when I got to Facebook, I realized these people were just college graduates… what do they know about technology?

But as I made mistakes, I gradually developed respect for them. For example, when I talked about how things were done at Google, they often said, “We don’t care at all”—and in most cases, they were indeed right. What works in some places may not necessarily work in others.

Ryan: Finally, I want to ask about the performance of the tools you developed, which outperformed similar solutions by several times. How did you develop this technical intuition? What key designs did you implement to make it so efficient?

Michael: I think the most crucial point was that I sat down and thoroughly sorted out the implementation logic of Google’s tools: what exactly was it doing? Where were the problems?

I quickly discovered a significant issue: any change would cause it to rebuild everything from scratch. This was the fundamental reason for its slow speed, especially in incremental build scenarios, where performance was very poor.

So I began to break it down to a lower level: what things depend on what? Which steps don’t need to be repeated? There were indeed some complexities involved, such as Android’s resource handling, which was quite special and complicated. Because of this complexity, the system initially adopted a “simple and brutal” strategy: if there was any change, it would clear everything and start over.

But when I truly delved into it, I found that it could be optimized—if certain inputs hadn’t changed, the results of corresponding steps could be cached and didn’t need to be executed again. Once I introduced this caching mechanism, the overall speed significantly improved.

Another issue was modularization. At that time, the system basically only supported four modules; if someone wanted to add a new module, they had to write about 200 lines of XML for the ANT build script, and almost no one really understood these configurations. The result was that no one was willing to do module splitting—because once you did, you had to take responsibility for maintaining that complex configuration.

What Buck did was make the “addition of modules” very simple. As a result, people became more willing to split modules, and the number of modules increased accordingly. With more modules, builds could be executed in a more granular incremental manner, further enhancing overall efficiency.

So essentially, this was not just a technical optimization but a change in mindset.

Ryan: In short, it’s about reducing repetitive work.

Michael: Exactly.

Choosing the Right Problems: Reconstructing IDE and Virtual File System

Ryan: After solving the Android build issues, you shifted your focus to other areas within the company. I noticed you began to participate more in IDE-related work. What problems did you see in the IDE field that prompted you to dive into this area?

Michael: After completing Buck, I briefly worked on the iOS Messenger development. At that time, I thought: since Android had been done, why not expand and try something else? Although, to be honest, I never really liked iOS development—and I still didn’t like it later.

Many people may not know that early on, Objective-C had a mechanism called ARC (Automatic Reference Counting). Nowadays, these tasks are mostly handled automatically by the compiler, but earlier, developers had to manually manage reference counts. For instance, every time an object was created or a reference was added, they had to write code to maintain it. Many people today have never seen such code, but the iOS Messenger code we inherited was very old and still used this method, making maintenance quite painful.

If it were now, perhaps tools like Codex could help clean up this code, but back then, we had to tackle it head-on. Additionally, the experience of using Xcode itself didn’t suit me well, such as the separation of header files and implementation files—a design I never liked.

Moreover, from a broader perspective, whether on Android or iOS, Facebook’s application scale was always the largest. We basically stuffed all functionalities into one app for unified release. This is different from Google’s strategy—they would split into multiple apps like Drive and Sheets, and since they controlled the platform, they could pre-install a whole suite of applications.

The result of this difference was that Facebook always hit the scale bottleneck of mobile development tools earlier than others.

This was painful for development, but from the perspective of development tools, it was actually quite interesting—because we were forced to solve some problems that others hadn’t encountered yet, and these problems weren’t “research projects” but directly impacted the business. The issues with Xcode were similar. We communicated with Apple and provided feedback: “Xcode can’t support us at this scale.” But their response was, “Your project shouldn’t be that big; you should break it down.”

In this context, developing self-built tools became reasonable.

At that time, my thinking was quite straightforward: what is the essence of an IDE? It’s merely interacting with compilers (like Clang) and language services. Therefore, we could build a better “shell” on top of that.

Moreover, at that time, the company was also transitioning from Git to Mercurial for version control, and I realized that mainstream IDEs were unlikely to natively support these customized needs. Coupled with the fact that we were already using Buck as our build system, these were highly customized internal tools, and Xcode couldn’t support these “Facebook-specific” workflows well. So overall, investing effort into creating a development experience more suited to us made sense.

In contrast, I didn’t have similar motivation on the Android side because IntelliJ was already doing well, and we had found a way to support large-scale development effectively.

Ryan: So on one hand, Xcode was inadequate and didn’t meet your needs; on the other hand, there was actually another IDE within the company being developed by another team, right? I remember it was a Web IDE?

Michael: Yes, it was a Web IDE (laughs). I’m not laughing at the direction itself; it’s actually fine. The problem was that it was built on a Google open-source project that had already been abandoned, and that project was written using GWT (Google Web Toolkit)—which means you wrote code in Java and it automatically generated JavaScript.

I actually tried to continue working on their foundation and even tried to build some “credit” by optimizing their build speed.

But then I returned to a familiar judgment method: looking at iteration speed and the technology stack itself. I found that this project was based on an abandoned open-source project and used GWT technology. Meanwhile, our company was already the birthplace of React.

So the question arose: Why not build the development tools on the technology stack we are best at and most recognized for?

My first reaction was that they had chosen the wrong path. So I did something similar to what I did with Buck—I started a new project. However, my strategy was the same: no direct conflict, no attempt to “take over everything.” I simply said, “I’m making a new editor here, but let’s just focus on the iOS scenario for now.”

Ryan: You were deliberately avoiding friction. But that Web IDE already had many users, right?

Michael: Yes, there were about a thousand engineers using it.

Ryan: But in the end, the company chose your path (which later became Nucleide). You had almost no users; why did they ultimately select you?

Michael: I think there were mainly two reasons.

First, it was the technology path itself. I emphasized that we were building a desktop application (desktop IDE), not a Web IDE. Because if you really want to replace Xcode, developers would definitely want: to connect directly to the simulator, to debug on the phone, to operate in the local environment. Theoretically, the Web could do these things, but the costs and complexities would be much higher.

Second, there was “historical credit.” Buck had succeeded, so people were willing to bet again. Simply put, “You succeeded last time, so you can try again this time.”

Ryan: It seems this experience, along with your performance at work, later led to your promotion to E8—equivalent to what’s known in the industry as a Principal Engineer. How did you feel at that time?

Michael: I was naturally very excited, but more importantly, it was a sense of “alignment.” At Google, I always felt a bit out of sync—not just technically, but the things I was doing had some deviation from the directions the company truly valued.

At Facebook, this promotion was, for me, a confirmation: I had not only grown technically but also began to understand what kind of work is both important and aligns with the company’s direction. This understanding itself is as important as technical ability, if not more so.

Ryan: I remember Nucleide was open-source, and Buck was too, right? What was the original intention behind choosing to open-source?

Michael: That’s right. In comparison, Buck’s open-source is more representative. Nucleide didn’t become particularly popular externally.

I think companies benefit a lot from open-source, so there’s a notion: if this thing isn’t our “core moat,” we might as well share it. Many things I’ve done in my career, including Codex, follow a similar line of thought. Even if no one directly uses your tool, making the implementation methods public as a reference is valuable to others.

Of course, ideally, you can also gain external contributions, like someone submitting PRs or fixing bugs, which is very helpful. I remember Uber and Airbnb later used Buck.

In a sense, it’s quite natural—Facebook is one of the largest applications, so we often encounter problems first; then the next wave of companies starts facing these issues and comes to see how we solved them.

Another interesting point is that internally, Google uses Blaze, while externally it’s Basil. We also wondered if open-sourcing Buck could, to some extent, “force” them to open more things. They did open some things later, although not entirely because of us, but it did push them a bit.

Another practical factor is recruitment. Open-source also serves to showcase what we are doing and what we excel at. If you want to work on cutting-edge things in this field, this is where you should come.

Ryan: So was the decision to open-source bottom-up? Did the engineers propose it actively, or was it pushed by management after recognizing its value? Do you have specific internal policy documents?

Michael: It seems there aren’t such documents, but both situations you mentioned likely occurred. For example, typical successful cases like React and PyTorch have created enormous value for the company. But there are also some long-term projects that, when the economic situation is good, face no controversy, but as the macroeconomic environment worsens, managers complain that engineers are investing too much energy in open-source.

Overall, most open-source projects are driven from the bottom up and usually don’t encounter much resistance. You can also do a few technical shares and write a few blog posts; these contents actually have long-term value and are helpful for recruitment, and their lifespan is longer than many people imagine.

Ryan: Since you’ve been promoted to E8, I guess your expectations have also risen accordingly. Isn’t it time to tackle a challenge that matches the E8 level? After your promotion, what did you do?

Michael: Haha, at that time, I was a bit overconfident. I wanted to solve the web loading speed issue—this was indeed a big challenge, as the loading speed of facebook.com was not ideal, and the architecture was somewhat outdated. But this problem was too big, and I actually didn’t have much experience. Most members of the team responsible for web at Facebook had been deeply involved in this direction for many years, while I had mainly been working on mobile and development tools, so I wasn’t familiar with this field.

I remember my colleague and I sat down to try to compile the V8 engine based on the source code to see if we could optimize the JS generation mechanism to make it more compatible with V8. We blindly tried various methods, but in the end, all were fruitless.

Looking back, different people are suited for different types of problems. I’m better at projects that require writing a lot of code from scratch, while that problem leaned more towards data analysis, cross-team communication, and coordination—which is not my strong suit.

Ryan: You mentioned that this stage of your career was part of a “hero’s journey.” What does this term specifically refer to?

Michael: I’m a bit ashamed to say this refers to some engineers always harboring the fantasy of “if only someone could solve this technical problem,” and this expectation led to my own inflation. Many engineers feel that this engineering problem has existed for too long; why hasn’t anyone managed it? My thought was simple: JS, I’m most familiar with it. So I dove in.

But I couldn’t get it done and completely failed. Looking back, I think this is an important lesson, and at least I summarized it again: while I can indeed solve many problems, the things that truly make me enjoy and excel are actually few. I will certainly continue to try to gradually expand into other areas, but in terms of results, I should focus on what I truly love; I can’t excel at everything.

I think everyone can understand this principle and should accept their limitations calmly.

Ryan: So how did you find truly E8-level problems afterward? What happened next?

Michael: I think there’s a bit of luck involved. We organized a small engineer meeting to brainstorm potential bottlenecks for the future. I mentioned that the continuous expansion of the codebase would eventually lead to scalability issues, and the department manager at the time, who later became my supervisor, Brian O’Sullivan, clearly took note of it.

He decided to gather people to develop a virtual file system to address this issue proactively. So I, Adam Simpkins, and Wes Furlong joined the team. These two were top engineers, and for a long time, I felt like I was the weakest member of the project.

Ryan: You mentioned the virtual file system. For a large company like Meta, what are the benefits of self-developing such systems?

Michael: Previously, we adopted a monorepo model—putting all code in one repository. But most people only needed a subset of the repository, viewing specific files at any time. So our core idea was to design supporting tools around this virtual file system: when users request file content at a certain commit point, the system would dynamically generate the corresponding files, presenting a complete file layout.

And this work involved two key aspects: the first was building the virtual file system—when the operating system requests file content, it can instantly retrieve data to present the complete file structure as laid out to the user. The second part, which is more in my wheelhouse, is anticipating tools in the toolchain that often need to read all files. For example, tools like grep directly read all content. I had to consider how to adjust the development process and tool design to revolve around the virtual file system. Because if we forced the original tools to present all files directly, the new virtual file system would be meaningless.

Ryan: Essentially, it’s about lazily loading a massive file system. Not only is it more efficient, but it also avoids processing all content in the initial phase.

Michael: Exactly.

Ryan: You just mentioned that you excel at integrating all functionalities into this foundational framework?

Michael: Yes. In fact, I realized this when collaborating with Hanson Wang (who is also a member of the Codex team). Traditional solutions for achieving ultra-fast file searches through an IDE or editor often begin by traversing the entire file system to search for files.

I thought this was definitely a big problem. So we began to think: how can we achieve file search without compromising the original advantages of the file system, thereby exploring a new solution that surpasses the current state? Ultimately, we developed a file system called miles (short for my files).

It indexes all new commits under the main branch through cron jobs, tracking file additions and deletions—and only indexes file names rather than content, as the file name alone is sufficient. Hanson also proposed a clever scheme for maintaining the index, enabling fuzzy matching of files. This means the new system not only supports substring matching but can also accurately recognize files even if they are named in camel case by entering only uppercase letters or if there are spelling errors.

We devised a very interesting representation method to record all previously checked-out files, along with some markers to indicate: during a certain commit, whether the file existed at that time. When we send a query to the system—such as informing it of the current commit version and whether files were added or deleted locally—we can return the corresponding data.

I remember that when handling queries with over a million files, the response time was only about 10 to 20 milliseconds. This framework was significantly faster than Xcode or the default response of MDS, effectively solving the performance issue.

With its extremely fast operating speed, miles began to be opened as an internal service and was widely used in various other scenarios. By the time I left, the miles service was running on at least 30 servers globally, meeting various needs far beyond file search at a massive deployment scale.

Ryan: You just mentioned several implementation details; after all, most people won’t use leetcode in their work. I pondered for a long time but still don’t quite understand how your input structure was implemented; did you use try?

Michael: Try is indeed powerful. In our solution, we used two parallel arrays: one stores file content, and the other seems to point to an index integer, I can’t recall specifically. Additionally, we set a 64-bit mask containing 26 lowercase letters, 26 uppercase letters, 10 digits, and hyphens. If any character appeared in the file, we set the corresponding character bit.

This way, we could quickly scan the list, eliminate a large number of invalid items, and achieve highly parallel design. All arrays used parallel layouts, which, from a cache efficiency perspective, allowed the CPU to achieve excellent performance when reading memory linearly.

Ryan: I know your work on the Eden and Miles projects ultimately led to further promotions. But before the promotion, it seems you accumulated some experience in enhancing personal influence and handling opinion conflicts?

Michael: Yes, this was also a challenge I faced when I took on the E8 management role. I had previously been focused on writing code, but actually, most colleagues at the same or higher levels had stopped coding. They were almost entirely focused on enhancing their influence or conducting cross-team collaboration, such as writing project documentation and coordinating opinions, etc.

So as an E8, to achieve the expected level of influence, simply writing code wouldn’t suffice. I had to spend time influencing others, which was beneficial for me and also satisfied my superiors.

Of course, sometimes we might be overly insistent on our opinions; at least I was like that back then, and the result was an overly strong attitude that ultimately cost me dearly. During that time, my promotion was delayed, and I was reminded to adjust my approach.

One background to this was that after Microsoft acquired GitHub, I was very anxious. Because our New Clyde project was largely built on the GitHub ecosystem. I felt this project was definitely going to fail, as VS Code would surely swallow its independent status, which indeed happened later. I was extremely anxious, feeling the project was in serious jeopardy. So I urged everyone to change direction without considering that many people in the team were actually satisfied with the current work and didn’t want to be thrown into chaos by sudden changes.

Later, my supervisor called me in and gave me a stern talking-to. I accepted the guidance and specifically learned how to handle such situations better.

Ryan: What was the most important lesson you learned in this regard?

Michael: For me, I’m now clearer about which situations trigger my emotional reactions, such as certain technical decisions that can get me worked up. When I encounter such situations, I immediately react and remind myself: okay, don’t act impulsively. Or when I realize I can’t have a normal conversation or my emotional state is poor, I choose to communicate with the other party’s supervisor instead of directly barging into an engineer’s workspace shouting, “I have an idea; here’s what it is…”

Often, I would first say, “I’m a bit excited about this issue right now, and my expression might not be very good; could you help me see how to proceed more appropriately?” This approach turned out to be more effective.

Ryan: An interesting point is that you foresaw the rise of VS Code, and the underlying architecture of New Clyde would be eliminated, yet you were delayed in your promotion due to your judgment. A year later, you realized you were right all along. How did you feel when you recognized that?

Michael: About a year later, I indeed discussed this with the relevant people to review the situation. After all, it had been quite awkward before. Fortunately, the outcome was good; we managed to resolve the issue.

Ryan: So the key is the handling method, not the stance itself.

Michael: Exactly, that’s indeed the case.

AI is Reshaping Development Practices: The Real Changes Brought by Codex

Ryan: You seem to have enjoyed your time at Meta, but you ultimately left. What attracted you to OpenAI?

Michael: There were multiple factors. I first interviewed with OpenAI at the end of 2023 when I was at Meta, working on developer tools projects based on large models. I even released a self-developed authorized version of Metamate—a lightweight version similar to GitHub Copilot—and worked on related papers and presentations, such as the Code Compose project.

But the reality was that we frequently received feedback asking why we didn’t choose to use GPT-4. We could only explain that we were using Llama 2, and the difference was quite obvious. I wasn’t doing model research; I was more interested in turning these capabilities into products and experiences. So it felt natural to me that if I wanted to do this, I should go to the place where the best models were.

The second point is that I felt OpenAI truly valued top talent, reminiscent of the early days at Google. The choices of these people surely wouldn’t be wrong. And indeed, it was true. At OpenAI, I had the opportunity to work with seasoned teams, including many level 8 and 9 experts from Meta, allowing me to unleash my value freely.

The third point is that the timing was particularly special for OpenAI itself. I’ve mentioned to many people that joining OpenAI at that time felt like joining Google in 2000. Note, not Google in 1998, but Google in 2000. At this point, the company had already established a foothold, and the product-market fit was beginning to show results, making this stage very attractive for individuals.

Additionally, on a more personal note, I chose Facebook initially because of its vast consumer market—after all, I had previously focused on consumer products, and the calendar tool I developed was well-received by users. But when I got to Facebook, I had no opportunity to engage with consumer business; the developer tools I worked on served the company’s internal engineers, around 20,000 people.

Later, wanting to go to OpenAI was to seize the opportunity to return to the consumer space—at least to have a large user base. Now, the Codex project I’m responsible for has surpassed one million weekly active users—specific numbers I can’t recall, but the growth curve is indeed very steep. The scale of this service far exceeds the influence range of 20,000 to 40,000 developers at Meta.

Ryan: Absolutely correct. Most development tools in the industry are at this scale.

Michael: Right.

Ryan: In my view, Meta seems more like an engineering-driven company, where engineers are core, and many things are pushed from the bottom up. On the other hand, many AI labs are more research-driven, with research being the top priority. This also makes sense, as the model itself is crucial. As an engineer who isn’t doing research, how do you view the differences between research-led culture and engineering-led culture?

Michael:

This is indeed a change that requires adaptation. If someone claims they can smoothly switch between these two cultures, I think they’re lying. However, one thing is certain: friends who have spent time in large companies like FAANG develop a good habit of focusing on cultivating their influence. This is very important, and I genuinely pursue this influence. Just as I love working on the Codex and Harness projects, these are meaningful and respected outcomes. But if the model itself isn’t excellent, no matter how much we optimize on the Harness side, it’s ultimately limited.

So after joining OpenAI, I felt fantastic; we collaborate closely with the research team, sitting very close together, and many things are pushed forward together. This was also a significant reason I left Meta to dive in—I wanted to work with colleagues who build models to create products and explore new technological boundaries together.

Perhaps a similar model could be achieved at Meta, but the actual effects ultimately can’t be compared.

Ryan: You just mentioned that you participated in the initial work of the Codex project when you joined. I heard that when Codex CLI was first released, the market response didn’t fully meet expectations, but later the project gradually got back on track. Can you share this journey?

Michael: Of course. This journey was also filled with twists and turns. In April 2025, we released Codex CLI, and during the live broadcast at the end of the promotion, we did a live demonstration and open-sourced the then-Core3Pro project, which many people actively tried out. Everyone was excited about this new programming assistant, and its performance was decent, but the release was indeed quite rushed.

Overall, this was quite helpful for attracting attention to the project—after open-sourcing, we received a lot of PRs. I remember the project gained about 10,000 to 20,000 stars within a week or two. That experience was fun, and we received a lot of heartfelt likes from users. But the problem was that the team might not have had the comprehensive strength to drive the project forward, as the company needed to push multiple businesses simultaneously.

A month later, seven engineers and a few researchers (I can’t recall the exact number) released the web version of Codex—allowing users to use Codex directly in containers and even start new projects on their phones. This was really cool. In short, the manpower became more sufficient, and I firmly believed that the long-term vision of this project was worth looking forward to. But in terms of results, it was a bit “ahead of users”; users weren’t fully prepared yet.

In contrast, at that time, everyone was more inclined towards local programming agents, so while our web version gained a wave of growth, its stickiness was below expectations.

Throughout the summer, we continued to push both product lines. Until mid-summer, local agents still had a stronger product-market fit. But I personally always felt that local solutions were just a stopgap. After all, agents require more devices to operate; it’s impossible to rely solely on a laptop.

So in the summer, we made significant adjustments: we expanded the programming team and brought in more developers. At that time, GPT-5 was about to be released, and the market prospects were particularly bright. I was personally very excited because, apart from the CLI interface, I had also done several prototype tests, and this time we finally had enough manpower. We also simultaneously started developing the VS Code extension because I insisted that while terminals are suitable for many scenarios, they still have many limitations in interaction. Creating a beautiful user interface in a terminal requires many compromises, while in an IDE, it can be done more naturally and completely.

August was an explosive period—GPT-5 was launched, we also released the new terminal interface. Meanwhile, the GPT open-source model was also unveiled, and we supported it in TUI as well. The design of open-weight models and open training frameworks was truly stunning. Later in August, the VS Code extension was released, and we thus entered a new phase of rapid iteration.

It was the convergence of these factors that led us to cross this vertical growth turning point. This journey was exhilarating, and the relevant data can be verified in the codebase. Whether it’s the number of participants or submission counts, various regular indicators allow one to intuitively feel this change. Looking back, this was quite an exciting journey.

Ryan: You just mentioned the local version of programming agents and the cloud version, and you seem to firmly believe that in the long run, the future lies not in local versions but in cloud deployment. Why do you make this judgment?

Michael: The scenarios that truly make people “inseparable” often look like this: for instance, whenever a new GitHub issue or linear task comes in, you automatically trigger the agent to handle it. Although there are indeed cost issues involved, and it might be abused, if it’s in an internal private repository, this is actually a very natural use case.

In this case, the agent is more like a part of an automated pipeline rather than a tool that only interacts locally with you. This means these tasks can’t all run on your laptop.

From this perspective, as an individual developer, you might still spend more time interacting with the agent locally; but if you look at “the computational power actually consumed by the agent,” I believe the bulk will still occur in the cloud. Deploying these things initially might be a bit troublesome, but once set up, the experience is actually very good.