“Engineers are becoming sorcerers” | The future of software development with OpenAI's Sherwin Wu

章节 1:AI 编程新常态:95% 的普及率与代码审查

📝 本节摘要

在本章中,OpenAI API 工程负责人 Sherwin Wu 分享了内部惊人的 AI 采用数据:95% 的工程师日常使用 Codex,且 100% 的代码合并请求(PR) 均经过 Codex 审查。数据显示,使用 AI 工具的工程师比不使用的工程师多提交了 70% 的代码合并请求,且这一差距正在扩大。Sherwin 强调,虽然工程师们仍在适应,但信任度与日俱增,正如 Kevin Weil 所言:“现在的模型是你将用到最差的版本”,暗示未来只会更强。

[原文] [Lenny]: sherwin thank you so much for being here and welcome to the podcast

[译文] [Lenny]: Sherwin,非常感谢你的到来,欢迎来到这个播客。

[原文] [Sherwin]: thank you thank you for having me

[译文] [Sherwin]: 谢谢,谢谢邀请我。

[原文] [Lenny]: i want to start with what's feeling like a barometer of progress in AI especially in engineering what percentage of your code if you even write code anymore and your team's code is written by AI at this point

[译文] [Lenny]: 我想先从一个能反映 AI 进展的晴雨表开始,特别是在工程领域。目前你(如果你还写代码的话)和你团队的代码中,有多少百分比是由 AI 编写的?

[原文] [Sherwin]: i do write code occasionally now still uh and I actually say for managers like myself it's way easier to use these AI tools uh than to manually code at this point and so I know for myself and some of the other emuring managers at OpenAI uh all of our code is written by by codeex uh at this point

[译文] [Sherwin]: 我现在偶尔还是会写代码,实际上我会说,对于像我这样的管理者而言,现阶段使用这些 AI 工具比手动写代码要容易得多。所以我知道对我自己以及 OpenAI 的其他一些工程经理来说,目前我们所有的代码都是由 Codex 编写的。

[原文] [Sherwin]: but more broadly there's just been this there's just so much energy there's like a tangible energy internally around just how far these tools have gotten how good Codeex as a tool has gotten for us and uh it's it's a little hard for us to exactly measure how much of the code is is written because the vast majority of it I'd say like close to 100% is is usually generated by AI first

[译文] [Sherwin]: 但从更广泛的角度来看,内部确实充满了一种切实的能量,大家都在感叹这些工具已经发展到了什么程度,Codex 作为一个工具对我们来说已经变得多么好用。而且,有点难确切衡量到底有多少代码是 AI 写的,因为绝大多数代码——我想说接近 100%——通常都是先由 AI 生成的。

[原文] [Sherwin]: what we do track though is is you know at this point the vast majority of engineers use codeex on a daily basis so 95% of engineers um use codeex um 100% of our PRs are reviewed by codeex daily as well so basically any code that goes into production that's merged in Kodas kind of has its eyes on and uh suggests improvements suggests changes uh uh in the PRs

[译文] [Sherwin]: 不过我们确实追踪了一些数据,目前绝大多数工程师每天都在使用 Codex。所以,95% 的工程师使用 Codex;同时,我们 100% 的 PR(代码合并请求)每天也都由 Codex 进行审查。所以基本上,任何进入生产环境、被合并的代码,Codex 都会“过目”,并在 PR 中提出改进建议或修改意见。

[原文] [Sherwin]: and so uh that's kind of what we're seeing internally but by and large the most exciting is just the energy that that there that there is

[译文] [Sherwin]: 这就是我们在内部看到的情况,但总的来说,最令人兴奋的还是那种氛围。

[原文] [Sherwin]: um another observation that we've had is uh engineers who tend to use codecs more uh open way more PRs so uh they're actually opening 70% more PRs uh and uh than than the engineers who aren't using codecs as much uh and the gap is widening

[译文] [Sherwin]: 我们观察到的另一个现象是,那些更倾向于使用 Codex 的工程师,开启的 PR 数量要多得多。实际上,他们开启的 PR 数量比那些不怎么使用 Codex 的工程师多出了 70%,而且这个差距正在扩大。

[原文] [Sherwin]: so I feel like you know the people who are opening more PRs um are starting to you know learn how to use the tool more and more get more efficient and that 70% gap keeps uh uh growing over time and so might have actually increased since I last looked at the at the number

[译文] [Sherwin]: 所以我觉得,那些开启更多 PR 的人正在开始——你知道的——越来越懂得如何使用这个工具,变得更高效,那个 70% 的差距随着时间的推移还在不断增长,实际上自上次我看这个数据以来,可能已经又增加了。

[原文] [Lenny]: okay so just to make sure we hear what you're saying you're saying all of the code of these 95% uh engineers at at OpenAI is written by AI it's written and then they review it yep yep

[译文] [Lenny]: 好的,为了确认我们没听错你的意思,你是说在 OpenAI,这 95% 的工程师的所有代码都是由 AI 写的?是 AI 写好,然后他们再进行审查?是的,是的。

[原文] [Lenny]: it's It's like crazy that that's almost like not crazy anymore that we're just like getting used to this

[译文] [Lenny]: 这太疯狂了,但好像又没那么疯狂了,好像我们已经开始习惯这种事了。

[原文] [Sherwin]: i think there's still some getting used to to be clear uh there's also I think some you know uh engineers who I think trust uh codeex a little bit less but um basically every day I talk to someone who who uh is blown away by something that I can do and and kind of like the their bar of of trust kind of uh or like how much they trust the model to do on its own goes up over and over uh over time

[译文] [Sherwin]: 我觉得大家还是在适应过程中,说实话。也确实有一些工程师对 Codex 的信任度稍低一些。但基本上每天我都会和某人交谈,他们都会被它能做的事情所震撼,而且他们信任的门槛——或者说他们信任模型能独立完成工作的程度——随着时间的推移在不断提高。

[原文] [Sherwin]: and there's a quote from Kevin Whale our our um VP of of science here and he likes saying this is the worst the models will ever be and so this is the worst that the models ever be for software engineering as well and so over time you just see people trusting it more and more and then we'll see the models get better and better as well

[译文] [Sherwin]: 我们负责科学的副总裁 Kevin Weil 有句名言,他喜欢说:“这是模型将会呈现的最差状态。”所以,这也是模型在软件工程方面将会呈现的最差状态。因此随着时间推移,你会看到人们越来越信任它,然后我们将看到模型也变得越来越好。

[原文] [Lenny]: yeah kevin Wheel former podcast guest uh he he said exactly that line on this podcast and a few times yeah

[译文] [Lenny]: 是的,Kevin Weil 也是这档播客的前嘉宾,他在节目里也确实说过这句名言,还不止一次。

[原文] [Lenny]: uh Peter the Claudebot/moldbotclaw is what it's called now uh developer uh recently shared that he uses codecs for his work and he feels like anytime it does things he just trusts that it has done the right job and he's just like almost certain he could just commit it to master and it'll be great

[译文] [Lenny]: 那个开发了 Claude bot(现在叫 moldbotclaw)的开发者 Peter 最近分享说,他在工作中使用 Codex,他感觉无论它做什么,他都相信它做的是对的,他几乎确定可以直接提交到主分支(Master),而且不会有问题。

[原文] [Sherwin]: yeah yeah he's a great um user of codeex i know he's in close touch with the team gives us great feedback um not surprised that he uses it i mean uh sorry it's called open claw open claw yeah open claw is a great is a great product and then I saw that this I mean this is very recent but this morning I think mold's book uh kind of like uh uh was shared as well and seeing all the uh AI agents talk to each other is pretty uh pretty surreal it's basically her is happening in real life is what I'm hearing

[译文] [Sherwin]: 是的,是的,他是 Codex 的一名优秀用户。我知道他和团队保持着密切联系,给了我们要非常棒的反馈。我不惊讶他在用它。我是说……抱歉,它叫 Open Claw。Open Claw 是个很棒的产品。而且我看到——这是最近的事,就在今天早上——我想 mold 的那本书(注:可能指相关项目文档或发布)也被分享出来了。看着所有的 AI 智能体(Agents)互相交谈真是太超现实了。我听到的感觉基本上就是电影《Her》在现实生活中上演了。


章节 2:工程师角色的演变:从编码者到“魔法师”

📝 本节摘要

本节中,Sherwin 描绘了未来工程师角色的重大转变:从传统的代码编写者转变为管理 AI 智能体(Agents)群队的“技术主管”或“魔法师”。他引用了经典教材《计算机程序的构造和解释》(SICP)中的隐喻,将编程比作巫术,代码即咒语。如今,随着“Vibe coding”(氛围编程)的兴起,工程师更像是《幻想曲》中偷戴魔法帽的米老鼠(魔法师的学徒),虽然能通过咒语(Prompt)极大地杠杆化产出,但也面临着控制失控“扫帚”(AI 进程)的挑战。

[原文] [Lenny]: how do you imagine the role of an engineer and the job of a software engineer looks in the next couple years just like what is that job

[译文] [Lenny]: 你如何想象未来几年工程师的角色以及软件工程师的工作会是什么样子的?具体来说,那份工作会是什么样的?

[原文] [Sherwin]: yeah it's I mean it's honestly being really cool to see um uh and it's part of where the excitement is because uh like the job is likely going to change pretty significantly over the next one to two years it kind of feels like we're still figuring things out though and so there's like this excitement I know especially from some of the software engineers of like we're in this rare moment you know maybe over the next 12 to 24 months where we'll kind of get to figure things out ourselves and set our standards for ourselves

[译文] [Sherwin]: 是的,我是说,诚实地讲看到这些真的很酷,这也是兴奋点的一部分,因为这份工作很可能在未来一到两年内发生相当显著的变化。不过感觉我们还在摸索阶段,所以我知道,特别是在一些软件工程师中,有一种兴奋感,觉得我们正处于一个罕见的时刻——也许在接下来的 12 到 24 个月里——我们将有机会自己把事情通过探索搞清楚,并为我们自己设定标准。

[原文] [Sherwin]: in terms of where I see uh I see this moving so I think there's a common thing that everyone's saying which is uh you know people are generally like IC engineers are becoming tech leads they're basically like managers now they're managing fleets and fleets of agents

[译文] [Sherwin]: 关于我认为这会向哪个方向发展,我觉得大家都在说一个共同点,那就是——你知道的——通常来说,IC(独立贡献者)工程师正在变成 Tech Leads(技术主管)。他们现在基本上就像是管理者,正在管理着成群结队的 Agents(智能体)。

[原文] [Sherwin]: um I know many of the engineers on my team basically have like 10 to 20 uh threads kind of being pulled on at the same time obviously not active running codeex uh jobs but uh just a lot of parallel threads they're checking in on what they're doing they're steering the agents and codeex and and and and giving it feedback and so their job has kind of really changed from just writing the code itself into being almost like a manager

[译文] [Sherwin]: 我知道我团队里的很多工程师基本上同时处理着 10 到 20 个线程。显然不都是正在活跃运行的 Codex 任务,但就是有很多并行的线程,他们在检查这些线程在做什么,他们在引导这些 Agents 和 Codex,并给它们反馈。所以他们的工作已经从单纯的写代码,真正转变成了几乎像是一个管理者。

[原文] [Sherwin]: in terms of where I think this will go one to two years from now so one uh kind of metaphor that that I kind of always come back to here is actually from this uh is from this uh programming textbook uh that I read back in college called sikp i don't know if you've heard of it uh structure and uh interpretation of computer programs so si sicp

[译文] [Sherwin]: 关于我认为一两年后这会变成什么样,我总是会回想起一个隐喻,实际上来自我在大学读过的一本编程教科书,叫 SICP。不知道你听没听说过,《计算机程序的构造和解释》(Structure and Interpretation of Computer Programs),简称 SICP。

[原文] [Sherwin]: um at at MIT it was really popular and and it was actually used as the uh uh introductory it was the textbook for the intro programming course for a very long time um and it kind of has this cult following um it teaches you programming uh it teaches you a dialect of list called scheme uh and so it like introduces you to like functional programming it's like very mindopening in that way

[译文] [Sherwin]: 在麻省理工学院(MIT)它非常流行,而且实际上很长一段时间它都是编程入门课程的教科书。它有一种狂热的追随者群体。它教你编程,教你一种叫 Scheme 的 Lisp 方言,带你领略函数式编程,在这方面非常令人脑洞大开。

[原文] [Sherwin]: but the thing that was memorable for me about that book so I I I kind of read it in college um the very beginning of it kind of describes programming as a discipline and draws this metaphor to basically like sorcery like it says like software engineers are like wizards and you're like like programming languages are like incantations and you're like you know you're you're saying you're issuing these spells and these spells are kind of like going out and doing things for you and and the challenge is like what incantation do you have to say to make the the program do what you want

[译文] [Sherwin]: 但这本书让我印象最深刻的是——我在大学读的时候——它的开篇将编程描述为一门学科,并将其比喻为巫术(Sorcery)。书中说软件工程师就像巫师(Wizards),编程语言就像咒语(Incantations)。你就像是在念出这些咒语,而这些咒语会出去为你做事。挑战在于,你必须念什么咒语才能让程序做你想做的事。

[原文] [Sherwin]: and this book was written in 1980 so this is this is a while ago and I think that metaphor is actually like kind of persisted over time and I think it's actually playing out as we move into this uh new era of vibe coding or just like what software engineering will look like because programming languages were basically these incantations they've changed over time and the challenge has always and and the trend has been that these it's been easier and easier to kind of get them the the the computer to do what you want uh via programming

[译文] [Sherwin]: 这本书写于 1980 年,所以已经有一段时间了。但我认为这个隐喻实际上一直延续至今,而且我认为当我们进入这个“Vibe coding”(氛围编程)的新时代,或者说软件工程未来的样子时,这个隐喻正在变为现实。因为编程语言基本上就是这些咒语,它们随时间而变,挑战一直存在,趋势是通过编程让计算机做你想做的事变得越来越容易。

[原文] [Sherwin]: and I think the current wave of AI is is probably the next stage of that evolution it is now literally incantations because you can tell you know your uh you can tell codex you can tell cursor uh exactly what you want to do and then it'll all go do it for you

[译文] [Sherwin]: 我认为当前的 AI 浪潮可能是这一进化的下一阶段。现在它真的就是咒语了,因为你可以告诉 Codex,告诉 Cursor 你确切想要做什么,然后它就会去为你完成。

[原文] [Sherwin]: uh and I particularly like the wizard and like the the the sorcery analogy because uh I think our current state is is starting to move towards kind of like the the sorcerers apprentice uh you know from Fantasia uh where Mickey Mouse is like you know he finds the sorcerer hat and he tries to do all these things and I actually think it's a really apt analogy because one uh it's just it's really powerful now these incantations you can do can is is extremely high leverage but you kind of have to know what you're doing right

[译文] [Sherwin]: 我特别喜欢这个巫师和巫术的比喻,因为我觉得我们现在的状态开始变得有点像《魔法师的学徒》(The Sorcerer's Apprentice)——你知道的,来自电影《幻想曲》(Fantasia)——米老鼠发现了魔法师的帽子,他试图做所有这些事情。我觉得这是一个非常贴切的比喻,首先因为现在的咒语真的很强大,杠杆率极高,但你得知道自己在做什么,对吧?

[原文] [Sherwin]: like in Sorcerers Apprentice the whole plot is like Mickey goes wild the the brooms like go crazy and everything's flooding i think he literally sets the like sets the uh the brooms off on a task and then goes asleep uh and and so you know it's like vi coding at it at its at its greatest and then eventually the the old sorcerer comes back and like cleans everything up

[译文] [Sherwin]: 就像在《魔法师的学徒》里,整个情节就是米老鼠失控了,扫帚们疯了,到处都在发大水。我觉得他是给扫帚们布置了一个任务,然后就去睡觉了——这简直就是“Vibe coding”的极致表现——然后最终老魔法师回来把一切都收拾好了。

[原文] [Sherwin]: and um you know when I see engineers kind of like doing these these these these 20 different uh codeex threads at a time there there is some skill and there's some seniority and like you know uh um a lot of thought that needs to go into this because you want to make sure that the the the models aren't going off the rails uh you definitely don't want to just like completely uh go away and and you know like ignore ignore the thing

[译文] [Sherwin]: 当我看到工程师们同时处理这 20 个不同的 Codex 线程时,这确实需要一些技巧,一些资历,以及——你知道的——大量的思考。因为你要确保模型不会脱轨(off the rails),你绝对不想完全走开,然后就不管它了。

[原文] [Sherwin]: but it's also extremely high leverage like you know a a very senior engineer who's who's really prol uh proficient with these tools uh can now just do way more things via what they're doing and and I think this is also what makes it fun like it literally feels like we're wizards now you know it feels like we're closer to to to to having uh uh to to making making it feel like this like magical experience where we're you know casting all these spells and having software do all these things for you

[译文] [Sherwin]: 但这也是极高杠杆的。比如一个非常资深、对这些工具非常熟练的工程师,现在可以通过他们的操作做更多的事情。我觉得這也是有趣的地方,感觉我们现在真的就像巫师一样。感觉我们更接近于创造一种魔法般的体验,我们在施放所有这些咒语,让软件为你做所有这些事情。

[原文] [Lenny]: i was thinking of the sorcerers apprentice exactly as the metaphor as you were describing that so I'm glad you went there uh a previous podcast guest described it as you have a genie that you can that grants you wishes and it's a useful frame because you have to be very clear about the wish you want like if you want to be big like how big it could be yeah or it might be like the monkeykey's paw type thing where you know it's like you got what you want but what are the side effects

[译文] [Lenny]: 当你描述的时候,我脑海里想的正是《魔法师的学徒》这个比喻,所以我很高兴你提到了它。之前的一位播客嘉宾把它描述为你拥有一个可以实现愿望的精灵(Genie)。这是一个很有用的框架,因为你必须非常清楚你想要的愿望是什么。比如你想变大,那要多大?是的,或者它可能像“猴爪”(Monkey's Paw,意指实现愿望但带来可怕后果)那样的东西,你得到了你想要的,但副作用是什么?

[原文] [Sherwin]: um yeah yeah i think that and the analogy is great and um yeah the crazy thing for me is just the staying power of that book sik be like it's called the wizard book you know people call it the wizard book because that is the metaphor that they kind of weave throughout the the book and um we're we've basically reached that point now which is which is which is really cool

[译文] [Sherwin]: 是的,是的,我觉得那个比喻很棒。对我来说最疯狂的是那本书 SICP 的持久影响力。它被称为“巫师书”(The Wizard Book),人们叫它巫师书就是因为那个贯穿全书的隐喻。而我们现在基本上已经到达了那个点,这真的很酷。


章节 3:极限实验:全 AI 代码库与“无退路”挑战

📝 本节摘要

本章中,Sherwin 介绍了一个正在 OpenAI 内部进行的激进实验:一个团队维护着一个 100% 由 Codex 编写的代码库。与常规操作不同,该团队被禁止使用“手动修代码”这一逃生通道(Escape Hatch)。实验发现,当 AI 无法完成任务时,往往不是能力问题,而是上下文(Context)缺失。因此,工程师的核心工作变成了将脑中的“隐性知识”(Tribal Knowledge)转化为代码库中的文档、注释或 Markdown 文件,以便 AI 能理解并执行任务。这一过程揭示了未来通过 Agent 进行开发的最佳实践。

[原文] [Lenny]: there's two kind of threads I want to follow here one is I've been hearing more and more there's this like stress that people feel when their agents aren't working you fire off all these you know codeex agents and then you have to keep stay on top of them oh shit one's not working i'm wasting time uh do you do you feel that do you feel that across your team at all

[译文] [Lenny]: 这里有两条我想继续探讨的线索。其中一条是,我越来越多地听到人们在 Agent(智能体)不工作时会感到一种压力。你启动了所有这些 Codex Agents,然后你得一直盯着它们,“噢该死,有一个不工作了,我在浪费时间”。你有这种感觉吗?你在你的团队中感觉到这种情况了吗?

[原文] [Sherwin]: yeah yeah i mean it happens all the time and I actually think like this is where the interesting part of all of this lies right now because these models aren't perfect these tools aren't perfect and we're still trying to figure out how to best interact with these uh with with with codecs or with these AI agents to to get work done

[译文] [Sherwin]: 是的,是的,我是说这种情况经常发生。而且我实际上认为这正是目前所有这一切最有趣的地方,因为这些模型并不完美,这些工具也不完美,我们仍在试图弄清楚如何最好地与 Codex 或这些 AI Agents 交互来完成工作。

[原文] [Sherwin]: we see this come up all the time there's a particularly interesting team that we have internally so there's a team that that's actually doing an experiment right now uh within OpenAI where they are basically maintaining a 100% codeex written codebase

[译文] [Sherwin]: 我们经常看到这种情况出现。我们在内部有一个特别有趣的团队,实际上有一个团队目前正在 OpenAI 内部做一个实验,他们基本上在维护一个 100% 由 Codex 编写的代码库。

[原文] [Sherwin]: uh so you know like you know uh uh some you know you'll have the AI write code but you'll obviously end up like rewriting a lot of it and and you might need to like double check and change things but this team is just fully codeex pill and just like leaning in entirely

[译文] [Sherwin]: 通常你知道的,你会让 AI 写代码,但你显然最终会重写很多部分,或者你可能需要仔细检查并修改东西。但这个团队完全是“服下了 Codex 药丸”(全信徒),完全全身心地投入其中。

[原文] [Sherwin]: uh and they run into the exact problems that you're describing which is like you know their challenge is you know uh you know I want to get this thing this feature built but I can't get the agent to do it and so usually there's an escape hatch where you know then you're like all right I'll roll up my sleeves and like figure it out and then instead of using codeex I might use like tab complete and and cursor and things like that

[译文] [Sherwin]: 他们遇到了你描述的完全相同的问题,就是他们的挑战在于——比如“我想构建这个功能,但我无法让 Agent 去完成它”。通常情况下,你会有一个“逃生通道”(Escape Hatch),你会想“好吧,我会卷起袖子自己搞定它”,然后不再使用 Codex(自动生成),而是可能会使用 Tab 补全、Cursor 之类的工具辅助。

[原文] [Sherwin]: but this team uh uh for the experiment this team doesn't have that escape hatch uh and so then the challenge like how do I get the the the agent to to to do this

[译文] [Sherwin]: 但对于这个团队的实验来说,他们没有那个“逃生通道”。所以挑战就变成了:我该如何让 Agent 去完成这件事?

[原文] [Sherwin]: and um I actually think we're going to be publishing a blog post from some of our learnings here um but a lot of fascinating like paradigms and best practices are falling out of this

[译文] [Sherwin]: 我实际上认为我们会发布一篇博客文章来分享我们在这里学到的一些东西,但很多迷人的范式和最佳实践正在从中涌现。

[原文] [Sherwin]: um one interesting thing that we've noticed I I don't know if this is what you you kind of feel but we definitely feel it here is a lot of the time uh when the coding agent is not doing what you want it's usually a problem with context and just like information that you've given it it's just you've either underspecified or there's just not enough information around how to do something available to the agent available to codeex

[译文] [Sherwin]: 我们注意到的一个有趣现象——我不知道你是否有同感,但我们在这是绝对感觉到了——很多时候,当编码 Agent 没有做你想做的事时,通常是上下文(Context)的问题,也就是你给它的信息有问题。要么是你描述得不够具体(Underspecified),要么就是对于如何做某件事,Agent 或 Codex 可用的信息根本不够。

[原文] [Sherwin]: uh and so uh when when you have to solve it through through that uh the challenge is then to to to add documentation and actually work around this this limitation and basically encode more tribal knowledge that's in your head somehow into the codebase either via you know code comments itself or code structure itself or via text files like you know MD files skills any type of additional resources within the repository so that the model can um uh can better do its task

[译文] [Sherwin]: 所以,当你必须通过这种方式解决问题时,挑战就变成了添加文档,实际上是绕过这个限制,基本上是将你脑海中的“隐性知识”(Tribal Knowledge)以某种方式编码进代码库。这可以通过代码注释本身,或者代码结构本身,或者通过文本文件——比如 Markdown 文件、技能(Skills)文件,或者代码仓库中的任何类型的额外资源——来实现,以便模型能更好地完成任务。

[原文] [Sherwin]: there's a whole bunch of other learnings from this uh this group which I think is fascinating uh to to explore but yeah kind of giving removing that escape hatch of of no longer using AI has allowed them to start piecing together a lot of the problems that we'll have to solve if we really want to lean into agents

[译文] [Sherwin]: 这个小组还有很多其他的学习成果,我认为非常值得探索。但是是的,通过移除“不再使用 AI”这一逃生通道,让他们开始拼凑出如果我们真的想全面依赖 Agents 时必须解决的许多问题。


章节 4:重塑工作流:自动化 PR 审查与消除繁琐

📝 本节摘要

随着 AI 辅助编码导致代码合并请求(PR)数量激增,人工审查成为了新的瓶颈。Sherwin 透露,OpenAI 内部已实现 100% 的 PR 由 Codex 预审,这使得原本需要 10-15 分钟的人工审查缩短至 2-3 分钟。AI 不仅充当了“第二双眼睛”,还接管了 Lint 修复、测试运行等繁琐的 CI/CD 流程。虽然存在“AI 写代码、AI 查代码”的循环风险,但通过保留关键的人工监督(避免“扫帚失控”),团队成功将工程师从低价值的重复劳动中解放出来。

[原文] [Lenny]: another uh issue people run into you talked about how people are shipping PRs like crazy a lot more PRs if they're working with AI uh obviously code review is becoming a bigger challenge is there anything you've figured out in your team to help speed that up to make that scale as and not just create this terrible job for people where they're just sitting there reviewing PRs all day

[译文] [Lenny]: 人们遇到的另一个问题是,你提到大家都在疯狂地提交 PR(代码合并请求),如果使用 AI 工作,PR 数量会多得多。显然,代码审查正在成为一个更大的挑战。你们团队有没有通过什么方法来加速这一过程,使其能够规模化,而不是让它变成一份糟糕的工作,让人整天坐在那里审查 PR?

[原文] [Sherwin]: yeah I mean one thing is Codex reviews 100% of all of our PRs at this point and so uh I actually think so one one really interesting thing that's happened is the things that tend to we hand we tend to hand to the models immediately tend to be the things that annoy us or like are the most boring parts of uh software engineering it's also why it's more fun now because we get to do more you know more of the fun things

[译文] [Sherwin]: 是的,我是说,有一点是 Codex 目前审查我们需要的所有 PR。实际上我认为发生的一件非常有趣的事情是,我们倾向于立即交给模型处理的事情,通常是那些让我们烦恼或者软件工程中最无聊的部分。這也是为什么现在工作更有趣了,因为我们可以做更多——你知道的——更多有趣的事情。

[原文] [Sherwin]: um for me um speaking more for myself I really hated code reviews it was like one of the worst things for me and then I remember in my first job uh out of college uh it was at it was at Quora um I owned I was working on the newsfeed and so I owned the code for the newsfeed and so I was a reviewer for Newsfeed and uh it was just like the central piece of code that everyone would touch and so I would just every morning I'd log in and be like like 20 to 30 code reviews i just like oh my goodness I got to like you know get through all these um I would procrastinate and then it grows to like 50

[译文] [Sherwin]: 对我个人而言,我真的讨厌代码审查,这对我来说是最糟糕的事情之一。我记得我大学毕业后的第一份工作是在 Quora。当时我在做 Newsfeed(信息流),所以我负责 Newsfeed 的代码,我也是 Newsfeed 的审查员。那基本上是每个人都会触及的核心代码块,所以我每天早上登录时,大概会有 20 到 30 个代码审查等着我。我就想:“天哪,我得把这些都处理完。”然后我会拖延,接着它就增加到 50 个。

[原文] [Sherwin]: and so there's like a a lot of code reviews codeex is really good at reviewing code uh so actually one thing that we've noticed that 52 in particular has gotten extremely strongly adept at is reviewing code and especially when you kind of steer it in the right direction

[译文] [Sherwin]: 所以有大量的代码审查。Codex 在审查代码方面真的非常出色。实际上我们注意到的一点是,特别是 GPT-4 (注:原文口误或指特定内部版本号,结合上下文指高级模型) 在代码审查方面变得极其熟练,尤其是当你给它正确的引导时。

[原文] [Sherwin]: and so uh for code reviews yeah we create a lot of PRs but Codex reviews all of them and it makes you know code reviews go from a you know I don't know 10 15 minute task to sometimes even just like a two to three minute task because you have a uh a bunch of suggestions uh already already baked in

[译文] [Sherwin]: 所以对于代码审查,是的,我们创建了很多 PR,但 Codex 会审查所有这些 PR。它让代码审查从一个——我不知道,比如 10 到 15 分钟的任务——变成了有时只需要两三分钟的任务,因为你已经有了一堆现成的建议在那里了。

[原文] [Sherwin]: uh a lot of the times people will uh especially for small PRs like you you actually don't even need people to review we kind of trust codeex in this way um the original author kind of looks at Codex it is you know the benefit of code review is to have a second pair of eyes to make sure that you're not doing anything dumb codex is a pretty smart second pair of eyes at this point and so that's something that that we've heavily leaned into

[译文] [Sherwin]: 很多时候,特别是对于小的 PR,你甚至不需要人来审查,我们在某种程度上信任 Codex。原作者会看一下 Codex 的反馈。你知道,代码审查的好处是有一双“第二双眼睛”来确保你没有做任何蠢事。现阶段 Codex 是一双相当聪明的“第二双眼睛”,所以这是我们非常依赖的一点。

[原文] [Sherwin]: um the general CI process and like the post uh kind of push and like deployment process has also been heavily automated via codeex internally at this point if you talk to a lot of engineers the thing that annoys them the most is after you've written your beautiful code like how do you get it into production you know you got to you got to run through all these tests you got to like you know lint errors you code review

[译文] [Sherwin]: 目前在内部,通用的 CI(持续集成)流程以及推送后的部署流程也已经通过 Codex 实现了高度自动化。如果你和很多工程师交谈,最让他们烦恼的事情是,在你写完漂亮的代码之后,怎么把它弄到生产环境中去?你知道,你得跑完所有的测试,你得处理 Lint 错误,你得做代码审查。

[原文] [Sherwin]: um there's a lot of automated stuff you can do with codecs and so we've actually built some tools internally that that help automate that process automate the lint you know if there's like a lint error it's a very easy codeex fix uh and then just it could just patch it and then kind of restart the CI process

[译文] [Sherwin]: 有很多自动化的事情你可以用 Codex 做。所以我们实际上在内部建立了一些工具来帮助自动化这个过程,自动化 Lint 检查。你知道,如果有一个 Lint 错误,这对 Codex 来说是一个非常容易的修复,它可以直接打补丁,然后重新启动 CI 流程。

[原文] [Sherwin]: um so all of that is we're trying to collapse as as into as as little work for an engineer as possible which and and the byproduct of which is uh um uh they can they can now merge and push out a lot more peers

[译文] [Sherwin]: 所以所有这些,我们都在尝试将其压缩,让工程师的工作量尽可能小,其副产品就是他们现在可以合并和推送更多的 PR。

[原文] [Lenny]: codec's writing the code Codex reviewing its own code i'm curious if you are open to using other models to review your models work is that is that a path or is it just it's good enough we don't need anything else

[译文] [Lenny]: Codex 写代码,Codex 审查它自己的代码。我很好奇你们是否愿意使用其他模型来审查你们模型的工作?这会是一条路吗?还是说现在的已经足够好了,不需要别的了?

[原文] [Sherwin]: so I will say there's there's definitely a circular thing here and like going back to sources apprentice like you want to make sure you're not letting the brooms go crazy here um and so you know we're very thoughtful I'd say around which PRs kind of are completely just codeex uh reviewed most people still obviously take a look at their PRs uh and so it's not like it's going to zero it's more like going from you know 100% attention to like 30% attention which which just helps things push through

[译文] [Sherwin]: 我得说这里肯定存在某种循环。回到《魔法师的学徒》那个比喻,你要确保这里没让“扫帚”发疯。所以我得说,我们在对待哪些 PR 可以完全只由 Codex 审查这方面是非常深思熟虑的。大多数人显然还是会看一眼他们的 PR,所以这并不是说人工关注度变成了零,更像是从 100% 的关注度降低到了 30%,这有助于推动事情的进展。


章节 5:管理者的进化:赋能顶尖人才与扩大管理半径

📝 本节摘要

在本章中,Sherwin 探讨了管理者角色的变化。虽然相比工程师,管理者的工作流改变较小,但 AI 带来的杠杆效应显著。他指出,AI 能让顶尖人才(Top Performers)的生产力“爆发式增长”,因此他的管理哲学是将超过 50% 的时间花在顶尖人才身上,确保他们畅通无阻。此外,借助 ChatGPT 等工具整理组织脉络和绩效评估,管理者现在的管理半径(Span of Control) 能够突破传统的 6-8 人限制,有效管理规模更大的团队。

[原文] [Lenny]: there's a lot of talk and we've been talking about kind of the IC role the work of an IC engineer there's less talk about the changing role of a manager especially an engineering manager how has your life as a manager changed with the rise of AI and just what do you where do you think managers what's the role of a manager in the future

[译文] [Lenny]: 有很多关于 IC(独立贡献者)角色的讨论,我们刚才也谈到了 IC 工程师的工作。但关于管理者——尤其是工程经理——角色变化的讨论较少。随着 AI 的兴起,你作为管理者的生活发生了什么变化?你认为未来管理者的角色会是什么样的?

[原文] [Sherwin]: it's definitely changed less than an engineer uh there's no you know codeex for managers just uh just yet however I use codeex quite a bit for for some of the um uh some of some of the like kind of more managery tasks that I do i'd say a couple things are are changing there like some trends so I don't think it's changed that much yet um but I see trends and I think if you play it out you can kind of see where where a lot of this is going

[译文] [Sherwin]: 肯定比工程师的变化要小,目前还没有针对管理者的 Codex。不过,我确实经常使用 Codex 来处理我的一些管理任务。我想说有一些事情正在改变,有一些趋势。我认为目前变化还不是很大,但我看到了趋势,如果你推演一下,就能看到这一切将走向何方。

[原文] [Sherwin]: one thing that that's becoming increasingly clear is codeex really empowers like top performers to to get a lot like to be a lot more productive and so it really like and I think this is maybe true for AI more broadly like across society which is like the people who really lean in or like the people who have high agency or like will get get good at these tools will kind of supercharge themselves

[译文] [Sherwin]: 变得越来越清晰的一点是,Codex 真正赋能了那些顶尖人才(Top Performers),让他们变得更加高产。我认为这在更广泛的 AI 领域乃至整个社会可能都是真的:那些真正全身心投入的人,或者那些具有高度能动性(High Agency)、擅长使用这些工具的人,将会让这种能力“爆发式增强”(Supercharge themselves)。

[原文] [Sherwin]: uh and so I'm kind of noticing this now as well which is like the top performers kind of end up uh uh uh being a lot more a lot more productive uh and so you see a broader spread uh in in team productivity in this way

[译文] [Sherwin]: 所以我现在也注意到了这一点,就是顶尖人才最终会变得——变得极其高产。因此,你会看到团队生产力的差距以此方式进一步拉大。

[原文] [Sherwin]: one so one thing that I've always done as as a management philosophy is to spend uh actually the majority of my time with top performers just like make sure they're unblocked make sure they're happy make sure you know they're they feel productive and they feel heard i think this is even more true uh in an AI world where you know your top firmers are going to just like really be shooting ahead uh using these tools

[译文] [Sherwin]: 所以,作为一种管理哲学,我一直坚持做的一件事是,实际上把我的大部分时间花在顶尖人才身上。只要确保他们没有被阻碍,确保他们开心,确保——你知道的——他们感觉自己富有成效,感觉被倾听。我认为在一个 AI 世界里这更加正确,因为你知道你的顶尖人才将会利用这些工具真正地一骑绝尘。

[原文] [Sherwin]: i think I think one example is is that the team that's you know maintaining a 100% codec generated codebase like just letting them kind of rip and and and see what's happening there is something that's that's paid dividends so I think that that's kind of one one trend that I'm seeing where where where um spending even more time with top performers for managers I think is is likely going to um uh continue

[译文] [Sherwin]: 我觉得一个例子就是那个维护 100% Codex 生成代码库的团队。就像让他们放手去干,看看会发生什么,这已经带来了回报。所以我认为这是我看到的一个趋势,管理者花更多时间在顶尖人才身上,这很可能会持续下去。

[原文] [Sherwin]: the other thing is I I so this is more uh an observation but my sense is with a lot of these AI tools available to managers so le less like writing code but just things like chat GBT with organizational knowledge like being able to do research and understanding organizational context a lot better another good example is uh um we're doing performance reviews right now and it's actually really easy to use chat GBT with internal knowledge hooked up to GitHub and like our notion docs and Google docs to give it get a really good sense of what this person has done over the last 12 12 uh months uh and writing a little you know deep research report for it

[译文] [Sherwin]: 另一件事——这更多是一个观察——我的感觉是,有了这些可供管理者使用的 AI 工具(不像写代码那么多,主要是像带有组织知识的 ChatGPT 这种),能够做研究并更好地理解组织背景。另一个很好的例子是,我们现在正在做绩效评估,利用连接了内部知识库(如 GitHub、Notion 文档和 Google 文档)的 ChatGPT,实际上非常容易就能很好地了解某个人在过去 12 个月里做了什么,并为此写一份深度研究报告。

[原文] [Sherwin]: my sense is I think managers will be able to manage much larger teams in this world kind of like how you know like software engineers are managing 20 to 30 codeexes um my sense that these tools will allow managers people managers to be higher leverage um and uh it will allow them to to to manage you know teams of way more than than the current best practice of I think it's like six to eight right for software engineering

[译文] [Sherwin]: 我的感觉是,在这个世界里,管理者将能够管理规模大得多的团队。就像软件工程师正在管理 20 到 30 个 Codex 线程一样,我的感觉是这些工具将允许管理者——人事经理——拥有更高的杠杆率,并允许他们管理远超当前最佳实践人数的团队。我想目前的最佳实践大概是 6 到 8 人,对吧,对于软件工程来说。

[原文] [Sherwin]: you kind of see this applied to you know like uh the non uh engineering domains like support or uh operations where it's like you know previously um uh where previously like the size of support team might be limited but like as you can pass off more things to agents you can actually do more work and also manage more people this way i think the same thing might happen for um people management as well especially in tech companies

[译文] [Sherwin]: 你可以看到这已经应用在非工程领域,比如客服或运营。以前客服团队的规模可能受限,但随着你可以把更多事情交给 Agents 处理,你实际上可以做更多工作,并通过这种方式管理更多人。我认为同样的事情也可能发生在人事管理上,尤其是在科技公司。

[原文] [Lenny]: i love this advice that the way you described it is you've always leaned into top performers and spent more time with them unblock them make sure they're happy the way Mark Andre and he was just on the podcast the way he phrased it is AI makes good people better and it makes great people exceptional

[译文] [Lenny]: 我喜欢这个建议,你描述的方式是你总是倾向于顶尖人才,花更多时间在他们身上,为他们消除障碍,确保他们开心。Marc Andreessen 之前刚上过这个播客,他的说法是:AI 让优秀的人变得更好,让杰出的人变得卓越。

[原文] [Sherwin]: yeah yeah and what you're saying here is just just doing this more and more is probably the right move spending more time with the best people on your team to unblock them make sure they have everything they need yeah

[译文] [Sherwin]: 是的,是的。你在这里说的就是,越来越多地这样做可能是正确的举措:花更多时间和你团队中最优秀的人在一起,为他们消除障碍,确保他们拥有一切所需。是的。


章节 6:未来商业格局:一人独角兽与 SaaS 的黄金时代

📝 本节摘要

本章探讨了备受瞩目的“一人十亿美金公司”概念。Sherwin 认为这并非空谈,但他更关注由此引发的连锁反应(二阶与三阶效应):为了支撑这种超级个体,可能会诞生数百家提供定制化 AI 服务的小型初创公司,从而开启 B2B SaaS 的黄金时代。这种趋势可能改变风险投资的逻辑——虽然传统的百倍回报机会可能减少,但会涌现大量估值 1000-5000 万美金的“小而美”企业,这对个人创业者是极大的利好。针对 Lenny 关于客服成本难以缩减的质疑,Sherwin 预言未来企业将通过外包给专门的 AI 代理服务商来维持极简规模。

[原文] [Lenny]: people just like have a sense this is big ai is changing so much the world is changing uh it's going to be a huge deal what do you think people aren't pricing in yet into what will change into where things are heading just like what's an example of something you think are like okay we're not realizing this yet

[译文] [Lenny]: 人们确实感觉到了这是一件大事,AI 正在改变这么多东西,世界正在改变,这将会是一件巨型的大事。你认为人们还没有“计入考量”(pricing in)的变化是什么?或者事情将走向何方?能不能举个例子,说明什么是你觉得“好吧,我们还没意识到这点”的事情?

[原文] [Sherwin]: so one of my favorite kind of uh uh like phrases or like things that have come out of this whole AI wave is is the idea of the one person billion dollar startup i think I actually think Sam may have ke or like uh Sam Sam may have been the first one to say it but it's fascinating to think about right

[译文] [Sherwin]: 这一波 AI 浪潮中涌现出的我最喜欢的短语或概念之一,就是“一人十亿美金初创公司”的想法。我觉得实际上可能是 Sam(Sam Altman)……或者是 Sam 最先说出来的。但这真的很值得思考,对吧?

[原文] [Sherwin]: it's like yeah if you know if people are so high leverage at some point there will likely be um a oneperson billion dollar startup um and while I think that's really really cool I think people aren't really pricing in the second or third order effects of this

[译文] [Sherwin]: 就好像,如果人们的杠杆率如此之高,在某个时刻很可能会出现一家只有一人的十亿美金初创公司。虽然我觉得这非常酷,但我认为人们并没有真正考虑到其背后的二阶或三阶效应。

[原文] [Sherwin]: because what the one person billion dollar startup implies is that there's you know one person can just have so much more agency and so much more leverage using one of these tools um that it is just super easy for them to get everything done that they need to for for their business to you know ultimately create something that's a billion dollars

[译文] [Sherwin]: 因为一人十亿美金初创公司意味着,一个人利用这些工具可以拥有大得多的能动性和杠杆率,使得他们能够超级轻松地完成业务所需的一切,最终创造出价值十亿美金的东西。

[原文] [Sherwin]: but I think there are a couple other implications of this one of them is uh uh if it's easy for a person to create a one person bill or if it's possible for a person to create a one person billion dollar startup it also means it's way easier for people to just create startups in general

[译文] [Sherwin]: 但我认为这还有其他几个含义。其中一个是,如果一个人很容易创造,或者说一个人有可能创造一家一人十亿美金的初创公司,这也意味着人们创建一般的初创公司也变得容易多了。

[原文] [Sherwin]: like I actually think this will like one second order effect to this is I think there's going to be a huge like startup boom and like small like SMB style boom um where anyone can build software for anything right

[译文] [Sherwin]: 实际上我认为这会带来一个二阶效应,就是会出现巨大的初创公司繁荣,类似那种中小企业(SMB)风格的繁荣,任何人都可以为任何事情开发软件,对吧?

[原文] [Sherwin]: you're kind of starting see starting to see this play out in the AI startup scene where software's became a lot more vertical oriented where

章节 7:管理哲学深潜:工程师即“外科医生”

📝 本节摘要

在本章中,Sherwin 分享了他深受经典著作《人月神话》(The Mythical Man-Month)影响的管理哲学。他将顶尖软件工程师比作“外科医生”——在手术室中,所有其他角色(如护士、麻醉师等)的存在都是为了支持主刀医生高效工作。作为管理者,他的核心任务是“多看一步”(Look around corners),提前为团队扫清组织障碍,就像在医生伸手前就递上手术刀一样,确保工程师能心无旁骛地通过 AI 工具通过高频产出代码。两人甚至现场构思了一个新点子:用 AI 分析 Slack 和文档,自动预测团队可能遇到的阻碍。

[原文] [Lenny]: okay uh I wanted to come back actually to your management stuff so I really loved your insight about spending more time with top performers has been really successful to you just thinking about you as a manager of a team that is building the platform that powers basically the entire AI economy like every AI startup is building on your API uh clearly you're doing a great job what other kind of core management lessons have you learned what do you find is really important and and and key to your success as a manager of engineers and just people

[译文] [Lenny]: 好的,我想回过头来谈谈你的管理内容。我非常喜欢你关于“花更多时间在顶尖人才身上对你非常成功”的见解。想想看,作为一个团队的管理者,你正在构建支撑整个 AI 经济的平台——就像每个 AI 初创公司都在基于你们的 API 构建一样,显然你做得非常棒。你还学到了哪些其他的核心管理经验?作为一个工程师和人员的管理者,你觉得什么是真正重要的,什么是你成功的关键?

[原文] [Sherwin]: yeah um I I think a lot of the lessons that I've learned here I don't know how specific it is to the OpenA API or or some of our enterprise products in particular i think my my management philosophy has obviously changed over time but I think it it's uh probably stayed the same more than it's changed uh over time

[译文] [Sherwin]: 是的,我想我在这里学到的很多经验,我不知道这是否专门针对 OpenAI API 或我们的一些企业产品。我想我的管理哲学显然随着时间推移发生了变化,但我认为它保持不变的部分可能比变化的部分要多。

[原文] [Sherwin]: uh one of these principles is is kind of what I talked to you about before which is you know spending a lot of time with with top performers like actually spending and like to be very concrete like it's like more than 50% of your time with your top performers with maybe your top like 10% uh performers and really really trying your best to empower them

[译文] [Sherwin]: 其中一个原则就是我之前跟你谈过的,那就是花大量时间在顶尖人才身上。具体来说,就是把你超过 50% 的时间花在你的顶尖人才身上,也许是你前 10% 的表现者身上,并且真的、真的尽你最大的努力去赋能他们。

[原文] [Sherwin]: the way that I think about it is um is is is kind of come back to this analogy of software engineer as as as a surgeon um which comes from the the mythical man book so it's actually it's funny so I I pull it from the book but in the book they actually described this world where um I think they were like predicting the future cuz cuz I think the book was written like in the 70s or something

[译文] [Sherwin]: 我思考这个问题的方式,是回到“软件工程师即外科医生”这个类比上。这出自《人月神话》(The Mythical Man-Month)这本书。实际上这很有趣,我引用了书里的概念,但在书中他们实际上描述了一个世界——我觉得他们像是在预测未来,因为那本书大概是 70 年代写的——

[原文] [Sherwin]: um they said that software engineering might end up moving into a world where that software engineers are like surgeons or like in a surgery room there's like one person doing the work um and you know there's one person like cutting or whatever and like doing all the surgery and everyone else in the room is there to just support them right as like the nurse and like the and the resident and the fellow

[译文] [Sherwin]: 他们说软件工程最终可能会进入这样一个世界:软件工程师就像外科医生。就像在手术室里,有一个人在做工作,你知道,有一个人在动刀或者做所有的手术操作,而房间里的其他所有人都在那里支持他,对吧?比如护士、住院医师和研究员。

[原文] [Sherwin]: and then the surgeon's like I need a scalpel and they give them scalpel and then uh they're like I need you know this tool and this machine and they'll bring it over everyone's there to just like you know support the one uh surgeon and so the the the myth mammoth actually predicted that that is kind of the direction that software engineers going to go

[译文] [Sherwin]: 然后外科医生说“我需要一把手术刀”,他们就递给他手术刀;然后他说“我需要这个工具和这台机器”,他们就会拿过来。每个人都在那里,只是为了支持那一位外科医生。所以《人月神话》实际上预测了那是软件工程师将会发展的方向。

[原文] [Sherwin]: i don't think that's exactly played out where like you know it's much more collaborative and like it's not only one person doing the work but I've always really liked that analogy and and and and uh that analogy is actually what I strive to uh uh kind of like emulate in my own management philosophy which is um software engineering isn't really like surgery where it's not just one person doing work but the way in which I like treating the people on my team and the way that I act as a manager is I want to uh empower them make them feel like they're a surgeon

[译文] [Sherwin]: 我认为现实并没有完全那样发展,你知道现在工作更具协作性,不仅仅是一个人在做工作。但我一直非常喜欢那个类比,而且那个类比实际上是我在自己的管理哲学中努力去模仿的。虽然软件工程并不真的像手术那样只有一个人在工作,但我对待团队成员的方式,以及我作为管理者的行事方式,是我想赋能他们,让他们感觉自己像个外科医生。

[原文] [Sherwin]: um and in in so far at like as like making sure that I'm supporting them and making sure they have everything that they need to to do their work and it feels like they have an army of people kind of supporting them um and looking around corners and giving them everything that they need when it's really just me as the as the manager

[译文] [Sherwin]: 在某种程度上,我要确保我在支持他们,确保他们拥有工作所需的一切,让他们感觉有一支军队在支持他们,在帮他们“多看一步”(looking around corners),给他们一切所需,哪怕实际上只有我这个经理在做这些。

[原文] [Sherwin]: and so like the example that I give is is looking around corners and unblocking people especially from an organizational perspective is extremely extremely useful and again going back to the AI conversations even more important nowadays right like uh if if people are just like cranking PR after PR the main thing bottlenecking uh progress and and you know shipping something tends to be organizational or like processoriented

[译文] [Sherwin]: 我举的例子是“多看一步”和为人们消除障碍,特别是从组织角度来看,这是极其、极其有用的。再回到 AI 的话题,这在今天甚至更加重要,对吧?如果人们只是在不断地提交一个又一个 PR,阻碍进度和发布的主要因素往往是组织层面的,或者是流程导向的。

[原文] [Sherwin]: and if you as a manager can kind of look around corners and kind of unblock the team if you can you know like if if the surgeon needs scalpel but you know the manager kind of already has a scalpel ready for them that that's the best case scenario that's kind of the the way that I approach uh u um management and and especially uh engineering management and so that's something that that's really really um stuck with me over time and even though you know software engineers aren't exactly surgeons that metaphor has always kind of stayed in my mind as of as of uh uh for the rest of my career

[译文] [Sherwin]: 如果你作为管理者能够“多看一步”,为团队消除障碍——如果外科医生需要手术刀,而你知道管理者已经把手术刀准备好了递给他们——那就是最好的情况。这就是我处理管理,特别是工程管理的方式。这一点随着时间的推移一直深深印在我的脑海里,即使软件工程师不完全是外科医生,但那个隐喻在我职业生涯的余下时间里一直留在我的脑海中。

[原文] [Lenny]: i love that and I I feel like I wonder if that's something AI can help with is look around corners and predict here this engineer is going to be blocked by this decision we need to figure this out we need to get Yeah that's actually a really good uh point i haven't tried this yet but I wonder what would happen if I ask uh Chad GBT hooked up to company knowledge you know like what are the active blockers uh look through all the notion docs what are maybe Slack messages you know it's probably in Slack somewhere what are the active blockers on my team and is there something I can do to to help

[译文] [Lenny]: 我喜欢这个观点。我在想,这是不是 AI 可以帮忙的事情?去“多看一步”并预测:这里,这个工程师将会被这个决策阻碍,我们需要解决这个问题,我们需要……是的,这实际上是一个非常好的点子。我还没试过,但我很好奇如果我问连接了公司知识库的 ChatGPT——你知道,比如“现在的活跃阻碍是什么?”,让它查阅所有的 Notion 文档,或者 Slack 消息(你知道问题可能就藏在 Slack 的某个地方)——问它“我团队目前的活跃阻碍是什么?有什么我可以帮忙的吗?”,会发生什么。

[原文] [Sherwin]: um now very I have not thought about that but you're right you just had an insight right here yeah yeah yeah uh and it's I think even more interestingly what do you anticipate will be a blocker for this engineer or this team in the in the coming months or Yeah you asked the you asked the model well you asked the AI to do the second and third order things anticipate that man anticipate what the bloggers will be next month too uh I think we've got a we've got a good idea right here yeah yeah

[译文] [Sherwin]: 嗯,我还没想到过这一点,但你是对的,你刚刚就在这儿产生了一个洞察。是的,是的,是的。而且我认为更有趣的是:你预计在接下来的几个月里,这个工程师或这个团队会遇到什么阻碍?是的,你问模型……你要让 AI 做二阶和三阶的事情,预测它,伙计,预测下个月的阻碍是什么。我觉得我们在这儿找到了个好主意。是的,是的。


章节 8:企业落地指南:为何“自下而上”才是正解

📝 本节摘要

针对许多企业 AI 部署出现“负 ROI(投资回报率)”的现象,Sherwin 指出核心症结在于盲目的“自上而下”(Top-down)行政命令。他认为,真正的成功来自于“自下而上”(Bottom-up)的探索——让最了解一线业务流程的员工去挖掘 AI 的潜力。他建议企业组建一支全职的“特种部队”(Tiger Team),成员不一定是昂贵的软件工程师,而是那些“技术邻近型”人才(如擅长 Excel 的运营主管),由他们作为内部布道者,发现场景并推广最佳实践。

[原文] [Lenny]: okay I'm going to shift to talking about the API and the platform that you all build some So you work with a lot of companies implementing your API your platform building on on your on your tools you told me that you find that a lot of companies actually have negative ROI on their AI deployments which uh I think is what a lot of people read about and feel and think and it's interesting you're actually seeing that what what's going on there what are they doing wrong what do you what what's happening in the world of AI and deployments in ROI

[译文] [Lenny]: 好的,我要转而谈谈你们构建的 API 和平台。你和很多正在实施你们 API、在你们平台上构建工具的公司合作。你告诉我,你发现很多公司的 AI 部署实际上是负 ROI(投资回报率),我觉得这也是很多人读到、感觉到和认为的情况。有趣的是你确实看到了这一点。那里发生了什么?他们做错了什么?在 AI 部署和 ROI 的世界里到底发生了什么?

[原文] [Sherwin]: yeah so so to be clear I I I don't like explicitly see quantitative numbers around this uh you know uh it's actually really hard to measure these things but especially from observing some companies kind of trying to do AI I would not be surprised if a lot of AI deployments are actually you know negative ROI

[译文] [Sherwin]: 是的,所以澄清一下,我并没有明确看到这方面的量化数据。你知道,这实际上很难衡量。但特别通过观察一些尝试做 AI 的公司,如果很多 AI 部署实际上是负 ROI,我也不会感到惊讶。

[原文] [Sherwin]: i mean part of this too is I think there's also general sentiment um from uh folks uh around the country u like basically outside of tech that AI is being forced onto them um and I think part of this is is is uh uh uh probably a symptom of some negative ROI uh AI deployments

[译文] [Sherwin]: 我的意思是,这部分原因也是我认为全国各地的人们——基本上是科技圈以外的人——有一种普遍的情绪,觉得 AI 是被强加给他们的。我认为这部分可能就是一些负 ROI 的 AI 部署所表现出的症状。

[原文] [Sherwin]: one thing is and I think I I come back to this again and again like I think we in Silicon Valley just forget that we live in a bubble like we are so like Twitter is a bubble sorry X is a bubble um Silicon Valley is a bubble software engineering is a bubble most people uh in the world most people in the US are not software engineers are not very AI pled um are not following every single model release

[译文] [Sherwin]: 有一件事——我想我一次又一次地回到这一点——就是我认为我们在硅谷的人忘记了我们生活在一个泡沫里。Twitter 是个泡沫——抱歉,是 X 是个泡沫——硅谷是个泡沫,软件工程是个泡沫。世界上大多数人,美国大多数人,不是软件工程师,没有特别深陷 AI 之中(AI pilled),也没有关注每一个模型的发布。

[原文] [Sherwin]: and so we're just like highly out of the loop on how to use this technology... when I talk to some of these companies and I and I talk to the the actual employees using these it's like the most basic thing that they're trying to do and they like have very little understanding of exactly how this technology works and so that that's that's kind of like one big observation for me which is like they're asking very simple questions of these things they're really not not pushing it just yet

[译文] [Sherwin]: 所以我们在如何使用这项技术上其实是非常脱节的……当我与其中一些公司交谈,与实际使用这些技术的员工交谈时,他们尝试做的都是最基本的事情,而且他们对自己到底怎么用这项技术知之甚少。所以这是我的一个大观察,就是他们问的都是非常简单的问题,他们真的还没开始深入挖掘它。

[原文] [Sherwin]: the companies where I think it's it started to work really well have a combination of both top down buyin so it's like the seuite it's like you know we're we're we want to become an AI AI first company so there's buyin they buy the tools they have you know exact support but it also has bottoms up adoption and buyin

[译文] [Sherwin]: 我认为那些开始运作得非常好的公司,结合了自上而下的认同(Top-down buy-in)——就像 C-Level 高管说“我们要成为一家 AI 优先的公司”,所以有认同,他们购买工具,有高管支持——但同时也有自下而上的采用和认同(Bottoms-up adoption)。

[原文] [Sherwin]: and so what I mean by that is it has like actual employees doing the work who are really excited about the technology and are willing to learn evangelize build best practices and kind of like knowledge share within the organization

[译文] [Sherwin]: 我这话的意思是,要有实际做工作的员工对这项技术感到非常兴奋,并且愿意去学习、去布道、建立最佳实践,并在组织内部进行知识分享。

[原文] [Sherwin]: we've we've seen this a lot internally so like obviously OpenAI has always wanted to be uh a very AIcentric company but where when it really started taking off was when was with the introduction of codecs and these tools where like people like actual employees themselves could start applying it to their work

[译文] [Sherwin]: 我们在内部经常看到这种情况。显然 OpenAI 一直想成为一家非常以 AI 为中心的公司,但真正的腾飞是在引入 Codex 和这些工具之后,也就是像实际员工自己可以开始将其应用到他们的工作中时。

[原文] [Sherwin]: uh and I think you really need this because at the end of the day everyone's work is like very different it's like very unique uh software engineering is different than finance is different than operations different than go to market and sales uh and so there's like a lot of these like last mile intricacies of work that needs to really be done in a bottoms up fashion

[译文] [Sherwin]: 我认为你真的需要这样,因为归根结底,每个人的工作都非常不同,非常独特。软件工程不同于财务,不同于运营,也不同于市场和销售。所以有很多这种工作的“最后一公里”的错综复杂之处,真的需要以自下而上的方式来完成。

[原文] [Sherwin]: and so my sense is a lot of these these AI deployments don't have like don't have bottoms up adoption like it was like an exact mandate and it's extremely top down and is very divorced from what the actual work looks like and as an end result you end up with a giant workforce that doesn't really understand the technology is like I know I'm supposed to use this and maybe it's like on my performance review too but um I'm not sure what to do and they look around no one else is doing it there's no one else to learn from

[译文] [Sherwin]: 所以我的感觉是,很多这些 AI 部署并没有自下而上的采用,就像是一个行政命令,极其自上而下,并且与实际工作严重脱节。最终结果是,你拥有庞大的员工队伍,他们并不真正理解这项技术,就像是“我知道我应该用这个,也许这还关系到我的绩效考核,但我不知道该做什么”。他们环顾四周,没有其他人在做,也没有人可以学习。

[原文] [Sherwin]: uh and so my my you know my recommendation for companies kind of pushing this is is find or maybe even staff a full-time team internally that is this kind of tiger team internally that can um explore the full extent of the capabilities apply to specific workflows do the knowledge sharing uh create excitement uh within folks uh who might want to use this technology

[译文] [Sherwin]: 所以,我对推动这件事的公司的建议是,找到甚至可能在内部组建一个全职团队,这就是一种内部的“特种部队”(Tiger Team)。他们可以探索能力的全部范围,将其应用到特定的工作流中,进行知识分享,在那些可能想使用这项技术的人群中创造兴奋感。

[原文] [Lenny]: and who who would you put on this tiger team is it like engineerled do you find in your experience is it cross functional sort of team

[译文] [Lenny]: 那么你会把谁放在这个特种部队里?是工程师主导的吗?根据你的经验,这是一个跨职能的团队吗?

[原文] [Sherwin]: yeah it's it's interesting so um also a lot of companies don't have software engineers uh and so the the pattern I've seen is it tends to be these like software engineering adjacent like basically technical people but are not software engineers i think those are the ones who get tend to get most excited uh around this

[译文] [Sherwin]: 是的,这很有趣。很多公司其实没有软件工程师。所以我看到的模式是,这往往是那些“软件工程邻近”(Software engineering adjacent)的人,基本上是技术型人才,但不是软件工程师。我认为这些人往往是对这件事最感到兴奋的。

[原文] [Sherwin]: it's like maybe the like you know support team operations lead who doesn't code but loves using these tools and you know is like an Excel wizard or something and so it's like technical adjacent or like coding adjacent and like you know pretty technical those are the times like those are the kinds of people I've seen in these companies who just like really light up and get excited around this

[译文] [Sherwin]: 比如可能是客服团队的运营主管,他不写代码,但喜欢使用这些工具,你知道就像是个 Excel 巫师(Excel Wizard)之类的人。所以是技术邻近或编码邻近的,并且相当懂技术。这些就是我在这些公司里看到的真正眼前一亮并为此感到兴奋的人。

[原文] [Lenny]: and the advice is find the people that are most excited and instead of kind of having them spread out through the organization you're what you find works is create a little AI kind of evangelist team that finds ways to use it and kind of spreads it across the work

[译文] [Lenny]: 所以建议是找到那些最兴奋的人,与其让他们分散在组织各处,你发现有效的方法是建立一个小型的 AI 布道团队,找到使用它的方法,并将其传播到工作中去。

[原文] [Sherwin]: yeah i mean another it's kind of like hearing you you play back to me another way to think about it kind of tying back to my own management philosophies is find the high performers in AI adoption and empower them you know let them build hackathons let them you know hold seminars do knowledge sharing kind of create the seeds of uh of excitement internally

[译文] [Sherwin]: 是的,听你复述,另一种思考方式——结合回我自己的管理哲学——就是找到 AI 采纳方面的“高绩效者”并赋能他们。让他们举办黑客马拉松,让他们举办研讨会,做知识分享,在内部播下兴奋的种子。


章节 9:反直觉洞察:模型终将“吃掉”脚手架

📝 本节摘要

在本章中,Sherwin 提出了一个极具挑战性的观点:在 AI 领域,盲目听从客户反馈可能会误导产品方向。这是因为技术迭代速度远超客户认知,客户往往受限于当前的“局部最优解”(Local Maximum)。他引用了 Nicholas (FinTool 创始人) 的名言:“模型会把你的脚手架当早餐吃掉”。许多曾被视为必备的复杂工程结构(如复杂的向量数据库、早期的 Agent 框架),随着模型原生能力的提升(如长上下文、内置搜索),正逐渐变得多余。这印证了 AI 界的“苦涩教训”(The Bitter Lesson):不要试图用复杂的人工逻辑去对抗算力的扩张。他的核心建议是:与其针对当下的模型能力开发,不如针对模型未来的演进方向“提前”构建

[原文] [Lenny]: okay amazing there's a couple hot takes I want to hear uh from you something that I've seen you talk about and share one is um you've shared that talking to customers and listening to customers is not always the right strategy in AI and it might often lead you astray

[译文] [Lenny]: 好的,太棒了。我想听听你的几个“暴论”(Hot takes),我看到你讨论和分享过一些。其中一个是,你曾分享说,在 AI 领域,与客户交谈和倾听客户并不总是正确的策略,这往往可能会把你引入歧途。

[原文] [Sherwin]: i don't know if it's that hot of a take i think the main thing here is so obviously you should talk to your customers like it's it's like useful to talk to customers i just think the AI field um especially what I've seen over the last kind of like three years um uh working on the API and and seeing kind of all that evolve is the field and the models themselves are just changing so so quickly they tend to like disrupt themselves especially around the like tooling and the scaffolding space

[译文] [Sherwin]: 我不知道这算不算非常激进的观点。我认为这里的重点是——显然你应该和客户交谈,和客户交谈是有用的。我只是觉得,在 AI 领域,特别是我过去三年在 API 团队工作并目睹这一切演变的过程中看到的,这个领域和模型本身的变化速度太快了,它们倾向于自我颠覆,特别是在工具和“脚手架”(Scaffolding)领域。

[原文] [Sherwin]: so uh there there's this quote that I read actually earlier this week from a it's from an ex article uh by this guy named Nicholas who's who's the founder of a a startup called finol uh where uh I think he was he was sharing a lot of the best practices that he has learned through building AI agents for financial services I think at a at a startup FinTool um and he had this phrase that I thought was really good which is uh the models will eat your scaffolding for breakfast

[译文] [Sherwin]: 实际上我这周早些时候读到一句话,来自一篇 X(推特)上的文章,作者叫 Nicholas,他是 FinTool 这家初创公司的创始人。他在分享为金融服务构建 AI Agents 的最佳实践时,说了这句我觉得非常棒的话:“模型会把你的脚手架当早餐吃掉。”(The models will eat your scaffolding for breakfast.)

[原文] [Sherwin]: like if you look if you rewind back to 2022 right when Chad GBT launched um these models were pretty raw and there was like all this product scaffolding and and things especially in the developer space to basically try and steer the model and build a scaffolding around it to get it to do what you want like agent frameworks there's like like vector stores I think was like really popular back then uh and just like a whole smattering of tools here

[译文] [Sherwin]: 如果你回溯到 2022 年 ChatGPT 刚发布的时候,那些模型还很原始。当时有各种各样的产品脚手架,特别是在开发者领域,基本上都是为了试图引导模型,在它周围搭建脚手架让它做你想做的事。比如 Agent 框架,还有当时非常流行的向量存储(Vector Stores),以及一大堆这类工具。

[原文] [Sherwin]: and as you've kind of seen the field play out that the models have just changed so much uh that uh and and gotten so much better that they ended up yeah literally eating some of some of the scaffolding

[译文] [Sherwin]: 正如你所看到的领域发展,模型变化如此之大,变得如此之好,以至于它们最终确实“吃掉”了部分的脚手架。

[原文] [Sherwin]: um and I think this is even true today so I think the the article from Nicholas um actually you know the the current scaffolding which is uh fashionable is skills files based context management i could see a world where at some point you know that's no longer useful uh where the model can actually you know manage all that themselves or like you know uh uh or or or there might be you know it's hard to predict but like might move on to some new paradigm where you know need this file based like skills skills type thing

[译文] [Sherwin]: 我认为即便在今天这也是事实。Nicholas 的文章提到,目前流行的脚手架是基于文件(Files-based)的上下文管理,也就是“技能”(Skills)。我可以预见在某个时刻,这也不再有用了,模型实际上可以自己管理所有这些,或者我们可能会进入某种新的范式,不再需要这种基于文件的所谓“技能”了。

[原文] [Sherwin]: you have literally seen this play out right like the agent frameworks I think are a little less useful now um there was a period of time like 2023 where we thought vector stores and is is going to be like the main way for you to you know bring organizational context into the models and you need to you know uh vector and embed every bit of your corpuses and then you need to do all this work to like figure out the vector search to like optimize that to fill out the right information the right time all of that is scaffolding because the model you know was not good enough

[译文] [Sherwin]: 你真的看到这一切发生了,对吧?比如 Agent 框架,我觉得现在没那么有用了。在 2023 年有一段时间,我们以为向量存储将是你把组织上下文带入模型的主要方式。你需要把语料库的每一部分都向量化并嵌入(Embed),然后你需要做大量工作来搞定向量搜索,优化它以便在正确的时间填充正确的信息。所有这些都是脚手架,因为那时的模型还不够好。

[原文] [Sherwin]: and turns out you know in this case it turns out as the models get better a better approach is actually to take out a lot of that logic and trust the model and give it a set of tools for search it doesn't need to be a vector store you could actually just hook it up to any type of search it could literally be files on a file system like skills uh and agents MD uh to kind of steer it uh as well

[译文] [Sherwin]: 结果证明,随着模型变得更好,更好的方法实际上是拿掉很多那样的逻辑,信任模型,给它一套搜索工具。它不需要是向量存储,你实际上可以把它连接到任何类型的搜索,甚至可以是文件系统上的文件,比如 Skills 和 agents.md,以此来引导它。

[原文] [Sherwin]: obviously there's still a place for vector stores i know a lot of companies are still using it but the that the entire scaffolding around that and building an entire ecosystem around that and assuming that's the only scaffolding that you need has has really changed

[译文] [Sherwin]: 显然向量存储仍有一席之地,我知道很多公司还在用,但围绕它构建整个脚手架和生态系统,并假设那是你唯一需要的脚手架,这种观念真的已经变了。

[原文] [Sherwin]: and so tying this back to like you know uh it's you know you don't always have to listen to your customers because the field is changing so much at any point in time you know a lot of people are kind of in this local local maximum and if you just blindly listen to your customers they'll be like yeah I want a better vector store like I want a better uh I want a better you know agent framework for this

[译文] [Sherwin]: 所以回到这个话题,你不必总是听从客户,因为这个领域无时无刻不在剧变。很多人处于一种“局部极大值”(Local Maximum)的状态。如果你只是盲目地听从客户,他们会说:“是的,我想要一个更好的向量存储”,“我想要一个更好的 Agent 框架”。

[原文] [Sherwin]: and uh if you had just kind of only chased down that path it actually would have led you to you know build something that again is the local maxima whereas as the models get better we've had to reinvent event and kind of rethink the right right uh uh abstractions and the right tools and frameworks to to to to build around these models

[译文] [Sherwin]: 如果你只是沿着那条路追下去,实际上会导致你构建出某种同样局限于“局部极大值”的东西。而随着模型变好,我们不得不重新发明,重新思考围绕这些模型构建的正确抽象、正确工具和框架。

[原文] [Lenny]: it's interesting how this is um the bitter lesson is uh you know this big lesson that AI and ML folks learned which is just like uh don't the less you over complicate the less logic you add to to machine learning to AI the more it'll be able to scale and grow and just like take it all away and let it just just compute basically just give it more power to to get smarter on its own

[译文] [Lenny]: 这很有趣,这就像是“苦涩的教训”(The Bitter Lesson),那是 AI 和 ML 从业者学到的一个大教训:不要把事情搞得太复杂,你给机器学习和 AI 添加的人工逻辑越少,它就越能扩展和增长。就像把那些都拿掉,让它去计算,基本上就是给它更多算力让它自己变聪明。

[原文] [Sherwin]: yeah there's literally a version of the bitter lesson applied to like building with AI where you know we were trying to architect all this stuff around and turns out the models are just kind of you know eat it all away and and and and honestly like OpenAI API team has like been guilty of this uh where we kind of like took some you know left and right turns when we shouldn't have um but uh yeah the models still end up models get better and uh we're all learning the bitter lesson day in and day out

[译文] [Sherwin]: 是的,这简直就是“苦涩的教训”在 AI 应用开发上的翻版。我们试图在周围架构所有这些东西,结果证明模型只是把它们都吃掉了。老实说,OpenAI API 团队也犯过这种错,我们在不该转弯的地方左转右转了。但是的,模型最终会变得更好,我们日复一日地都在学习这个苦涩的教训。

[原文] [Lenny]: so what would be the the key takeaway for folks building on say the API or just building agents and you know having to build a little bit of this around for now is it just yeah what would be the advice

[译文] [Lenny]: 那么对于那些基于 API 构建应用,或者构建 Agents 的人来说,关键的建议是什么?因为他们现在不得不构建一些这种东西。你的建议是什么?

[原文] [Sherwin]: my general advice and I've been giving this to people for a while and I think still true today is make sure you're building for where the models are going and not where they are today

[译文] [Sherwin]: 我的一般建议——这建议我已经给别人一段时间了,而且我认为今天仍然适用——是:确保你是为了模型将要去的地方构建,而不是为了它们今天所在的地方构建。

[原文] [Sherwin]: um uh you know the the it's it's clearly a moving target and I think a lot of the companies that I've seen startups that I've seen really really do well is they build a product for an ideal like type of capability that is like maybe 80% of the way there today and it like they end up you know having a product that like kind of works but it's like just almost there but then as the models get better you know suddenly it might click and then their product now is incredible because it works you like uh uh like maybe with like 03 at some point it suddenly works with 5.1 5.2 suddenly it unlocks it

[译文] [Sherwin]: 这是一个移动的目标。我看过很多做得非常好的初创公司,他们是为一种理想的能力构建产品,这种能力今天可能只实现了 80%。他们最终拥有的产品可能勉强能用,或者就差那么一点点。但随着模型变好,突然之间它就“通”了(Click),然后他们的产品就变得不可思议了。也许在 o3 模型下不行,但到了 5.1、5.2 版本,突然就解锁了。

[原文] [Sherwin]: but they're building these products with the like the model capability improvements in mind and with that you end up creating an exper experience that's way better than if you had assumed that it's it's static in the first place

[译文] [Sherwin]: 他们构建产品时心里想着模型能力的提升。这样你最终创造出的体验,要比你一开始假设模型是静态的情况下好得多。


章节 10:技术前瞻:长时程 Agent 与多模态语音交互

📝 本节摘要

在展望未来 12 到 18 个月的发展时,Sherwin 预测模型将迎来两大关键突破:一是任务执行时长的飞跃,从目前擅长 10 分钟的交互式任务,进化到能独立执行 6 小时甚至更久的长时程任务(Long-horizon tasks);二是 原生多模态(Native Multimodality) 能力的爆发,特别是语音(Audio)。他认为语音在商业和企业场景中被严重低估,未来“语音到语音”的模型将彻底改变业务流程的自动化方式。

[原文] [Lenny]: so to follow that thread where are like in the next six to 12 months where is the API heading where's the platform heading where are the models heading as much as you can share I know there's a lot of secrets here that maybe you're most excited about or do you think that people should start to prepare for and however much you can share

[译文] [Lenny]: 顺着这个思路,在接下来的 6 到 12 个月里,API 会向哪里发展?平台会向哪里发展?模型会向哪里发展?在你能分享的范围内——我知道这里有很多秘密——也许是你最兴奋的,或者你认为人们应该开始准备的事情是什么?

[原文] [Sherwin]: i mean so the obvious one is um how long of a task uh these models can do coherently um so there's like the the meter benchmark that that I think tracks software engineering tasks and how long you know like how long of a task can these models do uh 50% of the time 80% of the time

[译文] [Sherwin]: 我觉得最明显的一个是,这些模型能连贯地执行多长时间的任务。有一个基准测试(Benchmark)追踪软件工程任务,看这些模型在 50% 或 80% 的情况下能执行多长时间的任务。

[原文] [Sherwin]: uh I think we're at something like multi-hour tasks being able to be done by uh software engineering tasks being able to be done by um uh these frontier models uh 50% of the time and then I think 80% is something like just under an hour but the the the sobering thing about that that chart is they plot all the uh previous models on this chart as well so you can really see the trend of this

[译文] [Sherwin]: 我想我们现在的情况大概是,这些前沿模型在 50% 的情况下能完成长达数小时的软件工程任务;而在 80% 的情况下,大概是不到一小时。但那张图表最令人清醒(Sobering)的地方在于,他们把所有之前的模型也都画在了图上,所以你可以真正看到这个趋势。

[原文] [Sherwin]: that's something that I'm really excited about which is you know I actually think products today really optimize for tasks that the model can do for like minutes at a time like even codecs and like the coding tools I'd say like you know it's in the Cly you're kind of like seeing it be interactive it's really you know quite optimized well for like maybe at most 10-minute types

[译文] [Sherwin]: 这就是我真正兴奋的地方。实际上我认为今天的产品真正优化的是模型一次只能做几分钟的任务。即使是 Codex 和那些编程工具,我说就像是在命令行界面(CLI)里,你看着它交互,它真的只针对大概最多 10 分钟类型的任务进行了很好的优化。

[原文] [Sherwin]: i have seen people push codeex to the limit and do like multi-hour long uh tasks uh but again I I think that that's more of the exception but I uh if you follow this trend like I think like in the next 12 to 18 months we could see models that could do multi-hour long tasks very very coherently

[译文] [Sherwin]: 我见过有人把 Codex 推向极限,做长达数小时的任务,但这更多是例外。但如果你跟随这个趋势,我认为在接下来的 12 到 18 个月里,我们可以看到模型能非常非常连贯地执行长达数小时的任务。

[原文] [Sherwin]: at some point it might reach like you know 6 hours a dayong task where you kind of like dispatch it and have it do you know do things on uh on its own for a while the types of products you build around that will look very different you want to give the model feedback you obviously don't want it to completely run wild for a day maybe you do but but you probably don't um and and then the the universe of things you can have the model do really expand so that's something that I'm really um really excited about seeing

[译文] [Sherwin]: 在某个时刻,它可能会达到像 6 小时甚至一整天的任务时长,你把它派发出去,让它自己做一会儿事情。围绕这种情况构建的产品类型将会看起来非常不同。你会想要给模型反馈,你显然不想让它完全失控跑一整天——也许你想,但可能不想。那时你可以让模型做的事情的范围将真正扩大。这是我非常期待看到的。

[原文] [Sherwin]: another uh thing over the next 12 to 18 months I think be really cool is improvements in our in the multimodal models so uh and and actually by by multimodality I'm mostly thinking about audio here where uh the models are pretty good at audio i think they're going to get a lot better um at audio over the next 6 to 12 months especially the like you know the um native multimodal models the speech to speech ones

[译文] [Sherwin]: 未来 12 到 18 个月另一件我觉得会非常酷的事情,是我们多模态模型的改进。实际上说到多模态,我这里主要想的是语音(Audio)。模型在语音方面已经相当不错了,我认为在接下来的 6 到 12 个月里,它们在语音方面会变得好得多,特别是那些原生多模态模型,即“语音到语音”(Speech-to-speech)的模型。

[原文] [Sherwin]: I think there's also interesting work uh being done around um new types of models and architectures on the multimodal audio side uh as well but uh audio especially in the enterprise and in a business setting I think is a hugely underrated uh domain still like everyone talks about coding it's all text uh but uh we're talking in audio A lot of the world's business is done via audio uh a lot of services and operations are done via uh talking and audio and so uh I think that that area is going to look very exciting in the next 12 to 18 months and I think there will be uh even more unlock for uh what we can do uh with with audio models uh there as well

[译文] [Sherwin]: 我认为在多模态语音方面,围绕新类型的模型和架构也有一些有趣的工作正在进行。但我认为,语音,特别是在企业和商业环境中,仍然是一个被严重低估的领域。每个人都在谈论编程,那都是文本。但我们正在用语音交谈。世界上很多生意是通过语音完成的,很多服务和运营是通过交谈和语音完成的。所以我认为那个领域在接下来的 12 到 18 个月里会看起来非常令人兴奋,而且我认为那里会有更多关于我们能用语音模型做什么的解锁。

[原文] [Lenny]: amazing so quick summary uh expect agents and uh AI tools to run longer to that that trajectory to continue to increase and then audio and speech becoming a bigger deal more first party and and native and better and core to the experience extremely cool

[译文] [Lenny]: 太棒了。快速总结一下:预期 Agents 和 AI 工具运行时间更长,那个轨迹会继续增长;然后音频和语音会成为一件更大的事,更加第一方(First-party)、原生、更好,并且成为体验的核心。非常酷。


章节 11:被忽视的蓝海:业务流程自动化 (BPA)

📝 本节摘要

在本章中,Sherwin 指出了硅谷视野之外的巨大机遇:业务流程自动化(Business Process Automation)。他区分了软件工程(开放式、非重复的知识工作)与绝大多数非科技行业的工作(基于标准作业程序 SOP 的重复性操作)。无论是客服热线还是公用事业公司的流程,都具有“高确定性”和“高度可重复性”,这正是 AI 代理大显身手的地方。尽管这在推特(X)上讨论热度不高,但其实际市场规模极其庞大,甚至可能超过软件工程本身的自动化机会。

[原文] [Lenny]: okay I want to go back to one of your hot takes another hot take that I've seen you discuss you're big uh you're very bullish on business process automation as an opportunity in the world of AI talk about that

[译文] [Lenny]: 好的,我想回过头来谈谈你的另一个“暴论”,我看到你讨论过。你非常看好“业务流程自动化”(Business Process Automation)作为 AI 世界的一个机会。谈谈那个吧。

[原文] [Sherwin]: yeah this go this goes back to the thing that I said previously which is um we we we live in a bubble in Silicon Valley and um a lot of the work that we do that we're used to software engineering you know product management building products uh is very differently shaped than the work that goes on um that runs our entire economy and I see this in and out when I talk to customers

[译文] [Sherwin]: 是的,这回到了我之前说过的一点,那就是我们生活在硅谷的泡沫里。我们习惯做的很多工作——软件工程、产品管理、构建产品——其形态与那些维持我们整个经济运行的工作非常不同。当我与客户交谈时,我反反复复地看到这一点。

[原文] [Sherwin]: uh if you if you talk to any like you know company that's not based in it's not a tech company um there's a lot of business processes and so what what I mean by this is is you know I generally delineate it as you know there's like uh like software engineering is kind of like open-ended knowledge work right

[译文] [Sherwin]: 如果你和任何非科技类的公司交谈,你会发现那里有大量的业务流程。我的意思是,通常我将其划分为两类:软件工程有点像是一种“开放式的知识工作”(Open-ended knowledge work),对吧?

[原文] [Sherwin]: it's and this is why I think tools like codeex tend to be quite quite good because it's exploring and and you're giving it these like open-ended things but software engineering is fundamentally like pretty open-ended uh and it's not very repeatable right so like you build a feature you're not trying to build the exact same feature over and over again

[译文] [Sherwin]: 这也是为什么我认为像 Codex 这样的工具往往相当不错,因为它是在探索,你给它的是这些开放式的东西。但软件工程本质上是相当开放的,而且不是非常可重复的,对吧?就像你构建一个功能,你不会试图一遍又一遍地构建完全相同的功能。

[原文] [Sherwin]: and a lot of like tech jobs are in the space I think like data science is kind of in the space as well even some of the like strategic finance stuff but as you move further and further away from software engineering and like what what is core in tech a lot of jobs are just business processes they're like repeatable things uh repeatable operations um that you know some manager at a company has kind of like iterated on

[译文] [Sherwin]: 我认为很多科技工作都属于这一类,像数据科学也在这个领域,甚至一些战略财务工作也是。但当你离软件工程和科技核心越来越远时,很多工作仅仅就是业务流程。它们是可重复的事情,可重复的操作,是公司里的某个经理迭代出来的。

[原文] [Sherwin]: um there's usually a standard operating procedure that people want to do uh and you don't want to deviate from it that much you know this like in software engineering the ingenuity is isn't isn't deviating but a lot of a lot of the the the work being done in the world is actually just um running through these procedures and operations

[译文] [Sherwin]: 通常会有一个人们想要执行的标准作业程序(SOP),而且你不想偏离它太多。你知道,在软件工程中,独创性在于不偏离……但在世界上完成的大量工作中,实际上只是在跑通这些程序和操作。

[原文] [Sherwin]: like if I you know if I call um a support line they're running through one of these if I call my utility company there's a bunch of processes and things that they can and cannot do um for me

[译文] [Sherwin]: 比如如果我打一个客服热线,他们就在执行其中一个流程;如果我打给我的公用事业公司,那里有一堆流程,规定了他们能为我做什么,不能做什么。

[原文] [Sherwin]: uh and so I'm I'm just extremely bullish on this general category of like and and and I think it's underrated because it's so different from what we think about in Silicon Valley people tend to not think about it

[译文] [Sherwin]: 所以我只是极度看好这个大类。我认为它被低估了,因为它与我们在硅谷所想的太不同了,人们往往不会去想它。

[原文] [Sherwin]: but how can we apply um AI uh and and some of the tools and frameworks that we have towards this business process automation towards automated automating and making easier um repeatable business processes with high determinism um that is fully integrated with business uh data and business decisions and and and different systems within an enterprise

[译文] [Sherwin]: 但是,我们如何将 AI 以及我们拥有的一些工具和框架应用于这种业务流程自动化?应用于自动化并简化那些具有“高确定性”(High determinism)的可重复业务流程?这些流程与业务数据、业务决策以及企业内的不同系统完全集成。

[原文] [Sherwin]: um and how can we actually make that that process better uh because I actually think there's a lot of opportunity and a lot of work to be done uh in that area and we just we just don't talk about it because it's it's a little bit less uh uh in our wheelhouse

[译文] [Sherwin]: 以及我们如何实际上让那个流程变得更好?因为我真的认为在那个领域有很多机会和很多工作要做,我们只是不谈论它,因为它不太在我们的擅长范围内(Wheelhouse)。

[原文] [Lenny]: so your take here just to make sure I fully understand it is you think there's a much uh bigger opportunity outside of engineering for AI to impact uh productivity of companies and also jobs of these folks that are doing these kind of repetitive easily automated tasks

[译文] [Lenny]: 所以你的观点——为了确保我完全理解——是你认为在工程领域之外,AI 影响公司生产力以及那些从事这类重复性、易自动化任务的人的工作方面,存在着大得多的机会?

[原文] [Sherwin]: impact jobs and also just impact how work is done like so much of work is done in this way like you think about you know like what a basically we I I talk to customers all the time big enterprises like like how how will AI transform my company like how will it run in in in a world uh with AI in like 20 years

[译文] [Sherwin]: 影响工作,也仅仅是影响工作完成的方式。太多的工作是以这种方式完成的。我一直和客户交谈,那些大企业,比如“AI 将如何改变我的公司?”“在 20 年后的 AI 世界里,公司将如何运行?”

[原文] [Sherwin]: um and and you know software engineering is part of the story but there's so much more on the business process side and I actually think it might look even more different on the business process side and and the work there is is pretty substantial

[译文] [Sherwin]: 软件工程是故事的一部分,但在业务流程方面有更多的内容。实际上我认为在业务流程方面,情况可能会看起来更加不同,而且那里的工作量相当可观。

[原文] [Sherwin]: it's actually interesting i don't know like from an absolute percentage or absolute basis I don't know if it's bigger or smaller than software engineering like software is pretty huge and pretty extensive uh as well but it is pretty massive and it's definitely bigger than you know uh uh uh it's bigger than you would think it is based off of how how people talk about it or don't talk about it on X or Twitter

[译文] [Sherwin]: 这其实很有趣。我不知道从绝对百分比或绝对基础来看,它是否比软件工程更大或更小——软件业也相当巨大且广泛——但它确实非常庞大。而且绝对比你根据人们在 X 或 Twitter 上谈论或不谈论它的程度所想象的要大得多。


章节 12:生态共赢、开发工具栈与快问快答

📝 本节摘要

在最后一章,Sherwin 回应了创业者最担心的问题:OpenAI 会不会“碾压”初创公司?他强调市场极其巨大,OpenAI 的核心定位是平台,坚信“水涨船高”(Rising tide lifts all boats)。接着,他详细拆解了 OpenAI 的开发工具栈,从底层的 Responses API,到编排层的 Agents SDK,再到 UI 层的 Agent Kit。最后进入精彩的“快问快答”环节,Sherwin 分享了他的书单(包括科幻小说《There Is No Antimemetics Division》)、对动漫《咒术回战》的热爱、生活信条(“永远不要自怨自艾”),以及他在 Opendoor 时期关于房价模型的有趣洞察(高压线和户型图对房价的影响被严重低估)。

[原文] [Lenny]: the biggest question on people's minds is always just uh how do I not have OpenAI squash my idea and build their own thing and then you know destroy this this market I created what's the general policy what's the general philosophy of how startups should think about where open AI is unlikely to go

[译文] [Lenny]: 人们心中最大的疑问总是:我该如何避免 OpenAI “碾压”我的想法,建立他们自己的东西,然后摧毁我创造的这个市场?关于初创公司应该如何思考 OpenAI 不太可能涉足的领域,你们的一般政策和一般哲学是什么?

[原文] [Sherwin]: my general answer here is is um the market is so big and so massive like I actually think you know startups should just not overly think about where open AI or these labs are going

[译文] [Sherwin]: 我的一般回答是,市场是如此之大,如此巨大。我实际上认为初创公司不应该过度思考 OpenAI 或这些实验室要去哪里。

[原文] [Sherwin]: i've talked to a lot of startups you know that have you know not worked out startups that are doing really well every startup that I've seen that has kind of fizzled out is not because open AI or you know big lab or Google or something has has come to squash them it's because they built something and it like really didn't resonate with with the customers

[译文] [Sherwin]: 我和很多初创公司谈过,有的失败了,有的做得很好。我见过的每一家最终失败的初创公司,都不是因为 OpenAI 或者你知道的某个大实验室、或者 Google 之类的来碾压他们,而是因为他们构建的东西并没有真正与客户产生共鸣。

[原文] [Sherwin]: whereas the ones that take off like even in very competitive spaces like coding like cursor is huge at this point and it's because they built something that people really love and so my general advice is like don't you know don't overly stress about this just build something that people like and you will you will have a space in this

[译文] [Sherwin]: 相反,那些腾飞的公司——即使是在竞争非常激烈的领域,比如编程领域的 Cursor,现在已经很大了——是因为他们构建了人们真正热爱的东西。所以我的一般建议是:别过度为此焦虑,只要构建人们喜欢的东西,你就会在这个市场拥有一席之地。

[原文] [Sherwin]: one thing that that that we've always held very near and dear which both Sam and Greg helped you know reinforce from the top as well is we actually view ourselves fundamentally as a like ecosystem platform company the API was our first product we think it's really important for us to foster this ecosystem and continue to you know uh support it and and not squash it

[译文] [Sherwin]: 有一件事我们一直非常珍视,Sam 和 Greg 也从高层帮助加强了这一点,那就是我们实际上从根本上将自己视为一家生态系统平台公司。API 是我们的第一个产品,我们要培育这个生态系统并继续支持它,而不是碾压它,我们认为这真的很重要。

[原文] [Sherwin]: the general like thinking about this is like you know a rising tide like lifts all boats and you know we might be a aircraft carrier we're like pretty big at this point but we think it's important to raise the tide uh because everyone kind of uh benefits and I think we'll benefit as well

[译文] [Sherwin]: 关于这个的一般想法就像是“水涨船高”(A rising tide lifts all boats)。你知道,我们可能是一艘航空母舰,我们在这一点上相当大,但我们认为涨潮很重要,因为每个人都会受益,而且我认为我们也会受益。

[原文] [Lenny]: one last question just for folks that are thinking about building on the API or just like oh wait I could do cool stuff with open models and APIs what what does your API and and platform allow people to do like I know you can build agents on top of the platform just talk about what you allow

[译文] [Lenny]: 最后一个问题,对于那些考虑在 API 上构建应用的人,或者那些觉得“等等,我可以用开放模型和 API 做很酷的事情”的人来说,你们的 API 和平台允许人们做什么?我知道你可以在平台之上构建 Agents,能谈谈你们允许做什么吗?

[原文] [Sherwin]: so fundamentally the API offers a bunch of developer endpoints uh and and uh and these developer endpoints basically let you sample from our models the most popular one that we have right now is one called responses API uh and so this is an endpoint and it's optimized for building longunning agents

[译文] [Sherwin]: 从根本上说,API 提供了一堆开发者端点(Endpoints),这些端点基本上让你能够从我们的模型中采样。我们目前最受欢迎的一个叫 Responses API,这是一个端点,专门为构建长时运行的 Agents 进行了优化。

[原文] [Sherwin]: next layer up we have this thing called the agents SDK which has also gotten extremely extremely popular um this allows you to use you know the responses API or some other API endpoints that we have to build what you might more traditionally think of as an agent like a you know an AI kind of working in an infinite loop it might have sub agents that it delegates to

[译文] [Sherwin]: 再上一层,我们有一个叫做 Agents SDK 的东西,这也变得极其、极其受欢迎。这允许你使用 Responses API 或我们拥有的其他 API 端点来构建你传统观念中的 Agent,比如一个在无限循环中工作的 AI,它可能有它可以委派任务的子 Agent(Sub-agents)。

[原文] [Sherwin]: and then above that uh we've now started building tools to help also with kind of like the meta level of deploying an agent uh so we have this product called uh um agent kit uh uh uh and widgets uh which are basically a bunch of UI components that you can use to very easily um build a very beautiful UI um on top of uh uh either our API or agents SDK

[译文] [Sherwin]: 然后在那之上,我们现在开始构建工具来帮助处理部署 Agent 的元层级工作。所以我们有这个叫 Agent Kit 的产品以及 Widgets(组件),基本上是一堆 UI 组件,你可以用它们非常容易地在我们的 API 或 Agents SDK 之上构建一个非常漂亮的 UI。

[原文] [Lenny]: amazing sherwin with that we've reached our very exciting lightning round i've got five questions for you are you ready

[译文] [Lenny]: 太棒了,Sherwin。

[原文] [Sherwin]: yeah yeah absolutely

[译文] [Sherwin]: 是的,是的,当然。

[原文] [Lenny]: first question what are two or three books that you find yourself recommending most to other people

[译文] [Lenny]: 第一个问题,你发现自己最常向别人推荐的两三本书是什么?

[原文] [Sherwin]: oh I'll I'll talk about one non-fiction one and one fiction book uh the fiction book was I just finished reading it i I I it was really I I really recommend it it it's uh uh there is no anti-mimetics division by Q&M uh it's a uh I think it's like an online author but I saw it being shared on X uh this this uh it's like a science fictiony kind of book um and it was I basically devoured it in like two days

[译文] [Sherwin]: 噢,我会说一本非虚构类和一本虚构类书。虚构类那本我刚读完,我真的非常推荐,叫《There Is No Antimemetics Division》(暂译:无反模因部门),作者是 qntm。我想他是个网络作家,但我看到有人在 X 上分享。这是一本科幻类书籍,我基本上两天就把它吞(读)完了。

[原文] [Sherwin]: non-fiction so I'm going to cheat and I'm going to recommend two of them so in the last year I've been reading a lot more about China and kind of like the US China relations... first one is the Dan Wang book breakneck that one was really really good... and the other one is the Patrick McGee book on Apple and China was super super interesting

[译文] [Sherwin]: 非虚构类的,我要作弊一下,推荐两本。过去一年我读了很多关于中国以及中美关系的书……第一本是 Dan Wang 的《Breakneck》(疾驰),那本真的非常好……另一本是 Patrick McGee 关于苹果和中国的书,非常非常有趣。

[原文] [Lenny]: favorite recent movie or TV show you have really enjoyed

[译文] [Lenny]: 最近你真正喜欢的电影或电视剧?

[原文] [Sherwin]: i'm actually a big anime guy and so uh I I watched a couple episodes there's a new season of this anime called Jujutsu Kaisen uh that's out uh so season 3 of JJK uh was was was really good

[译文] [Sherwin]: 我实际上是个动漫迷,所以我看了几集——有一部叫《咒术回战》(Jujutsu Kaisen)的动漫出了新一季,JJK 第三季(注:口误,应为第二季或后续篇章),真的很好看。

[原文] [Lenny]: favorite product you recently discovered that you really love

[译文] [Lenny]: 最近发现并真正喜爱的产品?

[原文] [Sherwin]: I recently uh had to set up Wi-Fi and like home networking and I went all in on Ubiquiti uh routers... it is just such a well-built product... it's basically like the Apple of like home networking

[译文] [Sherwin]: 我最近必须设置 Wi-Fi 和家庭网络,所以我全面投入了 Ubiquiti 的路由器……它真的是一个构建得非常好的产品……基本上就像是家庭网络界的苹果。

[原文] [Lenny]: do you have a favorite life motto that you find yourself coming back to in work or in life

[译文] [Lenny]: 你有没有一个在工作或生活中经常回想起来的人生座右铭?

[原文] [Sherwin]: yeah uh the one that I always you know repeat to myself is uh uh never feel sorry for yourself... reminding yourself to never feel sorry and that you always have a sense of agency to kind of pull yourselves up is something that I've had to tell myself a lot

[译文] [Sherwin]: 是的,我总是对自己重复的一句话是:“永远不要自怨自艾”(Never feel sorry for yourself)。提醒自己永远不要感到抱歉,你总是有能动性去把自己拉起来,这是我必须经常告诉自己的话。

[原文] [Lenny]: last question so in your previous life you worked at Open Door where you led work on basically figuring out how much to uh pay for houses... What's like a variable in the price of a house that you didn't expect is really important and impacts the price of a house

[译文] [Lenny]: 最后一个问题,在你之前的职业生涯中,你在 Opendoor 工作,领导计算这房子该付多少钱的工作……有什么变量是你没预料到非常重要,但实际上对房价影响很大的?

[原文] [Sherwin]: power lines and like uh high voltage power lines like are super super uh actually impact your price quite a lot... and then the other one which which was something that uh was always something really difficult for us to uh quantify uh was floor plans

[译文] [Sherwin]: 电力线,像高压线那种,超级、超级——实际上对价格影响很大……另一个对我们来说总是很难量化的东西是户型图(Floor plans)。

[原文] [Sherwin]: i remember floor plan was a big one because like we'd have a home that like wouldn't sell and then our uh ops team would go in and be like yeah it's a floor plan issue so like how do you how could you tell it's like you go inside you just feel it

[译文] [Sherwin]: 我记得户型图是个大问题,因为我们会有卖不出去的房子,然后我们的运营团队进去后会说:“是的,这是户型图的问题。”你会问:“你怎么知道的?”这就像是你走进去,你就能感觉到它。