Andrej Karpathy: Software Is Changing (Again)
### 章节 1:软件开发的范式转移:从 1.0 到 3.0 📝 **本节摘要**: > 在本节中,Andrej Karpathy 强调了当前是进入软件行业的独特时刻,因为软件本质正经历剧烈变革。他回顾了“软件 1.0”(由人类编写的代码)到“软件 2.0”(通过优化器调整的神经网络权重)的演变,...
Category: Podcasts📝 本节摘要:
在本节中,Andrej Karpathy 强调了当前是进入软件行业的独特时刻,因为软件本质正经历剧烈变革。他回顾了“软件 1.0”(由人类编写的代码)到“软件 2.0”(通过优化器调整的神经网络权重)的演变,并正式提出了“软件 3.0”的概念——即大语言模型(LLM),其中提示词(Prompts)成为了用英语编写的新型程序。他以特斯拉自动驾驶系统为例,生动描述了软件 2.0 是如何逐步“吞噬”软件 1.0 代码的,并建议开发者需熟练掌握并在三种范式间灵活切换。
[原文] [Host]: please welcome former director of AI Tesla Andre Carpathy hello wow a lot of people here hello um okay yeah so I'm excited to be here today to talk to you about software in the era of AI
[译文] [主持人/Karpathy]: 请欢迎前特斯拉 AI 总监 Andrej Karpathy。(掌声)你好,哇,这里好多人啊。你好。嗯,好的,是的,我很兴奋今天能在这里和大家探讨 AI 时代的软件。
[原文] [Karpathy]: and I'm told that many of you are students like bachelors masters PhD and so on and you're about to enter the industry and I think it's actually like an extremely unique and very interesting time to enter the industry right now and I think fundamentally the reason for that is that um software is changing uh again
[译文] [Karpathy]: 我听说你们中许多人是学生,比如本科生、硕士、博士等等,你们即将进入这个行业。我认为现在实际上是一个极其独特且非常有趣的进入行业的时机。我觉得根本原因在于,嗯,软件正在发生变化,厄,再一次发生变化。
[原文] [Karpathy]: and I say again because I actually gave this talk already um but the problem is that software keeps changing so I actually have a lot of material to create new talks and I think it's changing quite fundamentally
[译文] [Karpathy]: 我说“再一次”,是因为我其实之前已经做过这个演讲了,嗯,但问题是软件一直在变,所以我其实有很多素材来创作新的演讲,而且我认为这种变化是相当根本性的。
[原文] [Karpathy]: i think roughly speaking software has not changed much on such a fundamental level for 70 years and then it's changed I think about twice quite rapidly in the last few years and so there's just a huge amount of work to do a huge amount of software to write and rewrite so let's take a look at maybe the realm of software
[译文] [Karpathy]: 我觉得大致来说,软件在如此根本的层面上可能有 70 年没怎么变过了,然后在过去几年里,我认为它相当迅速地变了两次。所以有大量的工作要做,有大量的软件需要编写和重写。那么,让我们来看看软件的领域。
[原文] [Karpathy]: so if we kind of think of this as like the map of software this is a really cool tool called map of GitHub um this is kind of like all the software that's written uh these are instructions to the computer for carrying out tasks in the digital space so if you zoom in here these are all different kinds of repositories and this is all the code that has been written
[译文] [Karpathy]: 如果我们把这看作是软件的地图——这是一个很酷的工具叫“GitHub 地图”——这有点像所有被编写出来的软件,厄,这些是让计算机在数字空间执行任务的指令。所以如果你放大这里看,这些是各种各样的代码库(repositories),这些就是所有已经被写出来的代码。
[原文] [Karpathy]: and a few years ago I kind of observed that um software was kind of changing and there was kind of like a new type of software around and I called this software 2.0 at the time and the idea here was that software 1.0 is the code you write for the computer software 2.0 know are basically neural networks and in particular the weights of a neural network
[译文] [Karpathy]: 几年前,我观察到,嗯,软件似乎在发生变化,出现了一种新型的软件,当时我称之为“软件 2.0”。这里的概念是,软件 1.0 是你为计算机编写的代码;而软件 2.0,基本上就是神经网络,特别是神经网络的权重(weights)。
[原文] [Karpathy]: and you're not writing this code directly you are most you are more kind of like tuning the data sets and then you're running an optimizer to create to create the parameters of this neural net and I think like at the time neural nets were kind of seen as like just a different kind of classifier like a decision tree or something like that and so I think it was kind of like um I think this framing was a lot more appropriate
[译文] [Karpathy]: 你不是直接编写这些代码,你更多的是在调整数据集,然后运行一个优化器来生成、来创建这个神经网络的参数。我认为在当时,神经网络被看作只是另一种分类器,就像决策树之类的东西,所以我认为这种(软件 2.0 的)框架描述要恰当得多。
[原文] [Karpathy]: and now actually what we have is kind of like an equivalent of GitHub in the realm of software 2.0 And I think the hugging face is basically equivalent of GitHub in software 2.0 and there's also model atlas and you can visualize all the code written there in case you're curious
[译文] [Karpathy]: 现在实际上我们拥有了软件 2.0 领域的 GitHub 等价物。我认为 Hugging Face 基本上就相当于软件 2.0 时代的 GitHub,还有 Model Atlas,如果你好奇的话,你可以把那里编写的所有“代码”可视化出来。
[原文] [Karpathy]: by the way the giant circle the point in the middle uh these are the parameters of flux the image generator and so anytime someone tunes a on top of a flux model you basically create a git commit uh in this space and uh you create a different kind of a image generator
[译文] [Karpathy]: 顺便说一下,那个巨大的圆圈,中间那个点,厄,那是 Flux 图像生成器的参数。所以每当有人在 Flux 模型之上进行微调时,你基本上就在这个空间里创建了一个 git 提交(git commit),厄,你就创造了一种不同类型的图像生成器。
[原文] [Karpathy]: so basically what we have is software 1.0 is the computer code that programs a computer software 2.0 are the weights which program neural networks uh and here's an example of Alexet image recognizer neural network
[译文] [Karpathy]: 所以基本上我们拥有的是:软件 1.0 是给计算机编程的计算机代码;软件 2.0 是给神经网络编程的权重,厄,这里有一个 AlexNet 图像识别神经网络的例子。
[原文] [Karpathy]: now so far all of the neural networks that we've been familiar with until recently where kind of like fixed function computers image to categories or something like that and I think what's changed and I think is a quite fundamental change is that neural networks became programmable with large language models
[译文] [Karpathy]: 目前为止,直到最近,我们要么熟悉的所有神经网络都有点像是固定功能的计算机,比如从图像到分类之类的。我认为发生的变化,而且是一个相当根本性的变化,是神经网络通过大语言模型(Large Language Models)变得可编程了。
[原文] [Karpathy]: and so I I see this as quite new unique it's a new kind of a computer and uh so in my mind it's uh worth giving it a new designation of software 3.0 and basically your prompts are now programs that program the LLM and uh remarkably uh these uh prompts are written in English so it's kind of a very interesting programming language
[译文] [Karpathy]: 所以在我看来这是相当新颖独特的,这是一种新型的计算机,厄,所以在我心目中,值得给它一个新的称号——软件 3.0。基本上,你的提示词(prompts)现在就是给 LLM 编程的程序,而且值得注意的是,厄,这些提示词是用英语写的,所以这是一种非常有趣的编程语言。
[原文] [Karpathy]: um so maybe uh to summarize the difference if you're doing sentiment classification for example you can imagine writing some uh amount of Python to to basically do sentiment classification or you can train a neural net or you can prompt a large language model
[译文] [Karpathy]: 嗯,所以也许为了总结这种区别,如果你在做情感分类,例如,你可以想象写一定量的 Python 代码来基本完成情感分类,或者你可以训练一个神经网络,或者你可以给大语言模型写提示词。
[原文] [Karpathy]: uh so here this is a few short prompt and you can imagine changing it and programming the computer in a slightly different way so basically we have software 1.0 software 2.0 and I think we're seeing maybe you've seen a lot of GitHub code is not just like code anymore there's a bunch of like English interspersed with code
[译文] [Karpathy]: 厄,这有一个少样本提示词(few-shot prompt),你可以想象修改它,以稍微不同的方式对计算机进行编程。所以基本上我们有软件 1.0、软件 2.0,而且我认为我们正在看到——也许你已经看到了——很多 GitHub 代码不再仅仅是代码了,代码中穿插着大量的英语。
[原文] [Karpathy]: and so I think kind of there's a growing category of new kind of code so not only is it a new programming paradigm it's also remarkable to me that it's in our native language of English and so when this blew my mind a few uh I guess years ago now I tweeted this and um I think it captured the attention of a lot of people and this is my currently pinned tweet uh is that remarkably we're now programming computers in English
[译文] [Karpathy]: 所以我认为有一类正在增长的新型代码。这不仅是一种新的编程范式,对我来说同样值得注意的是,它是用我们的母语英语编写的。所以当这在几年前让我大受震撼时,我发了一条推特,嗯,我认为它引起了很多人的关注,这也是我目前置顶的推特,厄,就是我们现在竟然在用英语给计算机编程,这太了不起了。
[原文] [Karpathy]: now when I was at uh Tesla um we were working on the uh autopilot and uh we were trying to get the car to drive and I sort of showed this slide at the time where you can imagine that the inputs to the car are on the bottom and they're going through a software stack to produce the steering and acceleration
[译文] [Karpathy]: 当我在,厄,特斯拉的时候,嗯,我们在研发自动驾驶(Autopilot),厄,我们试图让汽车自动行驶。我当时展示过这张幻灯片,你可以想象汽车的输入在底部,它们穿过一个软件栈(software stack),最终产生转向和加速指令。
[原文] [Karpathy]: and I made the observation at the time that there was a ton of C++ code around in the autopilot which was the software 1.0 code and then there was some neural nets in there doing image recognition and uh I kind of observed that over time as we made the autopilot better basically the neural network grew in capability and size and in addition to that all the C++ code was being deleted
[译文] [Karpathy]: 我当时的观察是,自动驾驶系统中存在大量的 C++ 代码,也就是软件 1.0 代码,然后里面有一些神经网络在做图像识别。厄,我观察到随着时间的推移,当我们改进自动驾驶系统时,基本上神经网络的能力和规模都在增长,而且除此之外,所有的 C++ 代码都在被删除。
[原文] [Karpathy]: and kind of like was um and a lot of the kind of capabilities and functionality that was originally written in 1.0 was migrated to 2.0 so as an example a lot of the stitching up of information across images from the different cameras and across time was done by a neural network and we were able to delete a lot of code and so the software 2.0 stack quite literally ate through the software stack of the autopilot
[译文] [Karpathy]: 就像是,嗯,原本用 1.0 编写的许多能力和功能都被迁移到了 2.0 中。举个例子,很多跨不同摄像头图像和跨时间的信息拼接工作都改由神经网络完成了,这使我们要能够删除大量代码。所以软件 2.0 技术栈毫不夸张地“吞噬”了自动驾驶原本的软件栈。
[原文] [Karpathy]: so I thought this was really remarkable at the time and I think we're seeing the same thing again where uh basically we have a new kind of software and it's eating through the stack we have three completely different programming paradigms
[译文] [Karpathy]: 我当时觉得这真的很了不起,我认为我们正在再次看到同样的事情发生,厄,基本上我们有了一种新型软件,它正在吞噬整个技术栈。我们现在有了三种完全不同的编程范式。
[原文] [Karpathy]: and I think if you're entering the industry it's a very good idea to be fluent in all of them because they all have slight pros and cons and you may want to program some functionality in 1.0 or 2.0 or 3.0 are you going to train neurallet are you going to just prompt an LLM should this be a piece of code that's explicit etc
[译文] [Karpathy]: 我认为如果你正在进入这个行业,精通所有这些范式是一个非常好的主意,因为它们都有各自细微的优缺点。你可能想要用 1.0、2.0 或 3.0 来编写某些功能——你是要训练一个神经网络?还是只给 LLM 写个提示词?还是这应该是一段明确的代码?等等。
[原文] [Karpathy]: so we all have to make these decisions and actually potentially uh fluidly trans transition between these paradigms
[译文] [Karpathy]: 所以我们都必须做出这些决定,并且实际上可能需要,厄,流畅地在这些范式之间进行转换。
📝 本节摘要:
这一节中,Karpathy 深入探讨了如何理解 LLM 这一新范式的本质。他首先引用 Andrew Ng 的观点,将 AI 比作“新电力”,指出了其作为公用事业(Utility)的特性:需要巨额资本支出(CapEx)建设、按量计费(OpEx)、以及稳定性对全球智力的影响。接着,他将其比作芯片制造厂(Fabs),强调了技术壁垒。但他认为最精准的类比是将 LLM 视为新的操作系统:它管理内存(上下文窗口)和计算资源,拥有封闭(Windows/macOS)与开源(Linux/Llama)的生态之争。最后,他指出当前我们正处于类似 1960 年代的主机分时共享(Time-sharing)阶段,且 LLM 呈现出一种独特的“反向技术扩散”现象——即最先进的技术首先由大众消费者用于日常琐事,而非由政府或军方垄断。
[原文] [Karpathy]: so what I wanted to get into now is first I want to in the first part talk about LLMs and how to kind of like think of this new paradigm and the ecosystem and what that looks like uh like what are what is this new computer what does it look like and what does the ecosystem look like
[译文] [Karpathy]: 所以我现在想进入的话题,首先在第一部分我想谈谈 LLM(大语言模型),以及如何去思考这种新范式和生态系统,以及它看起来像什么。厄,比如这台新型计算机是什么?它长什么样?它的生态系统又是怎样的?
[原文] [Karpathy]: um I was struck by this quote from Anduring actually uh many years ago now I think and I think Andrew is going to be speaking right after me uh but he said at the time AI is the new electricity and I do think that it um kind of captures something very interesting in that LLMs certainly feel like they have properties of utilities right now
[译文] [Karpathy]: 嗯,我曾被 Andrew Ng(吴恩达)的一句话深深打动,其实那是好多年前的事了,我想 Andrew 就在我之后演讲。厄,但他当时说“AI 是新电力”,我确实认为这,嗯,某种程度上捕捉到了一些非常有趣的东西,因为 LLM 现在确实让人感觉具有公用事业(utilities)的属性。
[原文] [Karpathy]: so um LLM labs like OpenAI Gemini Enthropic etc they spend capex to train the LLMs and this is kind of equivalent to building out a grid and then there's opex to serve that intelligence over APIs to all of us and this is done through metered access where we pay per million tokens or something like that
[译文] [Karpathy]: 所以,嗯,像 OpenAI、Gemini、Anthropic 等 LLM 实验室,他们花费资本支出(CapEx)来训练 LLM,这有点相当于建设电网;然后还有运营支出(OpEx),通过 API 将这种智能服务提供给我们所有人,这是通过计量访问完成的,比如我们按每百万 Token 付费之类的。
[原文] [Karpathy]: and we have a lot of demands that are very utility- like demands out of this API we demand low latency high uptime consistent quality etc in electricity you would have a transfer switch so you can transfer your electricity source from like grid and solar or battery or generator
[译文] [Karpathy]: 我们对这个 API 有很多需求,都是非常像公用事业的需求:我们要低延迟、高正常运行时间(uptime)、稳定的质量等等。在电力领域,你会有一个转换开关,这样你就可以把你的电力来源在电网、太阳能、电池或发电机之间进行切换。
[原文] [Karpathy]: in LLM we have maybe open router and easily switch between the different types of LLMs that exist because the LLM are software they don't compete for physical space so it's okay to have basically like six electricity providers and you can switch between them right because they don't compete in such a direct way
[译文] [Karpathy]: 在 LLM 领域,我们要么有 OpenRouter,可以轻松地在现有的不同类型 LLM 之间切换。因为 LLM 是软件,它们不争夺物理空间,所以基本上拥有六家电力供应商也是可以的,你可以在它们之间切换,对吧?因为它们并没有以那样直接的方式竞争。
[原文] [Karpathy]: and I think what's also a little fascinating and we saw this in the last few days actually a lot of the LLMs went down and people were kind of like stuck and unable to work and uh I think it's kind of fascinating to me that when the state-of-the-art LLMs go down it's actually kind of like an intelligence brownout in the world
[译文] [Karpathy]: 我认为还有一点有点迷人,实际上我们在过去几天里看到了这一点,很多 LLM 宕机了,人们就有点像被卡住了,无法工作。厄,我觉得这对我来说有点迷人,当最先进的 LLM 宕机时,实际上就像是世界上发生了一场“智力限电”(intelligence brownout)。
[原文] [Karpathy]: it's kind of like when the voltage is unreliable in the grid and uh the planet just gets dumber the more reliance we have on these models which already is like really dramatic and I think will continue to grow
[译文] [Karpathy]: 这有点像电网电压不稳定的时候。厄,我们对这些模型的依赖越深——这种依赖已经非常巨大了,而且我认为还会继续增长——地球就变得越笨。
[原文] [Karpathy]: but LLM's don't only have properties of utilities i think it's also fair to say that they have some properties of fabs and the reason for this is that the capex required for building LLM is actually quite large uh it's not just like building some uh power station or something like that right you're investing a huge amount of money
[译文] [Karpathy]: 但是 LLM 不仅仅具有公用事业的属性,我认为说它们具有某种晶圆厂(Fabs)的属性也是公平的。原因在于构建 LLM 所需的资本支出实际上非常大,厄,这不仅仅像是建个发电站之类的,对吧?你是在投资巨额资金。
[原文] [Karpathy]: and I think the tech tree and uh for the technology is growing quite rapidly so we're in a world where we have sort of deep tech trees research and development secrets that are centralizing inside the LLM labs um and but I think the analogy muddies a little bit also because as I mentioned this is software and software is a bit less defensible because it is so malleable
[译文] [Karpathy]: 而且我认为技术树,厄,这项技术的成长非常迅速,所以我们处于这样一个世界:我们拥有某种深度的技术树,研发机密正集中在 LLM 实验室内。嗯,但我认为这个类比也有点模糊,因为正如我提到的,这是软件,而软件的防御性稍微差一点,因为它是如此具有可塑性。
[原文] [Karpathy]: and so um I think it's just an interesting kind of thing to think about potentially there's many analogy analogies you can make like a 4 nanometer process node maybe is something like a cluster with certain max flops you can think about when you're use when you're using Nvidia GPUs and you're only doing the software and you're not doing the hardware that's kind of like the fabless model
[译文] [Karpathy]: 所以,嗯,我认为这只是一个有趣的思考角度。可能有很多类比可以做,比如 4 纳米工艺节点可能就像是具有特定最大算力(flops)的集群。你可以思考,当你使用,当你使用 Nvidia GPU 且只做软件不做硬件时,那有点像“无晶圆厂”(fabless)模式。
[原文] [Karpathy]: but if you're actually also building your own hardware and you're training on TPUs if you're Google that's kind of like the Intel model where you own your fab so I think there's some analogies here that make sense
[译文] [Karpathy]: 但如果你实际上也在构建自己的硬件,并且你在 TPU 上进行训练,如果你是 Google,那就像是 Intel 模式,你自己拥有晶圆厂。所以我认为这里有些类比是有道理的。
[原文] [Karpathy]: but actually I think the analogy that makes the most sense perhaps is that in my mind LLM have very strong kind of analogies to operating systems uh in that this is not just electricity or water it's not something that comes out of the tap as a commodity
[译文] [Karpathy]: 但实际上,我认为最合理的类比或许是,在我心目中,LLM 与操作系统(Operating Systems)有着非常强的相似性。厄,因为它不仅仅是电力或水,它不是那种从水龙头里流出来的商品。
[原文] [Karpathy]: uh this is these are now increasingly complex software ecosystems right so uh they're not just like simple commodities like electricity and it's kind of interesting to me that the ecosystem is shaping in a very similar kind of way where you have a few closed source providers like Windows or Mac OS and then you have an open source alternative like Linux
[译文] [Karpathy]: 厄,这些现在是日益复杂的软件生态系统,对吧?所以,厄,它们不只是像电力那样的简单商品。对我来说有点有趣的是,生态系统的形成方式非常相似,你有几个闭源提供商,像 Windows 或 macOS,然后你有一个开源替代品,像 Linux。
[原文] [Karpathy]: and I think for u neural for LLMs as well we have a kind of a few competing closed source providers and then maybe the llama ecosystem is currently like maybe a close approximation to something that may grow into something like Linux again I think it's still very early because these are just simple LLMs but we're starting to see that these are going to get a lot more complicated
[译文] [Karpathy]: 我认为对于神...对于 LLM 也是如此,我们有几个竞争的闭源提供商,然后也许 Llama 生态系统目前就像是某种近似物,可能会成长为像 Linux 一样的东西。再说一次,我认为现在还为时过早,因为这些还只是简单的 LLM,但我们要开始看到它们将会变得复杂得多。
[原文] [Karpathy]: it's not just about the LLM itself it's about all the tool use and the multiodalities and how all of that works and so when I sort of had this realization a while back I tried to sketch it out and it kind of seemed to me like LLMs are kind of like a new operating system right
[译文] [Karpathy]: 这不仅仅关于 LLM 本身,还关于所有的工具使用、多模态以及所有这些是如何运作的。所以当前阵子我有这个领悟时,我试着把它画出来,在我看来 LLM 有点像是一个新的操作系统,对吧。
[原文] [Karpathy]: so the LLM is a new kind of a computer it's sitting it's kind of like the CPU equivalent uh the context windows are kind of like the memory and then the LLM is orchestrating memory and compute uh for problem solving um using all of these uh capabilities here and so definitely if you look at it looks very much like operating system from that perspective
[译文] [Karpathy]: 所以 LLM 是一种新型计算机,它坐落在那里,有点像是 CPU 的等价物,厄,上下文窗口(context windows)有点像是内存,然后 LLM 正在编排内存和计算,厄,为了解决问题,嗯,利用这里所有的这些能力。所以毫无疑问,如果你从那个角度看,它非常像操作系统。
[原文] [Karpathy]: um a few more analogies for example if you want to download an app say I go to VS Code and I go to download you can download VS Code and you can run it on Windows Linux or or Mac in the same way as you can take an LLM app like cursor and you can run it on GPT or cloud or Gemini series right it's just a drop down so it's kind of like similar in that way as well
[译文] [Karpathy]: 嗯,还有几个类比,例如,如果你想下载一个应用,比方说我去 VS Code 官网下载,你可以下载 VS Code 并在 Windows、Linux 或 Mac 上运行。同样地,你可以拿一个像 Cursor 这样的 LLM 应用,你可以在 GPT、Claude 或 Gemini 系列上运行它,对吧?它只是一个下拉菜单。所以它在这方面也有点相似。
[原文] [Karpathy]: uh more analogies that I think strike me is that we're kind of like in this 1960sish era where LLM compute is still very expensive for this new kind of a computer and that forces the LLMs to be centralized in the cloud and we're all just uh sort of thing clients that interact with it over the network
[译文] [Karpathy]: 厄,更多触动我的类比是,我们有点像是处于 1960 年代那种时期,对于这种新型计算机来说,LLM 的计算仍然非常昂贵,这迫使 LLM 必须集中在云端,而我们都只是,厄,某种通过网络与它交互的瘦客户端(thin clients)。
[原文] [Karpathy]: and none of us have full utilization of these computers and therefore it makes sense to use time sharing where we're all just you know a dimension of the batch when they're running the computer in the cloud and this is very much what computers used to look like at during this time the operating systems were in the cloud everything was streamed around and there was batching
[译文] [Karpathy]: 我们谁都无法完全利用这些计算机,因此使用分时共享(time sharing)是有意义的,在云端运行计算机时,我们大家都只是,你知道,批处理(batch)中的一个维度。这非常像那个时期计算机的样子,操作系统在云端,一切都是流式传输的,还有批处理。
[原文] [Karpathy]: and so the p the personal computing revolution hasn't happened yet because it's just not economical it doesn't make sense but I think some people are trying and it turns out that Mac minis for example are a very good fit for some of the LLMs because it's all if you're doing batch one inference this is all super memory bound so this actually works
[译文] [Karpathy]: 所以个...个人计算革命还没有发生,因为它还不经济,这说不通。但我认为有些人正在尝试,事实证明,例如 Mac mini 非常适合某些 LLM,因为如果是做 Batch 为 1 的推理,这完全是受限于内存带宽的,所以这实际上是行得通的。
[原文] [Karpathy]: and uh I think these are some early indications maybe of personal computing uh but this hasn't really happened yet it's not clear what this looks like maybe some of you get to invent what what this is or how it works or uh what this should what this should be
[译文] [Karpathy]: 厄,我认为这些可能是个人计算的一些早期迹象,厄,但这还没有真正发生,还不清楚这会是什么样子。也许你们中的一些人能够去发明它是什么,或者它是如何工作的,或者,厄,它应该、它应该是什么。
[原文] [Karpathy]: maybe one more analogy that I'll mention is whenever I talk to Chach or some LLM directly in text I feel like I'm talking to an operating system through the terminal like it's just it's it's text it's direct access to the operating system and I think a guey hasn't yet really been invented in like a general way
[译文] [Karpathy]: 也许我要提到的再一个类比是,每当我直接用文本与 ChatGPT 或某些 LLM 交谈时,我觉得我像是通过终端(terminal)在与操作系统对话。就像是,只是,只是文本,是对操作系统的直接访问,我认为图形用户界面(GUI)还没有真正以通用的方式被发明出来。
[原文] [Karpathy]: like should chatt have a guey like different than just a tech bubbles uh certainly some of the apps that we're going to go into in a bit have guey but there's no like guey across all the tasks if that makes sense
[译文] [Karpathy]: 比如 ChatGPT 应该有一个 GUI 吗?不同于仅仅是文本气泡的那种。厄,当然我们稍后要讨论的一些 App 确实有 GUI,但还没有那种跨所有任务的通用 GUI,如果这说得通的话。
[原文] [Karpathy]: um there are some ways in which LLMs are different from kind of operating systems in some fairly unique way and from early computing and I wrote about uh this one particular property that strikes me as very different uh this time around it's that LLMs like flip they flip the direction of technology diffusion uh that is usually uh present in technology
[译文] [Karpathy]: 嗯,还有一些方面,LLM 与操作系统以及早期计算有着相当独特的不同之处。我曾写过关于这一点——这次有一个特别的属性让我觉得非常不同,厄,那就是 LLM 似乎翻转了,它们翻转了通常存在于技术中的技术扩散方向。
[原文] [Karpathy]: so for example with electricity cryptography computing flight internet GPS lots of new transformative technologies that have not been around typically it is the government and corporations that are the first users because it's new and expensive etc and it only later diffuses to consumer
[译文] [Karpathy]: 举例来说,电力、密码学、计算、飞行、互联网、GPS,很多以前不存在的变革性新技术,通常政府和企业是第一批用户,因为它们很新且昂贵等等,只有在后来才扩散到消费者。
[原文] [Karpathy]: but I feel like LLMs are kind of like flipped around so maybe with early computers it was all about ballistics and military use but with LLMs it's all about how do you boil an egg or something like that this is certainly like a lot of my use
[译文] [Karpathy]: 但我觉得 LLM 好像是反过来的。也许早期计算机都是关于弹道学和军事用途的,但对于 LLM,它全是关于“你怎么煮鸡蛋”或者类似的事情,这当然也是我的很多用法。
[原文] [Karpathy]: and so it's really fascinating to me that we have a new magical computer and it's like helping me boil an egg it's not helping the government do something really crazy like some military ballistics or some special technology indeed corporations are governments are lagging behind the adoption of all of us of all of these technologies so it's just backwards
[译文] [Karpathy]: 所以这对我来说真的很迷人,我们拥有了一台新的神奇计算机,而它在帮我煮鸡蛋,而不是在帮政府做一些真正疯狂的事情,比如军事弹道学或某种特殊技术。事实上,企业和政府在采纳所有这些技术方面,都落后于我们所有人。所以这完全是反向的。
[原文] [Karpathy]: and I think it informs maybe some of the uses of how we want to use this technology or like where are some of the first apps and so on so in summary so far LLM labs LLMs i think it's accurate language to use but LLMs are complicated operating systems they're circa 1960s in computing and we're redoing computing all over again and they're currently available via time sharing and distributed like a utility
[译文] [Karpathy]: 我认为这也许能为我们想如何使用这项技术,或者第一批 App 会出现在哪里等等提供一些启示。总之到目前为止,LLM 实验室、LLM,我认为使用这些语言是准确的,但 LLM 是复杂的操作系统,它们处于计算领域的 1960 年代左右,我们正在把计算这事重做一遍,目前它们通过分时共享提供服务,并像公用事业一样进行分发。
[原文] [Karpathy]: what is new and unprecedented is that they're not in the hands of a few governments and corporations they're in the hands of all of us because we all have a computer and it's all just software and Chaship was beamed down to our computers like billions of people like instantly and overnight and this is insane
[译文] [Karpathy]: 新颖且前所未有的是,它们并不掌握在少数政府和企业手中,而是掌握在我们所有人手中,因为我们都有电脑,而且这全都是软件。ChatGPT 就像光束一样瞬间传送到我们的电脑上,像是覆盖了数十亿人,在一夜之间,这太疯狂了。
[原文] [Karpathy]: uh and it's kind of insane to me that this is the case and now it is our time to enter the industry and program these computers this is crazy so I think this is quite remarkable
[译文] [Karpathy]: 厄,这对我来说确实有点疯狂,情况竟然是这样,而现在正是我们进入这个行业并为这些计算机编程的时候,这太疯狂了,所以我认为这相当了不起。
📝 本节摘要:
在这一章节中,Karpathy 建议在编程之前先理解 LLM 的“心理学”。他将 LLM 比作“人类精神体”或人类的随机模拟器,虽然底层只是 Transformer 神经网络,但因其在海量人类文本上训练而涌现出类人的心理特征。他指出 LLM 拥有类似电影《雨人》中主角那样的百科全书式记忆,但同时也存在严重的认知缺陷,如“参差不齐的智能”(能解决复杂问题却搞不定 9.11 和 9.9 的大小比较)和“顺行性遗忘症”(无法像人类员工那样通过睡眠巩固记忆,每次对话后记忆都会重置,类似电影《记忆碎片》)。此外,它们还非常易受骗,存在安全隐患。
[原文] [Karpathy]: before we program LLMs we have to kind of like spend some time to think about what these things are and I especially like to kind of talk about their psychology
[译文] [Karpathy]: 在我们给 LLM 编程之前,我们必须花点时间思考这些东西到底是什么,而且我特别喜欢谈论它们的心理学。
[原文] [Karpathy]: so the way I like to think about LLMs is that they're kind of like people spirits um they are stoastic simulations of people um and the simulator in this case happens to be an auto reggressive transformer
[译文] [Karpathy]: 所以,我思考 LLM 的方式是,它们有点像是“人类精神体”(people spirits),厄,它们是人类的随机模拟(stochastic simulations),而在这个案例中,模拟器恰好是一个自回归 Transformer。
[原文] [Karpathy]: so transformer is a neural net uh it's and it just kind of like is goes on the level of tokens it goes chunk chunk chunk chunk chunk and there's an almost equal amount of compute for every single chunk
[译文] [Karpathy]: Transformer 是一个神经网络,厄,它是在 Token 层面运行的,它就是“块、块、块、块、块”地处理,而且每一个块的计算量几乎是相等的。
[原文] [Karpathy]: um and um this simulator of course is is just is basically there's some weights involved and we fit it to all of text that we have on the internet and so on and you end up with this kind of a simulator and because it is trained on humans it's got this emergent psychology that is humanlike
[译文] [Karpathy]: 嗯,厄,这个模拟器当然只是,基本上涉及一些权重,我们将它拟合到我们在互联网上拥有的所有文本等等,最终你得到了这样一种模拟器。因为它是基于人类数据训练的,所以它拥有这种类人的涌现心理(emergent psychology)。
[原文] [Karpathy]: so the first thing you'll notice is of course uh LLM have encyclopedic knowledge and memory uh and they can remember lots of things a lot more than any single individual human can because they read so many things
[译文] [Karpathy]: 所以你会注意到的第一件事当然是,厄,LLM 拥有百科全书式的知识和记忆,厄,它们能记住很多东西,比任何单个人类个体都要多得多,因为它们读了太多的东西。
[原文] [Karpathy]: it's it actually kind of reminds me of this movie Rainman which I actually really recommend people watch it's an amazing movie i love this movie um and Dustin Hoffman here is an autistic savant who has almost perfect memory
[译文] [Karpathy]: 这实际上让我想起了电影《雨人》(Rainman),我真的推荐大家去看,这是一部很棒的电影,我爱这部电影。厄,达斯汀·霍夫曼在里面饰演一位患有自闭症的学者,他拥有近乎完美的记忆力。
[原文] [Karpathy]: so he can read a he can read like a phone book and remember all of the names and phone numbers and I kind of feel like LM are kind of like very similar they can remember Shaw hashes and lots of different kinds of things very very easily
[译文] [Karpathy]: 所以他可以读一本电话簿,然后记住所有的名字和电话号码。我感觉 LLM 在某种程度上非常相似,它们可以非常非常容易地记住 SHA 哈希值和许多不同类型的东西。
[原文] [Karpathy]: so they certainly have superpowers in some set in some respects but they also have a bunch of I would say cognitive deficits so they hallucinate quite a bit um and they kind of make up stuff and don't have a very good uh sort of internal model of self-nowledge not sufficient at least
[译文] [Karpathy]: 所以它们当然在某些方面拥有超能力,但它们也有一堆我会称之为“认知缺陷”的问题。所以它们会经常产生幻觉(hallucinate),嗯,它们会编造东西,并且没有一个非常好的、厄,某种内在的自我认知模型,至少是不够充分的。
[原文] [Karpathy]: and this has gotten better but not perfect they display jagged intelligence so they're going to be superhuman in some problems solving domains and then they're going to make mistakes that basically no human will make
[译文] [Karpathy]: 虽然这一点已经有所改善但还不完美。它们表现出“参差不齐的智力”(jagged intelligence),所以它们在某些解决问题的领域会是超人类的,然后它们又会犯下基本上没有人类会犯的错误。
[原文] [Karpathy]: like you know they will insist that 9.11 is greater than 9.9 or that there are two Rs in strawberry these are some famous examples but basically there are rough edges that you can trip on so that's kind of I think also kind of unique
[译文] [Karpathy]: 比如你知道,它们会坚持认为 9.11 大于 9.9,或者单词 "strawberry" 里有两个 R,这是一些著名的例子。但基本上可以说有一些粗糙的边缘会让你绊倒,所以我认为這也是挺独特的。
[原文] [Karpathy]: um they also kind of suffer from entrograde amnesia um so uh and I think I'm alluding to the fact that if you have a co-orker who joins your organization this co-orker will over time learn your organization and uh they will understand and gain like a huge amount of context on the organization
[译文] [Karpathy]: 嗯,它们还有点患有“顺行性遗忘症”(anterograde amnesia)。嗯,所以,厄,我想我指的是这样一个事实:如果你有一位同事加入你的组织,这位同事会随着时间的推移了解你的组织,厄,他们会理解并获得关于组织的大量背景信息。
[原文] [Karpathy]: and they go home and they sleep and they consolidate knowledge and they develop expertise over time llms don't natively do this and this is not something that has really been solved in the R&D of LLM i think
[译文] [Karpathy]: 然后他们回家睡觉,巩固知识,随着时间推移发展出专业技能。LLM 天生做不到这一点,而且我认为这在 LLM 的研发中还没有真正被解决。
[原文] [Karpathy]: um and so context windows are really kind of like working memory and you have to sort of program the working memory quite directly because they don't just kind of like get smarter by uh by default and I think a lot of people get tripped up by the analogies uh in this way
[译文] [Karpathy]: 嗯,所以上下文窗口(context windows)真的有点像是工作记忆(working memory),你必须相当直接地对工作记忆进行编程,因为它们不会默认就变得更聪明,厄,我认为很多人在这些类比上被误导了。
[原文] [Karpathy]: uh in popular culture I recommend people watch these two movies uh Momento and 51st dates in both of these movies the protagonists their weights are fixed and their context windows gets wiped every single morning and it's really problematic to go to work or have relationships when this happens and this happens to all the time i guess
[译文] [Karpathy]: 厄,在流行文化中,我推荐大家看这两部电影:厄,《记忆碎片》(Memento)和《初恋 50 次》(50 First Dates)。在这两部电影中,主角们的“权重”是固定的,而他们的“上下文窗口”每天早上都会被擦除。当这种情况发生时,去工作或维持人际关系是非常成问题的,而这种事(对 LLM 来说)一直在发生。
[原文] [Karpathy]: one more thing I would point to is security kind of related limitations of the use of LLM so for example LLMs are quite gullible uh they are susceptible to prompt injection risks they might leak your data etc and so um and there's many other considerations uh security related
[译文] [Karpathy]: 我想指出的还有一点是关于 LLM 使用中与安全相关的限制。例如,LLM 非常易受骗(gullible),厄,它们容易受到提示词注入(prompt injection)风险的影响,它们可能会泄露你的数据等等,所以,嗯,还有很多其他与安全相关的考量。
[原文] [Karpathy]: so so basically long story short you have to load your you have to load your you have to simultaneously think through this superhuman thing that has a bunch of cognitive deficits and issues how do we and yet they are extremely like useful and so how do we program them and how do we work around their deficits and enjoy their superhuman powers
[译文] [Karpathy]: 所以,基本上长话短说,你必须,你必须同时思考这个拥有超人能力却又有一堆认知缺陷和问题的家伙。我们该如何——尽管它们非常有用——所以我们该如何给它们编程?我们该如何绕过它们的缺陷并享受它们的超能力?
📝 本节摘要:
在本节中,Karpathy 探讨了 LLM 应用的最佳实践,即“半自主应用(Partial Autonomy Apps)”。他以代码编辑器 Cursor 和搜索引擎 Perplexity 为例,指出优秀的 AI 应用应具备上下文管理、多模型编排以及特定于应用程序的 GUI(图形用户界面)。他特别强调了 GUI 在“生成-验证”循环中的重要性,因为人类通过视觉审查 Diff(差异)远比阅读文本高效。为了应对 AI 的不可靠性,他提出了“自主性滑块(Autonomy Slider)”的概念:用户应能根据任务复杂度,自由调节让 AI 仅做简单的自动补全,还是进行完全自主的代理操作,始终确保将 AI “拴在链子上”以防止其失控。
[原文] [Karpathy]: so what I want to switch to now is talk about the opportunities of how do we use these models and what are some of the biggest opportunities this is not a comprehensive list just some of the things that I thought were interesting for this talk
[译文] [Karpathy]: 所以我现在想转换话题,谈谈我们如何使用这些模型的机会,以及最大的机会有哪些。这并不是一份详尽的清单,只是我认为在这个演讲中值得探讨的一些有趣的事情。
[原文] [Karpathy]: the first thing I'm kind of excited about is what I would call partial autonomy apps
[译文] [Karpathy]: 我感到兴奋的第一件事,是我称之为“半自主应用”的东西。
[原文] [Karpathy]: so for example let's work with the example of coding you can certainly go to chacht directly and you can start copy pasting code around and copyping bug reports and stuff around and getting code and copy pasting everything around
[译文] [Karpathy]: 举个例子,让我们以编程为例。你当然可以直接去用 ChatGPT,你可以开始到处复制粘贴代码,复制错误报告之类的东西,获取代码然后再把所有东西粘贴回去。
[原文] [Karpathy]: why would you why would you do that why would you go directly to the operating system it makes a lot more sense to have an app dedicated for this
[译文] [Karpathy]: 你为什么要,你为什么要那样做?你为什么要直接去操作“操作系统”呢?拥有一个专门为此设计的 App 会更有意义。
[原文] [Karpathy]: and so I think many of you uh use uh cursor i do as well and uh cursor is kind of like the thing you want instead you don't want to just directly go to the chash apt and I think cursor is a very good example of an early LLM app that has a bunch of properties that I think are um useful across all the LLM apps
[译文] [Karpathy]: 所以我想你们中的许多人,厄,都在用 Cursor,我也在用。厄,Cursor 有点像就是你想要的那种替代品,你不想只是直接去用 ChatGPT。我认为 Cursor 是早期 LLM 应用的一个非常好的例子,它具备一系列我认为在所有 LLM 应用中都,嗯,非常有用的属性。
[原文] [Karpathy]: so in particular you will notice that we have a traditional interface that allows a human to go in and do all the work manually just as before but in addition to that we now have this LLM integration that allows us to go in bigger chunks
[译文] [Karpathy]: 所以特别地,你会注意到我们有一个传统的界面,允许人类像以前一样介入并手动完成所有工作;但除此之外,我们现在有了这种 LLM 集成,允许我们以更大的“块”来处理工作。
[原文] [Karpathy]: and so some of the properties of LLM apps that I think are shared and useful to point out number one the LLMs basically do a ton of the context management
[译文] [Karpathy]: 所以我认为一些共有的且值得指出的 LLM 应用属性是:第一,LLM 基本上做了大量的上下文管理工作。
[原文] [Karpathy]: um number two they orchestrate multiple calls to LLMs right so in the case of cursor there's under the hood embedding models for all your files the actual chat models models that apply diffs to the code and this is all orchestrated for you
[译文] [Karpathy]: 嗯,第二,它们编排了对 LLM 的多次调用,对吧?所以在 Cursor 的案例中,底层有针对你所有文件的 Embedding(嵌入)模型、实际的聊天模型、以及将差异(diffs)应用到代码的模型,所有这些都为你编排好了。
[原文] [Karpathy]: a really big one that uh I think also maybe not fully appreciated always is application specific uh GUI and the importance of it
[译文] [Karpathy]: 一个真正重要的,厄,我认为可能并不总是被充分重视的属性,是特定于应用程序的 GUI(图形用户界面)及其重要性。
[原文] [Karpathy]: um because you don't just want to talk to the operating system directly in text text is very hard to read interpret understand and also like you don't want to take some of these actions natively in text
[译文] [Karpathy]: 嗯,因为你不想只是直接用文本与操作系统对话。文本非常难读、难解释、难理解,而且你也并不想原本就在文本中执行其中一些操作。
[原文] [Karpathy]: so it's much better to just see a diff as like red and green change and you can see what's being added is subtracted it's much easier to just do command Y to accept or command N to reject i shouldn't have to type it in text right
[译文] [Karpathy]: 所以直接看到像红色和绿色变化那样的 Diff(差异对比)要好得多,你可以看到什么被添加了、什么被删减了。直接按 Command+Y 接受或 Command+N 拒绝要容易得多,我不应该非得在文本里输入指令,对吧?
[原文] [Karpathy]: so a guey allows a human to audit the work of these fallible systems and to go faster i'm going to come back to this point a little bit uh later as well
[译文] [Karpathy]: 所以 GUI 允许人类审核这些容易犯错的系统的工作,并且做得更快。这一点我稍后还会,厄,再稍微回顾一下。
[原文] [Karpathy]: and the last kind of feature I want to point out is that there's what I call the autonomy slider so for example in cursor you can just do tap completion you're mostly in charge
[译文] [Karpathy]: 我想指出的最后一个特性是所谓的“自主性滑块”。例如在 Cursor 中,你可以只做 Tab 键补全,这时候主要由你掌控。
[原文] [Karpathy]: you can select a chunk of code and command K to change just that chunk of code you can do command L to change the entire file or you can do command I which just you know let it rip do whatever you want in the entire repo and that's the sort of full autonomy agent agentic version
[译文] [Karpathy]: 你可以选择一块代码,按 Command+K 仅修改那一块代码;你可以按 Command+L 修改整个文件;或者你可以按 Command+I,这就像是,你知道,让它“放飞自我”,在整个代码库里随心所欲,这就是那种全自主代理(agentic)版本。
[原文] [Karpathy]: and so you are in charge of the autonomy slider and depending on the complexity of the task at hand you can uh tune the amount of autonomy that you're willing to give up uh for that task
[译文] [Karpathy]: 所以是你掌管着这个自主性滑块,根据手头任务的复杂程度,你可以,厄,调整你愿意为该任务放弃多少自主权。
[原文] [Karpathy]: maybe to show one more example of a fairly successful LLM app uh perplexity um it also has very similar features to what I've just pointed out to in cursor
[译文] [Karpathy]: 也许再展示一个相当成功的 LLM 应用案例,厄,Perplexity。嗯,它也具有我刚才在 Cursor 中指出的非常相似的特性。
[原文] [Karpathy]: uh it packages up a lot of the information it orchestrates multiple LLMs it's got a GUI that allows you to audit some of its work so for example it will site sources and you can imagine inspecting them
[译文] [Karpathy]: 厄,它打包了很多信息,它编排了多个 LLM,它有一个 GUI 允许你审核它的一些工作,例如它会引用来源,你可以想象去检查这些来源。
[原文] [Karpathy]: and it's got an autonomy slider you can either just do a quick search or you can do research or you can do deep research and come back 10 minutes later so this is all just varying levels of autonomy that you give up to the tool
[译文] [Karpathy]: 而且它也有一个自主性滑块。你可以只做一个快速搜索,或者做普通研究,或者做深度研究然后 10 分钟后再回来。所以这都只是你让渡给工具的不同程度的自主权。
[原文] [Karpathy]: so I guess my question is I feel like a lot of software will become partially autonomous i'm trying to think through like what does that look like and for many of you who maintain products and services how are you going to make your products and services partially autonomous
[译文] [Karpathy]: 所以我想我的问题是,我觉得很多软件都将变成半自主的。我在试图思考那会是什么样子,对于你们中许多维护产品和服务的人来说,你们将如何让你们的产品和服务变成半自主的?
[原文] [Karpathy]: can an LLM see everything that a human can see can an LLM act in all the ways that a human could act and can humans supervise and stay in the loop of this activity because again these are fallible systems that aren't yet perfect
[译文] [Karpathy]: LLM 能看到人类能看到的一切吗?LLM 能以人类能行动的所有方式行动吗?人类能监督并保持在这些活动的回路(loop)中吗?因为再次强调,这些是易犯错的系统,还并不完美。
[原文] [Karpathy]: and what does a diff look like in Photoshop or something like that you know and also a lot of the traditional software right now it has all these switches and all this kind of stuff that's all designed for human all of this has to change and become accessible to LLMs
[译文] [Karpathy]: 在 Photoshop 或类似软件里,“Diff”看起来像什么?你知道,而且现在很多传统软件都有各种开关和这类东西,那都是为人类设计的,所有这些都必须改变,变得让 LLM 可以访问。
[原文] [Karpathy]: so one thing I want to stress with a lot of these LLM apps that I'm not sure gets as much attention as it should is um we we're now kind of like cooperating with AIS and usually they are doing the generation and we as humans are doing the verification
[译文] [Karpathy]: 所以关于这些 LLM 应用,我想强调的一件事——我不确定它是否得到了应有的关注——是,嗯,我们现在有点像是在与 AI 合作,通常它们负责“生成”,而我们作为人类负责“验证”。
[原文] [Karpathy]: it is in our interest to make this loop go as fast as possible so we're getting a lot of work done there are two major ways that I think uh this can be done
[译文] [Karpathy]: 让这个循环转得越快越好符合我们的利益,这样我们就能完成大量工作。我认为主要有两种方式,厄,可以实现这一点。
[原文] [Karpathy]: number one you can speed up verification a lot um and I think guies for example are extremely important to this because a guey utilizes your computer vision GPU in all of our head reading text is effortful and it's not fun but looking at stuff is fun and it's it's just a kind of like a highway to your brain
[译文] [Karpathy]: 第一,你可以大幅加速验证过程。嗯,我认为 GUI 在这方面极其重要,因为 GUI 利用了我们所有人脑袋里的“计算机视觉 GPU”。阅读文本是费力的,也不好玩,但“看东西”很有趣,而且这就像是一条通往大脑的高速公路。
[原文] [Karpathy]: so I think guies are very useful for auditing systems and visual representations in general and number two I would say is we have to keep the AI on the leash we I think a lot of people are getting way over excited with AI agents
[译文] [Karpathy]: 所以我认为 GUI 对于审核系统和一般的视觉呈现非常有用。第二,我会说我们必须把 AI “拴在链子上”。我认为很多人对 AI 智能体(Agents)有点过于兴奋了。
[原文] [Karpathy]: and uh it's not useful to me to get a diff of 10,000 lines of code to my repo like I have to I'm still the bottleneck right even though that 10,00 lines come out instantly I have to make sure that this thing is not introducing bugs it's just like and that it's doing the correct thing right and that there's no security issues and so on
[译文] [Karpathy]: 厄,对我来说,收到一个针对我代码库的 10,000 行代码的 Diff 是没用的。就像,我不得不——我仍然是瓶颈,对吧?即使那 10,000 行代码瞬间就生成出来了,我也必须确保这东西没有引入 Bug,就像,确信它在做正确的事,对吧?确信没有安全问题等等。
[原文] [Karpathy]: so um I think that um yeah basically you we have to sort of like it's in our interest to make the the flow of these two go very very fast and we have to somehow keep the AI on the leash because it gets way too overreactive
[译文] [Karpathy]: 所以,嗯,我认为,嗯,是的,基本上你、我们必须某种程度上——让这两者(生成与验证)的流程走得非常非常快符合我们的利益,而且我们必须以某种方式把 AI 拴在链子上,因为它会变得过于反应过度。
[原文] [Karpathy]: it's uh it's kind of like this this is how I feel when I do AI assisted coding if I'm just bite coding everything is nice and great but if I'm actually trying to get work done it's not so great to have an overreactive uh agent doing all this kind of stuff
[译文] [Karpathy]: 这,厄,这有点像,这就是我在做 AI 辅助编程时的感觉。如果我只是在瞎写代码(vibe coding),一切都很美好很棒;但如果我实际上是想完成工作,有一个反应过度的、厄,代理在做所有这类事情就不太好了。
[原文] [Karpathy]: so this slide is not very good i'm sorry but I guess I'm trying to develop like many of you some ways of utilizing these agents in my coding workflow and to do AI assisted coding and in my own work I'm always scared to get way too big diffs
[译文] [Karpathy]: 所以这张幻灯片不太好,抱歉。但我想我像你们很多人一样,正试图开发一些在我的编码工作流中利用这些代理的方法,来做 AI 辅助编程。而在我自己的工作中,我总是害怕收到太大的 Diff。
[原文] [Karpathy]: i always go in small incremental chunks i want to make sure that everything is good i want to spin this loop very very fast and um I sort of work on small chunks of single concrete thing uh and so I think many of you probably are developing similar ways of working with the with LLMs
[译文] [Karpathy]: 我总是以小的增量块进行,我想确保一切都好,我想让这个循环转得非常非常快,嗯,我某种程度上是在针对单一具体的事务进行小块工作。厄,所以我认为你们很多人可能也在开发类似的与 LLM 协作的方式。
[原文] [Karpathy]: I also saw a number of blog posts that try to develop these best practices for working with LLMs and here's one that I read recently and I thought was quite good and it kind of discussed some techniques and some of them have to do with how you keep the AI on the leash
[译文] [Karpathy]: 我也看到了一些博客文章试图开发与 LLM 协作的最佳实践。这是我最近读到的一篇,我觉得相当不错,它讨论了一些技巧,其中一些与你如何将 AI 拴在链子上有关。
[原文] [Karpathy]: and so as an example if you are prompting if your prompt is vague then uh the AI might not do exactly what you wanted and in that case verification will fail you're going to ask for something else if a verification fails then you're going to start spinning
[译文] [Karpathy]: 举个例子,如果你在写提示词,如果你的提示词很模糊,那么,厄,AI 可能不会完全做你想让它做的事。在那这种情况下,验证就会失败,你会要求别的东西。如果验证失败了,你就会开始空转(spinning)。
[原文] [Karpathy]: so it makes a lot more sense to spend a bit more time to be more concrete in your prompts which increases the probability of successful verification and you can move forward and so I think a lot of us are going to end up finding um kind of techniques like this
[译文] [Karpathy]: 所以花更多时间让你的提示词更具体是很有意义的,这会增加验证成功的概率,你就可以继续前进。所以我认为我们很多人最终都会发现,嗯,诸如此类的技巧。
[原文] [Karpathy]: i think in my own work as well I'm currently interested in uh what education looks like in um together with kind of like now that we have AI uh and LLMs what does education look like and I think a a large amount of thought for me goes into how we keep AI on the leash
[译文] [Karpathy]: 我认为在我自己的工作中也是如此,我目前对,厄,教育会变成什么样很感兴趣,嗯,结合现在我们有了 AI,厄,和 LLM,教育会是什么样子?我认为我大量的思考都集中在如何将 AI 拴在链子上。
[原文] [Karpathy]: i don't think it just works to go to chat and be like "Hey teach me physics." I don't think this works because the AI is like gets lost in the woods and so for me this is actually two separate apps for example
[译文] [Karpathy]: 我不认为直接去聊天界面说“嘿,教我物理”能行得通。我不认为这行得通,因为 AI 会像是迷失在森林里。所以对我来说,这实际上是两个独立的应用,例如。
[原文] [Karpathy]: there's an app for a teacher that creates courses and then there's an app that takes courses and serves them to students and in both cases we now have this intermediate artifact of a course that is auditable and we can make sure it's good we can make sure it's consistent and the AI is kept on the leash with respect to a certain syllabus a certain like um progression of projects and so on
[译文] [Karpathy]: 有一个给教师用的 App 用来创建课程,还有一个 App 接收课程并提供给学生。在这两种情况下,我们现在都有了“课程”这个中间产物,它是可审核的。我们可以确保它是好的,我们可以确保它是一致的,并且 AI 在特定的教学大纲、特定的、嗯、项目进度等方面被拴在链子上。
[原文] [Karpathy]: and so this is one way of keeping the AI on leash and I think has a much higher likelihood of working and the AI is not getting lost in the woods
[译文] [Karpathy]: 这是一个将 AI 拴在链子上的方法,我认为它成功的可能性要高得多,而且 AI 也不会迷失方向。
📝 本节摘要:
在本节中,Karpathy 结合自己在特斯拉研发自动驾驶的经历,对当前 AI 智能体(Agents)的过度炒作提出了冷静的看法。他回忆了 2013 年第一次体验 Waymo 自动驾驶时的“完美”表现,这种错觉让他当时以为全自动驾驶指日可待,然而 12 年后的今天,这仍是一个未解难题。据此,他认为“2025 年是智能体元年”的说法过于乐观,这更将是一个长达十年的进程。最后,他提出了著名的“钢铁侠战衣”比喻:在当前阶段,与其试图制造全自动的“钢铁侠机器人”,不如打造能增强人类能力的“钢铁侠战衣”,通过半自主产品和定制化 GUI,让人类保持在控制回路中,同时逐步推动“自主性滑块”向右移动。
[原文] [Karpathy]: one more kind of analogy I wanted to sort of allude to is I'm not I'm no stranger to partial autonomy and I kind of worked on this I think for five years at Tesla and this is also a partial autonomy product and shares a lot of the features
[译文] [Karpathy]: 我想提到的另一个类比是,我并非对“半自主”感到陌生,我在特斯拉为此工作了大约五年,这也是一个半自主产品,并且具有许多相同的特征。
[原文] [Karpathy]: like for example right there in the instrument panel is the GUI of the autopilot so it's showing me what the what the neural network sees and so on and we have the autonomy slider where over the course of my tenure there we did more and more autonomous tasks for the user
[译文] [Karpathy]: 例如,仪表盘上就有自动驾驶的 GUI(图形界面),它向我展示神经网络看到了什么等等。而且我们有“自主性滑块”,在我任职期间,我们为用户完成了越来越多的自主任务。
[原文] [Karpathy]: and maybe the story that I wanted to tell very briefly is uh actually the first time I drove a self-driving vehicle was in 2013 and I had a friend who worked at Whimo and uh he offered to give me a drive around Palo Alto i took this picture using Google Glass at the time and many of you are so young that you might not even know what that is uh
[译文] [Karpathy]: 也许我想简短讲的一个故事是,厄,实际上我第一次乘坐自动驾驶汽车是在 2013 年。我有个在 Waymo 工作的朋友,厄,他提议带我在帕洛阿尔托(Palo Alto)转转。我当时用 Google Glass 拍了这张照片,你们很多人太年轻了,可能甚至不知道那是什么,厄。
[原文] [Karpathy]: but uh yeah this was like all the rage at the time and we got into this car and we went for about a 30-minute drive around Palo Alto highways uh streets and so on and this drive was perfect there was zero interventions and this was 2013 which is now 12 years ago
[译文] [Karpathy]: 但是,厄,是的,那在当时可是风靡一时。我们坐进车里,在帕洛阿尔托的高速公路、厄、街道等地开了大约 30 分钟。那次驾驶是完美的,零干预,那是 2013 年,距今已经 12 年了。
[原文] [Karpathy]: and it kind of struck me because at the time when I had this perfect drive this perfect demo I felt like wow self-driving is imminent because this just worked this is incredible
[译文] [Karpathy]: 这真的触动了我,因为当时经历了那次完美的驾驶、那个完美的演示后,我觉得,哇,自动驾驶指日可待了,因为它是如此管用,这太不可思议了。
[原文] [Karpathy]: um but here we are 12 years later and we are still working on autonomy um we are still working on driving agents and even now we haven't actually like really solved the problem
[译文] [Karpathy]: 嗯,但 12 年后的今天,我们仍在该致力于研发自动驾驶,嗯,我们仍在致力于研发驾驶智能体,甚至到现在我们还没有真正解决这个问题。
[原文] [Karpathy]: like you may see Whimos going around and they look driverless but you know there's still a lot of teleoperation and a lot of human in the loop of a lot of this driving so we still haven't even like declared success but I think it's definitely like going to succeed at this point but it just took a long time
[译文] [Karpathy]: 就像你可能看到 Waymo 的车在到处跑,看起来是无人的,但你知道其中仍然有很多远程操作,以及很多环节中都有人类介入。所以我们甚至还没有宣布成功,但我认为到目前这个阶段它肯定会成功,只是花了很长时间。
[原文] [Karpathy]: and so I think like like this is software is really tricky I think in the same way that driving is tricky and so when I see things like oh 2025 is the year of agents I get very concerned and I kind of feel like you know this is the decade of agents and this is going to be quite some time we need humans in the loop we need to do this carefully this is software let's be serious here
[译文] [Karpathy]: 所以我觉得,就像这种软件真的很棘手,就像驾驶很棘手一样。所以当我看到像“噢,2025 年是智能体(Agents)元年”这样的说法时,我很担心。我觉得,你知道,这更像是“智能体的十年”,这将需要相当长的时间。我们需要人类在回路中,我们需要小心行事,这是软件,让我们严肃点。
[原文] [Karpathy]: one more kind of analogy that I always think through is the Iron Man suit uh I think this is I always love Iron Man i think it's like so um correct in a bunch of ways with respect to technology and how it will play out
[译文] [Karpathy]: 我经常思考的再一个类比是“钢铁侠战衣”。厄,我想我一直很爱钢铁侠,我认为关于技术及其将如何发展,它在很多方面都非常、嗯、正确。
[原文] [Karpathy]: and what I love about the Iron Man suit is that it's both an augmentation and Tony Stark can drive it and it's also an agent and in some of the movies the Iron Man suit is quite autonomous and can fly around and find Tony and all this kind of stuff
[译文] [Karpathy]: 我喜欢钢铁侠战衣的一点是,它既是一种增强(Augmentation),托尼·斯塔克可以驾驶它;它也是一个智能体(Agent),在某些电影里,钢铁侠战衣相当自主,可以飞来飞去寻找托尼做各种事情。
[原文] [Karpathy]: and so this is the autonomy slider is we can be we can build augmentations or we can build agents and we kind of want to do a bit of both
[译文] [Karpathy]: 所以这就是“自主性滑块”,我们可以构建增强工具,也可以构建智能体,我们有点想两者兼顾。
[原文] [Karpathy]: but at this stage I would say working with fallible LLMs and so on i would say you know it's less Iron Man robots and more Iron Man suits that you want to build it's less like building flashy demos of autonomous agents and more building partial autonomy products
[译文] [Karpathy]: 但在这个阶段,我想说,考虑到易犯错的 LLM 等因素,我想说你知道的,你想造的与其说是“钢铁侠机器人”,不如说是“钢铁侠战衣”。与其说是构建那些炫酷的自主智能体演示,不如说是构建“半自主产品”。
[原文] [Karpathy]: and these products have custom gueies and UIUX and we're trying to um and this is done so that the generation verification loop of the human is very very fast but we are not losing the sight of the fact that it is in principle possible to automate this work
[译文] [Karpathy]: 这些产品拥有定制的 GUI 和 UI/UX,我们正试图,嗯,这样做是为了让人类的“生成-验证”循环非常非常快,但我们也没有忽视这一事实:即原则上自动化这项工作是可能的。
[原文] [Karpathy]: and there should be an autonomy slider in your product and you should be thinking about how you can slide that autonomy slider and make your product uh sort of um more autonomous over time but this is kind of how I think there's lots of opportunities in these kinds of products
[译文] [Karpathy]: 你的产品中应该有一个“自主性滑块”,你应该思考如何能滑动那个自主性滑块,让你的产品随着时间推移,厄,变得某种程度上,嗯,更加自主。这就是我的想法,这类产品中存在大量机会。
📝 本节摘要:
在本节中,Karpathy 探讨了编程门槛的彻底降低。他指出,随着英语成为新的自然语言编程接口,每个人实际上都变成了程序员。他介绍了自己意外创造的流行词“Vibe Coding”(氛围编程),即不仅是专业人士,连孩子也能通过自然语言轻松构建应用。他分享了自己通过 Vibe Coding 开发 iOS 应用和“Menu Gen”应用的经历:虽然编写核心代码变得异常简单(甚至不需要懂 Swift),但他发现让应用“落地”的环节(如集成支付、身份验证、DevOps)仍然是极其痛苦的“苦差事”,这为下一章讨论基础设施的重要性埋下了伏笔。
[原文] [Karpathy]: i want to now switch gears a little bit and talk about one other dimension that I think is very unique not only is there a new type of programming language that allows for autonomy in software but also as I mentioned it's programmed in English which is this natural interface
[译文] [Karpathy]: 我现在想稍微换个档位,谈谈我认为非常独特的另一个维度。不仅有一种新型的编程语言允许软件实现自主性,而且正如我提到的,它是用英语编程的,这是一种自然接口。
[原文] [Karpathy]: and suddenly everyone is a programmer because everyone speaks natural language like English so this is extremely bullish and very interesting to me and also completely unprecedented i would say it it used to be the case that you need to spend five to 10 years studying something to be able to do something in software this is not the case anymore
[译文] [Karpathy]: 突然之间,每个人都成了程序员,因为每个人都会说像英语这样的自然语言。所以这对我来说是极其令人看涨且非常有趣的,我想说这也是完全前所未有的。过去的情况是你需要花 5 到 10 年时间学习,才能在软件领域做点什么,现在情况不再是这样了。
[原文] [Karpathy]: so I don't know if by any chance anyone has heard of vibe coding uh this this is the tweet that kind of like introduced this but I'm told that this is now like a major meme
[译文] [Karpathy]: 所以我不知道大家是否偶然听说过“Vibe Coding”(氛围编程)。厄,这是那条引入了这个概念的推文,但我听说这现在已经成了一个大梗(meme)。
[原文] [Karpathy]: um fun story about this is that I've been on Twitter for like 15 years or something like that at this point and I still have no clue which tweet will become viral and which tweet like fizzles and no one cares and I thought that this tweet was going to be the latter i don't know it was just like a shower of thoughts
[译文] [Karpathy]: 嗯,关于这个的一个趣事是,我在推特上大概已经混了 15 年或者是那么久了,到现在我还是完全不知道哪条推文会火,哪条推文会像哑炮一样没人关心。我原本以为这条推文会是后者,我不知道,这只是像某种“淋浴时的胡思乱想”(shower thoughts)。
[原文] [Karpathy]: but this became like a total meme and I really just can't tell but I guess like it struck a chord and it gave a name to something that everyone was feeling but couldn't quite say in words so now there's a Wikipedia page and everything this is like yeah this is like a major contribution now or something like that
[译文] [Karpathy]: 但这变成了一个彻头彻尾的梗,我真的看不懂。但我猜它可能触动了人们的心弦,给每个人都感觉到但无法确切用语言表达的东西起了一个名字。所以现在有了维基百科页面和所有东西,这就像是,是的,这就像是现在的一个重大贡献还是怎么的。
[原文] [Karpathy]: so um so Tom Wolf from HuggingFace shared this beautiful video that I really love um these are kids vibe coding and I find that this is such a wholesome video like I love this video like how can you look at this video and feel bad about the future the future is great
[译文] [Karpathy]: 所以,嗯,Hugging Face 的 Tom Wolf 分享了这个我非常喜欢的精彩视频。厄,这是孩子们在做 Vibe Coding,我觉得这是一个如此健康(wholesome)的视频。就像,我爱这个视频,看着这个视频你怎么会对未来感到糟糕呢?未来是极其美好的。
[原文] [Karpathy]: i think this will end up being like a gateway drug to software development um I'm not a doomer about the future of the generation and I think yeah I love this video
[译文] [Karpathy]: 我认为这最终会成为软件开发的“入门诱饵”(gateway drug)。嗯,我对这一代的未来并不悲观,我觉得,是的,我爱这个视频。
[原文] [Karpathy]: so I tried by coding a little bit uh as well because it's so fun uh so bike coding is so great when you want to build something super duper custom that doesn't appear to exist and you just want to wing it because it's a Saturday or something like that
[译文] [Karpathy]: 所以我也尝试了一点 Vibe Coding,厄,因为它太好玩了。厄,所以当你想要构建某种超级超级定制化、似乎并不存在的东西,而你只是想在周六或者什么时候随手做做(wing it)时,Vibe Coding 真是太棒了。
[原文] [Karpathy]: so I built this uh iOS app and I don't I can't actually program in Swift but I was really shocked that I was able to build like a super basic app and I'm not going to explain it it's really uh dumb but uh I kind of like this was just like a day of work and this was running on my phone like later that day and I was like "Wow this is amazing."
[译文] [Karpathy]: 所以我做了这个,厄,iOS 应用。我不,我实际上不会用 Swift 编程,但我真的很震惊我竟然能构建一个超级基础的应用。我不打算解释它,它真的很,厄,蠢。但是,厄,我有点喜欢这种感觉,这就像是一天的工作,然后那天晚些时候它就在我的手机上运行了,我就像:“哇,这太神奇了。”
[原文] [Karpathy]: I didn't have to like read through Swift for like five days or something like that to like get started i also vipcoded this app called Menu Genen and this is live you can try it in menu.app and I basically had this problem where I show up at a restaurant I read through the menu and I have no idea what any of the things are and I need pictures
[译文] [Karpathy]: 我不需要像为了入门那样去读五天 Swift 文档之类的。我还 Vibe Code 了一个叫“Menu Gen”的应用,这已经上线了,你可以在 menu.app 试用。我基本上遇到了这个问题:我到了餐馆,读遍菜单,却完全不知道这些菜是什么,我需要图片。
[原文] [Karpathy]: so this doesn't exist so I was like "Hey I'm going to bite code it." So um this is what it looks like you go to menu.app um and uh you take a picture of a of a menu and then menu generates the images and everyone gets $5 in credits for free when you sign up and therefore this is a major cost center in my life so this is a negative negative uh revenue app for me right now
[译文] [Karpathy]: 而这个东西不存在,所以我就想:“嘿,我要把它 Vibe Code 出来。”所以,嗯,它看起来是这样的:你去 menu.app,嗯,然后,厄,你拍一张菜单的照片,然后 Menu Gen 就会生成图片。因为每个人注册时都能免费获得 5 美元的额度,因此这是我生活中的一个主要成本中心,所以这对我来说目前是一个负营收、负营收的应用。
[原文] [Karpathy]: i've lost a huge amount of money on menu okay but the fascinating thing about menu genen for me is that the code of the v the vite coding part the code was actually the easy part of v of v coding menu
[译文] [Karpathy]: 我在 Menu Gen 上亏了一大笔钱,好吧。但 Menu Gen 对我来说最迷人的事情是,代码,Vibe Coding 的部分,代码实际上是 Vibe Coding Menu Gen 过程中最简单的部分。
[原文] [Karpathy]: and most of it actually was when I tried to make it real so that you can actually have authentication and payments and the domain name and averal deployment this was really hard and all of this was not code all of this devops stuff was in me in the browser clicking stuff and this was extreme slo and took another week
[译文] [Karpathy]: 大部分时间实际上花在了我试图让它“变现”(make it real)的时候,这样你才能真正拥有身份验证、支付、域名和 Vercel 部署。这真的很难,而且所有这些都不是代码。所有这些 DevOps 的东西都是我在浏览器里点来点去,这是极其痛苦的苦差事(extreme slog),又花了一周时间。
[原文] [Karpathy]: so it was really fascinating that I had the menu genen um basically demo working on my laptop in a few hours and then it took me a week because I was trying to make it real and the reason for this is this was just really annoying
[译文] [Karpathy]: 所以这真的很迷人,我在几个小时内就让 Menu Gen,嗯,基本的 Demo 在我的笔记本上跑起来了,然后却花了我一周时间,因为我试图让它落地,而原因在于这真的很烦人。
📝 本节摘要:
在本节中,Karpathy 探讨了如何为互联网上的新一类消费者——AI 智能体(Agents)——构建合适的基础设施。他指出,目前的网络是为人类(通过 GUI)或计算机(通过 API)设计的,而 AI 智能体作为“数字空间的人类精神体”,需要特定的交互方式。他提倡“Meeting LLMs halfway(与 LLM 相向而行)”,例如:在网站部署llm.txt(类似robots.txt)以 Markdown 格式向 LLM 介绍域名内容;将文档中的“点击这里”手动指令替换为 LLM 可执行的curl命令;以及利用 Model Context Protocol 和 gitingest 等工具,将复杂的代码库和文档转化为 LLM 易于消化的纯文本格式,从而大幅降低 AI 辅助编程的摩擦力。
[原文] [Karpathy]: so I think the last part of my talk therefore focuses on can we just build for agents i don't want to do this work can agents do this thank you
[译文] [Karpathy]: 所以我想我演讲的最后一部分将聚焦于:我们能不能直接为智能体(Agents)构建(基础设施)?我不想做这些工作,智能体能做吗?谢谢。
[原文] [Karpathy]: okay so roughly speaking I think there's a new category of consumer and manipulator of digital information it used to be just humans through GUIs or computers through APIs and now we have a completely new thing
[译文] [Karpathy]: 好的,所以大致来说,我认为出现了一类新的数字信息消费者和操纵者。以前只有通过 GUI 的人类,或者通过 API 的计算机,而现在我们有了一个全新的事物。
[原文] [Karpathy]: and agents are they're computers but they are humanlike kind of right they're people spirits there's people spirits on the internet and they need to interact with our software infrastructure like can we build for them it's a new thing
[译文] [Karpathy]: 智能体虽然是计算机,但它们有点像人类,对吧?它们是“人类精神体”,互联网上有了这些人类精神体,它们需要与我们的软件基础设施交互。比如,我们能为它们构建吗?这是一个新事物。
[原文] [Karpathy]: so as an example you can have robots.txt on your domain and you can instruct uh or like advise I suppose um uh web crawlers on how to behave on your website
[译文] [Karpathy]: 举个例子,你的域名下可以有 robots.txt,你可以指示,厄,或者我想应该是建议,嗯,厄,网络爬虫如何在你的网站上行事。
[原文] [Karpathy]: in the same way you can have maybe llm.txt txt file which is just a simple markdown that's telling LLMs what this domain is about and this is very readable to a to an LLM
[译文] [Karpathy]: 同样地,你也许可以有一个 llm.txt 文本文件,里面只是简单的 Markdown,告诉 LLM 这个域名是关于什么的,这对 LLM 来说是非常易读的。
[原文] [Karpathy]: if it had to instead get the HTML of your web page and try to parse it this is very errorprone and difficult and will screw it up and it's not going to work so we can just directly speak to the LLM it's worth it
[译文] [Karpathy]: 如果它不得不去获取你网页的 HTML 并试图解析它,这是非常容易出错且困难的,它会搞砸,而且行不通。所以我们可以直接与 LLM 对话,这很值得。
[原文] [Karpathy]: um a huge amount of documentation is currently written for people so you will see things like lists and bold and pictures and this is not directly accessible by an LLM
[译文] [Karpathy]: 嗯,目前大量的文档是为人类编写的,所以你会看到像列表、加粗和图片之类的东西,而这些不是 LLM 可以直接访问的。
[原文] [Karpathy]: so I see some of the services now are transitioning a lot of the their docs to be specifically for LLMs so Versell and Stripe as an example are early movers here but there are a few more that I've seen already and they offer their documentation in markdown markdown is super easy for LMS to understand this is great
[译文] [Karpathy]: 所以我看到一些服务现在正将其大量文档转换为专门针对 LLM 的格式。例如 Vercel 和 Stripe 是这方面的先行者,但我已经看到了其他几家,它们提供 Markdown 格式的文档。Markdown 对 LLM 来说超级容易理解,这很棒。
[原文] [Karpathy]: um maybe one simple example from from uh my experience as well maybe some of you know three blue one brown he makes beautiful animation videos on YouTube yeah I love this library so that he wrote uh Manon
[译文] [Karpathy]: 嗯,也许还有一个来自,厄,我个人经历的简单例子。也许你们中有些人知道 3Blue1Brown,他在 YouTube 上制作精美的动画视频,是的。我爱他写的这个库,厄,Manim。
[原文] [Karpathy]: and I wanted to make my own and uh there's extensive documentations on how to use manon and so I didn't want to actually read through it so I copy pasted the whole thing to an LLM and I described what I wanted and it just worked out of the box
[译文] [Karpathy]: 我想制作我自己的动画,厄,关于如何使用 Manim 有大量的文档,但我并不想真正通读它。所以我把整个东西复制粘贴给一个 LLM,描述了我想要的,它开箱即用。
[原文] [Karpathy]: like LLM just bcoded me an animation exactly what I wanted and I was like wow this is amazing so if we can make docs legible to LLMs it's going to unlock a huge amount of um kind of use and um I think this is wonderful and should should happen more
[译文] [Karpathy]: 就像 LLM 只是给我写出了(vibe coded)一个完全符合我要求的动画,我就像:“哇,这太神奇了。”所以如果我们能让文档对 LLM 清晰易读,这将解锁大量的,嗯,某种用途,嗯,我认为这太棒了,应该更多地发生。
[原文] [Karpathy]: the other thing I wanted to point out is that you do unfortunately have to it's not just about taking your docs and making them appear in markdown that's the easy part we actually have to change the docs because anytime your docs say click this is bad an LLM will not be able to natively take this action right now
[译文] [Karpathy]: 我想指出的另一件事是,不幸的是你确实必须——这不仅仅是把你的文档变成 Markdown 格式,那是最简单的部分——我们实际上必须修改文档内容。因为任何时候你的文档说“点击这里”,这都很糟糕,目前 LLM 无法天生执行这个动作。
[原文] [Karpathy]: so Verscell for example is replacing every occurrence of click with an equivalent curl command that your LM agent could take on your behalf
[译文] [Karpathy]: 所以例如 Vercel 正在把每一个出现的“点击”替换为等效的 curl 命令,这样你的 LLM 智能体就能代表你执行操作。
[原文] [Karpathy]: um and so I think this is very interesting and then of course there's a model context protocol from Enthropic and this is also another way it's a protocol of speaking directly to agents as this new consumer and manipulator of digital information so I'm very bullish on these ideas
[译文] [Karpathy]: 嗯,所以我认为这非常有趣。然后当然还有 Anthropic 的模型上下文协议(Model Context Protocol),这也是另一种方式,这是一种直接与智能体对话的协议,将它们视为数字信息的新消费者和操纵者。所以我非常看好这些想法。
[原文] [Karpathy]: the other thing I really like is a number of little tools here and there that are helping ingest data that in like very LLM friendly formats
[译文] [Karpathy]: 我真正喜欢的另一件事是这里那里出现的许多小工具,它们正在帮助以非常 LLM 友好的格式摄取数据。
[原文] [Karpathy]: so for example when I go to a GitHub repo like my nanoGPT repo I can't feed this to an LLM and ask questions about it uh because it's you know this is a human interface on GitHub
[译文] [Karpathy]: 例如,当我去一个 GitHub 代码库,比如我的 nanoGPT 代码库,我不能把它喂给 LLM 并提问,厄,因为你知道,GitHub 上这是给人类用的界面。
[原文] [Karpathy]: so when you just change the URL from GitHub to get ingest then uh this will actually concatenate all the files into a single giant text and it will create a directory structure etc and this is ready to be copy pasted into your favorite LLM and you can do stuff
[译文] [Karpathy]: 所以当你只是把 URL 从 github 改成 gitingest 时,厄,这实际上会将所有文件连接成一个巨大的文本,并且会创建目录结构等等,这就可以直接复制粘贴到你最喜欢的 LLM 里,然后你就可以干活了。
[原文] [Karpathy]: maybe even more dramatic example of this is deep wiki where it's not just the raw content of these files uh this is from Devon but also like they have Devon basically do analysis of the GitHub repo and Devon basically builds up a whole docs uh pages just for your repo and you can imagine that this is even more helpful to copy paste into your LLM
[译文] [Karpathy]: 也许更具戏剧性的例子是 Deep Wiki——这是 Devin 团队做的——不仅仅是这些文件的原始内容,而且他们让 Devin 基本上对 GitHub 代码库进行分析,Devin 基本上为你的代码库构建了整套文档页面。你可以想象这对于复制粘贴到你的 LLM 中甚至更有帮助。
[原文] [Karpathy]: so I love all the little tools that basically where you just change the URL and it makes something accessible to an LLM so this is all well and great and u I think there should be a lot more of it
[译文] [Karpathy]: 所以我喜欢所有这些小工具,基本上你只需更改 URL,它就能让某些东西变得可被 LLM 访问。这都非常好,厄,我认为应该有更多这样的工具。
[原文] [Karpathy]: one more note I wanted to make is that it is absolutely possible that in the future LLMs will be able to this is not even future this is today they'll be able to go around and they'll be able to click stuff and so on
[译文] [Karpathy]: 我想做的还有一个注记是,绝对有可能在未来 LLM 将能够——这甚至不是未来,这是今天——它们将能够到处逛,能够点击东西等等。
[原文] [Karpathy]: but I still think it's very worth u basically meeting LLM halfway LLM's halfway and making it easier for them to access all this information uh because this is still fairly expensive I would say to use and uh a lot more difficult
[译文] [Karpathy]: 但我仍然认为这非常值得,厄,基本上是“与 LLM 相向而行(meeting LLMs halfway)”,让它们更容易获取所有这些信息,厄,因为我认为直接使用(模拟点击)仍然相当昂贵,厄,而且困难得多。
[原文] [Karpathy]: and so I do think that lots of software there will be a long tail where it won't like adapt apps because these are not like live player sort of repositories or digital infrastructure and we will need these tools
[译文] [Karpathy]: 所以我确实认为很多软件,会有一个长尾部分,它们不会去适配 App,因为这些不像是什么活跃玩家类型的代码库或数字基础设施,我们将需要这些工具。
[原文] [Karpathy]: uh but I think for everyone else I think it's very worth kind of like meeting in some middle point so I'm bullish on both if that makes sense
[译文] [Karpathy]: 厄,但我认为对于其他人来说,我认为在某种中间点汇合是非常值得的。所以我对这两者(工具适配与原生能力)都看好,如果这说得通的话。
📝 本节摘要:
演讲最后,Karpathy 进行了全面的总结。他重申现在是进入软件行业的绝佳时机,因为大量的代码需要被重写,且将由专业人士和新兴的“氛围程序员”共同完成。他再次强调了核心类比:LLM 既像公用事业和晶圆厂,但更像是处于 1960 年代早期的操作系统。他回顾了 LLM 作为“易犯错的人类精神体”的本质,指出我们需要调整基础设施以适应它们。最终,他以“钢铁侠战衣”的愿景作结:在未来十年,我们将看到“自主性滑块”从左(人类主导)逐渐向右(AI 主导)移动,并邀请在场的所有人共同构建这一未来。
[原文] [Karpathy]: so in summary what an amazing time to get into the industry we need to rewrite a ton of code a ton of code will be written by professionals and by coders
[译文] [Karpathy]: 总之,这是进入这个行业的一个多么惊人的时机。我们需要重写大量的代码,大量的代码将由专业人士和程序员(注:此处可能指 Vibe Coders)编写。
[原文] [Karpathy]: these LLMs are kind of like utilities kind of like fabs but they're kind of especially like operating systems but it's so early it's like 1960s of operating systems and uh and I think a lot of the analogies cross over
[译文] [Karpathy]: 这些 LLM 有点像公用事业,有点像晶圆厂,但它们特别像操作系统。但这还太早了,就像是 1960 年代的操作系统,厄,我认为很多类比都是通用的。
[原文] [Karpathy]: um and these LMS are kind of like these fallible uh you know people spirits that we have to learn to work with and in order to do that properly we need to adjust our infrastructure towards it
[译文] [Karpathy]: 嗯,这些 LLM 有点像那些易犯错的,厄,你知道的,“人类精神体”,我们必须学会与它们共事。为了恰当地做到这一点,我们需要调整我们的基础设施来适应它。
[原文] [Karpathy]: so when you're building these LLM apps I describe some of the ways of working effectively with these LLMs and some of the tools that make that uh kind of possible and how you can spin this loop very very quickly and basically create partial tunneling products
[译文] [Karpathy]: 所以当你构建这些 LLM 应用时,我描述了一些与这些 LLM 有效协作的方法,以及一些使之,厄,成为可能的工具,以及你如何能非常非常快地旋转这个循环,并基本上创造出半自主产品(注:原文 "partial tunneling" 应为口误,联系上下文意为 partial autonomy)。
[原文] [Karpathy]: and then um yeah a lot of code has to also be written for the agents more directly but in any case going back to the Iron Man suit analogy I think what we'll see over the next decade roughly is we're going to take the slider from left to right
[译文] [Karpathy]: 然后,嗯,是的,很多代码也必须更直接地为智能体编写。但无论如何,回到“钢铁侠战衣”的类比,我认为我们在未来十年大致会看到的是,我们将把那个滑块从左向右移动。
[原文] [Karpathy]: and I'm very interesting it's going to be very interesting to see what that looks like and I can't wait to build it with all of you thank you
[译文] [Karpathy]: 我很感兴...看到那会是什么样子将是非常有趣的,我迫不及待想和大家一起建设它。谢谢。