a16z AI Agent Dev School: Shaw's OKai Deep Dive
- YouTube SD: https://www.youtube.com/watch?v=X1aFEOaGcYE
- Transcript is based on the SD version above
- YouTube HD:
- Part 1: https://www.youtube.com/watch?v=ArptLpQiKfI
- Part 2: https://www.youtube.com/watch?v=AC3h_KzLARo
- Much higher quality, easier to see code. Split into 2 videos and missing a chunk in the middle.
Timestamps​
0:13:40 - JavaScript, Node.js, and the V8 Engine: Origins of JavaScript and how it evolved to run on servers.
0:17:16 - Why TypeScript is Used for OKai: Explanation of type systems and the benefits of using TypeScript over standard JavaScript.
0:25:32 - NPM, PNPM, and the JavaScript Ecosystem: Overview of package managers, the vastness of the JavaScript package ecosystem, and the team's preference for PNPM.
0:37:28 - Installing Node.js and PNPM: Practical steps for setting up a development environment.
0:42:12 - WSL2 (Windows Subsystem for Linux): Benefits of using WSL2 for development on Windows.
0:44:30 - Git, GitHub, and the OKai Repo: Introduction to version control, how to clone a repo, work with branches, and submit pull requests.
1:08:44 - OKai Starter Kit: Simplified way to build agents without modifying the core OKai codebase.
1:17:54 - Creating a Character File: In-depth explanation of character file structure and the various fields for defining an agent's personality.
1:53:15 - Running a Character and Adding it to Discord: Demonstration of loading a character, running it, and integrating it with a Discord server.
2:19:42 - Q&A - General Agent Development Questions: Addressing viewer questions about development environment, character development, and adding knowledge to agents.
2:28:50 - OKai's Core Abstractions (Providers, Actions, Evaluators): Explanation and examples of each core abstraction.
2:47:23 - Deep Dive into Providers: Detailed examples of providers in action, including wallet and trust score providers.
2:55:50 - Deep Dive into Actions: Examples of actions, including the PumpFun action for minting and buying tokens on Solana.
3:01:31 - Actions vs. Tools: Comparison of OKai's "actions" to the "tool" approach used by other agent frameworks.
3:03:35 - Wrap-up, Q&A, and Future Session Topics: Answering final questions, discussing future development plans, and announcing the next session's focus on building an agent that evaluates users and responds accordingly.
Summary​
This is the first live session of a16z AI Agent Dev School hosted by Shaw on Discord. It seems to be geared towards developers of all levels, with the first hour focused on the basics of development for beginners. The session covered a range of topics including:
Part 1: Development Basics​
- JavaScript & Node.js: Shaw provided a historical overview of JavaScript and explained the relationship between JavaScript, Node.js, and the V8 engine.
- TypeScript: Shaw explained the importance of types in programming, why TypeScript is beneficial for JavaScript development, and how it compares to Python's type system.
- Package Managers (NPM & PNPM): Shaw discussed the concepts of package managers, the benefits of the JavaScript package ecosystem, and why the team chose to use PNPM for the OKai project.
- WSL2 (Windows Subsystem for Linux): Shaw recommended using WSL2 for developers using Windows, citing the advantages of a Linux environment for development.
- Git & GitHub: Shaw provided a history and explanation of Git and GitHub, emphasizing the importance of learning Git for developers and demonstrating how to clone the OKai repository and work with branches and pull requests.
- OKai Starter Kit: Shaw introduced a starter kit repository designed to simplify building agents without needing to modify the core OKai codebase.
Part 2: Agent Concepts​
- Character Files: Shaw explained the structure of character files in the OKai framework, detailing the various fields used for defining an agent's persona, including bio, lore, knowledge, message examples, and style.
- Agent Runtime: Shaw discussed the concept of an agent runtime and how it holds all the necessary information for running an agent.
- Clients: Shaw explained the different clients available in OKai, which allow agents to connect to external services like Discord, Telegram, and Twitter.
- Core Abstractions: Shaw introduced the three core abstractions of the OKai framework: providers, actions, and evaluators. He provided examples of each using the Solana plugin and the marketplace of trust feature.
General Notes​
- Shaw strongly emphasized the importance of self-learning for developers, recommending resources like YouTube, Andre Karpathy's Neural Networks Zero to Hero playlist, and Google Machine Learning courses.
- The session was highly interactive, with Shaw frequently addressing questions from participants in the Discord stage and encouraging them to contribute to the project through pull requests and discussion.
- The next session will involve building a bot that evaluates how much it likes a user based on their interaction and responds accordingly, incorporating the use of providers, actions, and evaluators.
Overall, the session was a comprehensive introduction to the OKai framework and agent development, catering to both beginner and more experienced developers. It highlighted the project's collaborative nature and emphasized the importance of continuous learning and exploration within the agent development space.
Hot Takes​
Here are 5 of the hottest takes from the recording, sure to spark some controversy:
-
"I think OpenAI's models are unusable. I don't know about anybody else." (0:30:03-0:30:05) Shaw boldly declares OpenAI's models, like ChatGPT, to be completely unusable, especially for character development, claiming they are too "cringe." This directly contradicts the popular opinion that OpenAI is leading the pack in language model innovation.
-
"Unless you want to have a soul-sucking job, you're never ever going to see Java. You're pretty much going to use Python and JavaScript and stuff like that, or Rust or something." (0:24:43-0:24:51) Shaw dismisses Java as a relevant programming language for aspiring developers, arguing that it primarily leads to undesirable jobs. This statement is bound to stir debate among Java enthusiasts and those who believe it remains a vital language in many industries.
-
"I really recommend like Grok's great. Anthropic, which is Claude, is great...Llama's great. All these options are good and basically lets you run, you know, whatever. I think Gemini kind of sucks." (1:32:50-1:33:02) Shaw unapologetically ranks various language models, favoring Grok, Claude, and Llama, while expressing a clear dislike for Google's Gemini. This candid assessment challenges the perceived dominance of Google in the AI landscape and provides a stark contrast to their heavily marketed Gemini model.
-
"I don't really like tools...I don't think the general agent is really good at like stringing together things to do...The action is more like it makes sure that the entire action happens." (3:00:25 - 3:00:35) Shaw expresses a preference for OKai's "actions" over the "tool" approach used by other agent frameworks like LangChain. He argues that agents struggle with using tools effectively and that actions provide a more robust and streamlined way to execute tasks. This critique of the widely adopted tool-based approach is likely to generate discussion about the optimal methods for agent task execution.
-
"I find it [GPT-4] unusable, unusable, literally the worst possible model. Like TV three was better." (3:10:05-3:10:08) Shaw doubles down on his harsh criticism of OpenAI, this time targeting GPT-4 specifically and claiming that its predecessor, GPT-3, was superior. This strong statement flies in the face of the general excitement surrounding GPT-4's advanced capabilities and is sure to provoke reactions from those who believe it represents a significant leap forward in language models.