AI Agent Dev School Part 2

Building Complex AI Agents with Actions, Providers, & Evaluators

Date: 2024-12-03 YouTube Link: https://www.youtube.com/watch?v=XenGeAcPAQo

Timestamps

00:03:33 - Shift in focus from characters (Dev School Part 1) to agent capabilities

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=213

00:07:09 - Deep dive into providers, actions, and evaluators, the core building blocks of OKai

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=429

00:07:28 - Discussion about actions vs. tools, favoring decoupled intent and action execution

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=448

00:18:02 - Explanation of providers and their function as information sources for agents

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=1082

00:20:15 - Introduction to evaluators and their role in agent reflection and state analysis

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=1215

00:29:22 - Brief overview of clients as connectors to external platforms

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=1762

00:31:02 - Description of adapters and their function in database interactions

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=1862

00:34:02 - Discussion about plugins as bundles of core components, examples, and recommendations

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=2042

00:40:31 - Live Coding Demo begins: Creating a new plugin from scratch (DevSchoolExamplePlugin)

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=2431

00:47:54 - Implementing the simple HelloWorldAction

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=2791

01:00:26 - Implementing the CurrentNewsAction (fetching and formatting news data)

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=3626

01:22:09 - Demonstrating the OKai Client for interacting with agents locally

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=4929

01:23:54 - Q&A: Plugin usage in character files, installation, OKai vs. OKai Starter

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=5034

01:36:17 - Saving agent responses as memories in the database

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=5777

01:43:06 - Using prompts for data extraction within actions

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=6186

01:51:54 - Importance of deleting the database during development to avoid context issues

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=6714

01:57:04 - Viewing agent context via console logs to understand model inputs

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=7024

02:07:07 - Explanation of memory management with knowledge, facts, and lore

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=7627

02:16:53 - Q&A: Prompt engineering opportunities, knowledge chunking and retrieval

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=8213

02:22:57 - Call for contributions: Encouraging viewers to create their own actions and plugins

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=8577

02:26:31 - Closing remarks and future DevSchool session announcements

Link: https://www.youtube.com/watch?v=XenGeAcPAQo&t=8791

Summary

AI Agent Dev School Part 2, Electric Boogaloo

The session focuses on building complex AI agents, with Shaw diving into core abstractions: plugins, providers, actions, and evaluators.

Actions are defined as capabilities that agents can execute, ranging from simple tasks to complex workflows. Providers serve as information sources for agents, similar to context providers in React. Evaluators run after actions, enabling agents to reflect on their state and decisions.

The live coding portion demonstrates creating a "DevSchool" plugin from scratch, starting with a simple "Hello World" action and progressing to a more complex "Current News" action that fetches and formats news articles. Shaw shows how to extract data from conversations using prompts, making actions dynamic.

The session covers memory management, explaining how agents store and recall information through different types of memory:

Knowledge: Information retrievable through search
Lore: Random facts that add variety to responses
Conversation history: Recent interactions and context

Shaw emphasizes the importance of prompt engineering, demonstrating how the structure and order of information significantly impacts agent responses. He shows how to view agent context through console logs to understand model inputs and improve agent behavior.

The session concludes with discussions about knowledge management, retrieval augmented generation (RAG), and future developments in AI agent capabilities, including the possibility of dynamically generating character files.

Hot Takes

OpenAI models are "dumb" due to RLHF and "wokeness" (02:03:00-02:04:07)

"But basically, I've also made them sort of useless by RLHFing. Like, very basic capability, like a haystack test out of them. ... I'm against killing the capability and making models way worse than they are for someone's political agenda. I just don't think that's the world we want to live in."

Shaw here expresses frustration with OpenAI's approach to alignment, arguing that RLHF has diminished the capabilities of their models and that this is due to a "woke" agenda. This take is controversial because it attributes technical limitations to political motivations and ignores the complexities of aligning powerful AI systems.

OpenAI models shouldn't be "telling" developers what they can and can't do (02:03:29-02:03:50)

"OpenAI, if you're listening, please fucking stop telling me how to run models. You don't know as well as I do. I do this every day. You're a fucking engineer who has to go train, like, an LLM. I actually have to use the LLM."

This rant criticizes OpenAI's models for "telling" developers what they can and can't do, arguing that the models are not as knowledgeable as the developers who are actually using them. This take could be seen as dismissive of the role of AI systems in providing helpful feedback and limitations.

Prompt engineering is the "most easy improvement" for AI agents (02:06:09-02:06:27)

"Huge amount of research would go into that... That's where we'll see like the most easy improvement in our agents."

Shaw argues that prompt engineering holds the key to significant improvements in AI agents, stating that it's the "most easy improvement." This take is controversial because it downplays the importance of other areas like model architecture, training data, and algorithm development.

Character files could be generated at runtime, making existing character files obsolete (02:22:05-02:22:53)

"The entire character file could be generated at runtime... The agent's like, I have no idea who I am. And you're like, oh, your name is OKai, and you like berries. OK, cool. I guess I like berries."

This take suggests that character files could be generated at runtime, rendering current character files obsolete. This idea is controversial because it could lead to a more dynamic and unpredictable agent behavior, which could raise concerns about control and reliability.

A "badge" system will reward developers who create custom actions, evaluators, and providers (02:24:45-02:25:49)

"If you want that badge, what I'd like you to do is come to the AI Agent Dev School, make an action, have your agent do something. Those are the kinds of people that I really think we'll want to, you know, keep in our ecosystem and keep busy."

This take suggests a "badge" system to recognize developers who go beyond the basics and create custom components for AI agents. This could be seen as elitist or exclusionary, potentially creating a hierarchy within the AI agent development community.

Timestamps​

Summary​

Hot Takes​

Timestamps

Summary

Hot Takes