Our Takeaways from the Agentic AI Summit
Ananda Rajagopal
Read time:

Ananda Rajagopal, Co-founder and Chief Product Officer, Ciroos
As one of the major technological innovations of 2025ââthe year of agentic AI,â in the words of Jensen Huangâit was fitting that a summit on this topic was held at UC Berkeley last Saturday on August 2, 2025. Such conferences provide a venue for information exchange between academia and industry, as well as across different sectors. This blog captures the major themes discussed at the summit that resonated most with us.
Interest in agents is driving innovation at every level of the stack. For example, while chain-of-thought or tree-of-thought (ToT) methods demand high token rates, agents may require even higher rates. Likewise, disaggregated inference is increasing the demand for more memory bandwidth and lower execution latency at the GPU layer.
Ion Stoica, co-founder of Databricks, Anyscale, and LMArena, emphasized the importance of reliability as AI systems evolveâdrawing parallels with traditional software engineering. For example, productizing a feature often requires 10 to 50 times more effort than prototyping it. The primary objective during productization is to make the system more robust and reliable. We couldnât agree more! Amidst the hoopla of 'vibe everything,' it's important to remember that deploying production systems is serious business â and enterprises have a high bar.
In multi-agentic systems, special attention needs to be given to potential failure modes. Multi-Agent System Failure Taxonomy (MAST) is the first empirically grounded framework designed to systematically understand failures in such systems. It classifies failure modes into three overarching categoriesâspecification issues, inter-agent misalignment, and task verificationâproviding a structured approach that helps builders improve the reliability of multi-agentic systems.
A topic of deep industry interest is how to teach AI new tasks. This is especially relevant in enterprise workflows, where ambiguity and human input are common. For example, site reliability engineers (SREs) deal with tribal knowledge, custom workflows, and expertise built over yearsâmuch of which exists only in human minds. Traditional methods involve weight updates using gradient descent, such as pretraining, supervised fine-tuning (SFT), and reinforcement learning with verified rewards (RLVR). While effective, these methods are expensive due to the large number of examples required. Genetic-Pareto (GEPA) is a prompt optimizer that incorporates natural language reflection to learn high-level rules through trial and error. The core idea behind GEPA is that the interpretability of natural language provides a richer learning medium for LLMs compared to reward-based policy gradients.
One of the key discussions centered on whatâs needed to deploy AI agents ubiquitously. Critical enablers include agents that can gather information from the outside world, adapt quickly to new environments, and are easy to create and customizeâall while being cost-effective. Chi Wang, who co-authored the AutoGen (now AG2) framework in 2022, shared a vision of an âAI Agent That Grows with You.â At Ciroos, we strongly align with this perspective. In fact, we believe the world is better served by an interoperable ecosystem of agentic AI systems. Thatâs why we built our platform to be extensible and interoperable with other agents via the Model Context Protocol (MCP), Agent 2 Agent (A2A) Protocol, and the AGNTCY initiative.Â
Agentic AI products span a spectrum of flexibilityâfrom simple chatbots (least flexible) to dynamic, autonomous agents (most flexible). These autonomous agents can ingest a multitude of data sources, operate with a broad action space (including write-oriented actions), leverage a dynamic set of tools, and offer rich, interactive interfaces with users. With this flexibility comes the need to address increased attack vectors. In her talk, Professor Dawn Song emphasized that any agentic AI system should meet not only the traditional âCIAâ triad of cybersecurity (Confidentiality, Integrity, Availability) but also ensure contextual integrityâensuring that tool calls align with the userâs intent. AgentBeats is an open platform for evaluating and assessing risks in agentic systems. At Ciroos, we have built strong guardrails from the ground up to safeguard against misuse. As a best practice, we recommend that enterprises adopting agentic AI systems begin with minimal access levels to discover value safely, and then progress toward broader trust over time.
An exciting research area is training AI to optimize for goal-conditioned value functionsâpredicting the next action most likely to achieve a specific goal. These capabilities may be particularly compelling in humanâAI interactions that involve persuasion. While the specific example discussed at the conference involved an AIâhuman interaction aimed at persuading someone to donate to a charity, I found myself wondering about the broader societal implications of this research.
Itâs always inspiring at conferences like this to hear from luminaries such as Vinod Khosla. In his fireside chat with Professor Dawn Song, he remarked, âIf you have less resources within bounds, you're more likely to come up with more invention in innovation than if you have lots of resources.â Scarcity drives focusâwhich is why necessity is the mother of invention. While this has always been true for startups, it is especially relevant today given the unprecedented pace of progress in AI. This environment enables fleet-footed, focused startups with a differentiated point of view and clear customer value to thrive. Khosla also referenced a valuable Stanford study showing that AI in the medical field already achieves 88% accuracy in complex diagnoses, compared to 73% for human doctors. The paradigm of automateâaugmentâautonomous operations is applicable across industries to drive better outcomes.
We appreciated the framing offered by Arvind Jain, founder and CEO of Glean, who observed that AI-native employees will drive AI-native businesses. Research indicates that the length of tasks AI can complete with 50% reliability has been doubling every seven months for the past six yearsâa trend that may have accelerated in 2024. In our view, itâs no longer a question of if, but when AI will meet enterprise benchmarks for a wide range of tasks, including those highly valued by site reliability engineers. As with all technological shifts, transformation is driven by curious and innovative employees who seek a better world for themselves and their teamsâbecoming the change agents within their enterprises! These are the superheroes who propel human progress.Â
â