On-demand

Watch the sessions

Catch up on all the sessions from Fully Connected

Fully Connected 2025 kickoff: The rise (and the challenges) of the agentic era

Join Robin Bordoli, VP of Revenue at Weights & Biases, as he kicks off Fully Connected London 2025. After looking back at how far AI has come in a few short years, Robin leads us to today’s agentic landscape, covering the explosion of agents and reinforcement learning, while highlighting that some of the oldest challenges in AI remain pervasive even as models become vastly more performant.

Fully Connected keynote: Building tools for agents at Weights & Biases

Weights & Biases Co-founder Lukas Biewald and CoreWeave VP of Engineering Camille Fournier deliver a morning keynote at Fully Connected London 2025. They touch on why roadmaps in AI are hard, what Weights & Biases and CoreWeave have been building in the past half year, and walk through how we build and think about agents and the tools they need to succeed.

Physical AI: Unlocking business value with foundation models and digital twins

GSK is both a global biopharma company that impacts more than a billion people worldwide and an AI pioneer. In this talk, Sander Timmer, PhD Senior Director: AI/ML and Decision Sciences, Enterprise AI at GSK details exactly how deeply permeated AI is at the company, from agents that help optimize their global supply chain to how their models help discover new drugs and keep their labs clean and secure.

Understanding the new AI tech stack: Infrastructure, models, agents

Join Robin Bordolli, Tim Rocktäschel, Mike Mattacola, and Akshita Gupta for a wide-ranging panel discussion on the entire AI stack. Covering everything from infra and sovereign AI to where agents are making a difference now to the next frontier in AI research, this is a full stack panel that touches on every part of AI.

Scalable autonomy through embodied AI

Rather than focusing on a single car or manufacturer, Wayve is training models to drive virtually any car, with any sensors, in any city. In this talk, Silvius Rus, SVP of Engineering, walks through how the company arrived at this approach, how their proprietary AV2.0 model works, and how they’re approaching autonomous driving differently.

Automating knowledge work with AI agents

Most human knowledge remains locked in complex documents and file types like PDFs, tables, and content with irregular layouts that still hold valuable context. Diego Kiedanski, Founding AI Engineer at LlamaIndex explores parsing and extracting technologies that work alongside AI agents to make truly automated knowledge work a reality. He also introduces LlamaAgents, a new framework that allows you to serve and deploy these assistants at scale.

RAG is dead, long live RAG: Retrieval in the age of agents

In 2023, RAG was the solution to everything. By 2024, it had been declared dead multiple times—killed by exploding context windows, agentic frameworks, and code tools using grep. So is RAG actually dead? In this talk from Fully Connected London, Amélie Chatelain, Head of Knowledge & Search at LightOn, traces RAG's evolution from default pipeline to conditional tool that agents call when it makes sense. She explores why more context doesn't fix issues with cost, latency, or accuracy, and why grep fails with visual content.

Building the next generation of computer use agents

In this talk, Kai Yuan, Agentic AI Research Team Lead at H Company, introduces Surfer 2, a next-generation computer-use agent that moves AI from models that "know" to agents that "do." He presents Surfer 2's unified architecture that operates across desktop, web, and mobile by separating strategic planning from tactical execution. Integrating third-party frontier models with H's Holo1.5, Surfer 2 has achieved state-of-the-art results on four major benchmarks and exceeds human performance on desktop and mobile.

The power of context: From traditional RAG to multi-agent retrieval

Whether you're building a traditional RAG system or designing a multi-agent architecture, one constant remains: context. In this session, Rajiv Shah, Chief Angelist at Contextual AI, explores how advances in context engineering—smarter retrieval, filtering, and orchestration—have paved the way for agentic RAG systems that use iterative, multi-step refinement to improve responses. He then looks ahead to the next frontier: multi-agent systems that collaborate using shared contextual knowledge to solve more complex tasks.

Rapid prototyping for reinforcement learning in industry

While RL has been extensively researched and applied in digital products, manufacturing introduces specific obstacles including diverse devices, machine variability, prolonged deployment periods, and long-term support requirements. In this talk, Shahram Eivazi, GenAI Technology Expert at Festo, focuses on rapid prototyping for RL in resource-constrained industrial settings, highlighting how careful prototyping with W&B tools can address industry-specific challenges and improve productivity.

Beyond the vibes: Learnosity’s journey to a robust LLM evaluation framework

Sean McCrossan, Data Scientist at Learnosity, shares how his team moved beyond intuition-driven "prompt engineering" to a structured, data-driven framework for evaluating large language models. He explains how Learnosity designs, tests, and validates AI educational products for quality and reliability, using techniques like LLM-as-a-Judge, synthetic data generation, and W&B Weave-based evaluation pipelines to ensure accuracy, fairness, and trust in real-world learning applications.

Defining factors for enterprise AI agents

Vadim Briliantov, Technical Lead at JetBrains, dives into what has emerged as the defining factors for enterprise AI agent success: namely, the importance of fault tolerance, predictability, and cost-efficiency. He introduces the JetBrains Koog Framework as a valuable tool for enterprise AI agents, and how the combination of Koog + W&B Weave has delivered unprecedented predictability and scalability for enterprises building their own suites of AI Agents.

End-to-end driving with safety guardrails

Mahshid Majd, Unit Lead for Deep Learning at Zenseact, dives into how their tech stack is set up to help them scale and achieve their mission of eliminating traffic fatalities via autonomous driving technology. She spotlights how Zenseact's OnePilot system, deployed in production vehicles, uses end-to-end sensor fusion and trajectory planning within a single deep neural network. She also discusses the importance of using Weights & Biases as their single source of truth, and how it has helped them scale past bottlenecks of collaboration, model management, and data labeling.

Sandbox breakout evals with Inspect

Harry Coppock, Research Scientist at the AI Security Institute, explores a critical question rarely asked: how good are AI agents at breaking out of sandboxes? As agents are increasingly deployed in everyday workflows with real-world impact through coding and orchestrating complex tools, sandboxing becomes vital to limit damage from compromise or misalignment. Harry reveals how AISI safely evaluates sandbox escapes, what they've learned about frontier models' escape abilities, and whether we should be concerned.

Multi-domain large language model adaptation using synthetic data generation

Injy Sarhan (NLP Researcher) and Avanindra Singh (Senior Researcher) from Shell explore how they're unlocking institutional knowledge trapped when experts retire. They detail their domain ingestion pipeline using NVIDIA Nemo Curator and W&B Weave, covering data preprocessing, domain adaptation, instruction tuning, and evaluation. The team demonstrates how domain-adapted LLMs achieve domain-specific reasoning and enhance factual accuracy, highlighting how W&B Weave's LLM-as-judge and feedback loops enabled alignment between manual and auto-generated benchmarks.

Multi-agent applications in production

As AI agents move from experimental notebooks to production, the core challenge shifts from building to operating them reliably. Heiko Hotz, Generative AI Global Blackbelt at Google, draws from two real-world customer projects to present a production-centric framework for managing Multi-Agent Systems. Learn best-practice architectural patterns for robust orchestration, practical approaches to analyzing user-agent interactions for actionable metrics, and how to build security guardrails against prompt injections.

Using MCP to analyse your experiments & create reports in natural language

Ina Koleva, Product Manager at Mistral AI, explores how the Model Context Protocol (MCP) introduces standardization for connecting AI applications with external platforms, avoiding vendor lock-in and eliminating repetitive development. She demonstrates how Mistral's suite—including Le Chat and Mistral Code—uses MCP to enable advanced automation and natural language technical tasks. Ina also shares how they use W&B Models and W&B Weave to evaluate and monitor their agents and applications, including a live demo of Le Chat with MCP integration.

Optimizing agentic AI workflows: Metrics-driven evaluation with W&B Weave and NVIDIA

Rita Fernandes Neves, Sr. Solutions Architect at NVIDIA, explores how to build, debug, and optimize agentic workflows—where large language models drive multi-step reasoning and tool use. She demonstrates how W&B Weave and the NVIDIA NeMo Agent Toolkit enable transparent, metrics-focused development of agentic applications. Rita showcases practical strategies for workflow instrumentation, evaluation, and visualization across structured operations and open-ended research assistants.

What is Fully Connected?

Watch the sessions

Fully Connected 2025 kickoff: The rise (and the challenges) of the agentic era

Fully Connected keynote: Building tools for agents at Weights & Biases

Physical AI: Unlocking business value with foundation models and digital twins