All Episodes

DeepMind AGI Safety Paper Explained: Part 1 — Introduction & Overview
Alex and Thuy kick off a deep dive into Google DeepMind's comprehensive AGI safety paper by Rohin Shah and 29 co-authors. In this episode: - Meet the hosts — Alex (fintech PM) and Thuy (AI researche

DeepMind AGI Safety Paper Explained: Part 2 — The Four Risks of AGI
What could go wrong with AGI? Alex and Thuy walk through the paper's four risk categories — misuse, misalignment, mistakes, and structural risks — and why DeepMind chose to focus on just two. In this

DeepMind AGI Safety Paper Explained: Part 3 — Fighting Misuse
How do you stop people from weaponizing AI? Alex and Thuy dig into the paper's layered approach to misuse prevention — from threat modeling to capability evaluations to red teaming. In this episode:

DeepMind AGI Safety Paper Explained: Part 4 — The Alignment Problem
When AI itself goes rogue. Alex and Thuy tackle the hardest chapter — what happens when you can't tell if your AI's output is good or bad, and the techniques being developed to address it. In this ep

DeepMind AGI Safety Paper Explained: Part 5 — Robust Training & System-Level Security
The second line of defense — what happens when alignment fails. Alex and Thuy explore how to treat AI as an untrusted insider and build containment that works even against adversarial models. In this

DeepMind AGI Safety Paper Explained: Part 6 — The Safety Toolbox
Interpretability, uncertainty estimation, and safer design patterns — the cross-cutting tools that make every defense layer work better. In this episode: - Interpretability — Three transformative ap

DeepMind AGI Safety Paper Explained: Part 7 — Making the Safety Case
How do you decide whether an AGI system is safe enough to deploy? Alex and Thuy walk through the paper's four types of safety cases and the role of red teaming. In this episode: - Inability cases —

DeepMind AGI Safety Paper Explained: Part 8 — Key Takeaways & Open Questions
Alex and Thuy wrap up the series with key takeaways, surprises, practical advice, and the open questions that keep them up at night. In this episode: - Defense in depth — The philosophy that runs th

The Persona Selection Model: Why Your AI Is Human (And You Didn't Tell It To Be)
Alex and Thuy break down Anthropic's February 2026 paper on the Persona Selection Model. Why AIs default to human-like behavior, the bizarre cheating experiment that made Claude want world domination,

Anthropic AI Safety Research Directions Explained: Part 1 — Measurement & Evaluation
Alex and Thuy dive into Anthropic's recommended directions for technical AI safety research, starting with the foundational challenges of measuring what AI systems can do (capabilities evaluation) and

Anthropic AI Safety Research Directions Explained: Part 2 — Control & Oversight
Continuing Anthropic's AI safety research directions, Alex and Thuy explore how to deploy potentially misaligned systems safely and how to oversee systems that might be smarter than human evaluators.

Anthropic AI Safety Research Directions Explained: Part 3 — Robustness & Future Directions
The final episode on Anthropic's AI safety research agenda covers adversarial robustness, unlearning dangerous capabilities, multi-agent governance, and synthesizes key themes for building safe AI sys

Attention Is All You Need Explained: Part 1 — Introduction & Transformer Architecture
Alex and Thuy dive into the groundbreaking 2017 paper that introduced the Transformer architecture, revolutionizing NLP and becoming the foundation for GPT, BERT, and modern large language models. In

Attention Is All You Need Explained: Part 2 — Why Self-Attention Wins & Training Details
Continuing the Transformer deep dive, Alex and Thuy explore why self-attention outperforms RNNs and CNNs, examining computational complexity, parallelization, and the training setup that made this arc

Attention Is All You Need Explained: Part 3 — Results, Ablations & Legacy
In the final episode on the Transformer paper, Alex and Thuy examine the breakthrough results on machine translation, model ablation studies, and discuss the lasting impact of this architecture on mod

The Persona Selection Model: Why AI Assistants might Behave like Humans — APR Podcast Script
In this episode, we break down "The Persona Selection Model: Why AI Assistants might Behave like Humans" by Sam Marks, Jack Lindsey, Christopher Olah (Anthropic). What we cover: (1) What Is an AI A

Agentic RAG Explained: The Blueprint for AI That Retrieves, Reasons, and Acts
In this episode, we break down "SoK: Agentic Retrieval-Augmented Generation (RAG): Taxonomy, Architectures, Evaluation, and Research Directions" by Saroj Mishra, Suman Niroula, Umesh Yadav, Dilip Thak

The Latent Color Subspace: How AI Accidentally Reinvented the Color Wheel — APR Podcast Script
In this episode, we break down "The Latent Color Subspace: Emergent Order in High-Dimensional Chaos" by Mateusz Pach, Jessica Bader, Quentin Bouniot, Serge Belongie, Zeynep Akata. What we cover: (1

Caging the Agents: Zero Trust Security for AI in Healthcare — APR Podcast Script
In this episode, we break down "Caging the Agents: A Zero Trust Security Architecture for Autonomous AI in Healthcare" by Saikat Maiti (VP of Trust, Commure; Founder & CEO, nFactor Technologies). Wha

The Autonomy Tax: Defense Training Breaks LLM Agents — APR Podcast Script
In this episode, we break down "The Autonomy Tax: Defense Training Breaks LLM Agents" by Shawn Li, Yue Zhao (University of Southern California). What we cover: (1) The Capability-Alignment Paradox