Created with Sketch.
Technical AI Safety Podcast
80 minutes | May 15, 2021
4 - Multi-Agent Reinforcement Learning in Sequential Social Dilemmas
with Joel Z. Leibo Feedback form Request an episode Multi-agent Reinforcement Learning in Sequential Social Dilemmas Joel Z. Leibo, Vinicius Zambaldi, Marc Lanctot, Janusz Marecki, Thore Graepel Matrix games like Prisoner's Dilemma have guided research on social dilemmas for decades. However, they necessarily treat the choice to cooperate or defect as an atomic action. In real-world social dilemmas these choices are temporally extended. Cooperativeness is a property that applies to policies, not elementary actions. We introduce sequential social dilemmas that share the mixed incentive structure of matrix game social dilemmas but also require agents to learn policies that implement their strategic intentions. We analyze the dynamics of policies learned by multiple self-interested independent learning agents, each using its own deep Q-network, on two Markov games we introduce here: 1. a fruit Gathering game and 2. a Wolfpack hunting game. We characterize how learned behavior in each domain changes as a function of environmental factors including resource abundance. Our experiments show how conflict can emerge from competition over shared resources and shed light on how the sequential nature of real world social dilemmas affects cooperation. Links: Open Problems in Cooperative AI
80 minutes | Mar 11, 2021
3 - Optimal Policies Tend to Seek Power
With Alex Turner Feedback form Request an episode Optimal Policies Tend to Seek Power by Alexander Matt Turner, Logan Smith, Rohin Shah, Andrew Critch, Prasad Tadepalli Abstract: "Some researchers have speculated that capable reinforcement learning agents are often incentivized to seek resources and power in pursuit of their objectives. While seeking power in order to optimize a misspecified objective, agents might be incentivized to behave in undesirable ways, including rationally preventing deactivation and correction. Others have voiced skepticism: human power-seeking instincts seem idiosyncratic, and these urges need not be present in reinforcement learning agents. We formalize a notion of power within the context of Markov decision processes. With respect to a class of neutral reward function distributions, we provide sufficient conditions for when optimal policies tend to seek power over the environment." What Counts as Defection? Non-Obstruction
52 minutes | Feb 1, 2021
2 - Neurosymbolic RL with Formally Verified Exploration
with Greg Anderson Feedback form Request an episode Neurosymbolic Reinforcement Learning with Formally Verified Exploration by Greg Anderson, Abhinav Verma, Isil Dillig, Swarat Chaudhuri Abstract: "We present Revel, a partially neural reinforcement learning (RL) framework for provably safe exploration in continuous state and action spaces. A key challenge for provably safe deep RL is that repeatedly verifying neural networks within a learning loop is computationally infeasible. We address this challenge using two policy classes: a general, neurosymbolic class with approximate gradients and a more restricted class of symbolic policies that allows efficient verification. Our learning algorithm is a mirror descent over policies: in each iteration, it safely lifts a symbolic policy into the neurosymbolic space, performs safe gradient updates to the resulting policy, and projects the updated policy into the safe symbolic subset, all without requiring explicit verification of neural networks. Our empirical results show that Revel enforces safe exploration in many scenarios in which Constrained Policy Optimization does not, and that it can discover policies that outperform those learned through prior approaches to verified exploration."
53 minutes | Jan 5, 2021
1 - Safe Reinforcement Learning via Shielding
With Bettina Könighofer and Rüdiger Ehlers Feedback form Request an episode Safe Reinforcement Learning via Shielding Mohammed Alshiekh, Roderick Bloem, Ruediger Ehlers, Bettina Könighofer, Scott Niekum, Ufuk Topcu Reinforcement learning algorithms discover policies that maximize reward, but do not necessarily guarantee safety during learning or execution phases. We introduce a new approach to learn optimal policies while enforcing properties expressed in temporal logic. To this end, given the temporal logic specification that is to be obeyed by the learning system, we propose to synthesize a reactive system called a shield. The shield is introduced in the traditional learning process in two alternative ways, depending on the location at which the shield is implemented. In the first one, the shield acts each time the learning agent is about to make a decision and provides a list of safe actions. In the second way, the shield is introduced after the learning agent. The shield monitors the actions from the learner and corrects them only if the chosen action causes a violation of the specification. We discuss which requirements a shield must meet to preserve the convergence guarantees of the learner. Finally, we demonstrate the versatility of our approach on several challenging reinforcement learning scenarios. Continued Work: Stefan Pranger, Bettina Könighofer, Martin Tappler, Martin Deixelberger, Nils Jansen, Roderick Bloem: Adaptive Shielding under Uncertainty. CoRR abs/2010.03842 (2020) Nils Jansen, Bettina Könighofer, Sebastian Junges, Alex Serban, Roderick Bloem: Safe Reinforcement Learning Using Probabilistic Shields (Invited Paper). CONCUR 2020: 3:1-3:16 Bettina Könighofer, Julian Rudolf, Alexander Palmisano, Martin Tappler, Roderick Bloem: Online Shielding for Stochastic Systems. CoRR abs/2012.09539 (2020) Bettina Könighofer, Florian Lorber, Nils Jansen, Roderick Bloem: Shield Synthesis for Reinforcement Learning. ISoLA (1) 2020: 290-306
5 minutes | Dec 7, 2020
0 - Announcement
Feedback form: https://forms.gle/4YFCJ83seNwsoLnH6 Request an episode: https://forms.gle/AA3J7SeDsmADLkgK9 The Technical AI Safety Podcast is supported by the Center for Enabling Effective Altruist Learning and Research, or CEEALAR. CEEALAR, known to some as the EA Hotel, is a nonprofit focused on alleviating bottlenecks to desk work in the effective altruist community. Learn more at ceealar.org Hello, and welcome to the technical ai safety podcast. Episode 0: announcement. This is the announcement episode, briefly outlining who i am, what you can expect from me, and why i’m doing this. First, a little about me. My name is Quinn Dougherty, I’m no one in particular, not a grad student, not a high-karma contributor on lesswrong, nor even really an independent researcher. I only began studying math and CS in 2016, and I haven’t even been razor-focused on AI Safety most of the time since. However, I eventually came to thinking there’s a reasonable chance AGI poses an existential threat to the flourishing of sentient life, and I think it’s nearly guaranteed that it poses a global catastrophic threat to the flourishing of sentient life. I recently quit my job and decided to focus my efforts in this area. My favorite area of computer science is formal verification, but I think I’m literate enough in machine learning to get away with a project like this. We’ll have to see, ultimately, you the listeners will be the judge of that. Second, what can you expect from me? My plan is to read the alignment newsletter (produced by Rohin Shah) every week, cold-email authors of papers I think are interesting, and ask them to do interviews about their papers. I’m forecasting 1-2 episodes per month, each interview will be 45-120 minutes, and there’s already a google form you can use to request episodes (just link me to a paper you’re interested in) as well as a general feedback form. Just look in the show notes. Finally, why am I doing this? You might ask, doesn’t 80000 hours and future of life institute cover AI Safety in their podcasts? My claim to you is, not exactly. While 80k and FLI produce a mean podcast, they’re interdisciplinary. As I see it, they’re podcasts for computer scientists to come together with policy wonks and philosophers, but as far as I know, there’s a gap in the podcast market, there isn’t yet a podcast just for computer scientists on the topic of AI Safety. This is the gap I’m hoping to fill. So with me, you can expect jargon, you can expect a modest barrier to entry, so that we can go on deep dives into the papers we cover. We will not be discussing the broader context of why AI safety is important. We will not cover the distinction between existential and catastrophic threats, and we will not look at policy or philosophy, if that’s what you want you can find plenty of it elsewhere. But we will, only on occasion, make explicit the potential for the results we cover to solve a piece of the ai safety puzzle.
Terms of Service
Do Not Sell My Personal Information
© Stitcher 2021