Last week’s episode on artificial intelligence gets a huge payoff this week—we’ll explore a wonderful couple of papers about all the ways that artificial intelligence can go wrong. Malevolent actors? You bet. Collateral damage? Of course. Reward hacking? Naturally! It’s fun to think about, and the discussion starting now will have reverberations for decades to come.
Relevant links:
- How to create a malevolent artificial intelligence
- Unethical research: how to create a malevolent artificial intelligence
- Concrete problems in AI safety