Chennai Mathematical Institute

Seminars




AI Safety Talk
Date: Thursday, 20 April 2023.
Time: 3:30 PM
Venue : Virtual Mode
A High-level Dive into Open Problems in AI Alignment

Arjun Jose
Alignment Research, ARC Evals.
20-04-23


Abstract

Making AI go well is pretty hard. One framing that verbalizes the crux of the alignment problem is ontology identification. High-level interpretability is an approach in this vein that attempts to formalize the type signature of an objective, to map it onto the internals of a neural network.