AI Safety Talk Date: Thursday, 20 April 2023. Time: 3:30 PM Venue : Virtual Mode A High-level Dive into Open Problems in AI Alignment Arjun Jose Alignment Research, ARC Evals. 20-04-23 Abstract Making AI go well is pretty hard. One framing that verbalizes the crux of the alignment problem is ontology identification. High-level interpretability is an approach in this vein that attempts to formalize the type signature of an objective, to map it onto the internals of a neural network.
|