Institute for AI Policy and Strategy (IAPS)’s Post

What research are AI companies doing into safe AI development? What research might they do in the future? To answer these questions, Oscar Delaney, Oliver Guest, and Zoe Williams looked at papers published by AI companies and the incentives of these companies. They found that enhancing human feedback, mechanistic interpretability, robustness, and safety evaluations are key focuses of recently published research. They also identified several topics with few or no publications, and where AI companies may have weak incentives to research the topic in the future: model organisms of misalignment, multiagent safety, and safety by design. (This report is an updated version that includes some extra papers omitted from the initial publication on September 12th.) https://lnkd.in/g_6DnqgX

  • chart, bar chart

To view or add a comment, sign in

Explore topics