AE Studio’s Post

View organization page for AE Studio, graphic

6,930 followers

Our commitment to AI Safety is all part of AE's prioritization of Human Agency. From our research to our Same Day Skunkworks tools--and of course our client projects--we build technology to improve the lives of humans and elevate businesses in a sustainable way. Reinforcement Learning from Human Feedback (RLHF) is the leading technique for fine-tuning large language models to be helpful, harmless, and honest. But RLHF is complicated and requires a distributed workforce of data labelers. To address this, the AE Studio AI Safety Research Team built a scaled-back, single-user implementation of RLHF. This implementation makes it very easy for students and researchers to run RLHF on a single laptop and gain hands-on experience in simple environments with this training technique. For more details see our write up here: https://hubs.ly/Q02G7-nQ0

DIY RLHF: A simple implementation for hands on experience — LessWrong

DIY RLHF: A simple implementation for hands on experience — LessWrong

lesswrong.com

To view or add a comment, sign in

Explore topics