OpenAI just released O1, but is it really worth the hype? Here’s why I don’t think so and why it doesn’t get us closer to AGI.
GPT 3 was revolutionary. It was a combination of a large enough langage model, with some novel fine-tuning techniques produced something really different. A model you can talk to, and can answer questions pretty accurately.
We’ve seen many better models since - GPT-4, Sonnet, Gemini and others. But all were just incremental updates. They all used a GPT-3 or a similar model and built around it to make it more sophisticated.
And O1 is no different. OpenAI calls it a “reasoning” model. And it’s indeed impressive - with PhD level math and chemistry and better code completion capabilities. But to me, it looks like another architecture change, with sophisticated prompt engineering techniques built right in.
So what’s happening under the hood? Though OpenAI hasn’t released details, it seems like they’ve embedded the “chain of thought” technique which instructs the model to work step-by-step on a solution.
It’s like when your math teacher made you show your full solution, not just the final answer. This allows the model to handle complex problems step-by-step. This is probably why the model is slower than previous versions.
Is this AGI? No. Is this bringing us closer to AGI? Probably not. This is an incremental change that is valuable to us who built applications on top of OpenAI. AGI will probably come from a whole different direction, with a new architecture for building models other than the 7 year old Transformers model.
What do you think?