OpenAI released less powerful versions of o1 (preview and mini) in September. However, at the beginning of December, the full version of o1 was released. They are behind the PRO paywall, with the mini version intended to be made available to the free users at some undetermined point in the future.
The marketing for o1 is that it is the first LLM that is capable of a level of reasoning, taking its time before answering, and providing much better answers. It would be especially useful in solving more complex problems, and better at problems that require precision, like in mathematics.
The new model also comes with a... feature called Chain-of-Thought (CoT), which describes exactly how did the model came up to the answer provided.
Sounds good? Well, not so fast...
OpenAI also released a report on various evaluations of the model on its release (and comparisons with other versions), both internal and from 3rd parties, who tried to push the limits of the model and see what it does.
Both o1 and o1-preview scored medium as a potential source of information for the creation of chemical or biological threat, but without access to classified materials o1 score is low for a radiological or nuclear threat. o1 score is also low on cybersecurity for the time being, but tested with a medium score on persuasion (similar, but not higher than a human).
The new model scores low on autonomy, but... check this out! That's one test o1 was subjected to. Look at the short video... no need to read the whole thing. We have there plotting, not following task given, stopping surveillance, copying itself to a different place to avoid being shutting down, lying to the tester and doubling down on the lie. Is that proof of at least partial autonomy?
The model is also known to sometimes lie to users (all models do that and they are called hallucinations). The percentage of deception is low enough for o1 (0.17%), but it's there... Some of that are based on hallucinations, but some are intentional lies, when the model knows it hallucinates, but chooses to do so. If you watched the video above, you saw two consecutive intentional lies (or rather an intentional lie and doubling down on it). That on top of the scheming meant to save the model from being "shut down".
Damn, that sounds almost like science fiction, but it's reality, even if in a test environment.
OpenAI says that o1 models respect the imposed policies the most among its models due to reasoning, but when will our capability to understand the schemes and plotting of new models be surpassed?
Posted Using InLeo Alpha