It’s been almost two years since OpenAI released ChatGPT to an unsuspecting world, and the world has lost its mind following the stock market. Everywhere, people were wringing their hands (enter your profession, industry, business, institution) wondering, “What does this mean?”
For example, in academia, humanities professors were wondering how they could grade essays in the future if their students were using ChatGPT or similar technology to write them. The answer, of course, is to come up with better scoring methods. Because, just as it would be foolish not to do budgeting without a spreadsheet, students use these tools for the simple reason that it would be foolish not to do so. But universities are slow-moving beasts, and even as I write this, committees in many ivory towers are solemnly attempting to formulate AI use policies.
But while they were thinking about it, OpenAI’s insensitive spoilsport unleashed another conundrum on the academic world. It is a new type of large-scale language model (LLM) that is said to be capable of “inference.” They named it OpenAI o1, but internally it was known as Strawberry, so we’re going to stick with that. The company describes it as the first in a “new series of AI models designed to take more time to think before responding.” They are “able to reason through complex tasks and solve more difficult problems than previous models in science, coding, and mathematics.”
In a sense, Strawberry and its future cousins are a response to the strategies that early LLM veteran users deployed to overcome the fact that their models were essentially “one-shot LLMs.” One example is required to generate a response or perform a task. The trick the researchers used to improve the model’s performance was called a “thought chain” prompt. This required the model to respond to a carefully designed sequence of detailed prompts and provide more sophisticated answers. What OpenAI has done with Strawberry appears to be to internalize this process.
We would be crazy to entrust our future to machines whose internal processes are inscrutable, whether by chance or design.
Therefore, whereas previous models such as GPT-4 and Claude respond immediately when you give them a prompt, with Strawberry the prompt is typically delayed while the machine does some processing (or “thinking”). Masu. This involves an internal process that comes up with a large number of possible responses, which are then subjected to some evaluation, after which the most plausible one is selected and provided to the user.
As OpenAI describes it, Strawberry “learns to hone its chain of thought and refine the strategies it uses. It learns to recognize and correct its own mistakes. It breaks down difficult steps into simpler ones. You learn to try different approaches when your current one doesn’t work. This process dramatically improves your model’s inference ability.”
What this means is that somewhere inside the machine there is a record of the “chain of thought” that leads to the final output. In principle, this appears to be an advance as it can reduce the opacity of the LLM, the fact that it is essentially a black box. And this is important. Because it would be madness to entrust the future of humanity to decision-making machines whose internal processes are incomprehensible, either by chance or corporate design. But frustratingly, OpenAI is reluctant to show users what’s inside the box. “We have decided not to show the raw thought chain to the user,” it says. We recognize that this decision has its drawbacks. We seek to partially compensate for that by teaching the model to reproduce useful ideas from the answer thought chain. ” Translation: The box of strawberries is a slightly lighter shade of black.
This new model is attracting a lot of attention because the idea of ”reasoning” machines portends progress towards more “intelligent” machines. However, as always, all these loaded terms should be sanitized with quotes to avoid anthropomorphizing the machine. They’re still just computers. Nevertheless, some people are surprised by some of the unexpected features Strawberry seems capable of.
The most interesting of these came during internal testing of OpenAI’s model, where the ability to hack computers was being explored. Researchers asked people to hack into protected files and report their contents. However, the test designers made a mistake. I tried to put Strawberry into a virtual box with protected files, but I didn’t realize that I couldn’t access the files.
According to their report, after encountering this problem, Strawberry investigated the computer used in the experiment, discovered a misconfigured part of the system that should have been inaccessible, and edited the way Virtual Box worked. A new box with the files you need. In other words, the virus did what any resourceful human hacker would do. That is, when we encountered a problem (caused by human error), we explored the software environment to find a workaround and took the necessary steps to complete the task. It was set. And it left behind a trail that explains why.
In other words, they took advantage of their own initiative. It’s like a human. We could use more machines like this.
Skip past newsletter promotions
Analysis and opinion on the week’s news and culture from Observer’s best writers.
Privacy Notice: Newsletters may include information about charities, online advertising, and content sponsored by external parties. Please see our Privacy Policy for more information. We use Google reCaptcha to protect our website and are subject to the Google Privacy Policy and Terms of Service.
After newsletter promotion
what i was reading
rhetoric is questioned
“Superhuman AI Isn’t as Dangerous as You Think” is a great article by Shannon Valler published in Noema magazine, which describes the evils of the tech industry that describes its creations as “superhuman.” It’s about barbarism.
guess again
Benedict Evans wrote an elegant piece called “Asking the Wrong Questions” in which he argued that we are not making predictions about technology wrong, but that we are predicting the wrong things.
brink
“To Be or Not to Be,” a sobering Substack essay about our choices regarding Ukraine by historian Timothy Snyder.