If Imitative AI Was So Good They Wouldn't Need to Tell So Many Lies About It
Imitative AI is going to be the next big thing we are assured. We must get on board with it because it will inevitably be better than, well, anyone creative at anything soon and so you better just suck it up and fall in line. Hallucinations? Soon to be a thing of the past. And look at the amazing things it can already do. Why, it can pass the bar exam better than most students! And it can find a ton (get the joke?) of new materials that scientists would have taken forever to find! Isn’t imitative AI fantastic?
Well, maybe not as fantastic as its hype men would have you believe. A lot of what I stated is either not true or exaggerated at best. Let us take the claim that hallucinations are sure to be solved. This is simply not logically reasonable. As Gary Marcus points out, imitative AI does not hallucinate because it wants to, or because it is drunk, or because it gets confused. It hallucinates because it cannot do otherwise — it has no idea of the difference between true and false and no way to fact check. Een with perfect data, the process of calculating the next token will inevitably result in errors in some situations — and imitative AI will never have perfect data, not matter how much they violate copyright to train the systems.
Well, at least the performance of imitative AI is already good. I mean, they say it can already pass the bar exam, and it does as well as the top 10% of test takers. That is amazing! Except it is not really true. They used a lot of tricks to get those results. They compared their machine to repeat test takers, people who failed the exam before and are there for not representative of the full group of people who take the exam. Compared to the full universe, the test drops to 69% percentile, 48% in the essay. When compared to the people who passed the test on the first try, the humans it would most likely be competing against, it fell to 48% percentile. Most damning, they did not have the exam essays graded by people who were trained as bar exam essay graders — a huge problem. They instead just compared the essays to a group of “good answers” and called it a day.
Okay, not great, but it still did okay, still did pass (maybe, if you accept their unorthodox grading) and look what at that Sora demo — professional animation quality video from just a prompt! I am not sure why you are excited about putting actors and animators out of work, but fortunately for them, it turns out that the demo — which I remind you is supposed to be the best the system can do — was heavily manipulated by, umm, actual human beings. Sora spent much of its time deep in the Uncanny Valey, apparently, could not get characters to keep their look from frame to frame, and could really only generate material in slow motion. Human beings had to do much if not almost all of the work to make the demo passable.
Well, what about in science? Surely the fact that Google’s Deepmind discovered millions of new materials counts for something. Stop me if you see this coming, but, again, not really true. When real scientists looked at the results, they were less than impressed. One paper pointed out that “We discuss all 43 synthetic products and point out four common shortfalls in the analysis. These errors unfortunately lead to the conclusion that no new materials have been discovered in that work …” While another, more positive, still highlights the fact that “we have yet to find any strikingly novel compounds in the GNoME and Stable Structure listings, although we anticipate that there must be some among the 384,870 compositions.” Basically, the results are wildly overblown.
I am not trying to claim that machine learning has no uses. That is far from the truth. But the specific subset of machine learning that is imitative AI appears to be much more hype than reality. Its proponents seem to be more interested in fluffing it up than actually describing its real capabilities and potential. They likely hype rather than describe because a sober assessment would show that the harms, societal and environmental, would far outweigh any of the almost certain to be meager benefits it makes possible. And if that reality, rather than fantasies of mass job loss and SkyNet in the here and now, became the accepted wisdom, the odds of these companies making their investments back drop precipitously.
The truth, despite the saying, does not need a bodyguard of lies. If so much of the supposed results of imitative AI are so exaggerated, then you have to wonder how much, if any of it, can live up to the hype?