Genre Grapevine on Machine Learning’s Problem with Bad Stories and Bad Intentions

Dec 29, 2023

Note: This column is available free to the public. If you like my writings on genre issues, consider backing my Patreon.

Previous Genre Grapevine AI columns from 2023:

As a writer, I thrill at reading bad novels most people want to throw across the room. I devour movies that stink like actual rotten tomatoes. I enjoy watching horrific TV shows even after they’re canceled halfway through the season.

Of course my true loves are stories, films and shows that are the opposite of bad. As a reader I want great stories with dazzling wordcraft, plots, and characters. As a viewer I want films and shows that visually transport me into well-crafted worlds with characters I relate to.

But as a writer, I learn much more from bad stories than from the good. Every bad story I’ve read or watched has had one or two good ideas that stuck with me. As I watch a real stinker of a film, I think “Wow, that minor plot device could have been turned into a great movie.” When I read a bad book, I pay attention to how and why the story failed to grab me.

Bad stories inspire me to reach ever higher with my own writing. Bad stories teach me the ways any story can be improved and fixed.

In recent months I’ve been thinking about why bad stories – or at least the ability to improve on bad stories – are one of the main reasons artificial intelligence programs have a long way to go before they truly threaten human creativity.

The Year of AI Pushback

For the writing community, 2023 has been the year of AI. While machine learning has been around for a while, this was the year when ChatGPT and similar programs began to deliver truly impressive results. Artists experienced a similar situation the year before with programs such as DALL-E, Stable Diffusion and Midjourney.

At first it was fun to play with the systems. But it was also scary, with many writers and artists worried machine learning might quickly take their place. Considering the already marginal pay and unsteady societal existence experienced by many writers and artists, there were legitimate fears at the start of 2023 that machine learning could decimate the livelihoods of creative people around the world.

However, the feeling at the end of 2023 is much more optimistic. Yes, machine learning will likely be a tool used by many creative people, and yes writers and artists still need to be aware of how these systems could impact them in the years to come (as detailed in my earlier report). However, some of the wind has come out of the sails of pending AI dominance.

Part of this has to do with how creatives pushed back against the corporations trying to use AI to replace them. First, the Writers Guild of America won their strike against the Alliance of Motion Picture and Television Producers. While the threat of Hollywood studios using machine learning to replace writers wasn’t the strike’s sole focus – the studios have been underpaying and threatening the livelihoods of their writers for decades – the strike still came at a critical time in the development of AI arts and writing programs. Hollywood writers could clearly see that the studios would eagerly use these programs to replace them, so they expanded their strike to also focus on this threat.

The result was that Hollywood writers won the right to use AI as a tool if they want, but studios can’t force them to use it and can’t replace them with the programs. And Hollywood actors also won similar safeguards against AI in their own strike.

The timing of these strikes was critical. If the labor agreements the writers and actors unions had with the studios had expired a few years earlier, machine learning wouldn’t have been on anyone’s radar and the studios could have eventually implemented these programs without much pushback. And if those same agreements had ended a few years from now, it may have been too late to do anything.

Another thing that has dulled fears of AI are the many lawsuits taking aim at how machine learning companies trained these programs on copyrighted works. A number of authors including John Grisham, George R.R. Martin and Jodi Picoult are suing OpenAI over using their work without permission to train ChatGPT, with other lawsuits by writers continually popping up. And major institutions are now filing similar lawsuits, with The New York Times recently suing OpenAI and Microsoft for copyright infringement and contending “that millions of articles published by The Times were used to train automated chatbots that now compete with the news outlet as a source of reliable information.”

The ultimate outcome of these lawsuits is uncertain, with two judges so far dismissing some of the counts in the lawsuits. In particular, one judge denied that the “Meta’s AI system is itself an infringing derivative work made possible only by information extracted from copyrighted material.” However, other counts in these lawsuits are still going forward. And the new suit filed by The New York Times seems particularly strong, since The Times is claiming their copyrighted materials were directly used to create new works that compete with their company’s products.

No matter the ultimate outcome, these suits are obviously worrying to the companies creating machine learning. As stated in a humorous headline in The Byte, AI investors are horrified at the suggestion that they may have to pay artists for using their copyrighted work. But the punch line is that this isn’t a joke. The famed venture capital firm Andreessen Horowitz, which has invested in scores of AI companies, has publicly stated that "Imposing the cost of actual or potential copyright liability on the creators of AI models will either kill or significantly hamper their development."

As Lincoln Michel wrote in The New Republic, “The legal questions will be settled in court, and the discourse tends to get bogged down in semantic debates about ‘plagiarism’ and ‘originality,’ but the essential truth of A.I. is clear: The largest corporations on earth ripped off generations of artists without permission or compensation to produce programs meant to rip us off even more.”

Artists have also shown there are other ways to fight back against their works being used without their consent. Using tools like Nightshade, artists are able to “contaminate and confuse the AI systems themselves” by inserting false information alongside the images those systems are trained on. Other poison pill programs like Glaze have also been created, with the program subtly changing “the pixels in an artwork to make it hard for an AI model to mimic a specific artist's style.”

So far, artists approve of this tactic. As illustrator Corey Brickley said, "‘Data poisoning’ is such a good term. Ruin a tech bro's day and bottom line. You can't just steal labor and jobs from people and expect them to let you."

And as Paloma McClain added, “AI bros seem to be misunderstanding the point of Nightshade. No, we aren’t trying to somehow stuff poison in your AI model datasets ourselves. We poison our own work. And if YOU choose to scrape our work, YOU poison your own dataset. It’s a retaliation to you offending first.”

The net effect of all this has been to massively slow the supposed inevitability of machine learning systems.

“We Say It’s AI, So It Must Be AI! And We Know What We’re Doing! Trust Us!”

One of the best science fiction works to plumb the topic of artificial intelligence is the manga Ghost in the Shell by Masamune Shirow, first published nearly 30 years ago. While Ghost in the Shell is perhaps better known today for the successful films and TV shows it inspired, the original manga is well worth reading. Of particular interest in the manga are Shirow’s many footnotes discussing where he believes technology will go in the future.

In one of the footnotes, he wrote: "The definition of a human is very vague, so when an AI superior to a human (definition unclear) is developed, the question is: Will humans really be able to recognize it?”

This is a very valid question. And it also helps explain why so many people are convinced that ChatGPT, Midjourney and similar programs are actually a form of artificial intelligence, instead of machine learning systems using algorithms crafted from training data. But calling these programs “machine learning” isn’t as sexy and trendy as “artificial intelligence,” so guess which term these companies keep using.

As I’ve previously written, there’s a ton of hype and deceptive language used to promote and describe machine learning systems. I believe this is deliberate, both because it helps secure funding for these companies and because it creates a sense of inevitability around this new technology. And since it’s so hard for humans to truly recognize an actual artificial intelligence – which I believe will be created at some point, even if this hasn’t happened so far – these companies’ hype and deception are simply accepted as truth.

Unfortunately, it’s difficult to debunk this hype because we can’t peek under the proprietary hoods of these programs. But even if we can’t examine the algorithms and technology behind machine learning programs, we can examine the words and actions of the people creating them.

OpenAI is the world’s leading machine learning company, having created cutting edge programs such as ChatGPT and DALL-E. So what kind of “intelligence” is OpenAI aiming to create? As quoted in this Reuters article, “OpenAI defines (artificial general intelligence) as autonomous systems that surpass humans in most economically valuable tasks.”

So the leading company in the world creating so-called artificial intelligence is focused on intelligence related to the “most economically valuable tasks.” Human intelligence doesn’t merely focus on economically valuable tasks, nor does the intelligence of any other animal species on Earth. Does that sound like they’re trying to create a true intelligence, or merely a disguised way to profit from disrupting the livelihoods of as many people as possible?

This “economically valuable” point is also vitally important if you examine the recent leadership turmoil at OpenAI. In November, the company’s board of directors fired Sam Altman, their superfamous CEO, only for him to be rehired a few days later and most of the OpenAI board themselves terminated. While no detailed public statements on what happened have been released by OpenAI, according to Vox “It appears there were fundamental differences between the (now former) board’s vision for AI, which included carrying out that mission of safety and transparency, and Altman’s vision, which, apparently, was not that.”

In particular, Reuters reported that it appears OpenAI’s board fired Altman because they were worried about a new "powerful artificial intelligence discovery that they said could threaten humanity" along with "concerns over commercializing advances before understanding the consequences." If Altman was not being open about those issues, then yes, that’s absolutely when the board should step in and fire him.

But instead, Altman is back in power and those OpenAI board members worried about these issues are gone. This happened after 738 employees at OpenAI – equal to 95% of their workforce – signed a letter saying they’d leave if Altman wasn’t rehired. That is an astounding percentage of people essentially saying that they care more about ignoring the risks of what they’re creating than in undertaking actual due-diligence.

To update that famous Upton Sinclair quote, “It is difficult to get an OpenAI employee to understand something when their stock options depend on their not understanding it.”

I have zero trust in the people creating machine learning precisely because, as what recently happened at OpenAI showcases, they have not proven themselves to be trustworthy. The same can be said of the entire tech industry over the last few decades, with their relentless focus on disrupting the lives of countless people merely so an elite few can profit.

A new lawsuit has uncovered evidence of where all this is likely going: “UnitedHealthcare, the largest health insurance company in the US, is allegedly using a deeply flawed AI algorithm to override doctors' judgments and wrongfully deny critical health coverage to elderly patients.”

According to this lawsuit, over 90 percent of these denials were wrong and eventually reversed. Which makes you wonder if that wasn’t the intent of the people who created UnitedHealthcare’s machine learning system. While a 90% error rate sounds horrific to most people, especially when you consider how many elderly patients had to fight to overturn their unjust medical treatment denials, there were still likely a number of rejected claims where people may have not felt like fighting back, or where they passed away before the denial could be overturned. That means from UnitedHealthcare’s cold-hearted point of view, their AI system still earned them a profit.

If a human was wrong 90% of the time, they’d be fired from every job on the planet. But for UnitedHealthcare, this wasn’t a 90% error. Instead, their algorithms allowed them to not pay for the medical treatments of a number of patients, increasing their profits for minimal effort. And they did this without paying actual humans – you know, people who might have some sympathy and kindness – to consider the medical conditions of those denied treatment.

Keep this in mind whenever someone promoting machine learning says “trust me.”

AI’s Bad Story and Bad Art Problem

Ironically, writers and artists may be in a better place to survive machine learning than many professions because of the creativity at the heart of what we do. The dirty little secret of machine learning systems is that their so-called “creativity” is an illusion. Yes, machine learning is able to create art and writing based on the data it has trained on. However, these systems only produce good results when they have continual access to imaginative works created by actual humans.

Earlier this year, BBC's Science Focus published a must-read article on how AI art’s hidden echo chamber is about to implode. The article quoted Ahmed Elgammal, a professor of computer science at Rutgers University, as saying, “If you get into the cycle of feeding (machine learning) what is on the internet, which right now is mostly AI, that will lead to a stagnation where it is looking at the same thing, the same art style, over and over again.”

According to Elgammal, this will result in machine learning programs converging “on anything that is popular. More of what is popular will get you stuck to certain art styles that are popular on the internet and it will become biased to that. Whatever is dominant right now, that is what the models will learn to push even more.” And this will then result in AI art being expressed in a “limited set of styles, causing generations of art to look very similar, lacking the imagination most people aim to get out of these generators.”

When you understand machine learning can get trapped in this cycle of mediocrity, you also understand why machine learning companies are so desperate to keep training their programs on art and writing created by actual humans. The alternative results in machine learning programs that are far more limited and far less profitable.

So just as machine learning has a problem with the bad motivations and intentions of those creating these systems, it also has a bad story and bad art problem that will likely result in a tidal wave of stale works.

Unfortunately, many people around the world will be perfectly satisfied with mediocre results from machine learning programs. We’ve already seen this with many of the films produced by Hollywood in recent decades, and the flood of similar books released by the publishing industry and the legion of same-sounding songs released by the music industry.

Many people are satisfied with receiving the same thing over and over. And machine learning will massively speed up the process of serving these people the same old same old.

The reason for this is perfectly illustrated by what Jason Farago wrote in an excellent essay in The New York Times:

"A.I. cannot innovate. All it can produce are prompt-driven approximations and reconstitutions of preexisting materials. If you believe that culture is an imaginative human endeavor, then there should be nothing to fear, except that — what do you know? — a lot of humans have not been imagining anything more substantial. … I remain profoundly relaxed about machines passing themselves off as humans; they are terrible at it. Humans acting like machines — that is a much likelier peril, and one that culture, as the supposed guardian of (human?) virtues and values, has failed to combat these last few years. Every year, our art and entertainment has resigned itself further to recommendation engines and ratings structures. Every year our museums and theaters and studios have further internalized the tech industry’s reduction of human consciousness into simple sequences of numbers."

This is also my fear.

Artists and Writers Must Lead by Creating New Works and Dreams

Machine learning is not artificial intelligence. But machine learning is poised to dramatically change how many people work and live.

Make no mistake – the people creating these machine learning systems want to conquer more than just writing and the arts. For the tech industry, writers and artists were merely the canary at the bottom of a technological mineshaft, with tech bros experimenting to see how long we take to suffocate. But their programs will threaten the livelihoods of a great many more people in the coming years.

None of this means that machine learning won’t be a tool used by writers and artists – it absolutely will be. But these programs must be a tool we use on our own terms. We must be the ones to set the rules for how these programs are used, not people like Sam Altman. Equally as important, we must demand that machine learning be for the benefit of the many, not an elite few.

The good news is that writers and artists are in a unique place to help lead this resistance. This year proved that writers and artists don’t have to sit passively by and wait for machine learning companies to simply screw us over. We can fight back.

The true threat posed by machine learning systems are in the motivations and intentions of the people and companies creating them, and in the dreams of the people using the programs. And by an amazing coincidence, exposing the motivations and intentions of people doing wrong along with creating new dreams is what writers and artists have been doing for millennia.

The flaw in machine learning programs is they can only approximate and rework what has already been created. They have no imagination. They have no true creativity. At the start of this essay I mentioned how as a writer I learn from bad stories. Humanity has long had an ability to improve on the bad. To build greatness out of the mediocre. To find a path forward even while walking through endless fields of crap.

Machine learning can’t improve on the bad. But writers and artists can. We can help create new art and new stories. We can work to ensure that our creations are not used to train AI systems – perhaps by using data poisoning tools like Nightshade, perhaps by dreaming up new ways of fighting back.

The fight won’t be easy. There will always be the temptation to take the easier path. To create what we’re expected to create. To dream what we’re expected to dream.

But to again quote Jason Farago’s essay in The New York Times, “Rather than worry about whether bots can do what humans do, we would do much better to raise our cultural expectations of humans: to expect and demand that art — even and especially art made with the help of new technologies — testify to the full extent of human powers and human aspirations.”

That’s what I’m aiming to do with my own writing in the coming years. I urge other writers and artists to do the same.

Genre Grapevine

Genre Grapevine on Machine Learning’s Problem with Bad Stories and Bad Intentions

Discussion about this post