You have almost certainly seen the new AI chatbot and the new AI image generators. (I have written about them here and here.) Perhaps you have been blown away. Perhaps you have wondered what all the fuss is about. In my experience, people tend to be either totally fascinated by AI or find it unbelievably boring. I count myself in the “fascinated” camp, and I cannot help but be stunned when, for example, I can make a robot give a credible explanation of systemic racism using a pirate voice (and have it be less offensive and more illuminating than I expected), or have an image generator produce fake album covers by bands that never existed. But I understand why some other people aren’t quite as fascinated, and see obsessive chatter about AI as similar to people who obsessively talk about cars. (A guy unfollowed me on Twitter a while back, announcing to the world that while he liked Nathan Robinson’s writing, he couldn’t take any more of the stupid AI shit. I would like him to know that I have since dialed back the AI posting, if he would care to give me another chance.)
One reason I think people are unimpressed is that much of what has been produced with new AI tools is not very innovative. I am a member of some “AI art” groups on Facebook, mainly to keep tabs on what is being made, and they are absolutely flooded with images of big-breasted hentai girls made by horny dorks, or portraits of a muscular Elon Musk as a Marvel superhero (made by same said horny dorks). One reason artists have to hate “AI art” is that, in addition to the familiar ethical problems, so much of it is so bad. Likewise, when Google announced a music-generating model comparable to the image and text generators, listening to the resulting music, I found it deeply underwhelming. My first reaction was: “Wow, how many millions of dollars did you spend to create something that can sound like the preprogrammed backing tracks on a 1980s Casio keyboard?”
And yet, to understand why this stuff is interesting and worth thinking about, we don’t just have to look at what is being done now. We also have to extrapolate into the future, to think about what the likely next steps are and what they mean. For the moment, Google’s music model can take a piece of text like “fusion of reggaeton and electronic dance music, with a spacey, otherworldly sound” and create something that sounds, well, rather like that. But I don’t think it will be too long before we can enter text like “John Lennon’s voice singing WAP” and have something that sounds exactly like that. At which point, there will be some huge consequences for the music industry. After all, if I can just make a new Taylor Swift song by asking ChatGPT to process all of Taylor Swift’s lyrics and write the sort of song she might write, and a music generator to write music and perform the song as her, of what use is the real Taylor Swift? (Oh, and perhaps an image generator can even make her album covers, an experiment I have already tried, to mixed success.) There are, of course, a whole host of major copyright issues that are going to arise, and one thing AI does promise to do is generate a whole lot of jobs for intellectual property lawyers (until they too are replaced with AI).
One of the most unsettling demonstrations of the capacities of “generative AI” (not a great term, but the one we have) has come from Google, which showcased examples of “speech continuation.” Given a few seconds of someone’s voice, like the beginning of a sentence, Google’s AudioLM could produce a continuation of the speech that would seamlessly blend with the original. So, if I put in an audio recording of myself saying “Current Affairs is a magazine you should subscribe to because it’s full of…” the model could produce a continuation of the sentence in my own voice that would make sense. Someone has already used more rudimentary audio generation tools to produce a never-ending conversation between film director Werner Herzog and cultural theorist Slavoj Žižek (the artificially generated Žižek is, if anything, more coherent than his human counterpart). There is also a bizarre AI-generated parody of Seinfeld that is programmed to run continuously until the end of time and has captivated many viewers with its uncomfortable absurdity. The Seinfeld parody, however, is deliberately bad, and was clearly put together by people who wanted to showcase the “uncanny valley” aspect of AI. More unsettling to me is the possibility of someday having AI generate a fake Seinfeld episode that is indistinguishable from a real Seinfeld episode because it is trained on videos of the original series. Will we soon be able to create a new film starring Cary Grant, just by training it on old films of Cary Grant? If you assume we won’t, I don’t think you’ve looked closely enough into what is being done already.
The really shocking implications of AI come when we think about what it will be like when the technologies that have just debuted are combined. If ChatGPT can realistically generate interactive text, and AudioLM can realistically read text in anyone’s voices, and some third model can adjust ChatGPT to have, say, a simulation of a given person’s writing style (say, by being trained on all of my own articles, tweets, and texts), plus the powerful new video generation tools being developed, isn’t it possible that quite soon there could be a tool that allows me to have a video call with a quite realistic simulation of myself? Or one could build a mirror in which your reflection talks to you? Weirder still, it is quite conceivable, given what the tools that now exist can do, that a model can be built that will simulate the experience of speaking with a deceased loved one, by being trained on their voice, photos, and emails. (I have no idea whether anyone would want this, but I can imagine an industry preying on grieving people and offering them the chance to “keep” the person they have lost. In a world where you can already rent a family, I wouldn’t put anything past capitalism.) [UPDATE: I should have known this has already been introduced and such a company has now sprung up. Of course it has. And of course the simulated dead people will only get more and more realistic over time.]
Anyone who has interacted with ChatGPT knows that it is different in a very noticeable way from previous chatbots, which were always dysfunctional and easily exposed as non-human. ChatGPT can converse naturally, and while its creators have been very responsible in making sure the model constantly insists that it is not conscious and is just spitting out text, other less responsible developers might make models that were designed to make the AI seem as human as possible.
Based on what has already debuted, I genuinely do not think it’s science fiction to think we are quite close to having the ability to produce fairly realistic simulations of the dead. It’s already possible to produce fairly convincing “deepfake” videos—like this one of a fake Morgan Freeman—so “quite close” might be an understatement.
Deepfakes creep people out, because the sinister implications are obvious. As AI engineer Ryan Metz wrote for this magazine in a sobering analysis four years ago, “so-called ‘deepfake’ videos will make Donald Trump’s claims of ‘fake news’ that much more plausible and difficult to counter.” It will be a cinch to create realistic faked footage of anything you like, and when we live so much of our lives in a world of online images, it will be very easy to end up inhabiting a world of complete delusion without realizing it. Entire informational ecosystems may arise in which nothing is true but everything is convincing and indistinguishable from the truth.
Countering this requires trustworthy institutions that can verify what we see. It requires journalists whose job it is to figure out what’s fact and what is mere simulation. That’s why it’s so distressing that these technologies are coming about precisely at the time when people are losing trust in their institutions and the financial resources for deep journalistic investigations are drying up. Deepfakes would be much less of a problem if we had institutions that could reliably authenticate information by tracking down its origins, and those institutions were trusted by the public to sort illusions from reality. This is yet another reason why building reliable independent media institutions is one of the most critical projects anyone can support in our present age. If you thought the problem of misinformation was bad during the COVID pandemic, the tech is about to make it a whole lot easier to create convincing forgeries. It is already perfectly technologically feasible to create, say, an audio recording of me in which I confess to some terrible crime. What we need are those who can quickly and reliably answer the question of what is real and what isn’t.
Unfortunately, there are some very irresponsible people in positions of power. Elon Musk, for instance, has expanded Twitter’s blue verification badges so that anyone can buy one, even deranged anti-Semites. The fact that someone had a “blue checkmark” was never a very good guide to whether they knew what they were talking about, and it gave undue credibility to those in the D.C. pundit class, but instead of allowing even malicious actors to buy “verification,” it would be better to work on some more reliable way of helping people differentiate between “bullshit spewed by any random person” and “evidence-based information.” We have to be careful here because everyone’s definition of “fake news” tends to be “things my political opponents think” (I’ve been highly critical of fact-checkers who hide their biases). But some things are true and some are false and it’s going to be more important than ever to have ways of helping people tell the difference as they’re barraged with information. How will they know who to trust, if everything looks equally credible?
Fortunately, the developers of ChatGPT are strongly devoted to “AI safety” and seem to have done a pretty good job correcting for the biases that often plague AI. ChatGPT, if developed responsibly, could be precisely the kind of trustworthy technology that people can turn to if they want to know what’s real and what isn’t. So can Wikipedia. But we have to be careful that ChatGPT’s corporate ownership doesn’t slowly turn it evil, as happened with Google. When the profit motive meets disruptive technology, the results can be socially toxic.
Let us return to the “creepy” side of what we face, though. One of the warnings that has been issued about deepfakes is that, like almost any other technology, once it exists it will be used to make porn. When realistic simulations of people can be made with only a few images and some snippets of their voice, the possibilities for humiliating and hurting people are vastly expanded, even if we know the difference between what is real and fake.
AI also threatens to turn dystopian because of the loneliness crisis. Just recently, the Wall Street Journal ran an article on how “tech can help” with the loneliness of seniors, through giving them wonderful things to do on virtual reality headsets. Again, when we think of how the various technologies we already have combine (speech generation, image generation, language generation), we can see how easy it would be to create a simulation of yourself that can hang out with grandma so that you don’t have to yourself. I think it’s perfectly plausible that the world depicted in the 2013 film Her (in which a man develops a relationship with a Siri-like AI virtual assistant) will soon come into being. Siri is still very rudimentary and frustrating, but in a short time the generative capacities of ChatGPT will surely be integrated into all of the virtual assistants, making them much better and more eerily lifelike (and of course, they will learn from their mistakes and get better quickly).
In his 2018 article for this magazine, Metz warned that the much-discussed risk of AI “superintelligence” taking over was quite low, but there is a much more serious risk that human beings’ isolation will increase and their relationships with machines will displace their relationships with one another. Robots, of course, don’t have human flaws, and many people may find they prefer the company of a robot programmed to realistically simulate love and companionship. I am certain that many of the Lost Lonely Men who follow Jordan Peterson, and who complain about their lack of luck with women, will see much to like about the idea of a simulated woman who is entirely compliant and does not have to be respected or listened to in any way. The internet has already made us angrier and less inclined to spend time with one another. When simulated realities become even more realistic, what will that do to our ability to work together and love one another?
I have not even discussed the labor implications of new technology, which we all know about. Watching the recent Avatar film, I was naturally struck by the sheer length of the credits, the thousand-plus people who had to work to create the stunning visual effects that are in every frame. But looking at image generators, and realizing we’re only just beginning to develop them, I couldn’t help but wonder how many animators are going to be put out of work in the very near future.
Metz also mentions military applications of AI, which I don’t even want to think about. I’ve written before about the horror of “autonomous drone swarms,” which can select targets without human intervention, and which the Department of Defense is openly investing in. Thanks to the irresponsibility of leaders in the U.S., Russia, and China, we are entering an insane new arms race that, coupled with the unpredictable and stunning power of AI, may quite easily end in Armageddon if the leaders of the world’s most powerful countries cannot work together for peace.
AI is going to change our world a lot in the next decades. It opens up some possibilities that are merely creepy, such as:
- Reanimating a dead actor in a film
- Allowing people to communicate with a simulation of a spouse after their death
- People having more relationships with simulated people than with real people
- Being able to make new Beatles songs that sound exactly like they might have been real Beatles songs
- Chatting with a simulation of yourself that is hard to tell from the real you
Others are truly horrifying, such as:
- Deepfakes used to ruin lives or destabilize society
- People disconnecting entirely from reality and living in a bizarre simulation
- The development of even deadlier weapons by superpowers that are inching closer and closer to war with every passing year
- Mass AI-induced unemployment and the opportunities for reactionary totalitarian demagogues that social unrest provides
I don’t know that we are prepared to deal with this, but I think all of it might be heading our way quite soon. The somewhat more uplifting news is twofold: first, the technology is not all creepy and horrible. It can also do things that are cool, like automating tedious tasks and democratizing learning. (And personally, I have a lot of fun making weird images with the generators.) Second, it may force positive social changes, by making the alternative impossible. If we don’t rein in the arms race, it is clearly going to kill us all. If we don’t provide a basic standard of living for all, there is going to be colossal disruption from tech-created unemployment. If we don’t build trustworthy media institutions, huge swaths of society will become entranced by dangerous delusions. If the tech is coming, we have no option but to do what is necessary to make sure that the merely creepy doesn’t lead to the outright horrifying. We have to democratically control both what technologies are developed and how they are integrated into society. I have confidence it can be done. But everyone owes it to themselves and to those around them to understand what could happen in our lifetimes if we don’t control the effects of new technologies and simply let the most catastrophic consequences unfold.