The Only Way To Win

A more opinionated take on the question of why there haven’t been any standout AI games


Last week, Frank Lantz published a post called “Why No AI Games?” noting how the prominence of increasingly sophisticated LLMs hasn’t resulted in any standout AI-based games or new forms of gameplay. He offers a few theories as to why, eventually concluding that there’s just nothing inherently fun in this kind of unpredictable, “soft logic” computational model.

It’s interesting, and I encourage anybody interested in the topic to check out the essay and the discussion going on in the comments. Since I’m somebody whose answer to the question is, “There aren’t any because it’s impossible because generative AI is stupid and bad,” it’s probably a good opportunity for me to take a step back and consider the question a little more thoughtfully.

I don’t fundamentally disagree with Lantz’s conclusion, such as it is. There isn’t anything inherently fun in engaging with an unpredictable, anonymous system, once you quickly get past the initial novelty.

Even if you have such a loose and generous definition of “game” that it includes slot machines, the engagement is empty without that extrinsic reward of getting a big payout. Whatever fun there is to be had from seeing a black box generate unexpected outputs, it’s short-lived if you don’t have a meaningful way to interact with it. You’re just poking at a black box to see what happens.

Just Let It Cook (The Planet)

I qualified “conclusion” with “such as it is” because Lantz seems insistent on refusing to come to a definitive conclusion. I think I understand why; if you’re taking a detached, academic, and theoretical approach to the question, you have to be careful not to claim that you can prove a negative. “Just because we can’t think of a fun application of this technology, surely that doesn’t mean no one can.”

But that’s frustrating, because I think we’re already past the point of being able to say “let’s sit back and see how all of this plays out.” There are too many fundamental issues with generative AI, each of them inseparable from the others, and they don’t just magically disappear when you choose to look at it through one specific lens.

One of Lantz’s theories as to why AI in games hasn’t taken off is labeled “Culture Wars.” It’s the idea that there’s currently a backlash against generative AI that as of now is discouraging people from pursuing ideas that take full advantage of the potential of LLMs.

I want to be clear that I’m not interested in calling anybody out, or assuming intent from an essay that wasn’t intended to be a value judgment on gen AI beyond this specific purpose. But I still don’t love how the whole conversation sounds. I’ve heard too many AI proponents insist that the only reason people are so vehemently opposed to it is because they haven’t really used it, or they don’t understand how it works.

So even labeling it as a “culture war” assumes a level of distance that comes across as condescending: let’s allow people to get this tantrum out of their system, so we can go back to an honest and objective evaluation of this groundbreaking technology. It assumes that people are yelling for the sake of yelling, instead of raising genuine, fundamental issues.

And one of the most fundamental issue is one that Lantz touches on, with his theory labeled “Business Models:”

“It’s very hard to build a real game around core functionality that you are paying a third party to supply. […] This dynamic also discourages developers doing small experiments and releasing them for free, hoping to go viral. The incentives are all wrong. Developers are highly motivated to hit the model as little as possible, to use cached, pre-generated responses or find other workarounds. I’ve also built game prototypes where the whole experience changed dramatically, for the worse, because the model I was building around changed in ways I couldn’t understand or control.

The business model is such an inseparably toxic part of generative AI that I believe any discussion that doesn’t include it — not as a problem to be inevitably solved, but as an inherently fatal flaw — becomes so vague as to be all but irrelevant.

It makes it difficult if not impossible to claim that it’s a pure thought experiment, along the lines of “what kind of gameplay would evolve if you had a non-deterministic thinking machine trained on all the world’s knowledge?” Because in reality, it’s more like asking “what kind of fun emergent gameplay would result from giving every player a fighter jet or an F1 car?”

It’s especially frustrating when combined with Lantz’s closing paragraph: “I’m sure that, even as I type these words, there is a clever teenager somewhere proving me wrong.” That’s the same kind of hand-waving that infuriated me so much in Good Luck Have Fun Don’t Die, because it perpetuates this fantasy that groundbreaking new technology is coming from scrappy young geniuses hacking away in garages or bedrooms somewhere. Instead of the reality, which is that it’s been the result of years of very rich people pouring ludicrous amounts of money, labor, and resources into developing systems that they control.

Unless that clever teenager has a few hundred billion dollars in a trust fund, and teams of underpaid people tagging content, and teams of lawyers coming up with new ways to circumvent laws around copyright and hoarding resources, it ain’t happening.

An essential part of the hype around AI is based on the idea that, like almost everything else in tech, it’ll just keep on improving indefinitely. It’s not optimistic; it’s deceptive. It distracts from the fact that it’s taking an unsustainable amount of resources to keep providing diminishing returns. I haven’t ever seen any indication that these systems scale efficiently at all.

It just can’t be stressed enough: they’re not giving you tools to create your own stuff. They’re trying to centralize everything under the control of as few people as possible, including the act of creation itself.

Ignore All Previous Instructions

Okay, okay, but what if that weren’t true? What if you could actually separate all of the legitimate concerns about the technology from the technology itself? What if you had an ethically-trained model that could run locally and efficiently, but still somehow have the fidelity of the current versions of “Open”AI’s and Anthropic’s models?

Even with that fantasy scenario, the outlook isn’t good. I can’t see the point or promise of using the tech at any stage of development.

Split it into two broad categories: one where interacting with an LLM is the gameplay, and one where it augments a game.

Lantz mentions a few existing experiments, with the LLM either offering gameplay prompts, or presenting chat bots that you interact with. And it’s pretty easy to come up with formats that feel like they might have potential:

  • AI as dungeon master
  • A werewolf/mafia type game that’s a Turing test where you have to identify which player is the AI
  • A Dixit type game where the AI generates text or visual prompts that the players need to make sense of
  • Improv games where you’re playing out a scene against (or driven by) the AI
  • Improv games where you have a specific goal to achieve by communicating with an AI-driven character (like the Suck Up! game Lantz mentions)
  • A telephone game where you give an LLM-powered chat bot instructions and another player has to deduce what those instructions were (e.g. “respond to everything in the style of a Victorian genius detective”)

As I see it, the fatal issue with all of these comes down to two things: coherence, and fidelity.

First up: coherence, or the ability of one of these systems to retain information across iterations. Obviously, I’m biased against generative AI, so I haven’t been interested in keeping up with the “state of the art.” But at the time I stopped giving a damn, it was one of the key issues that people were trying to solve.

It’s easiest to see in generated video, and it’s why — as I understand it — video clips can’t be longer than a few seconds before collapsing into nonsense, characters will regularly go “off model,” objects horrifyingly morph into different ones, attempts to recreate DOOM or extraction shooter levels from prompts will quickly turn into Overlook Hotel geometry or worse, etc.

Presumably, it’s easier with text, since there’s simply less data involved. And considering the number of reports of troubled individuals who’ve become convinced that OpenAI built them a computer partner with a distinct personality, it’s presumably gotten better at faking long-term memory, at least.

But I think anything you can reasonably define as a “game,” no matter how simple it might be, requires a “game state” of some sort. And it doesn’t scale well at all, even for systems that seem to have solved the problem of indexing unfathomably complex data. Each new variable you introduce adds a huge layer of complexity, because it’s not actually doing any reasoning, but manipulating data sets. Again, as I understand it, you’re inevitably going to reach a point where the system starts contradicting itself or simply spitting out nonsense.

You Were More Fun When You Were Stupider

That might seem like another problem that will inevitably be solved. After all, it wasn’t that long ago that the idea of a natural language chat bot that could give any kind of coherent response to a prompt was pure fantasy.

But that’s also why I think “fidelity” is the fatal flaw: the closer these systems get to approximating general intelligence, the less interesting they get for anything creative. Making them “better” for utility makes them worse for fun.

We’ve already seen this play out on Janelle Shane’s AI Weirdness blog. The more “advanced” these systems get, the less interesting they become. They’re deliberately eliminating every weird output that we could interpret as creative or even amusing.

Interacting with the simpler models can’t really be described as a “game,” since the workings are mostly impenetrable and unpredictable. You can’t really keep prodding at the model to figure out what it’s going to do next; you can only be mildly surprised when it spits out something different.

Interacting with the more sophisticated models will inevitably approach natural language Google searches, or at best, replacing a player with the dullest person you know. As long as that person, instead of imagination, had a quirk of confidently and inexplicably shouting out random nonsense. Improv with an extremely well-read but unstable weirdo.

At either extreme, you don’t get that meaningful spark of the unexpected. It’s either completely predictable in a way that adds nothing, or it’s completely unpredictable in a way that’s more chaotic than creative.

Maybe you could make a game with an AI agent who’s a mutineer? Where you have to keep accomplishing tasks before the one crew member has an episode that brings everything crashing down?

In any case, I’m a lot more confident that we won’t see anyone come up with the breakthrough that makes interacting with a large language model game-like, or even fun. I’m more optimistic that the underlying machine learning technology could be used to analyze complex game states and result in better AI players — the Dominion mobile games use this approach, and by all accounts are excellent — but generative content is a dead end. Too chaotic in its current state, too boring in the future.

Usually it’d be a bad idea to come up with a list of obvious possibilities for something like a style of game, reject them, and then extrapolate from that to say it simply can’t be done. But I say if it can’t even handle the obvious cases, why should I expect anything from the edge cases that no one’s thought of yet? It would seem like expecting someone to come up with Tempest without anybody first demonstrating how fun Pong is.

It’s bad at retaining information, and it’s getting increasingly worse at meaningful or interesting improvisation. Occasionally, it inadvertently comes up with a funny bit of wordplay, like with the question “Is Marlon Brando in Heat?” but you can’t rely on accidents like that for anything resembling long-term fun.

Supply, Demand, and an Arrow to the Knee

The question I keep going back to is “what exactly are you trying to accomplish?” or more accurately, “who are you trying to replace?” What is it that keeps this from being trillions of dollars of solution in search of a problem?

Conveniently for someone like me, who’s against all of it, that leads directly into the topic of using generative AI to develop a game. (Something that Lantz asserts is already happening everywhere, as if the utility of it were patently obvious).

The alleged promise is games that are more perfectly reactive and responsive. In-game agents that can engage in natural language conversations with the player, giving reasonable responses to whatever they might say. Or a system that can create game levels on the fly, giving players an infinite number of spaces to explore, and maybe even throwing in obstacles or rewards in real time, in response to what they’re doing.

Again, being as charitable as possible, assuming that everything laughably wrong with the technology now is merely early steps, and we will some day be able to see the over-hyped promises become reality: so what?

There’s a reason that the term “slop” has stuck, and it’s not just a bunch of nay-sayers being overly negative. It’s because the idea of supply and demand can extend to areas outside of economics. The more “content” you have, the less valuable each piece of it becomes.

To be clear, I do genuinely believe in all of the eloquent arguments that have been made about how real, human-made art is inherently superior to generative AI, because the art is in the process of making it as much as it’s in the end result. But it’s just as important to emphasize how much the cut corners are evident in the end result.

Even in my own work, even if I don’t remember the process of writing something, I can immediately tell the difference between when I struggled to word something exactly the right way, vs when I was slapping something together to get it done.

And I’ve long been critical of BuzzFeed as being the worst offenders of the Age of Disposable Content, even though there’s often not anything egregiously wrong with the writing itself. It’s usually not badly written, and it’s often a lot more accessible and focused than the stuff I write. But it always has the unmistakable feel of being written to fulfill a quota, instead of being a sincere expression on the part of the writer.

One of the tenets of being a programmer is “the less code required to accomplish a task, the better,” which is why using AI to generate more code has always seemed like a huge mistake. That extends to writing, as well: the secret to good writing is never “lots and lots of it.” It’s the skill that I’m still constantly trying and failing to perfect: distilling everything down to the perfect wording, being memorable and saying exactly what it needs to.

Even if you somehow had an in-game agent that could be trained on all of your game’s lore and history, could be trained on exactly the character’s life story and specific voice, and could reliably respond correctly and convincingly to anything the player asked, what are the odds it’s going to come up with something that’s more than functional? That perfectly evocative turn of phrase that sticks with you for years afterwards?

If the character is just supposed to be functional, then the longer you engage with them, the more it will obfuscate their function. Whatever idea(s) they were trying to get across will be lost in the sea of words.

If they’re supposed to have actual personality — for instance, to be evasive — then the unpredictable nature of an LLM is a liability, not an asset. If what they say can change each time, there’s no guarantee that they’re not saying something misleading, wasting the player’s time.

If they’re just there for “immersion” as part of the background, then what’s to be gained by giving them a complete (and expensive) LLM, instead of a finite set of succinct and evocative lines of dialogue?

The more capable they are of engaging with the player, and the longer they spend doing so, the more they get elevated from background to major character. Outside of a game like the Fable series — and it remains to be seen whether their promise of “every single character is alive!” actually delivers — then adding more stuff to each of the characters doesn’t actually create an immersive world. It just becomes noise.

It’s become a mantra whenever generative AI is discussed: if you didn’t care enough to write this, why should I care to read it? And it’s more than just a quick dismissal; it’s asking everybody to take a step back and consider what exactly we’re actually trying to accomplish.

The answer is always some form of “stuff I don’t want to do, and lots of it.”

And it’s a drag, because not only does it threaten to bury players under slop that the developers don’t care about, but it’s prone to drowning out the stuff the developers do care about. You can see the economic incentive to reducing creativity to “content” and delegating it to someone else’s computer, but it’s discouraging to see the developers themselves buy into it.

Just one seemingly innocuous example sums it all up: you often see games on Steam that use a gen-AI image as their store thumbnail. Even if it’s not anywhere near the style of the game itself, it attracts attention and (if you’re not looking closely) gives the game a more “professional” sheen. But it means that whatever creative or original might be contained in the game, you’ve chosen to give the player this unrelated image, created by someone or something else, to represent everything inside. You’ve abdicated your voice in the one thing that could have been the strongest distillation of it.

Not Exactly What I Had In Mind

The appeal of using generative AI, whether in creating static assets or actually including in the live game, is either money, time, or skill.

I’ve got some degree of sympathy for the money argument; for projects with very limited budgets, it can be tempting to go with a slick piece of cover art, or a fake voice actor, or some AI-generated music. But again, I’ll point out that every time I’ve seen a game with AI-generated cover art, it’s felt like such a bait and switch that it’s killed any interest in the game itself. Not even the old Atari 2600-covers style bait and switch, either, where the cover painting was so evocative, even if it had nothing to do with the actual game. The art that I always see associated with game storefronts is invariably the slickest, blandest take possible on whatever might have made the game itself unique. If you can’t afford to hire someone to collaborate with, I’d always prefer an amateurish but sincere take.

And even on projects I’ve worked on that did have the budget for actual talent, it’s always been the case, without fail, that a real musician, voice-over artist, or character designer has delivered something better than I’d imagined. The promise of generative AI is that, at some point in the always-near future, you can describe what’s in your head and the computer will deliver exactly that. Even if that were possible, that’s such a depressingly low bar. We don’t need trillions of dollars of investment when we have people who can deliver something better than the version that’s in your head.

I absolutely understand how writing can feel tedious, especially when it’s something that feels completely functional. There were many times during the Sam & Max games when I cursed the decision to have every interactive dialog include four choices, because it so often felt like scraping the insides of my skull to try and come up with a fourth option that was funny at all. The correct answer would’ve been to just drop that requirement, but even the push to come up with something often forced me to come up with a gag or an idea that made the exchange come to life and feel less like it was purely functional.

I’m not going to pretend that there’s a special kind of magic that gets unlocked by toiling over a bunch of random barks and lovingly hand-crafting each one. But occasionally you strike gold with a good one, and that becomes a stand-out memory of the game.

And even if that’s not the case, any time you’re writing something that feels functional and tedious, and tempted to go to the generative AI tools, it’s a perfect opportunity to consider how much of it is even necessary. Again, if it’s tedious for you to write, how is it not going to be tedious for the player to listen to?

The writing is what I’m calling out, because it’s what I’ve got the most experience with, but I think the idea applies to everything. Including any “live” content like AI agent NPCs or AI-created (as opposed to procedural, which is significantly different) levels. Is generative AI making things better, or is it just making more? And are you wasting the player’s time, giving them more stuff that you care so little about that you’re content to relegate it to a semi-predictable black box you don’t own?

Ultimately, generative AI is a land rich in contrasts. It’s here and inevitable, whether you want it or not; but somehow also demands that you get on board right now before it’s too late. It is both all-powerful and yet always in a state of becoming. It’s improving at a breakneck pace and yet still has laughably, unmistakably unusable results as often as not. It’s a tool to enable workers that also promises executives and managers that workers can be laid off in record numbers. It is simultaneously so sophisticated that it might even be conscious — who can say, really?! — and yet it demands that you use it the correct way for specific cases that are within its still-unspecified problem domain. It has access to all the world’s knowledge and has rendered Google all but unusable. It is the future that’s already here. It is infinitely scalable and yet requires buying up literally all of the planet’s RAM supplies. It is absolutely not a scam.

When you’re dealing with something that is Everything And Nothing, it can be easy to overlook the most basic questions. In terms of games and game development, I think the basic question is what are you trying to accomplish? And is this really a useful tool that can help you express your own voice? Or is it actually supplanting creative work in order to churn out a supply of stuff that drowns out your own voice?

Leave a Reply

Your email address will not be published. Required fields are marked *