Chatting With Nvidia’s Generative AI Characters Felt Like Next Level D&D     – CNET

Chatting With Nvidia’s Generative AI Characters Felt Like Next Level D&D – CNET

During March’s Game Developers Conference (GDC) in San Francisco, Nvidia set up a hotel room to show off demos of its generative AI-powered technologies. Some are available to try in beta while the more complex ones aren’t ready to implement in games just yet. They all harness generative AI in different ways, ranging from dynamically creating dialogue and responses for non-player characters (NPCs) to modernizing graphics in old games. Nvidia’s experiments explore how generative AI might augment tomorrow’s gaming experiences. 

Unlike generative AI tools like ChatGPT for text and Midjourney for images, Nvidia’s generative AI technologies are more experiential tools meant to expand a developer’s toolbox.

Read more: Final Fantasy VII Rebirth sets the curve for revisiting old games

Nvidia’s experiments come at a time of excitement and uncertainty for generative AI. Talk about AI’s impact on the games industry spilled out from scheduled GDC sessions to roundtable discussions to casual conversations. Developers worry about how AI will impact their labor. When I asked Nvidia whether their NPC chat generator could take away work from writers crafting dialogue, a spokesperson pointed out that writers still have to make entire backstories for the generated dialogue to draw from. Developers would have the option to install guardrails for how NPCs act. 

It’s hard to predict the impact of Nvidia’s AI solutions and that’s why the company was at GDC: to gauge developer interest. Nvidia’s experimentations offer developers a different way to build their NPCs, assuming the technology matures and works as predicted. At this point, we can only speculate which game genres would benefit most from such tech, and how it might change gameplay in the future. 

Hands-on…or rather, voice-on with generative AI

When I walked into the Nvidia demo space, I saw half a dozen stations, each with a monitor and PC with separate experiences ready to test. 

The first was ChatRTX, which lets users personalize a chatbot with their own content. It was revealed in January at CES 2024 and released free to the public in February. It’s only meant to canvas your own files, so an Nvidia spokesperson showed ChatRTX summoning photos of them hiking. Game developers could use it to surface files from asset libraries. Since searches are on the computer, search and document history remain private. The spokesperson suggested other potential uses for players, like generating trash talk in Rocket League on the fly while you play (he was clearly joking).

My next demo showcased NPC technology. In a sample game scenario, I played a private investigator tasked with entering the hotel room of a CEO. I walked up to the bellboy in the lobby. I didn’t enter prompts or sample dialogue. Instead, I talked to the bellboy through a microphone, asking real questions. He answered me with responses generated by Nvidia’s NPC AI tech.

It was a surreal experience, and frankly, I felt put on the spot to physically talk my way through a game. I bluntly asked for my target’s room number; no nuance, no plan. The bellboy responded stiffly, if politely, and told me to inquire with the hotel worker at the desk. When she proved equally buttoned up, I waltzed over to charm the CEO’s right-hand executive who also happened to be in the lobby.

A computer desktop, keyboard and monitor showing a first-person game sit on a black tablecloth-covered table. A laptop with game data also sits on the table.

The same Nvidia NEO NPC demo, but note the laptop on the left, which is running the background information for the non-player character which generative AI draws on to form his responses.

David Lumb/CNET

Since I had a short time with the demo, the Nvidia spokesperson suggested ways my private investigator could get what he needed in the game. There was a spare security badge lying around and I could use it to impersonate an employee, and there was a note written on a napkin behind the bar that could’ve gotten me in the good graces of the bellboy. 

While these felt like familiar ways to bypass an objective in a Deus Ex game, I was still stonewalled from the conversations. It required a lot of improv on my part and took more effort than just picking pre-written dialogue options. I wasn’t prepared to think my way out of these situations. It was a valuable challenge to connect me more to the character I’m playing, so it’s one way to further immerse players into a game’s world, but it does take a level of effort that some gamers may not want to spend on their relaxing pastime. Process aside, the game did generate cleverly apt responses to what I said. It almost felt like the characters were listening.

After I successfully (albeit clumsily) got the CEO’s room number, the Nvidia spokesperson broke down how the AI NPCs worked. The tech behind the characters’ responses is called NEO NPC, which Nvidia created in partnership with AI engine creator InWorld and game publisher Ubisoft. They also used another Nvidia technology, called Avatar Cloud Engine (ACE) to match the characters’ mouths with the generated dialogue.

InWorld created a back-end for each character that referenced an extensive background dossier. It’s the same kind of openness that Dungeons and Dragons or other tabletop RPG systems have had between players and game-running dungeon masters, but this demo showed how video games could get this functionality without needing a person supervising the experience.

Nvidia’s spokesperson, attempting to address the concern over AI in gaming, pointed to the demo as evidence that AI isn’t replacing narrative designers and writers who do different work hashing out an NPC character’s history, hopes, fears and desires that the player can discover and creatively exploit. Designers and writers can tinker with NPC conversational styles to give them guardrails and topics they’re willing to talk about.

A computer display shows a non-player character responding to the player's voiced interactions.

Ubisoft’s research and development team used NEO NPC technology to experiment with emotional responses to conversations players have with non-player characters.

David Lumb/CNET

More emotions in your AI NPC

The second NPC demo was hosted by Ubisoft and showed how the  NEO NPC project could add emotion to characters. I played a freedom fighter-to-be, starting with a conversation opposite and a beanie-wearing grizzled-but-earnest believer vetting my commitment to the cause of overturning a corporate dystopia. 

Once again, I leaned into the microphone and talked to the NPC believer who asked me if I was ready for questions. Now with more practice, I embraced the improv aspect of all this and said “It’s cool with me, daddy-o.” To which the believer, unfazed by my lameness, responded “Wow, you really are a cool cat. Anyways, what do you want to talk about?”

After a few interactions, I saw my relationship with the believer improve via a progress bar. Ubisoft’s technology analyzed the emotional charge of what I said. If it aligned with the believer’s values and temperament, it would add progress to the relationship bar. If I pleased my NPC friend, perhaps by bragging about my accomplishments or complimenting him, I could reach a higher level and unlock more interesting info, which in real-time conversation terms, means he’d open up about sensitive topics. Winning favor with one person might improve my standing with others in his faction.

While Ubisoft worked with InWorld’s AI models and used Nvidia’s ACE tech to match lip movements to generate audio for the demo, it was the game publisher’s research that added more emotion to NPCs.

“We have defined our own emotions, the character personalities, and we want to tweak the body language of each character so it’s a real character,” said Ubisoft senior data scientist Mélanie Lopez Malet, who led the demo.

For now, Ubisoft only takes into account what you say, not your tone. Ultimately, they don’t want to punish a shy person for not speaking more confidently, nor do they want to demand more performative energy from a player gaming at the end of a long day.

The demo, which Malet insisted was a proof-of-concept and not an eventual player-facing NPC AI experience, had three sections. After the first, I skipped the second scenario and jumped to the third, which saw my nascent freedom fighter planning a mission with an NPC resistance officer against a dystopian corporation. My goal was to make it past guards and cameras to extract crucial data. How I went about surmounting these obstacles was up to me. 

At Malet’s suggestion, I explored the freedom fighter’s HQ to take stock of assets, which inspired me to consider new methods to start the heist. Perhaps instead of ladders to get into the window, we’d use grappling hooks. Then to deal with the guards, we could use tranquilizer darts…and so on. 

There were hiccups in my conversations, as befits a proof-of-concept. At one point, I jumped ahead a couple of steps in the plan and had to be yanked back to finalize the method for the task at hand. The NPC resistance officer didn’t remember what solutions I’d previously pitched. Malet also recalled other demo attendees failing to convince the resistance officer to accept suggestions because their phrasing wasn’t confident or direct enough — what are called “emergent behaviors” in which the NPC interprets something differently than a human would based on subtle phrasing choices.

Two computer monitors on a black tablecloth show two different versions of a game: on the left, an older standard version, while the right uses Nvidia's technology to show far more modern lighting and shadow effects without changing the object models in the game.

The left display shows Half-Life 2 running normally, while the right display shows Nvidia’s RTX Remix tech at work. Note the differences in tree branches, textures and lighting.

David Lumb/CNET

Upscaling old games with generative AI

Another generative AI application Nvidia demoed was RTX Remix, which essentially remasters a game for modern graphics using ray tracing and Nvidia’s DLSS.

The example Nvidia demoed was the classic game Half-Life 2, which had bleeding-edge graphics when it was released in 2004. Two decades later and the seams certainly show, with angular trees, flat objects and low-resolution textures. With Remix, the game was upscaled to look like it came out a few years ago. 

Half-Life 2 with Remix won’t hold a candle to a modern ray-traced game, but it still looks leagues beyond its current resolution. What’s most fascinating is how Remix intelligently adds not just textures but light depth. A tree may have the same simple cylindrical geometry as it originally did, but the texture has added knots and bark divots that the lighting ingeniously curves around, simulating depth on a flat surface. 

Likewise, brickwork gains shadows from light sources despite the texture still being flat. An Nvidia engineer even popped into the console settings and tweaked the brick wall’s mortar to appear deeper, and shadows filled the gaps between individual bricks. It was impressive to see a game I remember playing half my lifetime ago get effectively remastered before my eyes.

A small hotel meeting room is lined with black tablecloth-covered tables supporting monitors and PC desktops showing various demos.

Nvidia’s demo space outside GDC 2024 showed half a dozen different generative AI-powered enhancements of the gaming experience.

David Lumb/CNET

So what does this promise games?

Nvidia has a few irons in the fire when it comes to using generative AI in gaming, but the most potentially revolutionary is figuring out the next generation of NPCs. Wouldn’t you want NPCs to dynamically respond to players to give them a more organic conversation? Wouldn’t that be better to give players more control over how a scene will progress?

Playing with — really, conversing with — NPCs that dynamically responded to my questions and picked up on my sass was pretty fun. In my Ubisoft demo, I was initially sympathetic to the NPC believer’s anti-corporate past and gained his favor, but then I proclaimed a love for corporations and received a frosty response. Likewise in the next heist scenario, I kept suggesting to end the careful plan by blowing up the wall and jumping out to escape on gliders. “This isn’t a video game,” the NPC resistance fighter crisply responded, and she told me to suggest something more practical.

Physically talking my way through a scene certainly took more effort than clicking dialogue options. It wasn’t just the act of speaking, but burning brainpower to figure out the best way to ask a question and craft it to fit the “role” of the character I was playing. Nvidia shrewdly compared the approach to playing tabletop RPGs, but those are typically group experiences where players get breaks between standing in the spotlight and dictating their actions and questions. Do video game players really want to have full conversations with every NPC they come across?

It’s empowering to have the freedom to needle any NPC to explore their hopes, fears and desires, but letting players dive deep in conversation with any unimportant character they meet could turn NPCs into distraction traps and occlude the game’s main storyline path. Letting players string out their interactions to eventually unearth the info they need means no control over how it’s delivered to players, as Aftermath pointed out, robbing the game of some of its crafted methods and moments, like the sort of controlled cinematic experiences that have made games like The Last of Us and Uncharted so beloved. 

Those moments in award-winning older games were skilfully crafted by writers and narrative designers, and given the concern about generative AI, it’s worth being wary of how implementing that tech changes how games are written and designed. The same goes for the voice actors who provided voice samples for the NPCs in these demos to auto-generate their dialogue — Nvidia confirmed that voice actors for these demos were compensated accordingly, but it’s worth wondering how voice acting could change with this technology. 

Looming over all these possibilities are the specs required to use generative AI in games. Every demo in the room was running at least an Nvidia RTX 4080 GPU, which each cost at least $999, and others were powered by the RTX 4090, one of the best consumer GPUs the company offers, which starts at $1,599. While Nvidia said that the features demoed in the room could run on the previous 3000-series of GPUs, none of the demo PCs had one to compare, and even those are still pretty pricey.

These features, even upscaling a game as old as Half-Life 2, require high performance. AI processing speed is measured in a metric called tokens, and AI models must process around 20 tokens per second to look seamless. The 4090 computes probably 100 tokens per second, an Nvidia spokesperson told me. It’s hard to imagine a 2000-series GPU or lower managing enough speed to use these features. 

Most or all of the features Nvidia had in its demo room are works in progress, and they’ll likely look a lot different once they’re ready to be implemented. It’s intriguing to see what games could be capable of with generative AI, and although it’s clear that these applications won’t work for every game genre (who needs an endlessly-talking NPC in a PVP team shooter, or a puzzle game?), the more technical and back-end uses will likely give developers more tools for years to come.

Final Fantasy VII Rebirth: Photo Mode Screenshots From the Frontier

See all photos

Editors’ note: CNET is using an AI engine to help create some stories. For more, see this post.

https://www.cnet.com/rss/all/

David Lumb

Leave a Reply