Your next best friend may be 100% AI w/ Purnendu Mukherjee (Transcript)

Listen along

The TED AI Show
Your next best friend may be 100% AI w/ Purnendu Mukherjee
June 18, 2024

[00:00:00] Bilawal Sidhu:

It is the 26th century. And you're moving through a world as a cyber genetically enhanced super soldier on the planet of Reach. You have a big mission in front of you. As Master Chief, you are the last hope for humanity against a hostile alliance of alien forces. Only you can stop the covenant. Well only you with the help of your trustee AI sidekick.

A hologram with a cropped haircut and shimmering ocean blue skin. Cortana.

Actually, it was 2001 and I just hooked up my Xbox to play. The first series of Halo Combat evolved. I was 11 at the time, and my brother, our friends, and I would pile into our Punjabi living room to play a campaign in split screen mode. I was totally sucked into the game, invested in the story, and Cortana was key to that with her humor and emotional depth. Even though she's just a blue hologram, ironically, she was the key to the game's humanity.

She was my trusty co-pilot guiding me through these alien worlds, and throughout the game, I felt a real bond starting to emerge. It seemed totally novel at the time, but I was forming a friendship with a non-player character.

I am Bilawal Sidhu, and this is The TED AI Show where we figure out how to live and thrive in a world where AI is changing everything.

Now non-player characters or NPCs for short have always been a big part of video games. In single player text-based computer games players could interact with NPCs in a very limited way. Just putting in a very specific command and having the NPC respond from a predetermined script. Think the original King's quest where players type directly into text box to interact with NPCs who were there only to move the plot along.

This evolved into single player games like Halo, where NPCs like Cortana had more dynamic personalities and quirks written for them. Though we're still limited by the design of the game and therefore static. Now, if you think about multiplayer games, they allow for humans to human interactions, think Second Life, where you're interacting with other players through their avatars.

What's so immersive about that is you're engaging with the real people. Who are responding dynamically to you and you're responding dynamically to them, human to human. NPCs exist in this world, and while they can be interesting, you would never confuse these NPCs as fellow gamers. NPCs just haven't been real enough to be mistaken for actual people.

Up until now, these NPCs have been relatively passive. But now with Generative AI, those NPCs no longer have to choose between a limited number of responses they can script in real time, just like a human player would reacting to what's happening inside of the game world. Suddenly video game worlds can be infinitely more immersive and interactive. As these virtual characters become more integrated into our daily lives and maybe even become our friends.

Are we gonna start spending all our times in these virtual worlds? Will our interactions with NPCs start to become unrecognizable from our interactions with humans? And what do we gain or lose if this happens? This is the domain of Convai spelled Con V-A-I, a platform that enables developers to create NPCs with human-like conversational abilities.

Convai says their goal is to help developers create virtual characters that can converse in the present moments and build long-term human relationships with human players. There are over a thousand projects being actively built on their platform by creators ranging from AAA studios to indie developers.

And what makes Convai so interesting is that they're making technology that enables game developers to create NPCs that are so lifelike. It won't matter who or what is on the other end.

Purnendu Mukherjee is the CEO of Convai, and our guest on the show today, he has a lot to say about our evolving relationship with NPCs, or as he likes to call it, AI characters. And like me, he was a total gaming nerd. Purnendu, welcome. I'd love to start talking about the origins of Convey. I get that you were a gamer before you worked at NVIDIA, but obviously there's a lot of areas in game development you could have branched into.

Why were you drawn to the notion of AI NPC specifically?

[00:04:35] Purnendu Mukherjee: Before I started at NVIDIA, when I was doing my thesis work, I literally saw this language model wave coming. I wrote this “That while language models are going to get bigger and better and will potentially even have, um, abilities to have human-like conversation, it is still not going to have the same level of understanding as we humans do, because we humans don't think from text in text out or text to text.”

We think from a 3D world around us. Right. Since we are born, we first understand locomotion, like moving around the 3D world, and then we attach words to these objects. We assign meaning, right? So, so basically we are multimodal creatures.

Where do we find such a multimodal environment where we could potentially have these AIs live, train, and, and, and iterate themselves? Virtual worlds. What kind of virtual worlds would we have? Uh, people that can provide feedback to this AI, uh, heavily populated world are games, right? So all those connected together, like if we have to create this human-like mind within a virtual world, NPCs, non-player characters, or let's call it AI characters embodied in a way are one of the best vehicles to do that.

[00:06:03] Bilawal Sidhu:

It's almost like, you know, we've got these rich environments where you can embody this sort of AI agent and have it experience very similar stimuli to what we might experience in the real world, which is a perfect segway into the evolution of AI and PCs. Right? This is a non playable character. This is sort of set dressing the side thing in this, like, you know, kind of like the side dish to go with the main course, which is the game itself.

[00:06:29] Purnendu Mukherjee:

Right.

[00:06:29] Bilawal Sidhu:

And so I'm kind of curious, um, what's your, what's your historical perspective on how AI NPCs have evolved from their earliest stages to the complex entities that we see today in games and virtual experiences?

[00:06:43] Purnendu Mukherjee:

To talk about the history of games. I mean, um, there are these pioneering genre defining games all the way from Half-Life that, you know, like single handedly define the first person shooter genre.

And of course, Counter-Strike like the multiplayer aspect of it, that you could have many people in the same world, right. In, in, in various ways of gameplay, basically revolutionized gaming, you know, they could play with each other, so you don't need new gameplay as long as people can involve with each other.

So, so like that has evolved. NPCs has of course gotten better mo mostly, uh, on the visuals or, you know, animation front, uh, but not on the intelligence front as much so. What we're seeing, it's almost like a Cambrian explosion of, uh, characters and, and AI agents that can not only be very human-like in terms of interaction with the players, just like players played with players.

Now, AI can also play with players both as friends or enemies, you know, cooperatively or competitively.

[00:07:51] Bilawal Sidhu:

I, I think it's this magical moment where now we've of course got these behemoth, large language models, but you also have the, you know, kind of multimodal models hitting, hitting the scene where it's not just that they can understand text, these models can understand audio, can understand imagery, can understand even video, right?

And so. I'm curious to dig into how that affects human AI relationships. So how do you see AI NPCs sort of changing the nature of player engagement and emotional investment in both games and experiences?

[00:08:24] Purnendu Mukherjee:

Firstly, the way these NPCs are becoming very human-like there is going to be a large set of people in the world that will big time benefit from it, mainly because there is a big, big chunk of players who don't like engaging with real people or are, or are nervous or afraid to do that.

Uh, they feel much comfortable if they know it's not a real person and that'll help them open up, they'll help them socialize. In terms of people that let's say are playing single-player games or multiplayer games, now they can engage with this, uh, with, with the set of AI characters and have a more engaging time, ideally.

And lastly, let's say if it's in a multiplayer environment, people will still enjoy engaging with people, but now they have another reason to have fun together with other people. And then from a relationship standpoint, basically, I think it is important for, for companies like ourselves to to look ahead in terms of the positives as well as the dangers.

It is gonna quickly fill in the gaps where a human doesn't exist today, right? Whether it's, uh, uh, you know, like just being friends or from a romantic angle or, or maybe someone that is a mentor or a guide and not just like Chat GPT, like text in, text out, but very much gamified, immersive environment.

That can, um, that can reach them and they can, they can effectively have this mentor of an, of an AI. So I think overall, I definitely am an optimist and I see the positive sides. There could be, um, potential darker, you know, size dystopian sides that needs to be, uh, addressed and understood and informed.

[00:10:09] Bilawal Sidhu:

I mean, I, I think it's, it's very fascinating, the, like it's one thing to talk to your Chat GPT app, right?

And you see a voice emanating from your phone. It's another thing entirely to, let's say, be talking to your mentor, you know, and it's embodied as like a humanoid character that like has the same sort of expressiveness that you do. It suddenly becomes this sort of more lifelike experience, right? And so as NPCs become more lifelike.

What are those ethical considerations that come into play? Espes, especially regarding player relationships and, you know, AI behaviors.

[00:10:44] Purnendu Mukherjee:

The number one thing that I think AI needs, and this is a bit controversial, but like if you, if you think deeply enough, right? The biggest, uh, fear for AI is centralization and a few entities responsible for these relationships.

When a kid grows up, uh, with an AI teacher and mentor and that's their all, you know, uh, that's a level of relationship that no company in the Earth should own. It's theirs absolutely. Wherever they want to take it. And, um, what can enable this? I think decentralized block blockchain technology can provide true ownership.

That is gonna be very, very essential along with confidential computing, uh, that can, uh, help ensure that, that their determine remains theirs, their memories and relationships remains theirs.

[00:11:43] Bilawal Sidhu:

Yeah, I mean, you brought up a bunch of very interesting points, right? It's, it's like, um, if you do have this future where, let's say we have an oligopoly of companies that sort of own the models that mediate your relationship with these digital characters, right?

And especially if this is like kids who are like growing up talking to these, you know, NPCs or AI agents, whatever you want to call them. Yeah. You are building. Like a very rich history of sort of their hopes, wishes, anxieties, worries, desires, and how those evolve over time. Right? And you know, these agents are getting to a place where it's not just like, “Oh yes, you say this and I will say this.”

It's like they remember the context and glean insights from your previous conversations. And you know, maybe the right way to solve that is with decentralized AI.

[00:12:30] Purnendu Mukherjee:

Right.

[00:12:36] Bilawal Sidhu:

And you alluded to confidential computing as well. It's like. Can we keep this sort of very rich data, um, you know, as close to the user as possible? Or, you know, and even if you are going to learn and improve models off of it, you do it in this like privacy preserving way, which you know, itself sounds like a tall order.

I'm really keen to dig into sort of the experiential and sort of emotional impacts of these type of AI agents. Right? And so like one example that I keep going back to is, “Hey, even if you're playing a solo, single player game. You could have this like AI J.A.R.V.I.S or Cortana that sort of understands you, your context and is with you helping you navigate this sort of game world.”

Um, how close are we to having that sort of companion in games?

[00:13:23] Purnendu Mukherjee:

The primary aspect that is missing would be the, I mean, to some extent, right? So, is the multimodal aspect of it for that to happen at scale, like very much similar to J.A.R.V.I.S of Ironman. Uh, is. Aware of the entire context. That means every nuclear cranny of the room that you are in, you know, like they're able to see that process that along with your, uh, digital presence.

So, we are not far away. Like, uh, it may not be at a hundred percent of Ironman's J.A.R.V.I.S capability, but like if you play the game, it'll feel like that.

[00:14:02] Bilawal Sidhu:

How does all of this change the way these experiences are authored? Right. The, the analogy that keeps coming to my mind is sort of the narrative division of Westworld.

[00:14:11] Purnendu Mukherjee:

Mm-Hmm.

[00:14:11] Bilawal Sidhu:

Is that the right way to think about how you're authoring these type of experiences with these rich characters?

[00:14:17] Purnendu Mukherjee:

The emergent nature of large language models are very interesting. The narrative designers and writers are still the better storytellers, right? We have to have a right mix of both controlled, evolution of characters, controlled evolutions of stories, as well as the open-ended and emergent behaviors, uh, of these generative AI models, which is where the balance and challenge is, and we are providing all the necessary tools for developers and designers to do that.

To come back to the Westworld analogy that, you know, not only did Westworld had these crafted characters that, um, that were one of some of our favorites, they evolved, right? They evolved into something else entirely than what was originally written, right?

And, and evolved in a very meaningful manner. Remember what was in Westworld that started, the whole thing about this characters evolving? Is memories. Once you have that, their personalities evolve. They remember things and not only are it's enough to just give them long-term memory, you have to keep them up all the time.

Go about their day, make decisions, interact with other real people or other AI, and those interactions will change their decisions and their pathways. And some will be very high intensity experiences and that will shape their personalities. And this will be ideally best put in servers that can have up to like 250, 500 people.

These, these AI and PCs will always be there living their life. You know, like, uh, their experiences will, will shape them and they will start making decisions that will be quite bewildering. We already see that, you know, like we did this NVIDIA demo where we had now two characters and both of them were chatting and, uh.

Like, we just let them talk to each other, right? Like just to see how, how what they chat about. And while chatting, they Nova made an order that, “Hey, can you bring me a drink?” And Jin was like, “Sure, let me get that for you.” And he went ahead and brought a drink and gave it to her. And uh, it was so wild that they are not only just talking with each other, they're like carrying out actions and they're giving commands to each other.

[00:16:27] Bilawal Sidhu:

The demo that you're referring to, which went stupid viral on the internet. I think it got everyone excited to just imagine where games are going next. Right. And so I'd love for you to talk a little bit about the types of experiences and characters people are building with Convai that are exciting to you both for gaming and non-gaming purposes.

[00:16:44] Purnendu Mukherjee:

The non-gaming ex examples are primarily in learning and training edu education related areas. And then there's brand ambassadors, which could be in the likeness or digital human of a celebrity or a nondescript model, um, who is AI powered and knows all about the brand, can guide people how to use the product and whatnot.

These AI characters can be even location aware, like you scan a particular QR code anywhere while walking the street and they spun up right there and tell, tell you what the directions are. It's not a far future, okay? Where we start seeing these embodied AI characters. Literally everywhere and very much in public spaces all the way from take your favorite mall to your favorite airport, you will see large screens with these AI characters standing there welcoming you.

But now you can talk to them and ask them which way to go. Um, you know, like, uh, “This is my airport ticket. Which way do I go? Where, where is the security check?” Any kind of information, dispense or transactional engagement, these, these characters would be. Uh, would be perfect for real world use cases that we are already seeing now, finally, coming to gaming.

Well, games are going to be effectively the Matrix, right? So, uh, what do I mean by that? It's gonna be so real that you would prefer living there. All right, so, which is, which is a dystopian future that we have to be aware. It'll be pretty darn engaging where instead of the machines taking over, we would be willingly submitting ourselves to this game worlds, which will be an extremely exciting with all of these technologies converging all the way from your VR devices, uh, VR augmented reality, and all of that to, um, extremely high speed internet.

The cloud computing aspect of it, where these extremely high definition world can be literally rendered on the cloud and stream to yourself along with these AI and PCs in these worlds, in these metaverse like worlds, it'll become a much easier way for people to put themselves in, let's call it the Matrix or the Metaverse.

Alright, so, uh, where, where you can be there, live there almost, uh, to engage with your friends and, you know, learn certain things and, and, and engage with these AI and PCs. And that is, uh the future, not too far away that people will start doing that. People already do that, by the way, in a major way in a lot of the social virtual worlds.

Okay. If you think.

[00:19:20] Bilawal Sidhu:

Especially the younger generation, right?

[00:19:22] Purnendu Mukherjee:

Yeah.

[00:19:22] Bilawal Sidhu:

Like it's growing up almost in these worlds.

[00:19:25] Purnendu Mukherjee:

That is true, but also there is a very small minority who have been doing it for a while. There's a large audience concurrent daily active user base for something like Second Life where we recently launched.

[00:19:39] Bilawal Sidhu:

Hmm.

[00:19:39] Purnendu Mukherjee:

Right. And uh, these people come back daily, live their life, talk to their friends for many years now. There are new, newer platforms like VRChat, uh, you know, like I know people that regularly go there. They party on the weekends. Once the challenges of onboarding and the challenges of, uh, the friction is reduced to get in those worlds, a lot of people will start going there, right?

[00:20:04] Bilawal Sidhu:

Let's, let's take that VRChat example. I think that one's fascinating. It's like, yeah, people are buying expensive, full body tracking setups with, you know, expensive headsets and computers to have this high fidelity embodied experience in virtual worlds. But right now, on the other end are other humans that they meet.

Right. How do you feel about. You know, there being a time, let's say in a couple years where you're talking to somebody in VR chat and you literally cannot tell if they're human or not. Does it matter at that point? Because like, I don't know, maybe these AI agents would be far more thoughtful and nice to you than perhaps a real flaw to humans.

So, I'm curious how you think about that, especially in the context of this Matrix analogy that you're making.

[00:20:49] Purnendu Mukherjee:

You literally reminded me of this movie called Transcendence, and Johnny Depp was the lead character there. Literally that's the case there. Basically, Johnny Depp dies and, uh, before dying, he actually uploads himself like full, full neuron scan of his brain and uploads himself into the internet.

[00:21:03] Bilawal Sidhu:

Hmm.

[00:21:04] Purnendu Mukherjee:

And um, uh, and when he comes back after his actual biological body has passed back, people would often ask him. “Are you real? Are you aware?” And his answer would be, “If you cannot tell, does it matter?” So, so that is gonna happen, no doubt.

It's already there from a text standpoint, text in, text out, it'll be hard to tell. And um, today.

[00:21:30] Bilawal Sidhu:

Totally. No.

[00:21:30] Purnendu Mukherjee:

Right. And the other, other aspects would be the visuals of it, the animations of it and things like that. But also. What might be a giveaway is if they are always of a particular, certain, certain personality type, they need to have a wide array of personalities and even, uh, eccentric AI characters, you know, that are kind of awkward.

And, and some of them are mean, and some of them are super nice and, uh, you know, like, uh.

[00:22:00] Bilawal Sidhu:

You get the full high school experience.

[00:22:02] Purnendu Mukherjee:

Yes, exactly. Those are gonna be necessary for, for, for people to engage. It's cannot just be like very assistant, like, so the more the variability, the better, you know, like people would like and, and engage with them and people will find their own type.

Some people are drawn and attracted to toxic people, right? So, so basically that they, they will have all kinds of a AI basically in, in these world. And you'll, you'll choose your, pick what you want.

[00:22:29] Bilawal Sidhu:

It reminds me of the conversation with The Architect and The Matrix where the architect outlines that, “Hey, you know, like we made a utopian simulation but nobody bought it. It just felt too artificial and almost introducing the flaws of our humanity kind of is what made it a sticky experience.”

[00:22:47] Purnendu Mukherjee:

Yes.

[00:22:47] Bilawal Sidhu:

Um, which, which I think is very fascinating, right? 'Cause these environments and experiences need to mirror the full range of emotions, um, that we experience.

[00:22:57] Purnendu Mukherjee:

Right, right. So that, that's what we are noticing.

You know, like we did this demo room and people enjoy talking with the meanest character.

[00:23:08] Bilawal Sidhu:

Wow, okay.

[00:23:09] Purnendu Mukherjee:

Like they would want to walk away and that that mean character will say something provocative that would draw them back, uh, to talking to their character. That's something that we have to be, uh, conscious about as well. Like that is literally the, one of the reasons that Facebook and TikTok became so popular because the newsfeed was programmed for, uh, a maximum emotional turbulence, like the content that that was the most provocative draw people there we don't wanna do with our stuff.

So, so like, uh, the right balance between engagement and what's good for the people is something that we plan to do.

[00:23:43] Bilawal Sidhu:

Now there are many negatives to outweigh the positives here, but Purnendu has spoken with me about being a lonely kid growing up, and I had to ask him about the way these experiences can benefit people.

You alluded to, you know, growing up as a kid, you felt isolated and different than other kids. And I just imagine, you know, I, I relate to that experience. I was so like deeply into computer graphics and visual effects and all this other stuff that like. Nobody else cared about at the time, and I obviously, I found my escape through the internet and, you know, OG IRC forums and PHP forums.

[00:24:17] Purnendu Mukherjee:

Hmm.

[00:24:17] Bilawal Sidhu:

I'm kind of curious, what does this do for people, you know, uh, who may feel lonely today? Like what role can these AI characters play in sort of enriching their lives?

[00:24:30] Purnendu Mukherjee:

Big time. You know, big time. There is of course, online communities where we could meet those like-minded people, but like, they may not be available, uh, at the same time.

But you have this AI character who could effectively have all the right interests. Like think of your best friend that you connected with the most, right? That understands you before you even say it, right? So, these AI characters can potentially be that for them, you know, like, uh, it is risky, but that is where we are evolving too, right? Like undoubtedly.

[00:25:02] Bilawal Sidhu:

And and do you find in that situation, like let's take, go back to a younger you or me. Do, do you imagine in the experience being that these systems can sort of infer what you need or would, would I be like curating the combination of my like three best friends?

Plus a little sprinkle of like, I don't know, John Gaeta and you know, some other VFX supervisors that I really like And lemme throw in a sprinkle of Alan Watts and a sprinkle of like, you know, Einstein in there. How would people define sort of the, their best friend, if you will, in this, in this space?

[00:25:37] Purnendu Mukherjee:

Yeah, that's a very hard one to answer because we don't choose. Uh, I mean, we kind of choose our best friends, uh, eventually, but, uh, we don't choose.

[00:25:47] Bilawal Sidhu:

Based on vibe. Right?

[00:25:48] Purnendu Mukherjee:

Based on vibe. Right, exactly. So, uh, and, and common interests and, and things like that is how, how it evolves. But we don't exactly choose their eccentric things, what they are interested, other things that they are interested in and whatnot.

Right? So, maybe it'll be a multifarious world with lots of different AIs. If these characters have to evolve, if you keep changing them, you will not see their character revolution, right? So you basically go and socialize and you start with your one, and maybe they will adapt to your interests and things like that.

And eventually they will have their own unique experiences too that will shape them effectively, like, um, how it shapes you. Imagine friends that grew up together, uh, maybe this AI can also go out and have, have its experience when, when you are not there, right? So, they have stories to tell what happened today.

[00:26:43] Bilawal Sidhu:

It's kind of mind blowing. What you're saying almost makes me feel like we've been at this phase of technology and the internet where we can organize the world's information and make it universally accessible and useful to use the, the Google mission statement. But now we're heading into a world where we can make the world's people and personalities universally accessible and useful.

[00:27:02] Purnendu Mukherjee:

Yes. And a lot of the technology we are creating you know all the way from facial expressions and hand gestures and and emotional voices basically pouring the mind of these non-player characters are gonna be the same, um, technology that may be used for a lot of these social robots.

[00:27:21] Bilawal Sidhu:

There's obviously both utilitarian and delightful experiences, uh, your customers are building.

What are you looking forward to?

[00:27:28] Purnendu Mukherjee:

We have set the vision, we have created the tools. Uh, and people are developing in terms of the immediate, uh, not, not just immediate, like medium term, three to five years plan is basically ensure that we have redefined gaming in a very positive way. We have enabled these learning and training experiences at scale, you know, that ha, that are changing lives of people in a very positive manner.

Uh, brand experiences, product information, uh, real world embodied characters.

[00:28:06] Bilawal Sidhu:

I love the creator centric approach. I think it's so important that we don't forget that creators are going to author these experiences. Thank you so much for your time. Thank you so much. Yeah, it was, it was great chatting.

Around the time Purnendu and I had this conversation. He invited me to the Convai headquarters where I got to see their NPC innovations firsthand, and let me tell you, it was pretty wild when I walked in, they had this massive monitor with an AI anime character on it. And the people at Convai told me to just have a conversation with it.

They told me I could talk to it in Hindi, which I was psyched about. Then I tried to press a little deeper. I think there's this really human urge to try and push the boundaries of an AI system to prove that it's intelligent and human enough to exist beyond the constraints of corporate language. Most of my efforts were in vain though because as soon as the AI came back with these canned responses, it kind of ruined some of the effect.

Even though it was giving me unscripted responses, it still had parameters. The AI allowed me to go off script, and even though the model I was talking to recognized what I was saying, it was still responding based on its parameters, the guardrails set by the company. While an AI like this might be great at helping you fight a lethal force of aliens, it's hard to know if it will ever reach the messier more human parts of how we relate to one another.

The TED AI Show is a part of the TED Audio Collective and is produced by TED with Cosmic Standard. Our producers are Elah Feder and Sarah McCrea. Our editors are Banban Cheng and Alejandra Salazar. Our showrunner is Ivana Tucker, and our associate producer is Ben Montoya. Our engineer is Aja Pilar Simpson.

Our technical director is Jacob Winik and our executive is Eliza Smith. Our fact checker is Julia Dickerson and I'm your host Bilawal Sidhu. See y'all in the next one.