🎙No Priors, 105: Clem Delangue, Cofounder/CEO HuggingFace (TRANSCRIPT)
EPISODE TITLE: The bot Cicero can collaborate, scheme and build trust with humans. What does this mean for the next frontier of AI? With Noam Brown, Research Scientist at Meta
EPISODE DESCRIPTION: AGI can beat top players in chess, poker, and, now, Diplomacy. In November 2022, a bot named Cicero demonstrated mastery in this game, which requires natural language negotiation and cooperation with humans. In short, Cicero can lie, scheme, build trust, pass as human, and ally with humans. So what does that mean for the future of AGI?
This week’s guest is research scientist Noam Brown. He co-created Cicero on the Meta Fundamental AI Research Team, and is considered one of the smartest engineers and researchers working in AI today.
TRANSCRIPT (Unedited)
SARAH:
This week on the podcast, we're welcoming Nome Brown Noms, a research scientist on the meta fundamental AI research team. Nome co-create the first AI to defeat top humans in two different types of poker. He also recently did an important project called Cicero. In this podcast, we'll dig into how this AI works, what makes for great AI research and engineering and how AI games tie into agi? No is considered one of the smartest engineers and researchers in ai. His work has deep implications for how humanity and AI co-evolve the new bot Cicero can lie, can scheme, it can read a human's intentions and build trust. Cicero demonstrates these skills by performing better than the average human at a classic game called diplomacy. No, welcome to No Pryors.
NOAM:
Oh, thank you for having me. Yeah,
ELAD:
Thanks all for joining. So, you know, I think, uh, in the world today when a lot of people think about ai, they think about it as basically you put a, you put a couple words into a prompt and then you get out an image or you have chat pt, summarize James Burnham's professional managerial class for you in a rhyming essay in the voice of a cat or something. And I think you've pushed in really interesting directions that are very different in some ways from what a lot of people have been focused on. And you've been more focused on game theoretical actors interacting with humans and with each other. And in parallel, you're kind of known as, as Sarah mentioned, as sort of one of these true 10 x engineers and researchers pushing the boundaries on the, in ai. And so I'm sort of curious like what first sparked your interest in games and researching AI to defeat games like poker and diplomacy?
NOAM:
Well, I think, you know, my journey is a bit non-traditional. I mean, I started out in finance actually, so towards the end of my undergrad career, and also like af right after undergrad, I worked in algorithm trading for a couple years and I, I kind of realized that while it's, it's fun and it's, you know, exciting, it's kind of like a game, you know, you gotta score at the end of the day, which is how much money you've made or lost. It's not really the most fulfilling thing that I wanna do with my life. And so I decided that I wanted to do research and it, it wasn't really clear to me in what area I was originally planning to do economics actually. And so I went to the Federal Reserve. I worked there for two years. Honestly. I wanted to figure out how to structure financial markets better, to encourage more pro-social behavior.
And so in the process, I, I became interested in, in game theory and I, I thought I wanted to pursue a PhD like in economics. Stu uh, focused on game theory. And two things happened. So first of all, I became a bit a bit jaded with the pace of progress in economics because if you come up with an idea, you, you have to get it passed through legislation and it's a very long process. And computer science is much more exciting in that way because you can just build something, you don't really need permission to do it. And then the other thing I figured out was that a lot of the most exciting work in game theory was actually happening in computer science. It wasn't happening in economics. And so I applied for, uh, grad schools with the intention of studying algorithm game theory in a computer science department.
And when I got to grad school, there was conveniently a professor that was looking for somebody to do research on AI for poker. And I thought this was like the perfect intersection of everything that I wanted to do. I was interested in game theory, I was interested in, you know, making something interested in ai. I had played poker when I was in high school and college and, you know, never for high stakes, but always just kind of interested in the strategy of the game. I actually tried to make a, a poker bot when I was an undergrad and it, it, it did terribly, but it was a lot of fun. And so to be able to do that, you know, for research in grad school, I thought this was like the, the perfect thing for me to work on. And also I felt like there, there was an opportunity here because it felt doable and I kind of recognized that if you succeed in making an AI that can play poker, you're going to learn really valuable things along the way and that could have like major implications for the future.
ELAD:
That's really cool. And did you have a specific end goal of your work when you started it? Or was it just interest? In other words, some, you know, you talk to a lot of people in the field now they say, oh, our end goal is agi. I, and it's always been, and I think sometimes that's sort of invented later as sort of an interesting story for what they're doing. Did you view this as just doing primary research and it's just personal interest? Did you view it as like there's a path leading to agents that function on behalf of people or, or was there some other sort of driving motivator?
NOAM:
Well, so I started grad school in, in 2012 and it was a very different time in 2012. The idea of AGI I was, was really science fiction. There were some people that were serious about it, but, but very few, the majority opinion was that AI was, if anything, it was kind of a dead field. I actually remember like emailing a professor and having this conversation where I was like, look, I'm really interested in AI but I'm, I'm kind of worried to pursue a PhD in this because I get the impression that it's just a dead field and I'm not, I'm worried if I'll be able to get a job afterwards. So conveniently, like a couple years into grad school, things change pretty drastically. And I, I happened to be in the right place at the right time. I think I was really fortunate in that respect. So the, the original intention wasn't to pursue agi. I, the original intention was, you know, you learn interesting things about AI and game theory and it, you, you build slowly and it was really only a, a couple years into grad school that I, it became clear that the pace of progress was, was quite dramatic.
ELAD:
Was there a specific moment that really drove that home for you? I know for some people they mentioned, oh, Alex and it came out, or Oh, you know, some of the early GaN work felt like a wake up call. I'm just sort of curious if there's a specific technology or paper or something else that came out or was it just kind of a continuum?
NOAM:
Uh, I think it was a slow drip. I mean, I think for me especially, it was the AlphaGo moment, you know, like when you see that, it's just very clear. I mean, Alex, that too, I mean before I started grad school actually I, I took a computer vision class and they were talking about like, you know, sift and all this stuff and then you get something like AlexNet and it just like throws all that out the window and it just like mind boggling how effective that could be.
SARAH:
Now can you explain actually like why AlphaGo is so important and like just size of search space and how you might contrast that to previous games?
NOAM:
Yeah, so a big milestone in AI was deep blue beating Gary Castoff and chess in 1997. And that was, that was a big deal. It's kind of downplayed today I think in like by a lot of machine learning researchers. But we learned a lot from that. We learned that scale really does work. And in that case it wasn't scaling, you know, training and neural nets, it was scaling search, but the techniques that were used in deep blue, they didn't work in a game like go because the pattern matching was just not there. A, a big challenging go was figuring out like how do you even evaluate the state of a board? How do you tell who's winning? And chess it's like difficult, but you can kind of write a function but you can handcraft a function to to estimate that, right? Like you calculate, oh, each piece is worth this many points and you add it together and you can kind of get a sense of who's winning and who's losing and, and go, that's just almost impossible to do by hand.
SARAH:
It's essentially too big to do
NOAM:
That. It's too big, it's too subtle, it's just too complicated and there's too much nuance. And if you ask to, you know, the difference is also if you asked a human, you know, who's winning, they could tell you who's winning, but they couldn't tell you why. And so, you know, one of the things that people assumed was that he, you know, humans are just better at pattern matching and to have an AI come along and like demonstrate that it can do this pattern matching better than a human can and even if it's in this constrained game, that was a big deal. And um, I think that was a wake up call to a lot of people, not not just me, but I think across the world.
SARAH:
I remember as a former like go nerd, just trying to understand the moves that o made to try to figure out how to play better. Cuz it, it was like a such a mind-blowing moment.
NOAM:
Yeah. And you know, if any, if any of your listeners haven't seen the AlphaGo documentary, I I highly recommend watching it. You can, I think it's on Netflix and uh, or YouTube and you can see just how significant this was to to a lot of the world. When you watch that, how
ELAD:
Did you end up choosing diplomacy as the next thing to work on after poker there, there's obviously like a, a wide space of a variety of different types of games. And so what, what drove your selection criteria there and how did you think about choosing that as the the next sort of interesting research problem?
NOAM:
So basically what happened, we succeeded in poker and when we were trying to pick the the next direction, it became clear that AI was progressing very quickly, like much quicker than I think a lot of people appreciated. And there were a lot of conversations about like, what should the next benchmark be? A lot of people were throwing around these games like, uh, Hanabi was won, somebody was talking about like Werewolf or Soto Katan, these kinds of things. And I just felt like, you know, this was 2019 and in 2019 you had G P T two come out, which was just mind blowing. And then you also had deep mind beating, uh, grand Masters in StarCraft two, you had opening eye beating human experts in DOTA two. And that was just after like a couple of, uh, of work of research and engineering and to then like go to a game like Soto Katan, it just, it just felt like too easy.
Like you could just take a team of five people, spend a year on that and you'd have it cracked. And so we wanted to pick something that would be truly impressive, like that would require fundamentally new techniques in order to succeed, not just scaling up something that, that already exists. And we were trying to think of what would be the hardest game to make an AI for. And we landed on diplomacy. The idea that you could have an AI that negotiates in natural language with humans and like strategizes with them, it really just felt like science fiction. And even in 2019 knowing all this success that was happening in ai, it still felt like science fiction. And so that's why we aimed for it. And I, I think that was the right call. I mean I'm really glad that we, we aimed high at that point. I was a little afraid to do that, to be honest. Yeah, it's, it's a high risk thing to aim for but all research is high risk, high reward, or at least it should be.
ELAD:
Do you wanna give a quick minute overview of diplomacy so people can understand what it is and why the research was such a breakthrough?
NOAM:
Yeah. Diplomacy is this game, it was developed in the fifties. It was actually developed by, by this guy who saw what happened in World War I and kind of viewed this as a diplomatic failure. And so he wanted to create this game that would teach people how to be better diplomats essentially. And so it takes place at the onset of World War I, there's seven powers that you can play as England, France, Germany, Italy, Russia, Turkey and Austria, Hungary. And you engage in these like complex negotiations every turn and your goal is to try to control as much of the map as possible. And the way you win is by controlling a majority of the map. It's kind of like hunger games where even though only one person can win, at the end of the day there's still this like incentive to be able to work together, especially early on because you can both benefit and have a better chance of winning in the end if you work together.
And so you have these like really complex negotiations that happen and, and all the communication is done in private. So you, unlike a game like risk for example, or settlers z Katan where like all the negotiation is done in front of everybody else in diplomacy, you will actually like pull somebody aside, go to a corner like scheme about who you're going to attack together this turn, who's gonna support who. And then after you've negotiated with everybody, you write down what your moves are for the turn. And so then all the moves are read off at the same time and you can see if people like actually follow through on their promises about like helping you or maybe they, they lied to you and they're just gonna attack you this turn. So it has like some elements of risk poker and survivor cuz there's this big trust component and that's really the, the essence of the game.
Like can you build trust with others because the only way to succeed in this game is by working together even though you always have an incentive to, to attack somebody and grow at their expense. So yeah, that's the game. It's been around for a long time. Like I said since the fifties it was JFK and Kissinger's favorite game. There's research for this game from an AI angle going back to the eighties. But the idea that you could play this game in natural language with humans and beat them was just complete science fiction until a few years ago. Like it was still science fiction but we at least thought it was like worth pursuing it. And research really took off in 2019 when researchers started using deep learning to make bake bots for this game that could play the non-language version. So there's no communication, you just write down your moves and you kinda have to communicate non-verbally through the actions that you take. We were doing research on this, DeepMind was doing research on this and then also University of Montreal and a couple other places as well. And there was a lot of interest and and progress, but we decided to take the risky bet of just like jumping to the endpoint and instead of taking an incremental approach, aiming for full natural language diplomacy and uh, I'm glad that we aim for that. Yeah,
ELAD:
It seems like one of the pretty amazing things about what you all did is you basically created bots that other people that humans thought were other people and therefore they had to learn how to collaborate with each other, how to sometimes lie or deceive how to sometimes, um, think through sort of multiple moves from a game theoretical perspective. And so it's a, it's a radically different thing than playing chess or or playing go against another person and then just having almost a probabilistic tree of moves or something.
NOAM:
Yeah. You run into this like human element. You really have to understand the human elements. And what's really interesting about diplomacy, aside from just the natural language component I, is that it really is the first major game AI breakthrough in a game that involves cooperation. And that that's really important because, you know, at the end of the day when we make these AI to play chess and go, we're not developing them with a purpose of beating humans of games. We, we want to, you know, have them be useful in the real world. And if you wanna have these AI be useful in the real world, then they have to understand how to cooperate with humans as well.
SARAH:
Alad and I were talking about Sunar play and whether or not that would persist as an idea at all given like we've accepted that AI are gonna win games at this point. But I think like, you know, the idea that AI are going to take action by cooperating with humans that needs to be a a core capability seems obvious and perhaps this is the making myself feel better story, but I I am hopeful that that is a, a human skill that remains quite important. Being able to cooperate with ais
NOAM:
Well from what I hear Centaur play is like ais have gotten so strong in games like chess that it's not clear if the human is really adding that much these days.
ELAD:
That's what I told Sarah too.
NOAM:
<laugh>. Yeah, it's it's kind it's kind of a depressing thought.
SARAH:
I know. Yeah, I'm crying, I get it, I get it, I accept it. Yeah,
NOAM:
I think the humans are still useful and a game like go cuz like the ais are super strong, but they will also sometimes like a few times in each game make these like really weird blunders and in diplomacy I think. Yeah, it's super helpful to have like an experienced human in in addition to the AI though, like, you know, eventually I'd imagine that these systems become so strong that like it kind of goes the way of chess where like the humans just kind of like adding a marginal difference at the end.
SARAH:
Yeah, I'm, I'm actually just, you know, wondering how long that window is for humans and centaur playing the game of life. Right. But it's okay. It's okay. I got it. Elad was right.
NOAM:
Uh, hopefully, yeah, hopefully forever, but we know, we'll see.
ELAD:
Yeah. So, uh, do you mind explaining the work that you've done in poker and some of the breakthroughs
NOAM:
That you made there as well? Yeah, my PhD research was really focused on how do you get an AI to beat top humans in the game of No Limit Texas hold and poker. Specifically during my PhD it was on Heads Up, no Limit, Texas Hold and Poker, that's, that's two player poker. And this was a longstanding challenge problem actually if you go back to the original papers written on Game Theory by John N the only application that's discussed in the paper is poker. He actually analyzes this like simple three player poker game in the paper and, and works out the national equilibrium by hand. And then actually at the end he says like, oh yeah, it'd be really interesting to analyze a much more complex poker game using this approach. So I'm glad we finally got a chance to do that, you know, 60 years later.
And it's interesting, I think especially after AlphaGo, this became a very popular problem because after AlphaGo there was a big question of like, okay, well AI can now beat humans a chess, they can beat humans ago, what can't they do? And the big thing that they couldn't do was be able to reason about hidden information, be able to understand that okay, this other player knows things that I don't know and I know things that they don't know. And being able to overcome that problem in a strategic setting was a big unanswered question. And yeah, so that, that was the focus of my research from basically, um, my, my whole grad school experience and there were a few different research labs that were working on this and what would happen is every year we would all make a poker bot and we would play them against each other in this competition called the Annual Computer Poker Competition.
And we actually won. So like basically what happened is when I started my PhD, there had already been like some progress in AI for poker and so the competition really turned into a competition of scaling. There's about like 2.5 billion different hands that you could have on the river, like the last round of poker in Texas Holdem. And what we would do is cluster those hands together using kines clustering and like treat similar hands identically and that allows you to compute, uh, a policy for poker because now instead of having to worry about 2.5 billion hands and like having to come up with a policy for each one of those, you can now like bucket them together and now you have like 5,000 buckets or something and you can actually compute a policy for, for that many buckets. And so this was like before neural nets, that's why we were doing these like this caines clustering thing instead of uh, deep neural nets.
But you can kind of think of it as like the number of buckets that you have is kind of like the number of parameters that you have in your, in your network. And so in grad school it kind of turned into a, a competition of scaling how many buckets could you have in your bot? And first year it was like 5,000 buckets. Then we got up to 30,000 buckets and then 90,000 buckets every year we would have these bigger and bigger models, we would train them for longer paralyze them and they would always beat the previous years model. And in 2014 we actually won the annual computer poker competition and after that we decided to take our bot and play it against expert human players. And so this was the first what was called the Brains versus AI poker competition where we invited like these top heads up, no limit Texas hold poker pros.
And we had them play 80,000 hands of poker against our bot. And the bot actually lost by a pretty sizable margin. And it occurred to me like during this competition that the way the humans were approaching the game was actually very different from how our bot was approaching it. So we would train our bot for like two months leading up to this competition, you know, on, on a thousand CPUs. But then when it came time to actually play the game, it would act instantly and the humans would, would do something different. Like, you know, obviously they would practice ahead of time, they would develop an intuition for the game, but when they were playing the game against the bot and they were in a difficult spot, they would sit there and they would think, and sometimes it was like five seconds, sometimes it was like a minute, but they would think and that would allow them to come up with this better solution.
And it occurred to me that this might be like something that we're missing from our bot. And so I did this analysis after the competition to figure out, okay, if we were to add this search, this planning algorithm that would come up with a better strategy when it's actually in the hand, how much better could it do? And the answer was it improved the performance by about a hundred thousand x. It was the equivalent of scaling the model, like scaling the number of parameters, scaling the training by a hundred thousand x. Now the three years of my uh, PhD at that point I had managed to scale things by about a hundred x and you know, that's like quite good. I was very proud of that. But when I saw that result, it made me appreciate that everything I had done in my PhD up until that point was just a footnote compared to adding search and scaling search.
And so for the next year I just worked basically nonstop, like a hundred hour weeks trying to scale up search, threw as much competition at the problem at inference time as possible. And then we did another competition in January, 2017 where we played against four top expert poker players, again $200,000 in prize money to incentivize them to play their best. And this time we completely crushed them. Poker players were literally telling us they did not think it was possible to beat expert poker players by that kind of margin. Yeah. And so that's the story of like, you know, my grad school experience working on poker ai that was for two player poker. We ended up after that working on multiplayer poker on six player poker. Again, the the big breakthrough there was that we developed a more scalable search technique. So instead of always having to search to the end of the game, it could search just a couple moves ahead. And what was really interesting there is the bot, we did another competition, the bot won and that bot cost under $150 to train if you were to run it on like a cloud computing service. And I think that shows that this wasn't just a matter of scaling compute, it, it really was an algorithm breakthrough and this kind of result would've been doable 20 years ago if people knew the approach to take.
ELAD:
And if you look at a lot of other games, those sorts of big shifts and performance from a bot relative to people then shifts how people play, right? They learn from the bot or they adapt their game from watching games that the bots play. How did that play out in terms of poker?
NOAM:
Yeah, that's a great question. So, you know, the competition, it was really interesting cause you know, so kind of like as a last minute thing, we added this ability, the way the bot works, we give it different bed sizes that it can use. The game that we were playing, there's 20,000 chips, $100, $200 blinds, $5,100 blinds actually. Um, and so it can bet any amount it wants from like a hundred dollars up to $20,000. And so there's not much value in like being able to bet both $5,000 and $5,001. And so we would discretized that action space to constrain it to like only considering a few different options. And so there's a question of like, okay, well what sizes do you give it the choice between? And you know, towards the end when we were developing the spot, it like we just had room for extra competition and so we just like threw in some extra sizes like four x the pot, 10 x the pot, like it, it doesn't cost that much more, so why not just give it the option?
I didn't think it would actually use those sizes. And then during the competition it, it actually ended up using those sizes a lot and it was sometimes bet like, you know, $20,000 into a $100 pot, which was completely unheard of in professional poker play. And you know, I was a little worried about this cause I, I thought it was a mistake at first and I think the players that we were playing against also thought it was a mistake at first. But then they, they found that they kept ending up in these like really tricky situations and you know, they would just really struggle with like whether to call or fold. And that's, that's how you know you're playing good poker. If you see the other person like really struggling with the decision, that is a sign that you're doing something right. And at the end they told us like, yeah, that's the one thing that we're gonna try to incorporate into our own play.
Adding these like what are called over bets into, uh, into our strategy, typically the, the strategy was like, oh you bet between a quarter of the size of the pot and one times the pot, and now in professional poker play, it's actually really no, I wouldn't say common, but it's a part of the strategy to bet sometimes like five x the pot, 10 x the pot, if you can pull it off in the right way, it can be a very powerful strategy. And I should also say like the way professional poker players train now, they all use bots to, to assist them. It's, it's a lot like chess where you play the game and then you have a bot analyze your play at the afterwards and see like, okay, did you make mistakes? Where do you make mistakes? How could you do better next time? The game really has been demystified and become a lot like chess. I kind of describe poker as essentially high-dimensional chess. It's like chess where you have to reason about like a probability distribution over actions instead of just like discreet actions.
SARAH:
Yeah, it's really, it's really interesting because I don't think people really believe there was fully optimal play in poker before, like they understood the probability distribution, but if you're playing live poker, like there's social cues, right? And social play and that has clearly been swept out not as an activity of like enjoyment, but in terms of a strategy that actually wins.
NOAM:
Yeah, I think that's surprising to a lot of people that this idea that there is an optimal way to play poker. You know, there there's this thing called the Nash equilibrium where if you're playing that strategy you'll never lose. It guarantees that in the long run you will not lose an expectation. And the reason for that is because like if you're playing against somebody else that's also playing the Nash equilibrium, like obviously you can't both win, one of you is going to lose or you're gonna tie. And so in expectation, if you're playing against each other, it gets somebody else that's playing the na equilibrium, um, you're gonna end up tying. But in practice what ends up happening is if you're playing the Nash equilibrium in a complicated game like poker, the other person is gonna make these small mistakes over time. And every mistake that they make is money into your pocket.
And so you just play the na equilibrium, wait for them to make mistakes and you end up winning. And that is now the conventional wisdom among poker players that you start by playing the Nash equilibrium. I if you're, if you're really good, you can look at the other players, see how they're deviating from the national equilibrium playing suboptimally. And maybe you can like deviate yourself to capitalize on those mistakes. But really the safe thing to do is play the Nash equilibrium, let them make mistakes and every mistake that they make cost them money, puts money in your pocket.
ELAD:
What was the most unexpected thing to come out of working on diplomacy in terms of, you know, what Cicero could do?
NOAM:
I mean I think the most unexpected thing was just honestly how it didn't get detected as a bot. We were really worried about this leading into, into the human competitions because first of all, there's no way to like really test this ahead of time. Like we can play with the bot, but we know that it's a bot and we can't really like gather a bunch of people together and stick them in a game and you know, have them play with a bot without having them realize that something's up, right? Like if this company's hiring them to like play a game, like, and they know that we're working on diplomacy, like clearly they're gonna be playing with a bot. And when people know that they're playing with a bot, they behave very differently, right? We didn't want this to turn this into a tour test and so we had to enter the bot into these games where players did not know that there was a bot in the mix.
That was the only way that we could get like, meaningful results. And, and just to be clear, the reason for this is because like diplomacy is a natural language negotiation game. And so you're having these like really complicated long conversations with these, with these people and it's kind of hard to get away with that, uh, as a bot but not be detected. And so our big concern was like we stick the bot in a game and within like five games, maybe even two games, they figure out it's a bot word gets out all the diplomacy, the diplomacy community is pretty small so they all talk to each other. And then in all the future games everybody's like asking, you know, touring test questions, trying to figure out who the bot is. And our experiments are just like meaningless. And so we figured like, okay, maybe we get lucky and we managed to get like 10 games in before they figure this out, but at least we have like 10 games worth of data.
But surprisingly we managed to go like the full 40 games without being detected as a bot. And that was surprising to me. And I think that's a testament to the progress of language models in the past couple years especially. And also that maybe humans aren't as good at talking <laugh> as we might think. Like, uh, maybe appreciate also that, you know, if somebody's saying something a little weird cuz the bot does say weird things every once in a while, their first instinct is not gonna be like, oh, I'm talking to a bot. Their first instinct is gonna be like, oh, this person is like dumb or distracted or like, you know, they're drunk or something. Uh, and then way down on the list is like, oh, this person is a bot. So I think we got pretty lucky in that respect. I mean, but but also, I mean the, the bot did manage to like actually go these 40 games without being detected. And so I think that is a testament to, to the quality of the language model.
SARAH:
I think meta's actually planning to release the data, which is gonna be so interesting. But can you just like describe like a, an interaction from the bot you thought was interesting in these negotiations?
NOAM:
Oh yeah, I mean I, I think one of the messages that was like really, you know, honestly kind of scary to me was just when it wa it was, it was talking to another player and the player was saying like, Hey, you know, I'm really nervous about your units, you know, near my border. And the bot honestly was not planning to attack the player, it was planning to go in the other direction and it just, it it sent the player this like really empathetic message where it was like, look, I totally understand where you're coming from. I, I can assure you 100%, like I'm not planning to attack you, I'm planning to go in the other direction. You have my word. And it really felt like a very human like message and a hundred percent I would've never expected that to come from a bot. And that when you see ki when you see stuff like that, like it makes you appreciate like, yeah, there's something really powerful here. How
ELAD:
Do you think about the tour test in the context of all this? Like, or what's your updated model of whether the test is still relevant or how to think about it?
NOAM:
So there was actually a New York Times article that came out from Cade Metz at, uh, New York Times song on like the touring test and what it means. And he actually talks about Cicero in, in the article and basically his, his view is that the touring test is, is kind of dead. And I, I kind of agree with that. I think the touring test is no longer really a useful measure the way that was intended to be. Certainly just because we have bots that can, I wouldn't say they can pass the touring test, but I mean like they're getting, they're getting close enough that it's no longer that useful of a measure. It doesn't mean that we have general intelligence. I think there's still a long way to go on that. There's a lot of things that these bots can't do well, but yeah, I, I think my view now is that the Turing test is not that useful of a measure anymore. It, that doesn't necessarily mean that it was always a useless measure. I think, uh, it just shows like how much progress we've made. We're not a hundred percent there, but you know, the, the progress really has been staggering, especially in the past few years. What
ELAD:
Measure do you think or measures do you think make sense to use? And then also what do you think is missing on sort of the road to general intelligence?
NOAM:
I think there's a few things that are missing. The big thing that I'm interested in particular is reasoning capabilities. You have these bots and they're all doing next word prediction, right? The Cicero is a bit different actually in that it's actually conditioning its dialogue generation on a plan. And I think that's one of the really interesting things that, that distinguishes Cicero from a lot of the work that's happening in language models today. But a lot of the research that is happening is, is using next word prediction and when it's trying to do something that's like more sophisticated in terms of reasoning capabilities, it's a lot of chain of thought where it's just like rolling out, you know, the kind of reasoning that it's observed humans do in their, in, in its training data and seeing where that leads. So I think there's a general recognition among AI researchers that this is a big weakness in the bots today. And that if we want truly general artificial general intelligence, then this is, this needs to be addressed. Now there's a big question about how to address it and that's actually why I really like this direction because it's still an open question about how, how to actually fix this problem. There's been some progress, but I think there's a lot of room for improvement.
SARAH:
What do you think are the most promising possible directions?
NOAM:
Uh, that is, that is the trillion dollar question. You're the
SARAH:
Trillion dollar man.
NOAM:
<laugh>. I think there's like clear, clear baseline. I mean like fir first of all, chain of thought really was like a big step and it's kind of shocking just like how effective that was given how simple of an idea it is. You know, you
SARAH:
Just, I tell myself every day when I wake up now, let's think step by step. Yeah,
NOAM:
Yeah. So for those of you that don't know, it's just like you add to the prompt like, oh, let's think through this step by step and then the, the AI will like actually generate a, a longer like thought process about how it reaches its conclusion and then that actually leads to better conclusions. But you can, you can kind of see that as like just rolling out the thought process that it's observed in human data. And so there's a question of like, okay, well instead of just rolling that out, could you actually improve it as it's going through each step? And so I think things like that, I mean, I'm kind of keeping it like very abstract because you know, it's an important question and also I think there's not a clear answer yet, so I don't wanna speculate too much, but I think that there is like room for improvements I in this direction.
ELAD:
What was the actual data set that was necessary in order for the training here? And uh, uh, maybe to take a step back, you know, I've been having a series of conversations with people about data and sort of like, when do we run out of data that's easily available and when do we hire have to start creating either large scale synthetic data or R L H F data or you know, do you literally pay people just record themselves all day so you can start collecting interesting data off of them <laugh> to do different things with over time, right? As these models scale to a certain point where, you know, you've used up the internet and you start using up all the video content and you start running outta stuff. I'm just sort of curious like how you thought about data in this context and what's necessary to really take things to the next level from a, you know, self-driven agent perspective like this?
NOAM:
Yeah, it's not clear that data really is the, the bottleneck on performance here. Uh, I, and I talked to AI researchers about this and I think there isn't as much of a worry about this as, as people might think, partly that's because there's a lot more data that's out there than people might realize than that people are using right now. And also it's because I think there are gonna be improvements to, um, sample efficiency as the research progresses. And so I think we'll be able to, to stretch the data more.
ELAD:
Why do you think it's a bottleneck?
NOAM:
So I think the bottleneck is going to be scaling. I mean, so you look at, at the models that exist today, like they probably cost 50 million to train, you can probably easily 10 x that, you know, I wouldn't be surprised if there's a 500 million model that's trained in the next year or two. You can maybe even go another order of, and like train a 5 billion model if you're like the US government or something, or like a really big tech company. But what do you do beyond that? Do you train a hundred billion model? You'll probably see some improvement, but at some point it just becomes like not realistic anymore. And so that's, that's gonna be the bottleneck. Like we maybe get like two orders of magnitude more scaling and, and then we have a big problem and people are focused on like, okay, how do we, how do we make this more efficient?
How do we train this, uh, cheaper, more paralyzed? But you can only squeeze so much out of that. And I think we've, we've squeezed a lot already and this is why I'm interested in the reasoning direction because I think there's this whole other dimension that people are not scaling right now, which is the amount of compute at inference time. You know, you can spend 50 million training this model, um, ahead of time like pre-training this model, and then when it comes to actual inference, it costs like a penny and you know what happens if instead of it returning an answer in a second, it returns an answer in like an hour or, or even five seconds or 10 seconds. You know, sometimes if people want to give a better a a better answer, they'll sit there and they'll think a bit and that leads to a better outcome. And I think that that's one of the things that's missing from these models. So I, I think that that's one of the ways to overcome this scaling challenge and that that's partly why I'm interested in working on that.
SARAH:
Going back to related to what Alad said, diplomacy problem specifically didn't have like, you know, internet scale data, right? As you mentioned, it's like a relatively small community. Can you talk about what you guys did in terms of self play and the data that actually was involved?
NOAM:
So diplomacy, the problem was interesting because yes, there, there's actually not a ton of data out there. I mean we had a relatively good data set about about 50,000 games with dialogue. We
SARAH:
Did. This is from like web diplomacy?
NOAM:
Yeah, this is from a, a site called web diplomacy.net. It's been around for like almost 20 years where people play diplomacy casually on this site. We were very lucky to get this data set. I mean, honestly I was scouring the internet trying to find like all the sites that have available data and this was basically the only sites that had a meaningful amount of data. Like there was another popular site, but they periodically deleted their data, which was, you know, just mind boggling to me. It's just you're sitting on a goldmine hearing, you're just deleting it cuz it's taking up service space. I, I guess they didn't appreciate that. Like AI researchers will one day be interested in that and then other sites just like refuse to hand over their data. And so I'm really glad that we managed to like work out a deal with web diplomacy.net because otherwise, uh, the project would've just never happened.
Now that's about 50,000 games of diplomacy, about 13 million messages, and that is a good size data set, but it's not, it's not enough to train a bot from scratch. Fortunately we are, we're able to leverage like, you know, a wider data set from the internet. So you kind of like have a pre-trained language model and then you fine tune it on the diplomacy data and you get a bot that can actually communicate pretty well in the game of diplomacy. Now that helps with the dialogue, but there's still a problem, which is that the strategy isn't gonna be up to par. And that's partly because you can't do that well with just supervised learning. You can't learn like a really, really good strategy in these kinds of games just with supervised learning. And it's also because the, the people that are playing these games are not very good at the game.
Like the most, most of the data set is from fairly weak players. You know, that's, that's just a reality. You have a, you have a bell curve. The actual strong players are like a relatively strong small fraction of any dataset that you have. And I should say this is not limited to diplomacy like we also found in chess and go, we actually ran this experiment. If you do just pure supervised learning on a giant dataset of human chess and go games, the bot that you get out from that is not an expert chess or go player. Even if it's like conditioned to like behave like a chess grand master, it's not going to be able to match that kind of performance because it's, it's not doing any planning. That's really what's missing. And so in order to get a, a strategy that was able to go beyond just like average human performance or, or even like, you know, strong human performance to something that's like much better, we had to do self play.
And this is like how all these like previous game AI have been trained, right? Like you look at AlphaGo, you look at especially Alpha Zero, the latest version of AlphaGo, and you look at the, you know, the Dota two bot, the way they're trained is by playing against themselves for millions or billions of trajectories. That's also how our poker bot was trained for two player N six player poker. Now the difference is like when you go from those games to diplomacy, suddenly there is this cooperative aspect to the game. Like you can't just assume that everybody else is going to behave like a machine like identically. It's the way you're gonna behave. And so in order to cut overcome that, we had to combine self play with a recognition that humans are gonna behave a lot like how our data suggests. And so using the dataset that we have, we're able to build up this model of rough model of how humans behave and then we can improve on that using self play.
And so we're, we're figuring out a good strategy, but basically a strategy that's compatible with how humans are playing the game. So to, to give some intuition for this, like, because it's not obvious why this changes when you go from a two player zero, some game like chess to a cooperative game like diplomacy. I mean also I should say like diplomacy is both cooperative and competitive, but there is a cooperative, a big cooperative component. Like let's say you're trying to develop a bot that negotiates, if you train that bot from scratch with no human data, it's going to, it could learn to negotiate, but it could learn to negotiate in a language that's not English. It could learn to negotiate in some like gibberish robot language. And then when you stick into a game with six humans, that's a negotiation task like diplomacy. It's not going to be able to communicate with them and they're just gonna all work with each other instead of with the bot.
That same dynamic happens even in the strategy game, the moves in the game, the, the nonverbal communication aspect, the bot will develop these like norms and and expectations around like what it's ally should be doing this turn, like I'm going to support my ally into this territory because I'm expecting them to go into this territory. And I don't, I don't even have to talk to them about this cuz it's just so obvious that they should be doing this. But the humans have their own metagame where like, oh, it's actually really obvious that I should be supporting you into this territory and if you don't understand the human norms and conventions, then you're not gonna be able to cooperate well with humans and they're just gonna not work with you and work with somebody else instead. So, so that's what we really had to overcome in Cicero and we managed to do that by using the human data to build this model of how humans behave. And then adding self play on top of that as kind of like a modifier to, to the human data set
SARAH:
That actually has some really interesting implications, right? Like if you believe in the long term, we are going to have bots that take action in the real world interacting with humans and humans are perhaps not very good at optimal play in the game of life and you're interacting with, um, like it just brings home the point of how important reasoning could be versus learning pattern recognition.
NOAM:
Yeah, I I think you're absolutely right that like this, this matters a lot if you want to make ais that interact with humans in the real world, right? Like if you have a car driving on the road, a self-driving car, you don't want it to assume that all the other drivers are machines that are gonna act like perfectly optimally in every step, every, every step of the way. Like you want the self-driven car to recognize that these other drivers are humans and humans make mistakes and somebody could like swerve into my lane. And yeah, and also like, you know, just like day-to-day interactions, understanding like the non-verbal cues of humans and like what that means, the, these are things that, or even the verbal cues, these are things that an AI has to be able to, to cope with if it's going to like really be useful to humans in the real world and not just beating them at chess
ELAD:
Games have been used for a while now as a way to measure AI progress. And you've worked on poker variants and diplomacy variants and you mentioned before other work people have done in terms of chess and go and things like that. What do you think is the next frontier in terms of games and sort of research on, on them in the lens of ai?
NOAM:
Yeah, so there's a long history of games as benchmarks for ai. And this goes all the way back to like the very foundations of AI back in like the fifties. Chess in particular was held up as this like grand challenge for ai because if we can make an AI that was like as smart as a human chess grand master, then like imagine all the other smart things we could do. Of course that turned out to be like kind of a false promise, right? Like you get an AI that plays chess and it turns out it doesn't really do anything else. But we've learned a lot along the way and games are useful as a benchmark because you can compare very objectively to top human performance. Like, it becomes very clear when you're surpassing human ability in this domain, even if it's a restricted domain. And you also have this benchmark that's existed before the AI researchers came along.
Like AI researchers, it's really easy for them to come up with a benchmark once they have the technique already created. You know, you, you come up with a, you come up with a technique and then you're saying like, okay, well now it's really easy to come up with a benchmark that this technique will work for. And you don't want that. You want the problem to come first and games give you that. But I think we're reaching a point now where individual recreational games are just no longer that interesting of a challenge. You know, I said earlier we chose diplomacy because we thought it would be the hardest game to make an AI for. And I think that's true. I, I can't think of any other game out there where like if somebody made an AI that could play that game, I would be like, wow, that's super impressive.
And I did not think that that was possible. And so I think going forward the field needs to move beyond looking at individual games and starting to look at, first of all, going beyond games, but also looking at generality. The approach that we've used in diplomacy is very different from what we previously did in poker and what others have done in chess and go and StarCraft. And now there's a question of like, okay, well if we really want a general system, a general ai, can we have it play all of these games at a superhuman level? And also able to do things like, you know, image generation and uh, question answering and like all these tasks. And if we could accomplish that, then that becomes incredibly impressive. And so I think games will, will continue to serve as this benchmark, but it's not, instead of serving as a benchmark that the research kind of overfit to, my hope is that it will serve as a benchmark that we use along other benchmarks outside of games like, you know, uh, image generation benchmarks and, and language q and a benchmarks and these kinds of things.
SARAH:
Given that the AI is already one in these restricted domains that are challenging in specific ways, like how do you think about the domains there are gonna be human skill dominant? Like are there going to be domains that like that?
NOAM:
Well, certainly anything in the physical world, I mean, humans still dominate. I mean when it comes to actually like, you know, manipulation tasks, these kinds of things, uh, robotics is really lagging behind and I think I'm, I'm trying to avoid doing anything in the physical world for that reason. Software is just so much nicer to work with. I think reasoning, there's still things that you can't, that humans are, are definitely better at. Even in restricted domains. You look at something like writing a novel, I don't think you can get an AI to output like the next Harry Potter just yet. That might not be that far off. Maybe it's like five years away or something, but I don't think it's, it's happening just yet. It's kind of scary that it's, I'm really struggling to come up with domains where I'm like, oh yeah, uh, AI is not gonna be able to surpass humans in this
ELAD:
<laugh>. Yeah, it's about to say like, uh, I feel like people often talk about areas where humans will always have an advantage just cause they're humans and they want to feel good about the future versus because there's necessarily something that shouldn't be tractable from a at least sheer logical perspective, right?
NOAM:
Yeah. Uh, it, it certainly is. I, I mean, I think that the big advantage that humans have, and it's not clear when AI will surpass humans in this is generality, the ability to learn from a s one of our samples. They be able to like, you know, be useful across a wide variety of domains.
ELAD:
But isn't that generality overstated? Because I, I feel like in the examples that you mentioned, you said everything from like image gen to diplomacy in like a single architecture or AI or something. And often it seems like, you know, if you look at the average person, if they're very good at one thing, they're usually not good at everything. Right? And so I kind of feel like the bar that we're using in terms of generality for AI sometimes is higher than the bar we'd use for generality for people in some sense. Or is that not a true statement?
NOAM:
I think it's, it's not just about generality, it's really about sample efficiency. Like how many games it take for a human to become a good chess player or a good diplomacy player or a good artist. The answer is orders of attitude less than it takes for an ai. And that is going to pose a problem when you're in domains. But there isn't much data now that seems like a problem that could be overcome. It's, I'm just saying that's a problem that hasn't been overcome yet. And I think that that's one of the clear advantages that humans have over AI today.
ELAD:
When do you think we'll see the emergence of AI in financial or economic systems? And obviously we have like algorithmic trading and other things like that and then we have things like crypto where you effectively have programmatic approaches to effectively money wrapped as code, right? And the ability to interact with those things in reasonably rich ways through smart contracts, you know, do you think we're there's any sort of near term horizon of people experimenting with that or just interesting research being done in terms of the actual interaction of a bot with a financial system?
NOAM:
I think it's already being done. If you look at financial markets, I'm, I'm sure there's tons of trading powered by deep learning. I've actually talked to a lot of finance companies about this. I used to work in finance and also like a lot of finance companies love poker. And so I, I've given a few talks at like various places on AI for poker and I've talked to a few places about like, is reinforcement learning actually useful for financial markets for trading? And the answer I get is usually no. I think the major challenge with using things like reinforcement learning for trading is that it's a non-stationary environment. So you can have all this historical data, but it's not a stationary system and it's gonna like the markets respond to world events, these kinds of things. So you need a technique ideally that uh, really understands the world, not just training everything like a black box.
ELAD:
But could that at all feed into what you're saying about spending more compute on inference versus training? In other words, incorporating realtime signals at the point of decision making? Or did you mean something else by that in terms of model architecture that would enable you to update weights in certain ways or things like that over time?
NOAM:
Well, I think it, I think it goes back to the sample efficiency problem that humans are pretty good at adapting to novel situations. And you run into these like novel situations pretty frequently in financial markets. Yeah, I, I, I think it's also a problem of, of generality that you need to understand so much about the world to really succeed. Now that, that said, I mean, I think that theis are successful in, in, in financial markets in like fairly limited ways. Uh, certainly if you wanna like break up big orders, these kinds of things. Also, I should say like, I'm not an expert in this, like this, this is kind of outdated knowledge from, from me cuz I, I'm sure like there's a lot of cutting edge stuff that's happening that people are not telling me about because it's making money. But I can tell you that this is like kind of the perspective as of like maybe five years ago that I think that there's, it's being used in limited ways, but I don't think it's, it's fully replacing humans yet. Do you
SARAH:
Think we're gonna get bots that negotiate with humans soon or I guess let me promise that as we are eventually going to get them. What do you think the timeline is or the use case
NOAM:
That seems doable? I think it depends on how constrained the domain is. I think if you were to look at constrained domains, uh, certain negotiation tasks, I think that AI's could probably do better than humans in that today. I, I mean, I'm trying to think of like specific examples, but things like, um, you know, if you wanted to negotiate over the price of a good, it could probably do better than a human in a lot of those, in a lot of those situations. I think if there's things like salary negotiations, it might do better than humans at at that. Also, I think it depends on how much you need to know about the world. I think contract negotiations, for example, would still be difficult because there's so much subtlety, there's so much nuance to like every contract and it's not gonna replace a professional negotiator for that kind of task just yet. But kind of the things that are more constrained don't require as much like outside knowledge about the world. I think AI are probably up to the task already.
ELAD:
So a friend of mine who used to work with you says that one of the things you're really exceptional at is you tend to pick a neglected research domain with lots of promise. You commit to it long term and then you become the best at it. And many people in the world kind of get attracted to shiny things instead and kind of distracted by, you know, whatever's in vogue, but then it turns out to be less interesting research. What are you thinking about working on next? Or what interests you as sort of the next wave of stuff to do?
NOAM:
I think the, the big thing I'm interested in is the reasoning problem. And this is kind of motivated by my experience in this game space. You look at things like Alpha Zero, the latest version of AlphaGo and like AlphaGo in particular is held up as this, like this big milestone in deep learning. And to some extent it is like it was not doable without deep learning, but it wasn't deep learning alone that enabled that if you take out the planning that's being done in AlphaGo and just use the raw policy network, the raw neural network, it's actually substantially below top human performance. And so we have with just like raw neural nets, we have all these things that are incredibly powerful like, you know, chatbots, image generation software, but the raw neural net itself still can't play go. It requires this extra planning algorithm on top of it to, to achieve top human performance.
And that planning algorithm that's used in AlphaGo Montecarlo research is very domain specific. I think people don't appreciate just how domain specific it is because it works in chess, it works in go. And these have been like the classic domains that people have cared about for investigating these kinds of techniques. It doesn't work in poker, it doesn't work in diplomacy. I think because I've worked in those domains, I kind of like recognize that this is, this is a major weakness, uh, of these kinds of algorithms. And so I think there's a big question of like, okay, how do we get these models to be able to do these like complex reasoning planning tasks with a more general system that can work across a wide variety of domains? And if you can do that, if you could succeed in that task, then it enables a lot of really powerful things.
Like one of the domains that I I'm thinking about is theory improving. You know, it doesn't seem crazy to me that you could have a model that can prove the reman hypothesis within the next five years if you can solve the reasoning problem in a truly general way. And yeah, you know, maybe, maybe the inference cost is huge, like maybe it costs a million dollars per token to generate that proof, but that seems totally worth it if you can pull it off and maybe can do other things with it too. Like maybe, maybe that's the movie that allows you to like, you know, write the next prize-winning novel. Maybe that enables you to come up with like lifesaving drugs. But I think
Just for context, the re hypothesis is like considered the, the most important unsolved problem in math where, I don't know, the first x set of solutions have been checked, but we, we don't know for sure yet.
NOAM:
Yeah, and I, I think the key is that, that I'm really interested in is the generality. Like we can solve this problem in domain specific ways, but then it's, it always ends up like kind of overfit to that domain. And so I think what we need is something as general as what we're seeing with, uh, transformers where you just throw it at any sort of problem and it, and it works surprisingly well.
SARAH:
And I guess you're implying that there are ways to frame the problem to make progress that are more general but really interesting to making progress in reasoning and that could be around math or possibly code is is that the right understanding?
NOAM:
My my hope is that the techniques are general. I mean, I think it's important to also look at a wide variety of domains in order to like prevent you from overfitting and yeah, one of the domains that I think would also be a good fit is co-generation because I think to write good code, like next, next token prediction is going to, is getting you surprisingly far, but I don't think it's gonna get you all the way there to like replacing engineers at big companies.
SARAH:
Yeah, maybe one piece of just context for listeners is co-pilot is amazing, right? But what we are doing with co-generation today is very local contact specific.
NOAM:
Yeah. And so if you wanna like plan out like a whole product, like that's doesn't seem doable with existing technology. And you know, I I think, I think the perspective of a lot of people when they, when they hear me say this is like, well, you know, but you just scale it. You know, you scale up the models, you scale the training and that's always worked in the past. And the example I like to give is you look at, okay, you look at AlphaGo like yeah, you could in theory scale up the training, scale up the model capacity and you don't need planning then you just have like a really large, you, you run this reinforcement learning algorithm for a really long time. You have this really big network and it will eventually learn in, in theory at least how to beat expert humans and go, but there's a question of like, okay, well how much would you have to scale it up?
How much would you have to scale up this raw neural net, the capacity and the training in order to match the performance that it achieves with Monte college research? And if you crunch the numbers and it ends up being a hundred thousand x, now these models are already costing like $50 million, like clearly that you're not gonna be able to scale them by a hundred thousand x. And so then, then there's question of like, okay, well what do you do instead? And the answer in AlphaGo is like, well instead of having all that computation be during training, you also have a spend like 30 seconds to figure out what move to, to make next when it's actually playing the game. And that shifts the, the cost burden from having to like pre-compute everything to then being able to think on the fly. And so that's why I think that avenue seems like the, the, the piece that's missing a
ELAD:
Really random question because if you look at the human brain, you have these very specialized modules with very specific functions, right? You have the visual cortex for visual processing, you have like different things for emotion in terms of specific modules. Like there's specific parts of the brain that if you ablate you remove certain emotive or other capabilities, right? There have been accidents where like poles have gone through people's heads and ablated very specific bite and that people have survived. And so you see this sort of very specific ablation of function through the ablation of specific modules. Why is it the correct assumption to think that there should be a generalizable architecture versus you just have a bunch of submodels that are all running together that collectively enable a wide range of behavior, which is effectively what we see in the brain?
NOAM:
That's a good question. Um, I, I don't think that we need to be tied to a specific technique. And the answer might be that we need to have like more specialized systems instead of just like one truly general architecture. But I, I think what I'm thinking about is, is more the goal rather than the approach. So we want something that's able to succeed across a wide variety of domains and having to come up with like a unique approach to every single domain that gets you part of the way. But I think that eventually that will be superseded by something that is truly general.
ELAD:
Yeah, that makes sense. And I, I guess, you know, one big domain is just reasoning, right? So I I didn't mean to imply that it's different subtypes of reasoning will require different approaches, but more there may be really big things that fundamentally may function in a very different way. And again, that may be incorrect, right? The brain is a evolved system, which means it has enormous limitations in terms of where it came from and how it got created. And you often end up with these local max when you evolve a system. Right? I was sort of curious about how you thought about that.
NOAM:
Yeah, there's certainly a risk always with research that you, you could end up in a, a local minimum and it's like hard to people like overfit that, and I think, I think actually like machine learning was an example of this. Like deep learning. Not many people were, were focused on this cuz they kind of assumed it was this dead end and there were only a few people out in the, you know, like Canadian wilderness that were working on this and that ended up like being tremendously successful. And so there's value in diversity, there's value in, uh, diversity of approaches and um, so I, I think, I think it does help to try to think outside the box and try to do something that's like a little bit different than what everybody else is doing.
SARAH:
Nome, you are gonna go work on this really interesting area. I'm sure there are other problems you think are interesting, especially given the practical limits of how much money we're willing to spend on scaling up beyond another magnitude or two. What do you think other researchers or teams should be working on that they're not paying enough attention to?
NOAM:
Well, I think we're in an interesting place now in AI where given we're things are at now, there's already an opportunity to build up products that can have a big impact on the world. And so it's great to see that there are people that are going in that direction and, and trying to like bring these, this research into the real world and have a big impact there. Um, make people's lives better
SARAH:
For what it's worth. Both a lot. And I got emails from multiple people telling us that they're building price negotiation agents as they as we speak.
NOAM:
Well that's <laugh>. I, like I said, I think it's doable. So I think, I think it's the right call. Yeah, I I think on the research side, there's, there's still a lot of interesting questions about like, how do we make these things more efficient? Are there better architectures we can use? I mean, I think there's just so many questions across the board that are interesting. I think the, the big thing I would, I would recommend to researchers is not about like, which area to focus on, but just like the style of research. I think there's a tendency to play it safe and not take big risks. And I think it's important to recognize that research is an inherently risky field. You know, there's a high probability that what you're working on is not gonna be useful in the long term and you have to kind of accept that and be willing to take that risk anyway. I mean, this, this happened to me like by the early research in my PhD in the grand scheme of things really wasn't that useful. Like it didn't make as much impact in the long term as, as i, as I would've hoped. And, um, and that's okay because, you know, I had one thing that ended up being quite impactful and so I, I think it's important to, to be able to take those risks, kind of like go into the field recognizing that you are taking your risk already by going into research.
SARAH:
You heard it here first. Be like no work on things that make you nervous. I think that's all we have time for. Thank you so much for joining us on the podcast.
NOAM:
Yeah. Thank you very much for having me.
SARAH:
Thank you for listening to this week's episode of No Priors
ELAD:
Follow, No Priors for new guest each week and let us know online what you think and who in AI you want to hear from.
SARAH:
You can keep in touch with me and conviction by following at @saranormous
ELAD:
You can follow me on Twitter @EladGil. Thanks for listening.
SARAH:
No Priors is produced by Conviction in partnership with Pod People. Special thanks to our team: Cynthia Gildea and Pranav Reddy; and the production team at Pod People: Alex Vikmanis, Matt Sav, Aimee Machado], Ashton Carter, Danielle Roth, Carter Wogahn, and Billy Libby. Also our parents are children, the academy and open Google Soft AI, the future employer of all of Mankind.