No Priors 🎙️110: Alex Graveley, Chief Architect GitHub Copilot, Founder/CEO Minion.AI (TRANSCRIPT)

March 30, 2023

No Priors 🎙️110: Alex Graveley, Chief Architect GitHub Copilot, Founder/CEO Minion.AI: What's Beyond GitHub Copilot (TRANSCRIPT)

EPISODE DESCRIPTION: Everyone talks about the future impact of AI, but there’s already an AI product that has revolutionized a profession. Alex Graveley was the principal engineer and Chief Architect behind Github Copilot, a sort of pair-programmer that auto-completes your code as you type. It has rapidly become a product that developers won’t live without, and the most leaned-upon analogy for every new AI startup – Copilot for Finance, Sales, Marketing, Support, Writing, Decision-Making.

Alex is a longtime hacker and tinkerer, open source contributor, repeat founder, and creator of products that millions of people use, such as Dropbox Paper. He has a new project in stealth, Minion AI. In this episode, we talk about the uncertain process of shipping Copilot, how code improves chain of thought for LLMs, how they improved product, performance, how people are using it, AI agents that can do work for us, stress testing society's resilience to waves of new technology, and his new startup named Minion.

Show Links:

Show Notes:

[1:50] - How Alex got started in technology
[2:28] - Alex’s earlier projects with HackPad and Dropbox Paper
[07:32] - Why Alex always wanted to make bots that did stuff for people
[11:56] - How Alex started working at Github and Copilot
[27:11] - What is Minion AI
[30:30] - What’s possible on the horizon of AI

Apple Podcasts: https://podcasts.apple.com/us/podcast/no-priors/id1668002688
Spotify: https://open.spotify.com/show/0O65xhqvGVhpgdIrrdlEYk

-----------------------------------------------------------------------

SARAH:

Everyone talks about the future impact of ai. But this week on No Priors, we're talking with the architect of arguably the first AI product that is already revolutionized to profession. GitHub co-pilot, Alex Gravely, was the principal, engineer and chief architect behind this product, a sort of pair programmer that auto completes your code as you type. It's rapidly become a product that developers won't live without and the most leaned upon analogy for every new startup. Co-pilot for finance, for sales, marketing, support, writing, decision making, everything. Alex is a longtime hacker and tinkerer, open source contributor, repeat founder and creator of products that millions of people use, such as Dropbox Paper. I've huge respect for the range of work he's done, ranging from hardware level security and virtualization to realtime collaborative web native and AI driven ui. He has a new project in stealth minyan. Just as a heads up, we recorded this podcast episode in person and you'll notice we don't sound as crisp as we usually do. Still. I think you'll enjoy listening to this conversation. Let's get into the interview.

SARAH:

Welcome to the podcast, Alex. Hi. So let's start with some background. How did you end up working in tech or AI

ALEX:

Tech was earlier. I started, you know, really young and uh, I got really into Linux when I was 14 or so. And um, yeah, it was right around the time when the web was like a new thing and you had to work to kind of get on the web is the idea of helping in the open and making it freely available so other people could learn from things like I was learning from things seemed great. So I went and spent many years just working on open source stuff. Yeah, spent many hours compiling kernels and uh, hacking on stuff and, uh, yeah.

SARAH:

And what was the thought process behind starting? Like hacked?

ALEX:

Oh, hacked. Yeah. So I had just finished like four or five years at VMware and uh, I wanted to get into startups. I knew I knew that. And then, so I left VMware and I started working on an, an education startup like many of us do. Many, many founders start with like the idea of an education startup. It's like a rite of passage. Yeah, it's like a rite of passage. Yeah. So I spent, I don't know, nine months working on that. Turns out education's very hardened. So after nine months I was like, all right, this isn't going anywhere. I know. I don't know if there's a value prop here. I mean, that's, that was the value is that I learned that like you have to make something that is both achievable and that people wanna pay for or spend their time on. So yeah, then I was just kind of fishing around. I was living in like a warehouse in San Francisco, a bunch of Burning Man people and uh, we were having trouble organizing large scale Burning Man projects. And so I forked Ether Pad and started hacking on it. Recruited a friend, uh, in the community to start working on it with me. And yeah, just grew from there. Before we did yc we had most, many of the large Burning Man camps using it to organize their, their builds.

SARAH:

Can you describe the product experience?

ALEX:

Oh yeah. It was a real-time text editor. Kinda like Google Docs. Google Docs is the only other one that did that at that time. It was kind of nice because it would highlight who said what, so you could go track down if somebody had a contribution, be like, oh, well what did you mean? Or, uh, I heard you say something about this. So that's very useful in like, um, large scale anonymous, pseudo anonymous groups where you don't really know who has ownership over what, like Burning Man camps and let's see. And then we did yc and then we got, uh, a bunch, almost many of those people did it. My, like, my hack was to go do YC and try and get all the companies doing YC to use my product, which many of them did. I also took like very extensive notes of all the YC presentations using the product that everyone would then like look at. And we were able to get Stripe and Stripe used us for many years actually. They built for, for the, I think we were their first knowledge base. They used it for a long time. A bunch of other big companies as well.

SARAH:

And post Dropbox acquisition, you worked on what became Paper. How did you think about like what you wanted to go work on next?

ALEX:

I spent the next few years kind of poking around at stuff. I knew that I wanted to make a robot that does stuff for you. So there's this company called Magic. So I went, worked at this, uh, company called Magic. They were doing this like text-based personal assistance. Mm-hmm. <affirmative>,

SARAH:

Uh, do you remember this

ELAD:

One? I, I, I think just like everybody starts the Education Startup Magic is one of those names that's cycling and there's a really cool AI company right now called Magic as well. So I feel like there's also these names that kind of persists from generation to generation, which I think is really cool.

SARAH:

I'm sure you know it's cogen. Yeah,

ALEX:

Yeah, yeah. No, yeah. I haven't seen a demo yet, but it sounds like they're doing the right thing. There's a few people doing the kind of whole repository changes, so it seems like a great direction. It was all op space, it was super heavy op space. So you know, they had teams of people and it was 24 hours and they would cycle in and, and you know, they would lose context and like they were all busy cause they're trying to like deal with lots of people with lots of requests all the time. And so really it was like a crash course in like human behavior, right? Like what do people do under stress? How do they act? What do they say? What can you train, what can't you train? Like can you bucket stuff? And the answer is no. Like humans are complicated, especially in Text One, the beauty of the web and like traditional UIs right, is that you fill them in and like if it doesn't do what you want, then you make the decision to either go forward or not go forward. Right? Text has this, uh, annoying property of you can be all the way at the 99% mark and then change the goal entirely. Yeah. So it's, it's complicated. It's hard to understand what people want. It's hard to understand the complexity involved, especially when you're dealing with the real world. Flights get delayed, passwords get lost.

ELAD:

Do you think we're the last generation to deal with that? In other words, it feels like we're about to hit transition point in which agents can actually start doing some of these things for us for real. When before I think all these products really started with these operations heavy approaches. I remember, you know, there was a really early sort of, um, personalized search engine called a Bark. Or similarly if you kind of look behind the hood, it was a lot of ops people and there was a little bit of algorithm, right? That

ALEX:

Was really with like, um, you would like sort of describe what you're good at or something and then they would try and send questions to you and your

ELAD:

Answer. Yeah, exactly. They kind of route things. But I think there's actually people doing some of the routing at least, or I can't exactly remember right? But I think it was, you know, I think a lot of people wanted to build these really complex sort of bots or agents that were doing really rich things and the technology just wasn't there. It feels like for a period of time there was also some startups. I remember one company, um, that I got involved with that was trying to do like virtual scheduling and assistance, you know mm-hmm. <affirmative> 6, 7, 8 years ago. And again, it felt like it was a little bit early maybe really cool

SARAH:

Clara Labs,

ALEX:

Clara, that's the one. Yeah,

SARAH:

Yeah, yeah, yeah. I remember this era and then we like, we had a operator. Do you remember? Operator? Yeah, I remember Operator. Like three or four. And then like on the question answering stuff, we had Jelly. So there's a whole series of

ALEX:

Yeah, yeah. What we were able to do with Magic as a, like an aside there, you know, I, I started working there cause I wanted to work on the AI part. I think somewhere on there, um, Facebook M started as well mm-hmm. <affirmative> and it was just, it was a fun place to try and learn everything I could about solving those problems. So yeah. Before Transformers, there was a sequence to sequence was kind of the previous iteration. We were able to like take all the, all the histories of the chats between assistance and, and people and train a little model on it and uh, a little bike today's metrics and run it and it would like, it would show some gray text in the little text bar and people could edit it and hit enter and hit

SARAH:

The operators.

ALEX:

Yeah, yeah. The operators and yeah, we would measure how much time they sent. Uh, they spent typing with it and with it out it, and so it'd save like half an hour across a hundred people per day, uh, across an eight hour shift. So yeah, that was like my first shipping AI product.

SARAH:

And then how'd you end up going to um, Microsoft?

ALEX:

Oh yeah, there's a bunch of other stuff along with. So after that I got into crypto. My friend was doing hcap, which was like, uh, sort of a capture marketplace, which is now like a, something like the number one or number two cap show service in the world, which is crazy. So kind of launched that, that was fun. Annoyed people, the world over for many, many man hours in aggregate and then left that to work with Moxie on on a cryptocurrency for Signal. So that was really fun, complicated and it all worked in a few seconds we were shooting for Venmo quality.

ELAD:

When you think about crypto in the context of ai, because people talk about it in a few different contexts, right? One is you have programmatic sort of money is code running. And so that could create all sorts of really interesting things from an agent driven perspective. But then the other piece of it is identity. And some people think, I mean world coin would be one example, but there's other examples of effectively trying to secure identity cryptographically on the blockchain in an open way and then using that identity in the future to differentiate between AI driven agents and people. Do you think that's gonna be important or does that stuff not really matter in terms of the identity portion of not only crypto, but just like how we think about the future of agents?

ALEX:

It's a good question. The honest answer is I think we're gonna go through a many year period of extreme discomfort where AI pretend to be things or uh, or confuse people or extract money from your grandparents or drain people's life savings in ways that are scary. And you know, open AI is trying to do their best, but for some reason the focus has been on open AI doing everything and instead of like, we should go build the systems that prevent that we should go pass the legislation that that drops the hammer on people doing that stuff, we should go. All this kind of stuff that is unfortunately it seems like we're gonna need some really bad things to happen before we align correctly. Mm-hmm. <affirmative>, I'm not really scared about AI's killing us, although I'm very grateful that there are people that are thinking about it. I'm more worried about bad people using new technology to hurt us.

ELAD:

Yeah. Eli from near has some really interesting thoughts on this cuz he was one of the main author, or he was the last author on the transformer paper before he started near mm-hmm. <affirmative>. And he's brought up these concepts of like, how do you stress test society relative to the coming wave of ai? Which I think is an interesting concept.

ALEX:

Yeah, it's a great, it's a great way to look at it. Like it's not as bad as it could be, right? If you think about it, they're, most of the things that you want to spam either have a spam blocker or are, are somewhat difficult to create an account on mm-hmm. <affirmative>. So doing a better job of, uh, sock puppet account filtering is gonna be really important going forward. You know, I like what CloudFlare is doing with their kind of fingerprinting instead of visual captures, which are not good enough anymore. One thing that is like kind of a saving grace, grace here is, is that many of the things that you would want to do cost money mm-hmm. <affirmative>. So calling everybody costs money, texting everyone should be hopefully illegal soon, but also cost money. Maybe not enough money to prevent these things, but

SARAH:

Probably not enough as agents can make money. Right? They can just look at the trade offs of cost.

ALEX:

Yeah, I think it's interesting. I guess I would say it's not, you know, it's somewhere in the middle. It's like imagine there's an, you know, North Korea has been trying to do this to us for a long time. Right now there's a North Korea that has more resources or is more distributed or whatever. It's, you know, we have some mitigations, we need more. We need to be thinking about it a lot more. How

ELAD:

Did you end up at GitHub and how'd you end up working on

ALEX:

Co-pilot? While I was working on mobile coin, my dad's kidneys failed and I tried to donate a kidney and they found a lump in my chest as part of the scans they do. And uh, I had to have my right, most of my right lung removed in 2018. And so that was a big deal and so took some time. It's weird. Healing from internal injuries takes a lot longer than you, than you think. Anyway, happy story is that I don't have cancer now for over four years and, and my dad's kidneys got, he got a transplant also. So things are good. And so after poof, I don't know, I guess I was recovering for for quite a while and then I went and begged my friend for a job. I figured I should start working again. That was the transition I guess. So I worked on some random stuff. At first I converted GitHub to using their own product to build GitHub, which was kind of fun. They weren't, yeah. So I think people still use code Spaces now to build GitHub, which is pretty cool. But yeah, then this kind of opportunity to work with open AI came up and because I had been tracking AI in the past and was pretty aware of what was going on, I jumped on it. Was that

ELAD:

Proposed by Open AI or by GitHub or So kind of initiated at all?

ALEX:

So I don't know the exact beginnings. I know that opening AI and Microsoft were working on a deal for supercomputers, so they wanted to build a big cluster for training and there was a big deal that was being worked out and there was some software kind of provisions thrown in, I think Office and Bang probably. And GitHub was like, oh, okay, well maybe there's something GitHub can do here. Think OpenAI threw a small, threw a small fine tune over then was like, here's a small model trend on um, on some code, see if this is interesting. You know? So we played around with it. What's small in those days? I don't know. This was, uh, I have to remember now. I think this is before I knew very much so it was definitely not a da Vinci size model, it's for sure. I don't know what sets was.

Yeah. And so it was just like, I learned later that it was basically a training artifact. So they had wanted to see what introducing code into their base models would do. I think it had positive effects on chain of thought. Reasoning code is kind of linear, so you can imagine that, you know, you kind do stuff one after another and the things before, uh, have an impact and yeah, it was not that good. It was very bad. It was I think just like I said, just an artifact and a small sample of GitHub data that they had crawled. And this was before actually I, I joined me and this guy, Albert Ziegler were the first two after Uge Uge got ahold of this model and started playing with it. And he was able to say like, well, you know, like it doesn't work most of the time, but here it is doing something.

You know, here's, here's, and it was only Python at that time. Here it is, you know, generating something useful. We didn't really understand anything. So that was enough to like, okay, well you know, go fetch a of people and start working. See, see, see if there's anything there. We didn't really know what we had. So the first, you know, task was to go test it out, see what it did. We crowdsourced a bunch of Python problems in house, stuff that we knew wouldn't be in the training set. And then we, we started work on fetching repositories and finding the tests in them so that we could basically generate functions that were being tested and see if the test still pass. There had been like a brand new pie test feature introduced like recently that allowed you to see which functions were called by the pie test.

So you'll find that function zero body, ask the model to generate it and then rerun the test and see if it passed. And I think it was less, I don't know, 10%, something like that of, of those guys. And the dimensions are kind of like, how many chances do you give it to solve something and then how do you test whether it's worked or not? Right? So for the standalone tests, that was you, we had people write test functions and then we would try to generate the body and if the test passed then you know, it works. And in the wild test harness we would download a repository, run all the tests, look at the ones that passed, find the functions that they called, make sure that they weren't trivial, generate the bodies for them, rerun the tests, see what it passes. See it, you get your percentage.

Yeah, I mean it was something like some very, very low percentage up front, but we knew that there was kind of a lot more juice to squeeze. So like getting all of GitHub's code into the model, um, and then a bunch of other tricks that we hadn't, you know, we hadn't even thought of at that time. Yeah. And eventually, you know, it it, it went from, you know, less than 10% on in the wild test to over 60%. So that's like one in two tests it can just generate code for, which is insane. Right. Somewhere along the way, you know, there was like 10% to 20% to 35% to 45%, you know, these kind of like improvements along the way. Somewhere along the way we did more prompting work so that the prompts got better somewhere along the way. They used, you know, all the versions of the code as opposed to just the most recent version. They used Diffs so that it could understand small changes. Like Yeah, just it got better. And so, but at the, when we first started we were just, we didn't all we had, we were just trying to figure it out at the time. They, they were thinking in terms of like, maybe you can replace Stack Overflow or something, you know, do a stack of workflow competitor. Was

SARAH:

That the first like product idea you guys had for it?

ALEX:

I, I, I don't know that we had that idea. I think that was kind of like a Yeah,

SARAH:

On high idea.

ALEX:

Yeah. Yeah, yeah. Yeah. That was more like a, it'd be nice that if you made something to compete at with stack of workflow, cuz we have all this code, wouldn't it be nice to leverage it? Yeah. And so we made some UIs but like early on, you know, it was like <laugh> early on it was bad. So it would be like you'd watch it and it would, it would, it would like it would run and most of them wouldn't be bad. It'd be like, there'd be like one success and be like, oh sweet, I got a success. But I had to wait, you know, some number of seconds for

SARAH:

Was the test user group just like six of you, like some larger group?

ALEX:

The first iteration was just like an internal tool that people would, that help people write these tests. Mm-hmm <affirmative>. And then we wanted to see if maybe we could turn that into some UI that people would use cause we're <laugh> if there was some way to cover up the fact that one in 10 things pass. Right. So you tried to tried a few UI things there and then it was actually open ai. It was like, it would be nice if we could, we're testing these model fine tunes. It'd be nice if we could like test them more quickly. Can what about doing like a vs code extension? Just do auto complete. And uh, I was like, all right, sure, why not? And yeah, so we did auto complete at first and that was kind of, that was a big jump, you know, cuz they were still thinking in terms of like Stack Overflow, you know, but this was, it's like, you know, I didn't have any ideas basically, I didn't know how to beat Stack Overflow with this thing, but we could play with some stuff in, in vs.

Code that was maybe closer to the code. You know, at first we did auto complete and that was kind of fun. It was useful. It would show this little pop-up box like auto complete does and you could pick some strings and so that format actually, you know, the, the usage was, you know, was fine. It wasn't the right metaphor. Exactly. Right. But you've got like this code generated mixed in with the specific terms that are in the code and it's a little, it's not exactly the same thing. We tried things like adding a little button over top of empty functions so it would go generate them or you could like hit a control key and it would create a big list on the side that you could choose from. Or there's a little pop-up thing. So basically tried every single UI we could think of NVS code

SARAH:

And multiple generations. Like the, the list that didn't work.

ALEX:

Yeah. None of, none of none of them like really worked. I think lists were like, you know, maybe you'd get one generation per person per per day. And this was just a small sample. It's just like a few people that were interested in, in at GitHub language nerds or people that have written tests for us and open eye people. Yeah. So very early on I had this idea that it should, I had this idea that it should work like a Gmail gray text auto complete mm-hmm <affirmative>, which was, I was like enamored with that product. It's like, it was the first quote unquote large language model deployment in the wild. It was fast, it was cool. Like the paper's great, like they give you all sorts of details on how they do it. All the, all the workarounds they had to do. So that was always in the back of my head, you know, it was bad also it was like, you know, those completions are not good but it seemed like the right, the right thing anyway. And somewhere along the way after I, after we tried all the ui, you know, sort of come up with some idea for BS code didn't support this. So I tried to hack it in, finally came up with a way to hack it in and uh, enough to make a demo, like a little demo video.

SARAH:

Was there support to like build real support for it within the organization?

ALEX:

It's a little complicated. I, I guess, you know, we're pretty much a skunkworks project. No one knew about us so we would go to like the, if you go to VS code, people would be like, Hey, we need you to go implement this very complex feature. We like, I don't even know who you are. Like what are you talking about? There was definitely some politicking that happened to, to get the VS code people to dedicate some resources to that on a short, short timeframe. Like we were moving really fast, you know, it was less than a year before from beginning to ship public launch.

SARAH:

Was there a certain metric where you were like, this is good enough, like we need to actually put it in the public product?

ALEX:

Uh, which specific? With the completions or like the ui?

SARAH:

Yeah, it was completions so it was,

ALEX:

Yeah, I, there was, I mean we had a long, we had a nice long window of public access before ga so where it was free and you could use it and, and we did a bunch of optimizing for different groups of people that would be, you know, okay well you know, do we want more experienced people? Do we want more, uh, new people who want people from this area or this area? And that gave us a bunch of really good stats. Uh, so we were able to learn that for instance, like speed is the only thing that matters. Yeah. There's something crazy thing like every 10 milliseconds is 1% fewer completions that people would do that adds up 10 milliseconds is pretty fast. We learned that because somewhere in our first few months of public release we noticed that Indian completions were really low. Like for whatever reason they were just significantly lower than, than Europe. So

SARAH:

Network latency to India. Yeah.

ALEX:

And it turns out cuz opening only had one data center, so it was all in Texas. And so I was going, if you can imagine, you're like typing and so that goes from India through Europe, over the water down to Texas and back and back and back. And now if you've typed something that doesn't match the thing that you requested, then that's useless, right? So you don't get a completion.

ELAD:

But by the way that, I know that's obvious, but that's happened on every single product I've ever worked on. You know, like when I was at Google, like I worked on a variety of mobile products, same thing, you know, payload times, obviously search in general, a hundred millisecond difference, like it's a big deal. Makes a big shift in market share. So yeah, it's kind of nuts how much speed matters.

ALEX:

Yeah. And so we, once we figured that out, we knew that we had something awesome, right? Like people that were close to Texas were like, this is freaking great. Like, they were like, you know, we had a, a slack channel when people were posting in it all the time. And the most fun stuff was like these people that would pop up and be like, I don't program, but I just like learned how to write this a hundred line script that like, does this thing that I need. It's like, oh my God,

SARAH:

I definitely feel like I now speak different languages that I don't actually know this syntax of. Yeah. Which is like some very exciting.

ALEX:

Yeah. It turns out these models are really great at finding patterns. So the, once we have the, the UI mechanism that worked. So we knew, we knew on something like that. And then it was just this like wheezing as much performance as we could get out. We, like, we basically never found the bottom of like, you know, we, we hit, we made as fast as we could and it, it was still improving on completions and then not perpetuated. Like, okay, okay, I know there's this plan for like Azure to run open AI in six months. We need you to do that in the next month. So like, can we, let's figure out how to make this happen. Cause we wanted to run a bunch of GPUs in Europe, so, so we could hit Asia. You know, there was no, there was no other place that we could, we could run them in Europe, we could run them on west coast, we could run them in Texas at the time. So that was, and that, and Microsoft stepped up there, we got it running and then pretty much after that we launched and yeah.

ELAD:

Were you surprised by the uptake Postlaunch?

ALEX:

No, no. I mean our retention rate was 50%. Like it never went like months later. It was still above 50% by like weekly cohort, which is like insane. Yeah, right. And we didn't, we didn't know if people would pay for it. That was one thing I lobbied pretty hard for going cheap and capturing the market. How

SARAH:

Did you guys think about inference cost for this thing at the beginning?

ALEX:

Oh yeah, we were, our estimates were wildly off. Wildly off. Yeah. So we got estimates that were like, you know, it'll be 30 bucks a month for ev for your on average, right? And then, um, once Microsoft was able to like do some kind of, do their Azure infrastructure, we were able to then like fork off little bits so we could do more accurate, uh, projections. And uh, there was a bunch of like moments where like, how much is it gonna cost? You'd like wait for these results. The first big one was 10 bucks a month, it'll cost 10 bucks a month. And I was like, oh my God, it's so much cheaper than we expected and then we could optimize it. And that was like, we hadn't even optimized on price yet. Right. And then we optimized and priced a bunch and yeah, now it's less than that. So like, it was very fortuitous, right? Like we were thinking like, okay, well maybe it's, maybe it's enterprise only because that's the only people who are gonna be willing to pay for this. Like 30 bucks a month is not, it's a lot. And that's like with no margins, right? Um, but yeah. So for

SARAH:

40% of your code, it's not a

ALEX:

Lot. Yeah. That's the thing. We know that, we know that. Now, uh,

ELAD:

Where do you extrapolate all this like three to five years out? Are there basically gonna be just like agents writing code for us in certain ways? Is it 95% of code is written by co-pilots and you know, humans are kind of directing it? Like what, what do you kind of view the, the world evolving into in the next couple years? And next couple? I mean like three to five, not 20.

ALEX:

Yeah, it's hard. I, I don't know. <laugh> Yeah, I think it's hard for me to, you know, it, it, it, it's hard for me to imagine what that world looks like cuz it's such a shift from like I have my hands on something and I know that it's right to um, where we are now, which is I mostly know what's right or I have a sense that it's right, but I have to, I have to test it and, and see it run to know that it's right to then just write this and I'll, I'll trust that it's gonna work. Like those are pretty crazy transitions, right? Yeah. They, they exist like, you know.

ELAD:

Sure. But you could also imagine like certain ways to do like a code review post some chunk or some, some other sort of quality check to

ALEX:

Yeah, I think every, if that's the goal we want to get to is the people. Like I think every barrier to that is achievable. Mm-hmm. <affirmative>. So we can code review only those, the dangerous parts or only the confusing parts. Or we can do things like train a model on functions before, after changes to say like, okay, this looks like a more polished version of this function would be this. Or yeah, we can do things like, you know, just start the very basic, you know, very basic main loop and then add everything piece by piece with tests so that it's, you know, what, what works and what doesn't. And then just have the, the, I keep generating what's the logical next feature. Like all these things will get figured out. So if, if that's what we want to do, that's what's gonna happen.

SARAH:

So what's the, what's the idea behind minion?

ALEX:

Yeah. I think I mentioned making bots that do stuff for you. It's a broad topic and I think that's where we see, uh, going, you know, the next few years are in AI is taking action. Not just, uh, answering questions or writing copy, but actually helping us in our daily lives. Things like organizing my schedule or booking flights or finding a trip for me to take or doing my taxes or telling me which contacts I haven't talked to in a long time, I should reach out to, you know, there's a lot of stuff that we can do by giving AI's access to information and letting them act on that information in a controlled way that checks to make sure that we're, that we're aligned. And yeah, I think that'll be a really fun future. Almost like you can imagine co-pilot applied to everyday activities, right? Like co-pilot gives you a little bit of help. So I want minion to give you a little bit of help outside of your code editor.

ELAD:

How did you decide to, to work on minion specifically?

ALEX:

Minion specifically? So basically I, I stopped working at Magic. I quit cuz I, I couldn't figure out how to hook up the AI to data. I was like, if in order to, in order to improve the quality, I need a PhD in math. Like I don't know what to do now. And it just sort of, the models got better enough where that specific problem seems solvable. Yeah. So the, the tech got better. And so that specific problem is where interacting with the real world broke down in the past. You know, like I said, floods get delayed. Prices change, not just like a little bit all the time. You know, they might not have the seat you want at the concert that you want to go to. And so, you know, AI are this kind of compression of everything that you in their training set, but they're not a real-time mechanism anyway.

So that, that was the idea. I was like, okay, well I think we can, I think we can work on this old problem of how to make a a bot do stuff for people. That's what we want. Let's go make it. I don't know, it's like maybe I can use the excitement from co-pilot to launch into something which is incredibly hard and, but which I believe the technology is around for if we can figure it out. So yeah, that's it. That I, that's the only reason I've ever gotten into startups actually is like, okay, I wanna do a startup so that I can do a harder startup. Yeah. You know, or I wanna do a project so I can do a harder project. Is this one sufficiently hard? This one's hard. Yeah, it's good and hard. I think part of that is they're these fun things that you kind of learn along the way that keep you, that keep you like engaged, right?

Like if the thing with code with co-pilot is like, it turns out code is pretty special, right? Like you can run it. So if an AI generates some code and it runs, then you know, something about that code that you wouldn't know necessarily with text and accomplish, like doing agent like work on the internet or on apps is uh, has a lot of these similar kind of properties where it's like, oh yeah, you can, you can learn something here. We can learn what we can learn from what people do, or we can see if it's a success or a failure and then learn from that. Or we can optimize what works based on what doesn't work or what, figure out what's annoying and and, and try to improve it. So the, I think these things all compound in some really interesting way. You know, I think the goal right is is you know, straight outta sci-fi, right?

You wanna make a thing where you say, Hey, computer file my taxes and it does the right thing, you know? And, uh, I think we can get there, it's certainly in the next few years, but it's also fun to like think about how to break these things down and turns out breaking down tasks in the same way that humans do take a complex task. You figure out, you know, you write a list of things to go through it. Same kind of thing works foris as well. So you break 'em, you break down complex tasks, you figure out if there's any information you need and maybe you have to write some code, maybe you have to ask some questions, maybe you have to query some data. Same thing human wouldn't do. Humans don't usually write code to do their taxes, but sometimes it's kind of the same thing. You go through a list of your pay stubs, you know, that's executing a for loop. Also, I like data sets that don't exist. So people clicking on stuff on the web is a gigantic data set, which is currently un owned. And so I think that's pretty exciting. I think there's a lot of stuff to learn from that.

SARAH:

So you're hiring, and one thing we've talked about is like, it's kind of a funny thing to try to hire for building products with this new set of technologies and that like working in machine learning for the last decade may not help you that much. How do you, how do you think about the people you need?

ALEX:

Yeah, it's been strange. It seems very bimodal. Like very senior people get it and very young people get it. And so I don't know that the middle has really caught up yet or realizes that. Are

SARAH:

We on the other side? Are we in the middle <laugh>?

ALEX:

No, I just mean the middle in terms of like, I just mean the middle in terms of like, it's like

SARAH:

I'm very young.

ALEX:

No, no, it's not an age thing. It's not an age thing. It's more like you tell someone who's very naive and you tell someone who's very experienced, like, I can make magic. Here's how I can do it. And then the very naive person is like, that's awesome. And then the very experienced person is like, ah, maybe. Yeah. You know, but the response from almost everybody else is that's not possible. Mm-hmm. <affirmative> or I don't see it. But

ELAD:

Yeah. And a number of companies I know are trying to build that intuition internally now and often, this is the first time in a really long time that I've seen certain founders come back and start coding again. You know, if they'll have a multi hundred company and they'll get so enamored by what's happening that they'll just dive in. A number of them have told me that they feel like the, some of their team members just don't have that intuition for what this can actually do. And so they don't even know what to build or where to start or how interesting or important it is. And so it's almost like this founder mindset is needed to your point in terms of this is a really interesting use set of capabilities and how to actually learn what these are and then how do I apply them? And I think people lack natural intuition for what, what this does right now.

ALEX:

It's definitely not intuitive. Yeah. I mean, I I would say I have some intuition now and people that are, again, on that experience spectrum has some intuition, but it's not intuitive. There's many kinds of different learners and there's many kinds of different programmers, right? It takes a certain kind of programmer to be comfortable with this idea of like, I don't know what's gonna work. Let me try some stuff. It turns out that's a, it's a similar attributes to the web or product, product making, being able to test stuff, look at the right metrics, find the right metrics. Yeah. Those are, those are useful skills I think that are reusable, but the intuition and the tenacity that it takes to like, you know, this doesn't work at all. What do I do? That's still rare, I think, in that field because there's, there's so much uncertainty that it's easy to go. Like, I tried 10 things and it didn't work. So like this is impossible. Like, okay.

SARAH:

I think the natural reaction of somebody looking at something that has 10% performance at the beginning is like, why even bother? We're not gonna get there. Right.

ALEX:

That's a great point. The crazy thing is that these things work at all, right? And not only that, but they scale and they improve the scale, right? Most things break at scale. Almost everything breaks at scale. And that's what, you know, that's what you hire people to deal with. You know, every doubling you, usually something breaks, you gotta fix it. It's a pretty crazy thing to think of. What kind of emergent properties might still be out there if we can get these things 10 times bigger. So I'm always, I'm always in favor of people pushing those limits, you know? Like, it doesn't make sense. Like even, even the best people can't explain what's gonna happen when you, you know, go bigger. We built a large Hadron Collider for that reason too. You know, we didn't, I know. We think maybe we'll find some stuff. These endeavors are valuable and like, yeah. I'm, I'm grateful to be a actor during and during this time. It's

SARAH:

A great note to end on. Thank you for joining us for the podcast.

Thanks!

No Priors 🎙️110: Alex Graveley, Chief Architect GitHub Copilot, Founder/CEO Minion.AI: What's Beyond GitHub Copilot (TRANSCRIPT)