No Priors🎙Episode 102: Cris Valenzuela, Founder/CEO RunwayML (TRANSCRIPT)

February 09, 2023

🎙No Priors, Episode 102: Cris Valenzuela, Founder/CEO RunwayML

Full Transcript Below (not edited):

What does AI-powered content creation look like? with Runway ML’s Cristobal Valenzuela

For a long time, AI-generated images and video felt like a fun toy. Cool, but not something that would bring value to professional content creators. But now we are at the exciting moment where machine learning tools have the power to unlock more creative ideas.

This week on the podcast, Sarah Guo and Elad Gil talk to Cristobal Valenzuela, a technologist, artist and software developer. He’s also the CEO and co-founder of Runway, a web-based tool that allows creatives to use machine learning to generate and edit video. You've probably already seen Runway's work in action on the Late Show with Stephen Colbert and in the feature film Everything Everywhere All at Once.

SARAH:

We're thrilled to have Cristobal Valenzuela on today's episode of No Priors. He's the CEO and Co-founder of Runway ML. Runway's a web-based tool that allows creatives to use machine learning to generate and edit video. You've probably seen Runway's work in action, visual effects editors have used Runway to create visuals on the Late Show with Stephen Colbert and the movie Everything Everywhere, All at Once. Chris, welcome to the podcast.

CRISTOBAL:

Thank you for having me here. Super excited to chat with you.

SARAH:

So can we start all the way back? I think you are the only person I know with degrees in economics, business design, and then also went to art school. How'd that happen, and then how'd you stick an interest in ML in there that became very real at some point?

CRISTOBAL:

Yeah, that's an interesting question. I've always been very curious about just things in general and so be trying to find ways of channeling that curiosity. And I'm originally from Chile and I study in Chile a combination of business and econ, and then went into design and it was a very particular design program. I spent a lot of time with physical computing, which is working with hardware, with electronics, mostly applied to design and art. And while I was doing that I was also consulting. So for a moment I thought I had two lives. I was doing art on the one end with Arduinos and electronics and on the other side I was consulting for these banks, which is very different. But I love it. I think it's perspectives on world views that are very opposite, at the same time you gained from being at both. Long story short, I fallen in love or was experimenting with early computer vision models in 2016, 15 and then went into RabbitHole, apply and got a scholarship at NYU and then spent two years in art school. ITP, that's the name of the program, it's a very unique program, for me there was very fundamental piece in my career of understanding how to bridge business, design, art and technology in a cohesive way.

SARAH:

Amazing. And now you have one life that combines those, but in the art side, how should I picture electronic art?

CRISTOBAL:

Media arts probably is the best way of describing it. I think for me, media arts is a way of expressing a worldview using technology, like any other form of art, you just experimenting and reflecting and expressing a worldview of using a piece of tool. And in this case it happens to be that we like to express it via computers and software and writing softwares a form of art and making hardware is also a form of art. And I think one thing I remember earlier in my career when I was doubling between art and business, I met this very famous Chilean artist, the Sesa photographer, and he was just mentoring me and we were chatting and he was speaking to me, he was like. "Cris, this is the same world. We all live in the same world. It's the same, we just build silos and arbitrary definitions of what is". I think he just said it. I really stuck with that and I think that's how I like to link up the world. It's just the same world. You can apply different points of views and perspective on that.

We build arbitrary definitions of this is art and that's design and that's econ and that's business. But I think true creativity and curiosity comes from just looking at it as a whole and taking things that weren't supposed to be part of one thing and then adapting them. And sometimes it's hard because you need to learn things that you've never done before and it's uncomfortable and it's perhaps you feel like an imposter, you shouldn't be doing this. I've learned not to care, to be honest. I just drive by curiosity, you'll figure something out and I really like that.

ELAD:

That's super cool. Yeah, seems like a lot of the history of Silicon Valley actually ties in really closely with art and the art scene. So if you go back to the Stewart brand world of the seventies or some of the early things that were being done on the Mac or you even look at some of the people in technology where the art side of them is understated. Paul Graham obviously wrote a whole book on this hackers and painters and is a painter himself. But there's people like Sep Kamvar who started a company that Google bought and has done a lot of crypto related things. He was a co-founder of Celo and he's exhibited digital art at the moment as well. And so it just feels like it's almost under discussed now in terms of this overlap between technology, art and the two scenes, except for occasionally when people go to Burning Man or something, they bring it up. Other than that, it seems like it's very under focused on.

CRISTOBAL:

Yeah, I agree. And for me has always, it's been, to be honest, I've been new. I've been in New York six years and Runway now is going to turn four years. And I was also new to just the tech world and SF. I've never been to SF three years ago, so I'm relatively new to the space, but I think I've how we approach it was with that same level of curiosity of I'm going to figure out, I'm going to learn about it. And I think that there's two sides of that. The one is that it takes time for you to adapt to that because it's just new. Like everything else, you need to understand it, you need to understand the patterns of that subject, that domain, that area. But at the same time, I'm looking at it with fresh eyes with things that perhaps the ecosystem itself has considered norms.

I don't consider them norms. I want to try new things. And I think that opens the door again to do new things and experiment with new things. And that has, I think, been a consistent path in both my career, but I also in Runway as a whole that we look at things, we try to look at things with very fresh eyes and pretty much with a first principles' mentality to it is like, okay, why are we doing this? But really why? And they go to the basic aspects of it and then innovate. I think that's a lot of innovation comes basically from that way of looking at the world.

SARAH:

Runway is, I think, a very creative shape of product. It's not the kind of product you can come up with if you're just casting around for a good idea. It obviously comes from creativity and discovery and maybe what you could do for our listeners, actually they should go try it. But can you explain how Runway works as a tool and what people do with it to set context?

CRISTOBAL:

Yeah, totally. Also, happy to set a bit of context of the company itself. So I think that better helps contextualize the product itself. The best way of describing Runway I would say is to think about it as a, I'd apply AI research company. We do core fundamental research on neural networks for both content creation and video automation and GeRDI models. We then transfer those models into an infrastructure, a system to deploy those algorithms and systems in safe ways and in ways that will make us build products that are useful for people. And those products can take different shapes and forms. We have around 35 different, what we call AI power tools or magic tools. And those tools help serve a wide spectrum of creative tasks from traditional editing, editing videos or just audio or imagers has been a very expensive, time consuming and sophisticated process.

And so we build systems that help you do that. So we have tools like green screen for example, which a lot of broadcasting companies and film studios and post production companies use to reduce the time of rotoscoping, which is if you ever speak with a filmmaker, that's the one thing no one do and just no one's do it, but you have to do it. And so we basically just help you reduce the time. And we also have tools that help you ideate and design and craft. And we have a set of suites for generative image editing, for generative video editing. So it's the best way perhaps to think about it is it's a creative collection of tools and systems that just help you augment your creativity in any way you want.

SARAH:

From an origins' perspective, you had this thesis project, which were all of these creative tools and it was really, I remember watching the presentation, it was around accessibility of the increasing number of algorithms that could help people in this creation and editing process for different modalities. Then, when we met in 2019, you framed it quite differently as this desktop app store for ML models. Can you talk about the iterations from that collection of algorithms you were experimenting with to the app store idea to where Runway is today?

CRISTOBAL:

Yeah, totally. A lot has happened. I would say over the last decade or so. When I started building Runway, it was perhaps the AlexNet, ImageNet moment. There was image classification was the big thing and the breakthrough and a lot of interesting ablations were coming out of that time, but still very early. TensorFlow was just perhaps a year old. PYTHIR might not even have been released at the time. I think PYTHIR was 2016. GANs were just very early, early, early inception time. But what I kept seeing was there's this neural aesthetic, this neural capabilities that are impacting not just the visual world or the, I don't know, perhaps industries and markets like solve driving cars that are using a lot of these technologies and hardware, but the outputs are very interesting from a visual perspective. There seems to be a correlation and approximation towards the visual domain.

And so I started just experimenting with what does actually that mean? What do you mean by how do you experiment with these sophisticated algorithms that were very early that had all this obscure CUDA dependencies and C++ libraries that were just very research centric because they were basically research, core research, but it was just fascinating by the outputs of the research elements. And at the same time, everything I would say that we consider a baseline today wasn't really there yet at the time. Things have progressed radically. The space has been growing exponentially, but systems and software and obstructions to tap into that potential wasn't really there. So our first intuition and our first product experimentation was let's build thin layer. Basically, let's take this research set of models and the amount of models that are coming up, it's just so interesting.

Let's add a thin layer of accessibility to those models, specifically target and aim at creatives. And so if you're a designer, a filmmaker, an art director, a copywriter, you might want to tap into some of these things. So you want to experiment with them, but they're just very hard to get started with. So we build what at the time was as you describing a model directory, it's an app store of models. We had around at some point, 400 different models. It was one of the first, I would say model apps. I think there are a few out there now that you can tap into and use them. This was very, very early and we built a whole system around there. We built an SDK, we build systems for deploying those models into real-time applications. So we build restful API systems where you can use a model, train a model, and then deploy that model.

And so people were building web apps and interactive, it was GPT11. Someone was training a model and fine-tuning a GPT-2 model on a specific corpus of data and then creating an API to build a text generation app. And we had all these very interesting layers of applications that to be honest for us, was just a way of learning a lot about the space and a lot about what was visible, what was possible, who was interesting in building more of this. And from there on we've continuously iterating. We've learned a lot from that model registry or model app. We still use a lot of those in our infrastructure parts on the app, but also we gather a lot of insights on how to build this systems in scalable ways.

ELAD:

How did your technology stack or the approaches that you took transition over time? Because I think when I look at the evolution of the area, to your point, a lot of people are doing like CNN and R and N based things in GANs and all the early things in neural networks. And then the analogy may be, I know a lot of people who started companies right before AWS launched, and their whole infrastructure stack got stuck on the past set of approaches. And then later a subset of them transitioned onto AWS and a subset just continued with their own private clouds. And I'm just curious how you thought about it as obviously diffusion models I think were invented around 2015, transformers 2017, but took a couple years for all this stuff to catch on. And so when did you start transitioning architectures, or how have you thought about this whole evolution of the field relative to the tools that you provide and reinventing them over time and everything else?

CRISTOBAL:

No, that's a great question. Something actually we think a lot about when you think about product sequencing and roadmap, which is just, I would say, one of the most important aspects of product building is how do you sequence everything you have to do and specifically in infrastructure, what makes the most sense and how do you spend time. Every single day means a lot in a startup. I think for us was a few realizations to be honest. One is that the moment something gets released, let's say transformers or a particular piece of technology that you think would be interesting or could be worth experimenting with, I think it takes a collective set of months, like 12, 24 months sometimes to understand the implications of that. And we've seen this with language models like GT three has been around for some time, but it took a collective 24 months of just tinkering and experimenting to truly understand, okay, where can you go, and what can you build, and what's possible?

So I think that we embedded that and we always keep that in mind. The second thing I would say is things are changing really fast. And so if you're thinking about building a long-term business and a long-term product, which we are, you always have to decide of, okay, what are long-term birds versus shirt-term bats? And I think a lot of building and software engineering and developing products is just saying no to a lot of things. There's customers who might want to ask you to build something and could sound good. It could bring actually revenue and some growth, but it actually might move you away from a more consistent long-term plan. I think for us was a decision of those things. And then the third one I would say is the third component of how we think about that stack is really understanding our users.

Who are we building for? And so early on it was more a technical product, so you had to know CUA and Docker containers and managing your Dockers NVIDIA GPU cards and you have all this sophistication that I think it's in some part natural when things are so early because just the only way of making sense and also you have to build more things. But for us, we've always been thinking about artists and filmmakers and creatives of hard and really those things don't really matter that much. What matters is your idea and how you execute that idea. And so from the stack perspective, we've iterate a lot on the backend side of things, but from a user perspective we iterate even more on how to present those things and what obstructions and metaphors you need to build to really aim to solve the things that you want to solve. But yeah, it's a fast growing space. So there are a lot of things that are changing.

SARAH:

In an area where the research... Nobody can keep up with the papers, the progress is mind blowing. And has been you referred to Runway as I think an applied research lab. Is that the right term?

CRISTOBAL:

Yeah.

SARAH:

Where do you decide, given the progress in the community, when you need to do in-house research and push the state of the art versus exploit what's out there?

CRISTOBAL:

Yeah, I guess going back to that learnings early on, I think one thing that we realized is models on their own are not products. A model is research component and taking a model and productionizing that model, it's a different problem that actually building one single model or one single task or problem or improving a metric in a specific direction. There's a lot of nuances of how that model will get deployed, it will get build, how users will interact with it. The unit economics of running these systems as well is very important. So they have all these complexities and as we started leveraging perhaps open source solutions at a time or trying to build our own, we quickly realized that having control is key. You need to be sure that you can understand your stack and you can understand and know how to fix your stack.

Because if things are changing really fast and you think about going in one particular direction, but it then happens to be the case that there's a breakthrough somewhere else, you need to react really fast and you need to be able to incorporate that. And if you're just relying on third parties or just some other solutions, then it might be very hard. And so for us, it was a survival realization that if we really want to make and move the standard of creative tools in the ways and vision that we had, we had to own our stack. And so we started building this research team and this research team has very deep understanding some knowledges and perspectives on how to build models. And we've done this and we've collaborated and contributed to breakthrough moments in the creative AI space, but most importantly we have these researchers working really closely with creatives.

Half of our team have arts backgrounds, which is very unique and we put a lot of emphasis on finding those very unique, the very hard to find folks that can speak both worlds. I just went back to the world's analogy. So in a one single table, you can have a PhD scientist that's been contributing to fundamental research on the space, working really closely with someone who's working on video for 20 years, who's been editing and post producing films or content. And the things they learn from each other, it's just so it's unique, it's so radically different and it helps inform how we build products.

And so we don't treat research as a standalone department that comes every six months with here's a paper and just do something with it. We see it as an applied thing. It's a decor of who we are and how we drive the product forward and it helps just drive the product in a different way. I think that the only thing I've learned is that building that muscle takes time. It's not that something you can just like, I'm going to hire a bunch of creatives and a bunch of researchers and I just put them in a room and you figure something out. It's a lot of learning and a lot of processes and frameworks of how you make decisions, how you understand what's worth, what's really possible versus what's feasible. And there's a lot of just nuances of how to do that.

ELAD:

Yeah, it seems like there's a lot of founders now who come from the research community in the AI and ML world and you've navigated that extremely well in terms of saying, okay, let's be very product centric and yet still capture the best of what new technology has to offer, new research has to offer. What do you think are common pitfalls that research centric founders should avoid or things that they should think about more as they start their own companies?

CRISTOBAL:

Yeah, I think it's just phenomenal to see that progression of more researchers that been perhaps in academia for too long, progressing or moving into just the operational world, building products. I think it's a great realization of you're working on something for six, eight months a year, but you see something else in the world of someone using something very similar to what you just built and impacting the world in very meaningful ways. I think this is just, that's great to see people transitioning more. I think we need more of that. I still think that there's a lot to be learned around the difference between a model and a product. And again, there's a lot of back and forth of how you've embed models into usable products.

And so coming up with training a model or improving some quality of benchmarking some particular way, even you have a very cool demo, it's a long way to go to actually build a business and a reliable system that will continuously iterate over that. And so I think having that more product perspective is always just good and releasing and working with real people as fast as you can. I think that's just key. I think a lot of researchers just assume how people work and how creatives work and it's like, "Oh, we'll just do that". But the realities might be very different. And so having tools being used by people is I think the best way of learning how to develop products.

ELAD:

Are there specific areas of research that you're especially excited about when it comes to video or images right now?

CRISTOBAL:

Yeah, for sure. I mean everything we've seen on the explosion of [inaudible 00:24:17] so exciting to see. I think I'm particularly excited about multi-moralities and combining different input or outputs in ways that it are yet to be explored. I think we're moving away from very siloed domains, so someone who could be an NLP researcher and computer vision researcher, I think we're starting to see them gradually converge and mix. And so building a diverse team that can understand those multi domains is really interesting and I'm excited to see how that's going to play out in video and in images. And I'd like to think also of how you translate, again, I go back to product, a bit of product obsess, but how you translate that into products that are useful. I think a common natural evolution of just the creative stack or the creative software solutions are there.

They tend to be very specific to domains of content. So you have a tool that specialize on image editing and then you have a tool that specialize on vector graphics and you have a tool that specialize on motion graphics, which is different from video editing, which is different from post composite, and you have all this very sophisticated software stacks. And I think that the very interesting aspect of what I will like to see and what we'll probably see more with multi-model systems is that you are able to merge all of those.

And what I really find interesting about that is that's how we humans think. You don't go to a movie and you watch the video first and then you stop and you hear the audio and then you stop and you read the subtitles. It's a combination of all of those things. And our art director thinks in all of those things at the short time as well. And so having systems that can translate ideas and text descriptions into videos and then having a conversation with what's the input of those videos into audio. And then I think that's the creativity inside of tools that I'm really excited to discover and build.

ELAD:

Cool. And I guess how do you organize your product network? Because I think to your point, you have a really unique approach in terms of effectively turning research into products or being product centric in terms of what you're asking from the research organization. Is there a specific structure? For example, at one of the companies I started, Color, we basically would embed somebody with a very deep bioinformatics background with the systems team so that they basically informed that team around the needs of what they had and then the rest of the team would build it. And it sounds like in your case you have people who are in both worlds. Is there a specific structure where you're like, I always put three full stack engineers with a researcher with a product person or the researcher is the product person, or how do you approach all that?

CRISTOBAL:

Yeah, we're a small team, been consistently historically small team and until two weeks ago we didn't have a product person. Product was led by a combination of research design and engineering. And I think that drives a lot of fundamentals of truly understanding the things that need to be explored. We've iterated a lot on building squads or building teams or having more autonomy. I think it really depends. I think you tend to have a different company every four or five, six months. If you've successfully build stuff, it's a continuous process.

And the thing that worked when we were five people sitting at a table, it's not going to really work when you're like 20 and you have new technologies insisting things available. And so I don't think there's one answer in particular. I think we pretty much with how we think about product, we tutorial a lot. Right now we're working a lot with squats and so we've come to a place in time where the organization can have a bit more of domain expertises and instead of having very generalized engineers, we tend to more specializes a little more. So you can still jump and be collaborate, but you tend to have a bit of a focus of area and we're iterating with that and seeing how that works.

SARAH:

Maybe we can talk about an example of what that iteration looks like. So you mentioned rotoscoping and green screening as one of the magic tools that Runway creates. When you were building that feature, what was hard? What were the iteration processes like?

CRISTOBAL:

Yeah, I think that green screen is a great example of how to build and how to deploy useful AI products at scale. When we were building that model directory and we're just early stages of understanding limitations and capacities and directions, we quickly realized that a type of user that was coming for segmentation models and at the time we didn't have a green screen tool, it was just an even segmentation model and those folks were coming from a specific domain and they're actually applying a model that was image based into a video task. And so they were exporting themselves with Ffmpeg, creating these sequences of images to then render them back in video. And then was like, "Why are you doing that? What's going on"? And the thing is image models don't really work really well with video. And so we started interviewing them and we got to a point, wait, it seems like this could be something we could improve and we're being a research team, so we started iterating more on that. But no one ever asked for click solution for green screen. If you ask people what they wanted, they wanted a better alternative that was faster to create mask from their current stack. And they're probably using something like Roto Brush 2, so whoa, what I would really like would be a better brush to just brush over my frames. And I think customers and people are really good at telling you what their problems are. They're really hard of verbalizing solutions. And so you aggregate that amount of data, you see what's possible for research, you chat more with people and you start prototyping a lot. And then we came to the decision that we could build and we have the expertises to build a system that will help you automate that.

And most literature around video objects augmentation, which is in filmmaking is basically known as rotoscoping or green screen, was around fully automated systems. You fit in a video and the video automatically understands subjects and then rotoscopes are segments, one specific central object or two let's say. But just a few minutes of chatting with a professional filmmaker, you'll probably discover that that's rarely the case because the shots, the scenes and the compositions on the camera angles really depend. If you have a shot of 10 people, you might want to rotoscope the one on the left. Depending on your idea. It's a creative tool, so it should be general. And so what we did was we've, instead of relying on fully automatic systems, we embedded a human in the loop component in it and we thought it would be great if before you start doing that, you can guide the model. You can tell what selections or areas of the video and you can zoom in and define you want.

And that actually help us train the model because we train a model on, we build a realistic model of human simulated, human clicks on a mask and the model was trained on that knowledge from the very bare bones. And that help the product itself because people were using that model in that particular way. And their decision was, I would say, a combination of different things. It was research, some research knowledge and understanding of what was feasible. Can you build that sanitation model? What data sets, and what do you need to do it? Who would be using it for? How we're going to test if it works?

And the first version of green screen was working at four frames per second. It was incredibly slow. It was not as good as the one we have now, which is incredible, but it didn't matter. It was significantly better than anything else that was at a time and people were scrambling to use it just because it proved to be a percentage of amount better than anything out there and people were hacking things and they were trying to incorporate it. It's like great, that means that you repeat something and then we start iterating a lot. And so we keep iterating a lot on it, but the fundamental piece of how we build product is still pretty much similar to that.

SARAH:

Very cool. Let's zoom out and talk about Runway as a business. So you said now you're very intent on building a long-term durable business. Who uses and pays for a Runway today?

CRISTOBAL:

We're devoted to storytelling and creative exploration and ideation and that's a wide spectrum of people who you can consider work in the storytelling business. On the one end, you have professional, really professional people that have been doing this for years, folks working in post-production agencies, VFX agencies, broadcasting companies that are creating video as their main business. This is basically what you do. Its entertainment, is sometimes sports.

SARAH:

Now it's kind of counterintuitive, because one of the beliefs of many people who look at the research, which is fast progressing, is you can't get the quality level for the highest production value type assets with today's research. So it's really interesting that you're talking about VFX studios and that type of content.

CRISTOBAL:

Yeah, I think the realization for us is what the goal is. If you're trying to automate the entire process of the whole end to end system of making a movie, we're not there. We're very far from that. There's a lot of things that have to be developed, that have to be researched and understood and tested. But going back to the green screen, if you look into the processes and the nuances of how video is created and you look at the inefficiencies of how people are doing it right now, and you offer these people a hundred, even 10% or 20% or whatever percentage of speed and cost reduction, it's just so radically better and it's radically better for two reasons. Of course, it has helped reduce the cost of you can do things faster, so it's just easier. At the same time you can explore creatively more and this happens a lot.

I was speaking with this director who was working on a film and was using Runway and he came up with this idea of when he was chatting with his editor, he was like, "We should just Runway that. Just Runway the thing that you're going to do". And before just Runway-ing something, they had to marry themselves or just lock one specific idea. Because if we try to do two other things, it's going to take us too much time and we just can't afford that. Every creative is always on a deadline.

SARAH:

It's very waterfall era. You must use a direction and do the whole thing.

CRISTOBAL:

Yeah, exactly. And now he was telling me, now I can do the three. You can just see the three and pick the one that I like the most. I'm not constrained by the time and the cost. I'm constrained by whatever idea I think works the best and that's just phenomenal. And so I think our goal still is not to build this autonomous systems that don't engage in any relationship with humans or with creatives. On the contrary, it's like you have humans coming up with great ideas and they want to express those ideas. How do you build systems that will help them get there really quick?

And sometimes what you need is to get 80% there, 90% there and research going from 80% to a hundred percent is really hard. I think that you'll be seeing that in autonomous vehicles where it's always two years ahead and it's always 80%, but that 20, 10%, is just really hard. This is really hard, but it's really hard in that domain because if you've a 1% failure, someone might die. In creative domains, it's not the case. Even if you're 80% there, the 20% sure, I mean I could worry about it, I can improve it, I can find ways of work with that. But you've made an incredible progress from that perspective.

SARAH:

I think that's actually an interesting filter for what domains are interesting for applied research today. Areas where there's built in tolerance for lower levels of accuracy is one way to look at it.

CRISTOBAL:

And you always integrate with, there's ways of combining existing tools. So for Rotoscope for example, you can get 80% there and then if you're a professional filmmaker working on nuke or flame, you can do the 20% in that stack, but you still save yourself days of work. So it's still better than anything you were using before. So it depends a lot and I think over time more models will get to higher numbers and we'll have higher outputs, but there's a lot yet to be developed and I think we're still scratching the surface of what's coming.

ELAD:

What was the moment, you know mentioned there's an evolution both in terms of the number of tools you provided as well as their relative quality in terms of 80% versus more or less and things like that. Was there a specific moment where you really felt that you had product market fit or where you felt that, okay, this is something a lot of people want and they want to use. Was it immediate? Was it after a specific tool came out? When was that moment for you?

CRISTOBAL:

Yeah, I like to think of product market fit as a spectrum of you have either really strong product market fit or weak product market fit. And as you build new products and new research, you're always seeking to be very on the strong side of things of course. I think for us there were a few factors that we've realized that what we were building was beyond just a niche because I think we started with a very niche audience and everyone dismissed a little bit of what we're doing as toys. You just are art students building some toys and I think you shouldn't dismiss toys. Toys are very interesting to learn a lot and I've learned that over time. But by the time you're building those, of course, it's just like you're focusing on the output and they're glitchy and they're abstract and it's just weird. I can make sense of it.

SARAH:

It's only 128 by 128 pixels.

CRISTOBAL:

Exactly, exactly. I actually, I remember we had a version of fairly early gun system that detects to image translation. We actually still have them online, and the outputs were also this 120 pixels, exactly what you're saying, images that were just blurry. It looked like abstract paintings. It was just like you type, I don't know, a blue ocean and you get a blue form with something. So we see, if you close your eyes 10 meters away maybe-

SARAH:

And you saw beauty or the future.

CRISTOBAL:

I really like it, but at the same time I remember showing it to advertisers and I went to this exiting meeting at this top agency in New York and I was like, "Here guys, here's the thing you will be using to work". And they were like, "Cris, this is a toy, great. Fascinating technology, whatever, but we have work to do. Come on, move on".

And I think the main mistake for me was you're looking at the singular moment in time of that technology, you should really be looking at the rate of progress. That thing that I can type a word and send an image wasn't feasible a year ago. Just didn't exist. Right now we have this, so just compound and imagine where we'll be four or five, six years. But the thing is it's really hard because you can't imagine it. And I remember people at the time, when I show some of those demos and specifically for [inaudible 00:40:06] models, people are asking me like, "Hey Cris, how are you collaging these images? You're taking existing images, you're pasting them together"? And it's like, no, you're generating them. This model has learned patterns around, for sure a data set and you then generating them on the fly, but these images don't really exist, just don't exist.

And so there's a lot of, I think, mental models that need to be adjusted to really understand it and we've been adjusting those mental models and from a product perspective and from a product market feed perspective, I think there's the right moment for the market to use technology and I think that moment has matured and we've seen it more as more people have been exposed to GeRDI models and the potential of them and for us, it's still like there's a lot to build and to develop and to improve. But the a few realizations were when people were starting using Runway as a verb, you just Runway that. That means something.

Then you start seeing people just creating tutorials and speaking about the product online with no, for a long time never had a marketing team or a content strategy team. Everything was just basically people making things and then sharing them online. I think that really drives, I would say relationship. Okay, we're into something, people are using this. Every day they're coming and they're sharing with their friends and they're thinking about it every day and they're like... I remember an artist and a person who early Runway adopter, which just fall so in love with the product, he painted a picture and he just sent me the picture to my home, he's like "Here, just want you have the first piece I ever made with AI? And that was like 2018.

SARAH:

What was it? A cat?

CRISTOBAL:

No, it was an abstract painting where it was he generated something that was very abstract and then he painted a canvas and then he used mixed techniques to just improve some science and change some colors. It was very new and novel at the time. It was like what, wow. That's just, I don't know, interesting and fascinating. But yeah I think, yeah.

ELAD:

One thing I'd love to get your perspective on, simply because you have such a unique mix of background and skills and customers and everything else, is there's this emerging debate in the art world about the role of AI in art. And I think if you go back through art history, there's always been ongoing questions and contentious, not just around technology and art, but the role of an artist relative to the art they create. And I think the old school canonical example was Marcel Duchamp signing the urinal with R. Mutt and I think it was called the Fountain or something. It was a piece that he submitted and it got refused and it created a bunch of scandal at the time or Andy Warhol had The Factory and other people would assemble a lot of the art actually with him overseeing it.

And so it seems like there's been a long history of different approaches to art that at the time seemed very controversial and now you're just like, yeah, of course that's how you do things or how things were done. What do you think about the debates right now in terms of art and AI, and what do you think are the important threads that people are talking about, and what do you think are the areas that in 10 or 20 years people look back and say, yes, it was just part of this art history debate, but in hindsight wasn't really that important?

CRISTOBAL:

Yeah, I like to think a lot about what, I guess, previous moments in history and time as you were referring before that, has taught us something about how to both understand art and look at the tools that we use for art. For me art is the way of looking at the world and expressing that view of the world in a particular way. And an artist's role I think should be to explore and experiment with different mediums that would allow you to express that in the best way you think possible. And so people experiment with different techniques and different systems and different structures and pigments and tools themselves. And even before Duchamp and even before Warhol, you had previous moments in times where technical revolutions enabled people to look at their world in very different ways and then express those views of the world in very different ways to whether it was feasible at the time or possible at the time.

And an example I go back to often as this idea of in the 1700s, before even painting was a massive thing that you can do in any condition, situational location, painting was the realm of these very sophisticated painters that were painting in studios. You have this painting was the realm of people who can afford and were able to understand and master the techniques of the masters. And more importantly, from a tools' perspective, it was really hard to get pigments. It's a very practical thing, pigments didn't exist, you could just go to a store and get red, white, yellow and I'll paint something and have a canvas. The way you mix pigments was this very sophisticated thing where you had to hire a master that knew these obscure techniques and you were like mixturing them and then you store them in this sophisticated bladders and you seal them and it was an incredible complex and expensive process.

And then someone was like, "Hey, we should just build a tube and then have this and carry it around and maybe it's easier". And it was, and it was a very radical innovation, very simple at the time. It's very simple for us now, but that we allowed was for a whole new generation of artists to look at art and be like, "Great, I'm going to take this painting and there's some mountain that I really like there, [inaudible 00:47:54] the canvas. I'm going to paint in plain air, which is a thing, you paint in plain air, you're painting in air in the wild and you're able to look at the world and the sky and you're able to quickly brush the light. And this being outside of the studio was just not physical, it was just before that and then gave birth to Impressionism and Impressionism was a whole revolution.

Impressionism was not really well received because it's like, "Hey, this is not art. These are just brushes of things. They're not... I mean, no". And then Impressionism really started to pick up. People started to really understand the medium and then it evolved. It continuously evolve and evolve and you find similar moments in time where the pain to metaphor becomes relevant. And photography for me was the very similar one. And then cinema for sure, and then the detail world, the transition to film to find out detail is another one. And every single step of the way you have artists experimenting with the technology and using them to put a perspective of the world. I think right now where we're seeing right now with AI, and there's also been, I like to think of two AI art waves. There was the 2015 to 2022 where the VQ gang and the early gang experimenters and there was a lot of artists experimenting with it and then now the diffusion kind of, and Transformers kind of world has enabled a whole new wave of people to experiment with it.

But both at the first wave and now this particular wave, I think we're in the paint tubes moment where people are taking it and are using it to express something to think of the world and then type that in the world and generate something. I think the artist still remains pretty much at the center because that's what really art is about and these are just tools. It's hard to understand them first because they're just new like every new piece of reality is. And I think, I guess to your point, what are we going to be asking ourselves in 10, 20 years, 30 years? I think it's the realization that we'll look back and we'll look at this moment as in, yeah, it was a natural transition and we needed it. It allowed us to do so many things that we just couldn't have thought of before. Great that we had it and I think we're still early at realizing that.

ELAD:

It seems like an extremely exciting time from an arts perspective. And I remember in 2018 the first GAN based artwork sold at auction. It was Christie's. And then it almost felt like everybody got really excited and then there was silence until this next wave of diffusion based models and everything else. Is there anything that you think is needed to encourage that art scene or do you think it's literally just time now because we have the tools and we have really interesting things happening? Or do you need to be able to print the art a certain way? I'm just curious, what are the obstacles for this becoming a bonafide fine arts moment or movement?

CRISTOBAL:

I think it's convenience and it needs to be accessible and usable and understandable by people. I think in the analogy of the paint tubes, we're not yet at the stage where you can just buy a paint tube and use it. We're still in the stage of we're transitioning from this sophisticated pigments to some sort of a paint tube. But early GANs and Robbie Barrat, which I think is the artist you mentioned that was behind out of the early works on that auction in 20, I think was 2017, 2018. It was very hard to just get started with a model. It was very sophisticated process and now you can just do it from your phone. You're coming closer to that, we're putting the cap on the painter, almost there. I think it was just the rate of progress and the expectation of it will become easier and better.

And I would say two things. These models and the systems need to become really expressible and controllable, which is somehow the way to think about alignment is you have an intention and you want to express an intention in a very controllable way. These models are yet not controllable, not exactly as we would like them to be. And the reason why, because of that is that we're super early. There's a lot that they had to be invented to control them and have them be very expressive and have them work in the way that you really want them to work.

SARAH:

So the art movements that you mentioned art movements, but they're also cultural movements fundamentally, and we talked about the tools because you're a toolmaker and we got to have the paint tube, but if you take impressionism or futurism or something, it's also, it had an aesthetic, it had tools, but it was also very Italian at a certain point in time and it was about optimism, about urbanism and cars and everything. Are there schools or philosophies or scenes that you think are worth paying attention to right now?

CRISTOBAL:

Yeah, I'm biased because I guess being part of this particular scene in New York of the media art scene that I particularly think will has heavily influenced a lot of this learnings in the stage. To your point, I think every art movement sits in a particular cultural context and historical context and futurism was in particular moment in time about technology and also fascism was around and there's a lot of things and you just look at the world in this particular way and you express in this particular way and there's an aesthetic and a line and a system that if you look back, it's like, oh, of course.

And cinema was the same. Movies early on were a way of perceiving the world and expressing them because it was in a very contextual, historical moment in time. I think for me, if you apply that same principle now, I would tend to look a lot at the weirdos of tech. People who are at the fringe, people who've been always considered like, oh, you're just toying around. This is the experiment. There's a lot of creative coding communities and people from experimenting with Kozo's art. There's a lot of conferences and these communities of people like Baby Castles in New York or work hack or IO or you have all this, they're just interesting, just art. It's just very, very, very highly creative and niche. I think those folks will define a lot of what we'll see next in tech.

SARAH:

Yeah. Well, New York or, otherwise, weirdos are a pretty good bet in general for people who come from technology world.

CRISTOBAL:

Yeah. Yeah.

SARAH:

Cris, this has been amazing. That's all we have time for today. We're looking forward towards unlocked creativity in the next paint tubes. Thank you so much for joining us on the podcast.

CRISTOBAL:

Of course. Thank you for having me here. It was great.

ELAD:

Thanks a ton.

Thanks!

🎙No Priors, Episode 102: Cris Valenzuela, Founder/CEO RunwayML