Thinking (creatively) with the help of machines
“An armchair in the shape of an avocado,” is a weird thing to have shake the world.
The release of the second generation model DALL-E 2**, which generates imagery from a textual description, is striking a chord. The visuals it generates are inspiring and scary-good. It’s higher-resolution, lower-latency and more capable (edits too!) than its predecessor. Dare we say it’s more creative? Here are a few of my favorites so far:
"Teddy bears shopping for groceries in ancient Egypt."
“A rabbit detective sitting on a park bench and reading a newspaper in a victorian setting”
"pastel hopeful and dreamy illustration of a girl blooming out of a lotus flower"
With models like DALL-E 2, everyone can now create visuals of high expressive quality at much lower cost. Visuals make communication more impactful. Whether you think it is art or not, DALL-E is about to make us all more powerful communicators.
The Age of Assistive, Creative AI
We are entering the age of assisted knowledge work and play. After I first wrote about the conversational economy and the potential of AI agents, there was a 5Y+ period of disillusionment. Bots in the wild, for example to replace human support agents, didn’t live up to expectations.
But we have seen a flurry of applications of AI that have had counterintuitive success. Instead of fully automating repetitive work, they’ve excelled at higher level, counterintuitively creative, nuanced and empathetic tasks as varied as art generation, language translation, and personalized mental health coaching (try WoeBot).
I’m particularly interested in ways AI can augment humans attempting different modes of creative knowledge work, where we don’t need to see superhuman performance for deployment in the wild, and some level of unreliability as models improve isn’t fatal.
There have been several breakthroughs besides DALL-E 2:
- Code generation (with OpenAI Codex / GitHub Copilot)
- Game play with Deepmind’s AlphaGo and AlphaZero. One of the most profound discoveries of DeepMind was this early proof that AI will surpass us in creativity, initially in seemingly trivial ways. There is a fun book, Game Changer, on the new chess strategies discovered by AI
- We’ve also seen promising early takes at AI-powered features for easing creative tasks in other mediums. Try this GANS-based photo-restoration mini-app from Baseten, Reduct and RunwayML for video editing, and Descript for audio editing
Our Future Writing Partners
Writing is the way we share knowledge, or create knowledge in the first place. We cannot have fully formed thoughts without organizing them in language. But writing is also incredibly hard.
Existing AI that help us write comes in the form of either narrow utilities: rule and style-checkers (Grammarly), auto-summarizers (see Quillbot) and short-form suggestions (see Google smart replies) or more specialized support (see Cresta, for helping call center reps chat effectively). But we don’t yet have a general purpose writing assistant, the way codex is a general purpose programming assistant. A writing partner is really a thought partner. This will be a huge unlock for humankind.
How can we picture the path to that? A few skills we can picture for this partner:
- Keep track of and structure our ideas
- Generate new ideas
- Contextualize our ideas in the search space of all existing work
- Search for and suggest new, related knowledge while we are thinking
- Summarize that existing, related work
- Search for and suggest people we should discuss our ideas with
- Summarize the structure and storyline of what we have written for review
- Rate components for quality, suggest revisions
How might we think and write better if our first reader is someone who has all the knowledge of the internet? An agent that has read everything in a particular field?
In the above list, the second concept of generating new ideas, might be the least obvious. However, this is already within reach. In the field of purely creative writing, I used GPT-3 to help me create “lore” at scale (think fanfiction, but it can actually become part of the canon in web3) for a cyberpunk NFT project I love, Chain Runners, by feeding it structured prompts. I was inspired by Crypto Covens, which did something similar: read from keridwen about the process with some examples of her lovely writing.
If an AI agent can coach us to write, it’s a small step to picture coaching for other types of communication coaching, from the mundane: “don’t text when you’re angry” to encouragement to use specific language, “how do you think about this?” These agents can be thought partners if we want to test out a specific framing or practice a difficult conversation. They can give us guidance for cross-cultural interactions.
AI is on an exponential curve, which is always hard for humans to picture. OpenAI’s advancement of DALLE has made it easier to see the future. I believe the next surprising advancement will be in language -- though they will come in all modalities.
If wordcels rule the world, then we should all accept any help we can get from AI’s to become our best wordcel selves. And if you’re an entrepreneur, don’t be afraid to be ambitious with what you assume of AI over the coming years. Plan to be surprised.
**“DALL-E” is meant to be a combination of “WALL-E” and “Salvador Dalí”