The Conversational Economy, Part 2

Accessible AI

June 26, 2016

The Conversational Economy — 5 Reasons Mobile Apps May Still Rule

Part 2 in a series on the Conversational Economy. Read Part 1 here

If these powerful forces and players are fueling the rapid rise of the Conversational Economy, why aren’t we there yet?

1. Fragmentation stunts developer ecosystem development.

The new platform landscape is fragmented, and it’s unclear that we are headed for consolidation anytime soon. While we live in a dual-ecosystem mobile world with Android on the rise, for the past nine years everyone in the developed world built mobile apps for the iPhone first. iPhone users represent more engagement and spend, and Android fragmentation and less mature tooling drive higher development cost.

Announced MAU numbers except for SnapChat and iMessage. Also note Kik is missing from this analysis. *Not a direct user number for iMessage. Using RBC Analyst June 2016 estimate of 470M active iPhones as a proxy

Today, it remains unclear if any messaging player will ever reach WeChat-like dominance in the english-speaking world (in the way iOS led the mobile revolution). By sheer force of numbers, Facebook looks like the early leader. However, pure user-base comparison is an oversimplification of the landscape. There could remain space for others with:

Dominance in a particular geography — Line in Southeast Asia (220M users), WeChat in China (720M users) and Telegram (a somewhat globally distributed base, but 20M users in Iran)
Significant user bases — SnapChat, with the under-30 crowd? Or even Kik, with teens?
Ancillary user base and deep war chest — perhaps the five confused legs of the Google communications strategy (Gmail-Hangouts-Allo-Duo-Messenger) or Microsoft-owned Skype (favorite amongst voice-first baby boomers, and still everyone’s favorite international voice/video fallback)
Momentum in different use cases. Slack, which in relative terms remains small (likely under 10M MAUs), has magical momentum and customer love, and very strong engagement in a particularly valuable environment — at work. Discord is becoming the preferred communications tool for the gaming community (disclosure: Greylock is an investor in Discord), and even for some big non-gaming communities like React

2. A lot of software simply shouldn’t be conversational.

Not every application lends itself to conversational design or bot-ification.

Five years ago, many believed in the promise of the “mobile enterprise” — that mobile would rapidly eat desktop client-based enterprise applications. This hasn’t happened partly because of temporary reasons (stickiness of legacy), learning how to design for mobile, etc. and because of fundamental UX limitations. Many tasks are more efficiently done with a large screen and keyboard experience, and people spend a lot of time sitting at a desk. Coding, long-form writing, design, working with structured data — these all haven’t (really) come to mobile. One 451 research analyst writes, “…in the end, we still see a strong bias and interest in employee surveys for laptops and desktop [computers] for most work.”

Similarly, some people are predicting the demise of the app as we know it, but there are some things that will just work better as a visual interaction. The current “bot bubble” of developer interest is in many cases not leading to great experiences — in Poncho’s case, an experience described as “frustrating and useless” and “the slowest way to use the internet.” This harshness is undeserved, but shows how hard it is to do this not just well — but better — than all other options.

Just as consumers still prefer to look at and navigate tabular data on desktop vs. mobile, consumers will prefer many native app UI’s to conversational ones unless those new experiences are more efficient or more natural. Two swipes and a tap (or a Google search with a card result) can be better than a hundred characters of iteration back-and-forth.

Some early examples of more efficient, or more natural conversational experiences:

Multiuser information-gathering. Howdy’s simple scripting allows you to “schedule a bot that gathers data from your team” replaces, for example, chasing down progress updates before a weekly standup meeting (though the prodding, like that of a persistent human, can also be quite annoying)
Asynchronous complex tasks, and queries densely expressed with natural language. Facebook M, with its ability to get a “res for hot new dinner place in SF for 4ppl 2nite,” saves a user research and workflow time, as does Operator, with its expert network helping you decide on the perfect Father’s Day present
Personalized, hortatory workflows and coaching systems. Begin, reducing the cognitive load of endless to-do lists
Engaging, self-paced content. Purple makes keeping up with the election consumable and fun
Hands-free voice platforms. Many people have had an “echo moment.” They’ll ask (while busy making coffee) “Alexa, what’s the weather/traffic like today?” or (when with friends who all want to know) “Alexa, who is winning the Packers game?”

3. New messaging ‘platforms’ are still in their infancy.

Over the past nine years, the infrastructure of the mobile economy has grown robust and rich. There’s a well-trodden path to setting up a new mobile service — so well-trodden that a core problem is competing with the 2 million other teams who are also doing this. To rehydrate the infrastructure of iOS and Android in new messengers will take time. Core components in approximate order of importance include Distribution/Discovery, Richer UI Support, Context, Payments, Developer Support, User Analytics and Tooling. No player besides WeChat has a cohesive story around this right now; below I will lay out the call to action for the would-be-platforms (as well as independent players building picks and shovels).

Distribution and Discovery. First and foremost, developers are balancing potential gains (addressable market and monetization) with development and distribution (cost). There are both obvious channels of distribution — stores, directories, search, and web buttons, as well as channels that require native mechanics.

Existing Messaging Service Discovery for Kik, Line, Slack and Telegram

While most messengers are mobile-first experiences today, their discovery experiences are web-only and/or limited in scope. For example, Line’s store supports only stickers, themes and games — and Snapchat Discover is a one-way, content-only stream for well-heeled advertising partners today. Ratings and community-driven validation of messaging services remains sparse. ProductHunt is actually a richer discovery experience than native offerings.

Access to the platform’s “graph” and native sharing (for example, group messages that allow one to pull in both a service and other users) that enables virality within the platform is still yet to arrive. However, if the incentivized invites of Zynga/Farmville are any indicator — enabling social distribution is tricky, and developers will incentivize users to help their services grow in ways that may hurt the platform user experience without careful management. The platforms are rightfully wary about getting this right.

Richer UI Support and Context. People often confuse messaging as the environment and messaging as the medium. Dan Grover describes the most successful messaging ecosystem, WeChat, as a better home screen (or more accurately, central inbox) through which businesses can reach customers, not a place where they’ll interact via messaging. Most brands and businesses do not have “conversations” within WeChat — they just have lightweight mini-apps or one-way feeds. Force-fitting all interactions into free-form text will only frustrate users and developers.

Companies and developers shouldn’t have to use AI to interact within messenger environments and support a full, generalized NLP interface. At the very least they need to be able to display information, broadcast media, and restrict the input that users give them (a la tabbed keyword menus) and enable common and repeated user actions (e.g. message buttons for Slack, announced recently).

User Context. Putting aside the question of AI/NLP for later — for developers to build mobile experiences that are even on-par with native mobile apps, they need data on par with native mobile — sensor data, profile data, geolocation, other phone data such as contacts and calendar. Without this data, these services that are expected to be “smart” cannot be. However, platforms releasing this data to third-party services requires infrastructure, and another user-facing layer of privacy and permissions management.

Economics. Businesses and app developers will only continue to invest in new platforms if they see a return on that investments through 1) the ability to direct transact for goods, services, and digital services, 2) the ability to show ads, or, at the very least, 3) the ability drive users or trackable “leads” to another point-of-transaction (a native app, a real pizza shop, a web e-commerce store).

There’s a catch-22 in messaging: it’s hard to grow a userbase for a new third-party service, but if you already have a userbase on native mobile, or in your own SaaS application, why would you invest in what is fundamentally a layer of disintermediation? You must either be investing in the future ability to distribute and monetize (if you are a new company), or you are disrupting yourself (if concerned about the shift of attention away from non-daily use native apps or enterprise applications).

Developer Support and Tooling. Roadmap visibility, clear approval processes, structured partner/promotion programs and developer-specific support channels are still nascent. Slack is applauded for its relatively transparent platform roadmap — but every platform is under pressure from developers to deliver faster. Many companies building within conversational platforms are building the same core infrastructure — natural language capabilities, user analytics, feedback mechanisms.

Slack “Open Source” Platform Roadmap

4. Unsolved technology challenges remain.

At the opening of the Facebook AI Research Center last year, Director Yann LeCun said: “The next big step for Deep Learning is natural language understanding (NLU), which aims to give machines the power to understand not just individual words but entire sentences and paragraphs.”

And yet, in June of 2016, the Loebner gold medal has never been awarded — it is promised to a chatbot that passes a Turing Test, with responses indistinguishable from a human’s. While a conversational experience can be great without any conceit of being “human” (or even having sophisticated natural-languge capabilities), many services are aiming for that kind of magic. But where are we really with NLU, and why is it so hard?

Language is hard to model (and program) because it is so ambiguous. Similar sentences can have very different meanings, seemingly different sentences can have the same meaning. Humans are strange, unruly, unconscious and inconsistent in their communication, but make up for that by being so flexible in their ability to understand imperfect, ambiguous communications from others — based on context. Through experience, we unconsciously build sophisticated models of what words can mean in different contexts, and then draw on and compose those models together.

A lot of the current excitement around conversational interfaces goes back to this idea of natural language processing having reached a good enough threshold recently. The intuition goes something like this — because we don’t consciously understand language ourselves in a structured way, newish statistical methods for reasoning from large-scale, unlabeled data (e.g. deep learning) seem well-suited for NLU. They have improved our ability to compute on language without explicitly coding how it works. As discussed ad nauseum, this is happening now because of 1) more data, 2) more processing and 3) new/better algorithms.

It turns out that even if deep learning is a serious step forward in NLU (and I believe it is), our natural language “problem” isn’t close to being solved. Different applications (question answering, sentiment analysis, machine translation, part-of-speech tagging) have different model architectures competing for state-of-the-art status: strongly supervised Memory Neural Networks, Tree Long-Short-Term-Memory Networks (LTSM), bi-directional LSTM-Conditional Random Fields (CRF), Dynamic Memory Networks and others. Without digressing too much, even if we have some promising new ideas in research, the design and engineering of generalized, scalable conversational systems that maintain complex state, composed from those ideas, is far from commoditized. Artificial intelligence talent is extremely concentrated in the platforms companies (as is, in many cases, the data needed to train AI systems). The nonprofit OpenAI, in part started to make sure AI capabilities are not so concentrated in these few (economically motivated) internet giants, just announced one of its four first technical goals is to “Build an agent with useful natural language understanding.”

This challenging technical backdrop and the high consumer expectations associated with conversational services is at odds with the low cost to “start” a conversational service. Many developers are excited by the ease of starting (just create a Facebook page, download BotKit, create a API.AI, Twilio account, etc.) only to be quickly disillusioned by how difficult it is to reach misaligned consumer expectations.

5. Mobile is becoming a less silo’d experience.

The winners of the desktop, web and mobile platforms are by no means keen to see the Conversational Economy eclipse them. The six most relevant technology companies of right now — Alphabet, Amazon, Apple, Facebook, Microsoft and Tencent — are all competing on all fronts. That includes maintaining and improving the status quo where they have ball control. New (likely conversational) platforms such as new computing hardware in your home and car, VR/AR, and a messaging OS-inside-your-OS must all compete for consumer attention with the existing seven hours we spend per day on smartphones.

After several years of building frustration in the ecosystem, it feels like the dam has broken and mobile is evolving again, with deeplinking, instant apps, app extensions / custom keyboards, subscriptions and paid app store ads, and integration services.

Deeplinking dramatically reduces the time-to-value for users, by bringing them to specific places and actions in your app. This reduction of friction improves re-engagement, and also suggests a tantalizing opportunity to improve app search. People first used deeplinks to take users from web to mobile app content, and then to take them from marketing (emails) to app content, and finally to improve onboarding. Now the battle to make it work for app-to-app linking is playing out, and companies like Branch and Button are playing for a piece of the pie.

On the Android side, instant apps (announced in May) is a much bigger leap. These are small, modular native experiences that can load independent of the rest of the mobile app, digitally-signed and living within Android’s app sandboxing for safety, but with access to native capabilities like the camera. For the user, these are immediately-executable via a deeplink. Adoption of this new capability is still TBD — developers will have to put some effort into “atomizing” their native apps, depends on fast download, and is a Google Play-only feature, rather than OSS Android — but if successful it will reduce dramatically the friction involved in getting rich native capabilities into user hands.

iOS Application Extensions

In the release of iOS8 in late 2014, Apple’s new app extensions allowed for new kinds of third party functionality that could interact with other mobile apps (albeit through Apple’s system frameworks). This included sharing, notification screen widgets, custom keyboards and custom actions. While tightly restricted (e.g. with Apple-imposed memory limitations), extensions are a big and important step toward applications that actually talk to one another. The popularity of keyboards like Sunrise and Gif Keyboard demonstrate how important this de-silo’ing is to users.

Subscriptions beyond content, and paid search ads both expand economic viability for apps — enabling more predictable, ongoing revenue streams for developers and a new way to re-engage users and drive discovery. Whereas a massive user base was before a prerequisite to build a significant mobile company, Apple’s latest moves may enable more valuable apps aimed at devoted niches.

Finally, third-party integration services such as Azuqua, IFTTT, newly acquired Wand, Workflow and Zapier all enable end users to connect silo’d SaaS or mobile apps together.

What Comes Next Is Conversation, Apps and Integration

Clearly, the Conversational Economy is coming. These more natural interfaces and environments, the places where we spend time and communicate with other humans (and now with services) are a new platform battleground. But while many are hyping the demise of the app, this seems unlikely to me. The realities are more complex than that.

The emerging ecosystem of new hardware, voice interfaces, and messaging is still extremely immature. The technology challenges are far from solved in state-of-the-art research, much less commoditized, and the existing dominant tech giants have advantages in achieving many good artificially intelligent experiences before startups do. These players are deeply resourced, far from asleep at the wheel, and have already staked their claim in this fight. At the same time, to me it feels like our beloved smartphones are likely here to stay — and that we’ll likely get richer, better integrated apps, apps on subscription models, conversational apps, and also more than just “apps” — services that span computing platforms and reach us where we are.

Perhaps most interesting is not what the platforms are doing, but what the big mobile companies are doing independent of those platforms, and how much they are investing in their APIs. Uber, privately valued at >$60B, is an entirely mobile company. Their only web experience (besides driver and rider reporting) is m.uber.com. Even though Uber has both the engagement model to support a native mobile application, and an economic model unrestricted by the app store, they are distributing Uber’s capabilities everywhere, in Messenger, in native maps apps, with travel partners.

While this article covers many challenges, I’m actively looking to invest in the Conversational Economy — in startups that design their distribution strategy, growth and roadmap to face those challenges, and harness the wave. But investors, large companies and small teams alike should go in eyes wide open as to where we are in this ecosystem shift. Early days to be going, as April would say, “botsh*t crazy.”

First published in Medium.