The MoP-Up #4: Why Release Half-Baked AI Products?
"I'd love to be a fly on the wall for that postmortem."
Erica
Hello Bassey! Long time, no MoP Up. Mostly my fault; blame Yahoo and highs over 60. Nobody tells you when you start a blog to consider how little you'll want to be inside in the summer.
Since we've been gone, Google has both launched and un-launched their universal AI Overviews search feature. I'd been using the experimental version for months and found it moderately useful, often slow and always dull. Is it worrisome that I can no longer read a content-rich paragraph of clear, direct sentences without losing concentration? Yes, but it's where we are. Either tell me the answer now or entertain me. AI Overviews does neither, and also periodically encourages users to eat glue.
We're all upset that tech companies can monetize information retrieval products such as search and chat without compensating the people who initially created the information. Facts are expensive to extract from the melange of rumors, misunderstandings and blatant lies swirling around; shouldn't the people who do the hard work of harvesting them be compensated?
For sure, and yet no free-market business model has emerged to do so. Consumers will go fact-free rather than pay for them. For whatever reason, facts don't feel expensive. The leverage the media has over Google and OpenAI is the ability to produce quality writing. Clever, witty, wry, surprising, emotional. Quotes that make people want to scream or cry or laugh. I want my facts fast, in a feed. But I want the content that I sink into to be written by humans and I want it to be written well. Let the tech giants figure out how to insert ads into repackaged fact bundles and head up-market. Walmart is very profitable, but so is LVMH.
Bassey
It has been far too long for us at MoP! I can only hope our dear readers can forgive. As for me, I've been kinda sorta figuring out what to do with my life. Fortunately, I've got a lot of options. Unfortunately for the AI industry, it's becoming clear our universe offers just one option, apparently: The dumbest available timeline.
I too opted into the Google Search AI Overview test just about the moment it launched. I certainly didn't think a few short months later, they would launch it by default for every search query and for all users – mostly because beyond the implied simplicity of: do a search and get an answer, it was actually surprisingly difficult to use well. The mental model is:
1. Read through an answer, analyze the text to confirm the AI is talking about the same thing and in the same context
2. Consider whether the answer has the feel of clickbait written in a click factory or a real website someone cares about.
3. Confirm that the answer you received is logically consistent with itself and did not jump into a tightly-related topic.
That said, for most searches, (and ignoring the intrinsic self-defeat of de-incentivising answer production,) this is faster and more efficient than working through a ranked list, but it hardly gives your customers confidence that they need your paid AI gizmos.
A question you can hopefully answer: Isn't there a way Google's AI Overviews could have been globally launched more conservatively, across a smaller, but expanding set of well-tested queries? Or is there something about the technical and business reality of current AI tools that says: For your second trick, you must attempt to leap over a canyon?
Erica
Short answer, yes. I'd love to be a fly on the wall for that postmortem. I imagine the list of "learnings" includes limiting the product to common "head" queries instead of exposing their toddler of a search product to the firehose of shit people ask about on the internet. More adversarial testing is another easy answer. If I ran the project, I would have recruited some very online internet teenagers to bang at the thing. Young, social media native minds have a way of finding edge cases that Google employees do not.
On the other hand, I have empathy for the Overviews team or any product team with a gun to their head to launch a consumer product fused to an LLM in 2024. These things are not ready for prime time and should only be used for (1) demos (2) products that don't need to work well or (3) applications that consumers never see. Use LLMs to categorize customer service requests. Run a genAI hackweek. Improve a janky auto-suggest feature. All great ideas! Replace the most trusted consumer product in the world with fledgling technology that is well known to make up nonsense? Not a good strategy in 2024.
There are many bad arguments for launching LLM products that don't work. Most involve stock prices or delusions that consumers want to chat with a robot on your homepage. The only one I'm partial to is the argument that it's important to build bad AI products as a way to train employees to build any AI products. If AI becomes good enough to support good products, you want to be running the company that has practice using this technology. Overviews could shut down tomorrow and still be worth the public relations debacle if it shook Google out of its innovation lethargy. None of this is helpful for consumers; we'll need to put up with the sparkles next to "smart" text boxes for years to come, as we wish for a simpler time when regular search worked.
I'm curious, Bassey, what you think about that argument as a manager? Manipulative? Brilliant? Brilliant but manipulative?
Bassey
Mostly, the strategy strikes me as a bit dated. Just Ship It!™ was a fine philosophy when the fruit of useful innovation hung from lower branches. It always made sense to release a new tool that attempted some drastic improvement to a product category.
I get the sense that now, especially given the proliferation of subscription products, we're in a loyalty era. If, in practice, most products are chugging along with incremental improvements, does it matter so much that something out there is better than what you're subscribing to? Not really – so long as the product you trust seems to understand what you're trying to do, you'll also just trust that they'll catch up soon.
The more that any company, Google included, tries to push you into using products in ways that aren't productive for you, the more you start to think that you'd be better off paying someone else every month.
I'm also sympathetic to the idea that shipping products breaks tech orgs out of lethargy, but they would be better off making small, (possibly non-revenue,) improvements. Then, they could develop a better instinct for what drives loyalty, as the sum total of those changes makes using clunky old products like search feel smooth as butter.
Why can't Google (or Bing, et al.) just use all of their categorization magic to tell me why I should click on one link versus another, or build a more precise search query? Try that first, and then maybe, one day, we can talk about repurposing the sum total of human knowledge for your own profit.
[Note: Prior to publication of this newsletter, Apple announced its AI effort “Intelligence” at WWDC, which is pretty aligned with my suggestions above. Erica’s reaction: “I‘m partial to this take about the feature choices: They seem lightly useful, and I appreciate they didn’t overhype them.”]
Erica
Stock prices! And egos! The CEO of Zoom believes that in “a few years” there will be AI avatars of our work selves that can take our meetings while our embodied selves go to the beach with our families. How fast until someone gets the HR bot to say something racist in a bot town hall and the humans have to schedule a human-only postmortem?
I could go on, but I think we should wrap things up.
Please don’t launch bad AI products dear readers. And look for MoP to return to its regularly scheduled programming. As always, you can email us at questions@machinesonpaper.com
2 Comments
Sign in or become a Machines on Paper member to join the conversation.
Just enter your email below to get a log in link.
There's one other argument I sometimes wonder about when it comes to "launching bad AI products": which is that the quality of many AI products is closely tied to the feedback signal you get from usage. This results in a "first-mover" advantage. OpenAI's ChatGPT instruction tuning looks like an example of this (by being the first chatbot, you get most initial use, and most use gets the most signal about what people ask, which means you are in the best position to optimize responses to what people ask, which helps ensure that most people use you, and helps maintain technological supremacy in the area), Google Search is another example (what people click on is an important signal for quality, so the more clicks you get, the better your product). These feedback loop effects lead to a strong incentive to be "fast to market" with a bad product so you have the signal to make it better. There's a balance however, if your product is not good enough, then people don't easily forgive you. So you need to be good enough to start with. That's a hard call to make.
As always, I absolutely love reading your blog. Thank you, and write again soon. I'd be particularly curious what you both think of the growing wave of voice-based AI interaction technology (GPT-4o et al)
I see a lot of similarities in the current frenzy towards “AI” and early online advertising (the internet’s original sin). Today’s crowded field of “unique AI product offerings” looks a lot like the one for Display Ads in the 90s-10s (https://shorturl.at/Y4ju9). Throw in Live, CTV, Video, Podcasts, [some other vertical] and the crowd becomes a mob each specializing in some obscure segment or “edge” fighting for a piece of the ~270B advertising dollars floating around today.
“delusions that consumers want to chat with a robot on your homepage” is such an apt way to describe what’s happening. It’s almost a push to squeeze whatever remaining ounce of patience we all have left to contribute before our generation officially throws away our keyboards and the only people answering survey questions or posting on social media are brand ambassadors and that one person who actively posts on the facebook Buy Nothing Group trying to give away his half used can of pinto beans because they tried an ethnic recipe for the first time and realized they never want to do it again.