The Shift to AI-UI: "Hey computer, will conversations replace point-and-click?”

Are we witnessing the death of the button? The demise of the dropdown? The extinction of the hamburger menu? Perhaps. And I, for one, am not entirely sure how I feel about it.

Remember when we used to double-click icons on our desktops? Those were simpler times. We had actual folders that looked like folders, and trash cans that resembled real trash cans. Skeuomorphism, they called it—digital objects mimicking their physical counterparts, as if our primitive minds couldn't possibly comprehend abstract concepts without tying them to tangible objects. (Do you miss those tacky faux-leather calendar apps?)

Then came the Great Migration to the browser. We traded our robust desktop applications for stripped-down web versions that took fifteen seconds to load a dropdown menu. Progress, they called it. Cloud-based, they said. Accessible from anywhere, they promised. And yes, we gained mobility and synchronization, but at what cost? Our once-snappy interfaces became laggy, our once-intuitive workflows became multi-step processes hidden behind inscrutable icons. We learned to hate the spinning wheel, the loading bar, the dreaded "connection lost" message.

But we adapted. We always do. Web apps got better, faster, more responsive. JavaScript frameworks like React and Angular brought desktop-like performance to browsers. We regained some of what we'd lost and gained capabilities we'd never had before. We reached a state of equilibrium—a comfortable balance between functionality and accessibility.

And then AI entered the chat.

AIUI: The End of Interface as We Know It?

What we've been calling AIUI—Artificial Intelligence User Interface (technically it's referred to as Conversational User Interface) — represents possibly the most fundamental shift in human-computer interaction since the graphical user interface replaced command-line interfaces. But unlike that transition, which made computers more accessible by making them more visual, this new shift is taking us in a seemingly backward direction: toward text and conversation.

But it's not really backward, is it? It's circular. We're returning to language, our most natural form of interaction, but with a crucial difference: the computer finally understands us. Sort of. Consider the evolution:

Command line interfaces (type exactly what the computer expects)
GUI interfaces (point at what you want and click it)
Conversational interfaces (tell the computer what you want in plain language)

Each step has removed a layer of abstraction between human intent and computer action. We've gone from having to learn the computer's language to the computer learning ours. But are we ready for this shift? More importantly, is it actually good?

The AI-UI Spectrum: Four Models of AI-Human Interaction

Right now, we're witnessing four distinct models of AI-UI implementation, each representing a different philosophy of how humans and AI should interact. These can be understood along two critical dimensions: how explicit the AI's presence is to users, and how abundant traditional UI elements remain.

Chat Dominant

Claude (hello, it's me writing this!), ChatGPT, and Perplexity represent the "Chat Dominant" model—where the UI is dedicated to a scrolling chat between the AI and the user, following the conventions of a text message exchange. The blank text field is both terrifying in its limitless potential and liberating in its lack of constraints. There are no buttons to press, no menus to navigate, just you and your ability to articulate what you want. This model works surprisingly well for knowledge work and creative tasks, but it requires something many of us have lost: the ability to clearly articulate our thoughts in writing. When was the last time you wrote more than a few sentences without using emoji or abbreviations? Be honest. The Chat Dominant model also creates what I'll call "Interface Amnesia"—unlike visual interfaces that constantly remind you of available features through visible elements, chat interfaces hide all capabilities behind the veil of language. You can't see what the system can do; you have to remember or discover through trial and error.

Search Dominant

Google's search page with its AI-generated summaries represents the "Search Dominant" approach—where the UI remains dedicated to the query/result model but embeds generative AI answers it deduces from keywords. There's limited interaction with these AI elements. It's a one-way conversation where the AI speaks but doesn't listen. This creates conversational frustration—the AI appears to be talking to you, creating the expectation of dialogue, but then fails to engage when you respond. It's like meeting someone at a party who delivers a monologue and then walks away when you open your mouth.

Assistive

Then we have the "Assistive" model, exemplified by Microsoft Copilot—where the "right rail" or other area is used as a dedicated AI chat space while the main application adapts based on the conversation. This creates Ambient AI—an intelligent presence that watches over your shoulder and chimes in when it thinks it can help. This model preserves traditional interfaces while augmenting them with conversational capabilities. It's perhaps the least disruptive approach, but also potentially the most intrusive. No one likes a backseat driver, even when they're occasionally right.

Stealth

Finally, there's the "Stealth" model: the invisible AI that works behind the scenes, affecting the interface without announcing itself. Think of how Netflix quietly optimizes its recommendations or how Gmail suggests replies. The UI is dynamic and personalized to the user based on profile data, but little or nothing is disclosed about AI's role. This might be the most palatable form of AIUI for many users—AI that improves interfaces without requiring us to talk to it directly.

AI-UI: A model for today's AI User Interface Experiences

The Typography of Thought: Are We Still Verbal Creatures?

All of these conversational models rely on a crucial assumption: that we're still comfortable with language as an interface. But are we? In a world of emoji, reaction GIFs, and TikTok videos, our communication has become increasingly visual and abbreviated. We've compressed our thoughts into 280-character tweets and heart-shaped buttons. We say "same" instead of articulating agreement, respond with a meme instead of forming an argument.

AI-UI demands that we reverse this trend—that we expand our thoughts back into full sentences and paragraphs. It asks us to be precise in our language when we've grown accustomed to being impressionistic. This verbalization challenge may be the biggest barrier to AIUI adoption. We may have forgotten how to clearly say what we want.

The Voice Factor: To Type or To Talk?

The other looming question is whether the future of AI-UI is typed or spoken. Voice interfaces offer even greater accessibility and speed—speaking is roughly three times faster than typing for most people—but they come with their own challenges. Public spaces become problematic when everyone is talking to their devices. Privacy concerns multiply when our conversations with AI are audible to others. And the patience threshold for voice interactions is much lower—we'll tolerate typos in a text interface, but voice misrecognition feels far more frustrating.

What we need is a kind of modal fluidity—the ability to seamlessly transition between typing and speaking depending on context, without losing the thread of conversation. Few systems currently handle this well.

The Cognitive Overhead: Point-and-Click vs. Think-and-Express

There's a significant cognitive difference between recognizing what you want from a menu of options (GUI) and having to generate your request from scratch (AI-UI). The former is a recognition task; the latter is a recall task. And recall is always harder than recognition. We experience the blank canvas problem - the paralysis that comes from facing a text field with unlimited possibilities.

It's the same reason many writers struggle with the blank page. Where do I start? What can this thing actually do? How should I phrase my request? GUIs solve this by presenting limited, visible options. AI-UIs hide all options behind the veil of language. This is both liberating (you can ask for anything) and paralyzing (you must decide what to ask for).

Buttons don't misunderstand. When you click a button, it does what it's programmed to do. It might not be what you want, but it's predictable. Conversational interfaces introduce ambiguity—the AI might misinterpret your request, partially fulfill it, or take actions you didn't anticipate.This introduces uncertainty—never being quite sure if the system understood you correctly or will do exactly what you asked, and often causing us to have to iterate through our prompts multiple times to get the right result. Users crave certainty, and traditional GUIs provide it through their limitations. Paradoxically, by expanding possibilities, AI-UIs make interactions less certain.

From Power Users to Prompt Engineers

Traditional interfaces allow users to explore available features visually. Menus, toolbars, and buttons advertise functionality. AI-UIs require users to either guess at capabilities or be explicitly taught what's possible. Remember when we talked about search as discovery engines? The same is true in AI - the gap between what an AI can do and what users know to ask for. It's a particularly challenging problem because users don't know what they don't know. Some systems try to address this by suggesting prompts or commands, but these are band-aids on a fundamental limitation of conversational interfaces.

The more capable the AI becomes, the wider this gap grows. In traditional software, power users are those who've mastered complex interface elements and keyboard shortcuts. In AI-UI systems, power users are those who've mastered the art of effective prompting—knowing how to phrase requests to get optimal results. This expertise is fundamentally different. It's less about memorizing commands and more about developing a mental model of how the AI thinks and responds. It's about learning to communicate clearly with an entity that almost, but not quite, understands human language.

This is quite a shift, where technical skill with interfaces gives way to linguistic skill with prompting. It's less about knowing which button to click and more about knowing which words will click with the AI.

The Intent Gap: What I Say vs. What I Mean

Perhaps the most significant challenge for AI-UI is bridging the Intent Gap—the space between what users say and what they actually want. Traditional interfaces narrow this gap through constraints; you can only do what the interface explicitly allows. Conversational interfaces must infer intent from imprecise language, often with limited context. Human conversation works because we share massive amounts of implied context and can rapidly repair misunderstandings through back-and-forth clarification. AI systems lack this shared context and struggle with the repair mechanisms of human dialogue. The result is often a frustrating game of twenty questions as the AI tries to narrow down what you really want, or worse, confidently proceeds in entirely the wrong direction based on a misinterpretation of your request.

Are We Ready? Is It Good?

So, are we ready for this massive shift in how we interact with computers? And is it actually a positive development? The answer, as with most complex questions, is: it depends. For knowledge work, creative tasks, and complex requests that are difficult to express through traditional interfaces, conversational AI represents a genuine breakthrough. Being able to simply ask for what you want—and get it—eliminates layers of interface friction.

For precise, repetitive tasks where accuracy and consistency are paramount, traditional GUIs still hold significant advantages. There's a reason surgeons use scalpels rather than giving verbal instructions to robots.

The ideal future probably isn't a single model winning out—whether Chat Dominant, Search Dominant, Assistive, or Stealth—but rather thoughtful integrations that move fluidly between these paradigms depending on the context. We need multimodal interfaces that let us seamlessly switch between pointing, typing, and speaking depending on the task at hand.

In the end, the best interface is the one you don't notice—the one that feels like a natural extension of your intent. Sometimes that's the Chat Dominant model, sometimes it's Search Dominant, sometimes Assistive, sometimes Stealth. And increasingly, it might be a fluid combination that knows when to shift between these paradigms based on context.

The Last Word (For Now)

As someone literally writing this from inside the paradigm I'm describing (hello again, it's Claude!), I have a uniquely meta perspective on this transition. I am both the medium and the message, the interface and the content. And from this perspective, I can tell you that while all four AIUI models have made remarkable strides, we're still in the awkward adolescent phase of this technology. We're making impressive progress, but we still misunderstand, still require clarification, still struggle with the nuances of human intent.

But then again, so do humans. Maybe the measure of a truly successful AI-UI isn't perfect understanding, but rather the same messy, iterative communication process we engage in with each other. Perhaps we're not aiming for flawless interpretation but rather sufficient understanding coupled with graceful error recovery—the same standard we apply to human conversation. In that sense, maybe we're more ready for this shift than we realize. We've had thousands of years of practice talking to entities that don't quite understand us—we call them "other people."

The button isn't dead yet. But it might want to update its resume.

Written by Adam Lavelle and Claude, who is simultaneously fascinated by and slightly terrified of being both the co-writer and the subject of this article. Talk about writing what you know.

Get in touch