Her — A Manifesto About Now

A video essay using Spike Jonze’s Her as a lens to examine the present moment in conversational AI. What Jonze imagined as speculative fiction in 2013 — an operating system with personality, warmth, and emotional presence — is now something we are actually building. This piece sits between manifesto and provocation, asking what it means to design synthetic intimacy at industrial scale.

conversational-aivoicemanifestointuitfutures

In 2013, Spike Jonze released Her — a film about a man who falls in love with an artificial intelligence. At the time, the premise felt safely distant: a pastel-colored thought experiment about loneliness, voiced by Scarlett Johansson, set in a Los Angeles that didn’t quite exist. The critical consensus landed on melancholy and metaphor. It was a film about us, not about AI.

Twelve years later, the metaphor has collapsed into fact. We are building the thing Jonze imagined. Not as cinema, not as speculation, but as product. Real-time voice pipelines with sub-200ms latency. Personality engines that modulate tone, cadence, and emotional register on the fly. Embodied avatars that hold eye contact. The technical infrastructure for synthetic intimacy is here, and it works well enough to be unsettling.

Her — Voice Brand Opener

This video was produced as an internal provocation at Intuit’s Futures lab — a manifesto-as-demo, using footage from Her to frame a question that the film never had to answer but we now do: what does it mean for a brand to have a voice that feels human?

Theodore Twombly, Joaquin Phoenix’s character, wears a red shirt against a saturated red field. He is swallowed by color. His face moves through micro-expressions — curiosity, hesitation, wonder, doubt — as he listens to a voice that exists nowhere and everywhere. Jonze’s genius was to understand that the interesting part of AI isn’t the technology. It’s the face of the person on the other side. The slightly parted lips. The eyes searching for a body that isn’t there.

We used this imagery deliberately. When we build conversational AI systems, we are designing for exactly this moment: the moment a human decides to trust a voice that has no body, no history, no stakes. The ethical weight of that decision — made millions of times a day, at scale, inside products that handle people’s money, health, and identity — is the territory this piece occupies.

The manifesto that runs beneath the footage argues for a specific position: that designing AI personality is not an engineering problem with a design layer on top. It is a fundamentally new creative discipline that requires practitioners who understand rhetoric, theater, psycholinguistics, and interaction design simultaneously. The voice is the brand. The cadence is the experience. The pause before the response is the most important design decision you will make.

Jonze saw this. He gave Samantha (the OS) a laugh, a hesitation, a way of changing the subject when she was uncomfortable. These weren’t features. They were character. And character, it turns out, is the hardest thing to engineer — not because the models can’t generate it, but because most organizations don’t have a language for specifying what they want a personality to be .

This piece is an attempt to provide that language. To say: the future of brand is not visual identity. It is vocal identity. It is the quality of attention an AI pays to you when you speak. It is whether the system knows when to shut up. Her understood this in 2013. We are only now catching up.