I sent an email and then immediately became a Victorian ghost haunting my own phone.
Not externally. Externally I was extremely mature and reasonable.
Internally?
Every notification sound became a theological event.
I think modern technology has created a very specific emotional condition humans were never designed for:
the suspended intimacy state.
You reveal something emotionally real to another person, press send, and then your entire body quietly turns into a weather station waiting for atmospheric changes.
That was the seed of the project.
Not heartbreak.
Not rejection.
Not romance, exactly.
Just:
the psychological distortion field created by waiting.
So naturally I decided to make an AI-assisted cinematic monologue about it with a Tom Petty cover underneath because apparently this is how I process emotions now.
2/9
Opening spark
The line that unlocked the whole thing was this:
“I felt exposed.
Not because I said something foolish.
But because I said something real.”
The moment I wrote that, the entire project suddenly snapped into focus.
This wasn’t about needing reassurance.
It wasn’t “please love me.”
It wasn’t abandonment panic.
It was stranger and quieter than that.
It was about the peculiar discomfort of no longer controlling how another person interprets your inner world.
That feeling where:
the message is gone,
the silence begins,
and your mind starts manufacturing architecture inside uncertainty.
I wanted the film to feel like that.
Not dramatic.
Not melodramatic.
Not emotionally collapsed.
Just… suspended.
Like a person trying to continue daily life while their attention remains magnetically tethered elsewhere.
3/9
Step 1: Building the voice
I wrote the narration first.
About 2 minutes.
First person.
Restrained.
Observational.
The trick was avoiding “movie sadness.”
If the narration became too poetic, it would feel fake.
If it became too casual, it would lose emotional gravity.
So I aimed for:
“a thoughtful person trying to remain composed while privately destabilized.”
That tonal target turned out to matter more than almost anything else.
Lab note:
Most emotional AI videos fail because they over-act emotionally instead of preserving tension.
Real vulnerability often sounds controlled.
The narration was voiced using a custom ElevenLabs voice inspired by Tom Hiddleston’s vocal qualities:
intelligent articulation,
contained emotion,
slight inwardness,
close-mic intimacy.
Not performance.
Presence.
4/9
Step 2: The visual language of waiting
This part became unexpectedly fascinating.
I realized very quickly:
there is almost no actual “action” in waiting.
So the entire film had to be built from micro-actions:
checking,
hesitating,
placing the phone down,
looking away,
walking nowhere,
re-reading messages,
hovering.
Tiny rituals of suspended attention.
I ended up creating 21 separate six-second cinematic clips using Midjourney reference images — and Grok imagine or WAN AI iamge-to-video generation.
The visual grammar became:
• phones as gravitational objects
• windows and reflections
• empty chairs
• hands hovering before decisions
• soft overcast daylight
• dusk interiors
• muted emotional color palettes
The character is technically functioning.
But psychologically orbiting.
Which honestly describes half the modern human race at this point.
5/9
Step 3: The music problem nearly broke me
The backbone of the piece was a sampled cover of The Waiting by Tom Petty.
Perfect emotionally.
Terrible technically.
I had two music segments in neighboring keys:
D major and E-flat major.
My brain:
“Oh, that’s only one semitone apart. Easy.”
Narrator voice:
It was not easy.
I spent an embarrassing amount of time pitch-shifting audio in Audacity until everything sounded like emotionally distressed chipmunks broadcasting from a damaged AM radio station.
Tinny.
Phasey.
Horrible.
Eventually I realized:
I was trying to solve an emotional continuity problem with harmonic theory.
Wrong tool.
The solution ended up being cinematic transition design instead:a simple drone in D.
Which weirdly feels like an accidental metaphor for relationships.
6/9
Step 4: The talking head experiments
The intro and outro became their own strange little laboratory.
The opening uses a hyper-realistic cosmopolitan woman delivering the first emotional framing lines directly to camera.
Not seductive.
Not theatrical.
Just deeply human and emotionally intelligent.
The ending flips into a worn Tom Petty-inspired recording booth aesthetic:
vintage microphone,
warm tungsten lighting,
quiet aftermath energy.
One thing I learned:
AI faces become dramatically more believable when you stop trying to make them “perfect.”
Slight asymmetry.
Micro-fatigue.
Thoughtfulness.
Human pacing.
Those imperfections create presence.
Lab note:
Perfect AI humans feel dead faster than imperfect ones.
7/9
Tools & creative stack
Primary creative stack:
• Midjourney — cinematic still imagery
• Producer.ai — music generation
• ElevenLabs — narration voice synthesis
• Kdenlive — editing and timing
• Audacity — audio manipulation and transition experiments
• ChatGPT — psychological architecture, narration refinement, shot logic, prompt engineering
8/9
The real lesson
What surprised me most was this:
AI turned out to be exceptionally good at depicting suspended emotional states.
Not because it “understands” human emotion in some magical sentient way.
But because iterative prompting allows you to isolate emotional micro-behaviors with absurd precision.
You can tune:
• hesitation
• gaze direction
• posture tension
• room emptiness
• lighting psychology
• pacing rhythm
Like an emotional synthesizer.
And honestly?
That starts becoming artistically interesting very fast.
The film is technically “about waiting for a response.”
But underneath that, it’s really about something older:
the unbearable vulnerability of placing something genuine between yourself and another person,
then surviving the unanswered space that follows.
That’s the actual story.
Not the phone.
9/9
I think what fascinates me most about AI filmmaking right now is not spectacle.
It’s intimacy.
Tiny human moments.
Micro-tension.
Psychological weather.
The machines are getting good at dragons and explosions.
I’m over here teaching them how to portray someone checking their phone for the fifth time while pretending not to care.
Which feels significantly more dangerous.
TL;DR:
I made an AI-assisted cinematic monologue about emotional suspension, accidentally learned too much about sound design, and confirmed that unresolved silence may be the strongest visual effect in modern storytelling.
Steve Teare
video alchemist
TerminallyBored.Monster
Palouse, Washington USA
