I thought I was directing. Turns out I was negotiating.

I thought I was directing. Turns out I was negotiating.

I was sitting in the dim glow of the monitor at 1:17 a.m., the ice in my drink had melted beside me, when I finally admitted it out loud:

“This AI doesn’t want to make a detective. It wants to make a fantasy.”

Lissette kept coming back wrong. Sometimes too polished. Sometimes too soft. Sometimes her skirt was suddenly mid-calf like she was heading to a church picnic instead of a crime scene. I’d type “mid-thigh pleated skirt” until my fingers hurt, and the machine would politely nod… then hand me something completely different. Usually a miniskirt. I wanted consistency in her wardrobe. The audience needs cliches as shortcuts to folow who is who in a 3-minute episode.

The quiet frustration was familiar. Not rage. Just that low, persistent hum of “we’re not speaking the same language yet.”

Lab note: The real work of AI art isn’t prompting. It’s learning how the machine sees the world — and then gently, persistently, correcting its vision.

The Long War with Photorealism

For the first stretch I chased photoreal. Dramatic noir lighting, cool shadows, warm skin tones, the whole moody package. The still-image results looked expensive but felt dead when converted to motion. Lips didn’t sync. Fabric behaved strangely. The characters felt like beautiful statues instead of living, tired, slightly dangerous people. The biggest problem was morphing into a unrecognizable character. A eerie monster-like mutation. Not good.

Then I swung hard into painterly territory — Ross Tran, Artgerm, WLOP. Gorgeous stills. But the moment I tried to turn them into video, the magic evaporated. Too dreamy. Too floaty. Too blob. The micro-motions turned into mush.

I was stuck between two extremes and neither one wanted to play nice with vertical short-form storytelling.

The Comic Book Pivot

That’s when I started throwing Midjourney –sref numbers like darts. I did a little research in detective noir style references online.

Most missed badly. Some got warm. Then I landed on –sref 1738164590 –sv 4.

Suddenly the images had structure. Cleaner lines. A graphic novel energy that still felt premium. Lip sync in WAN actually held. Grok conversions behaved. The style had enough stylization to forgive imperfections, but enough definition that Lissette started to feel like a real character instead of a digital ghost. And it felt nostalgic. Bonus.

REFERENCE: https://sref-midjourney.com/style/noir

Of course, the AI still had opinions.

The skirt length battle was brutal. Mid-thigh, mid-calf, sometimes a random long dress. No matter how explicitly I described it, the model kept defaulting toward whatever was most popular in its training data (always a miniskirt). In the end, the only reliable solution was to give in and put her in a miniskirt. Consistency was less disturbing than immodesty.

THE SAME PROMPT with different style references: full body shot, beautiful woman kneeling on both knees on the floor beside a luxurious leather couch in a high-end apartment crime scene, one hand lifting the dust ruffle to look underneath the couch, other hand resting on the floor for balance, graceful athletic build, dark red shoulder-length wavy hair, sharp intelligent green eyes, confident focused expression, wearing an open camel-hair blazer in rich warm camel color soft brown with golden-beige undertones, white blouse with modest neckline, navy-blue pleated mid-thigh skirt, black 4-inch stiletto high-heels, professional elegant detective, dramatic cinematic lighting, painterly digital illustration, textured brushwork, glossy skin highlights, stylized proportions –no photorealistic, photo, realistic skin pores, blurry face, long hair covering face, cleavage, deep neckline

Sorry ladies. That’s how the internet-trained eye works. Male-dominated popularity wins again. This is how the AI interprets a mid-thigh skirt.

Even with the negative prompts to exclude offending wardrobe. Ignored.

This simple reference style won:

Lab note: Sometimes you don’t win the fight. You just choose which compromise keeps the project breathing.

What We’ve Learned So Far

1. Photorealism is beautiful but often unforgiving in motion and consistency.

2. Heavy painterly styles look incredible as stills but often collapse in video.

3. Comic book / graphic novel stylization gives us the best balance of consistency and workable motion right now.

4. –sref is currently the most powerful tool we have for locking visual language.

5. The AI will always drift toward the path of least resistance — which usually means whatever got the most clicks.

Current Creative Stack

Midjourney with –sref 1738164590 –sv 4 — primary style lock
Grok Imagine — bulk image-to-video
WAN — precision shots and lip sync
ElevenLabs — custom Lissette voice (sexy, calm, observant, slightly world-weary)
Producer.ai — background music
Audacity + editing tools — slicing and final assembly

The Real Lesson

I thought I was building a murder mystery series.

What I’m actually doing is learning how to have a long, patient conversation with a machine that has very strong, very internet-shaped opinions about how women should look.

The skirt fight taught me humility. The successful lip sync test taught me hope. We’re not there yet, but for the first time the pixels are starting to move in the right direction — and Lissette is finally beginning to feel like *her*.

There’s still plenty of friction ahead. But friction, I’m learning, is where the interesting work happens.

TL;DR: Photorealism and painterly styles fought me hard. A comic book stylization using –sref 1738164590 –sv 4 finally gave us decent consistency and workable lip sync. The skirt length battle ended in a strategic retreat to miniskirt. Progress is messy, but real.

— Steve Teare
video alchemist

TerminallyBored.Monster
Palouse, Washington