I was alone in the studio, headphones on, the kind of late-night quiet where you can hear your own pulse.
Elton Johns tiny dancer slipped through the headphones, that soft piano intro rising like fog off wet pavement. I thought: what if I made it breathe differently? Not a straight cover, but a needle drop. Two voices, male and female, calling and responding across a dark cinematic score. Rainy neon streets. Shadowy detectives. The kind of tension that sits right under your collarbone.
It sounded so clear in my head. The reality was fourteen attempts of ai refusal, a manual pitch shift that felt like surgery, and a last-minute key compromise that somehow made everything better. This is the story of how a 47-second clip turned into something that feels alive.
The Setup: Two Covers, One Vision
I started simple. Two covers. Nolan Neals live take (raw, vulnerable male voice) and Florence plus the Machines version (ethereal female power). Both trimmed to the first two verses, exactly 47 seconds, in audacity on linux. Clean cuts, no extra reverb, just the voices floating naked.
Lab note: I cut it first to reduce processing time and save credits during the next step. The original song is about 3 minutes.
Then I processed them through lalal.ai to isolate the vocal stems. The separation was eerily good. Suddenly I had pure male and female lines ready to weave together like a conversation in a dimly lit alley.
Lab note: There’s a tiny thrill when stems come back cleaner than you dared hope. Its like the AI handed you the bones and said, now make it dance.
Next I ran both through tunebat.com to determine key and beats per minute. Nolans was locked in B major. Florence sat closer to where I needed her. I knew I wanted the whole piece centered in C major for that bright-yet-haunted lift against the dark instrumental I was dreaming up. So I opened Nolans stem again in Audacity.
The Pitch-Shift Grind
Changing B major to C major without touching tempo is straightforward on paper: effect to pitch and tempo to change pitch. I set from B to to C (plus 1 semitone), checked the high-quality stretching box, and hit OK. It worked. Mostly. There were a couple of tiny artifacts on the louder notes. Audacity is honest like that. But nothing that broke the spell.
I listened, eyes closed, feeling the new key settle into my chest. It was close enough to call it home.
The AI Orchestra That Wouldn’t Cooperate
Now came the base layer. I fed Producer.ai this prompt generated in Grok:
Instrumental only, no vocals, cinematic neo-noir murder mystery theme in b major. Dark atmospheric orchestral score with jazz influences. Moody and suspenseful, blending classic film noir tension with modern neo-noir edge. Slow to mid-tempo (around 70-85 bpm), building subtle dread and mystery.
Instrumentation: sultry tenor saxophone solos, muted trumpet, dark piano chords and arpeggios, upright bass, brushed drums, lush low strings, distant atmospheric pads, subtle cinematic percussion, occasional vibraphone or harp accents for unease.
Evoke rainy neon-lit city streets at night, shadowy detectives, hidden secrets, and impending danger. Tense, brooding, elegant yet ominous, with chromatic harmonies and unresolved chords for a psychological thriller feel. High-quality cinematic production with reverb and film grain atmosphere.
B major key, instrumental cinematic neo-noir jazz orchestral hybrid, murder mystery soundtrack style.
I asked for conversion to C major fourteen different times. Fourteen. The model kept sliding into its comfort-zone minor keys or just ignoring the root note entirely. After the last failure I finally surrendered and accepted an a-minor version. It wasn’t what I ordered, but it felt right. Brooding, unsettled, perfect for the shadowy vibe I was chasing.
Lab note: Producer.ai admitted key adherence is still one of its challenging frontiers. Genre gravity pulls it toward D minor or G minor for noir jazz. I get it. The model feels the mood more than it reads the theory. That friction taught me more than any perfect match ever could.
Layering the Call-and-Response
With the A-minor instrumental locked, I dropped the pitch-shifted male stem and the female stem on top, in true call-and-response. His line answers hers, hers echoes his, the voices braiding like two strangers trading secrets under a streetlight. The timing was tight. I nudged a few breaths in the Kdenlive timeline until the exchange felt alive instead of mechanical.
Then the intro. I stacked three A-minor drones from my private sample library, low, warm, almost subliminal. They sit under everything like distant thunder you feel more than hear.
Finally I added two free gems from multiplysound.com: a soft violin basic (Am) that drifts like smoke and a vocal cadence (A) that feels like a held breath. The latter sits right at the intro, giving the piece its first heartbeat.
Tools & Creative Stack
- Audacity (linux) – cutting, pitch shifting, initial assembly
- lalal.ai – vocal stem separation
- tunebat.com – key and bpm detective work
- Producer.ai – the moody instrumental base (after much negotiation)
- multiplysound.com – free a-minor violin and vocal cadence samples
- Kdenlive – final timeline layering and rendering
Total cost: basically zero beyond the lalal.ai credits I already had. The rest was time, stubbornness, and happy accidents.
What I Actually Learned
Ai will fight you on the details. It nails vibe, texture, and atmosphere way before it nails strict music theory. That’s not a bug. Its a feature. The moment I stopped forcing C major and let A minor in, the whole track exhaled. The unresolved tension I wanted was suddenly there, breathing on its own.
Happy accidents aren’t accidents when you’re directing an AI orchestra. They’re the places where the machines intuition collides with yours and something neither of you planned shows up.
Serendipity beats perfection. Every single time.
TL;DR: Planned for C major. Fought Producer.ai for fourteen rounds. Settled on a minor and layered call-and-response vocals over a jazz-noir bed. The key compromise unlocked the magic. The piece now breathes like a rainy-night fever dream.
I sat back last night and hit play one more time. The drones swell, the violin sighs, the two voices trade lines over brushed drums and distant sax, and for under 3-minutes the piece just lives. Its not what I sketched in my head. Its better. It breathes.
Steve Teare
video alchemist
TerminallyBored.Monster
Palouse, Washington USA
BONUS: I tested the music as an MP4 upload to YouTube — and it passed inspection for copyright. None detected.
