Now I'm still waiting for someone to succeed at a clean-room recreation of Majel Barrett's voice, so we can finally have computers sound like they always should have.
We could've been there a decade ago, but the high-quality audio samples, made officially and specifically with possibility of this use in mind, got trapped somewhere between the estate, producers, and a commercial interest that called dibs, and then procrastinated on the project instead.
I did this. She recorded clean (imo, i cleaned it up) audio for “Star Trek: The Next Generation Interactive Technical Manual” which is available on archive.org.
Nurse Chapel, and "Number One"* from the original series' original pilot, The Cage. Both of these characters are main cast in SNW, sadly no mind-swap plot with these two has happened yet.
* I don't think she had a full name at that point?
I just yeeted a bunch of extremely noisy fragments into elevenlabs, and it came out pretty good on their cheap $5 plan. If you're after this for your own amusement, let me know if you want a screencap, or a dump of the source files.
Obv no clean room reconstruction but good enough for personal use...
I have lots of super high quality, clean audio recordings from her ripped from an old video game that she did voice work for. I've tried various TTS models over the years with it. Getting the pitch and tune is easy, but getting the impersonal detached robot-y feeling is kinda tricky. But I haven't tried in the past 6 months, so maybe it's time to give it another shot.
the inflection and impersonal feel is definitely hard to get right. there are parameters in the elevenlabs API docs to make the voice more stable (= monotonous; see speak.sh in that repo) but still the voice cloner on my $5 plan doesn't really get it right.
nevertheless... i'm still having a lot of fun with this.
edit: if I am forced to rot my brain with the 10x productivity boosting slop gun, at least I'll do it grinning
> pod cleaned up. waiting on the behemoth to finish grinding through Italy.
< if only postgres had progress indicators
... then they coulda called it progresql
> lmaooo
> Bash(~/speak.sh "Joke detected. Humor subroutine engaged. Ha. Ha. Ha.")
“Director John Badham states in the commentary that the actor voicing the raw content that was later modified for the computerized effect was John Wood (the Falken character), reading the script word-for-word in reverse order in order to portray a "flat quality" with limited inflection. That raw audio was then edited and re-assembled after being run through audio processing equipment to achieve the desired effect.”
Apparently John Wood read the lines in reverse order to make the enunciation weird. If you train a model, feed the lines you want in reverse word order, then split on silence and reverse them again, you should come close.
Now I'm still waiting for someone to succeed at a clean-room recreation of Majel Barrett's voice, so we can finally have computers sound like they always should have.
We could've been there a decade ago, but the high-quality audio samples, made officially and specifically with possibility of this use in mind, got trapped somewhere between the estate, producers, and a commercial interest that called dibs, and then procrastinated on the project instead.