this post was submitted on 25 Aug 2024
86 points (93.0% liked)

No Stupid Questions

35263 readers
1868 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)


Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.



Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That's it.



Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.



Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.



Rule 8- All comments should try to stay relevant to their parent content.



Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



Rule 10- Majority of bots aren't allowed to participate here.



Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 1 year ago
MODERATORS
 

I have several tapes (yes actual cassette tapes) of my grandfather reading a novel.

Unfortunately a few of the tapes have degraded to the point that I cannot play them back.

I would love to recreate his voice, to "rerecord" the missing bits.

The recordings are in Danish.

Is this possible?

If it is, how can I go about it?

top 25 comments
sorted by: hot top controversial new old
[–] Deestan 49 points 2 weeks ago

While tools exist, like people already commented, remember that the result may not be what you expect.

A recreation whether by AI or a skilled voice actor will have slightly different intonations, emphasis, tempo variations, pauses and lack of pauses that are not your granfather's. It is very likely to feel flat and wrong in an unpleasant way.

[–] corroded 24 points 2 weeks ago (1 children)

I can't speak to the AI voice generation part of this, but you might be interested in the Domesday Duplicator for digitizing your audio, especially if some or it is slightly degraded.

https://github.com/harrypm/DomesdayDuplicator

The project was originally designed for laserdisc, but it's been expanded to support VHS and cassette tape. Traditionally, you would play your tape on a cassette player, then the built in analog circuitry would convert the magnetic signals into audio, amplify them, and feed them to a sound card on your PC, which then converts the analog signal to a digital audio stream.

With the Domesdsy Duplicator, you record the raw magnetic signal from the read head and directly digitize it into a bitstream that you can then process as needed. For DIY archiving from an analog source, it's one of the best options for signal fidelity, and it will give you the truest representation of what's actually on the tape.

[–] boojumliussnark 4 points 2 weeks ago

This project looks very cool, but at 300usd+ I think it is out my budget range for this project.

[–] [email protected] 17 points 2 weeks ago* (last edited 2 weeks ago) (1 children)

Maybe the term you are searching for is "AI voice cloning". The engine of https://elevenlabs.io/voice-cloning claims to be able to understand and reproduce even Danish.

Edit: They seem to require some voice verification to make sure the voice is yours. Which is odd in your case.

https://speechify.com/da should allow to recreate the voice of "your beloved one", at least they mention it on their German page.

[–] boojumliussnark 8 points 2 weeks ago* (last edited 2 weeks ago) (4 children)

I did sign up for ElevenLabs, unfortunately they cannot allow me to clone a dead persons voice, as per their FAQ:

You may only clone your own voice or a voice you have the rights to clone. For added security, when creating a Professional Voice Clone we require users to complete a Voice Captcha mechanism by reading a text prompt within a specific time to confirm your voice matches the training samples you upload for training. If there’s a match, your request is sent for fine-tuning. If not, you’ll have to reach out via our help center to have your voice verified manually.

Now I'm sure it wouldn't be an issue to get the legal rights, but when I spoke to their support, they did not have any way to verify beyond the captcha.

[–] tlou3please 9 points 2 weeks ago

Would it be worth reaching out to them on social media? Something like this would be great PR for them.

[–] Dkarma 4 points 2 weeks ago* (last edited 2 weeks ago)

Just find A github project that does AI voice replication...it would be free

Check it

https://github.com/topics/voice-cloning

So e of these may have the advantage that you can use your voice as carrier for the words but they'll come out sounding like him so you can do the same inflections he would have done.

Private Message me if you have questions I'd love to dig into this but I don't read thread responses

[–] [email protected] 4 points 2 weeks ago

If you have the equipment (mainly an Nvidia GPU that has the ability) doing voice cloning locally is the way to go if you keep running into legal issues. Plus being on your computer you may be able to tweak and try different methods to get the best results for your needs. A year ago this would have been a maybe, but there's a lot out there to look at and try. See what others have done first in videos and such and follow their lead.

[–] [email protected] 3 points 2 weeks ago* (last edited 2 weeks ago) (1 children)

Maybe https://speechify.com/da/ works. At least they mention the recreation of the voice of "your beloved one" on their German page.

[–] boojumliussnark 2 points 2 weeks ago (1 children)

I can't find this. Where is it on the German page?

[–] [email protected] 2 points 2 weeks ago* (last edited 2 weeks ago) (1 children)

####Bleibende Momente
Klonen Sie die Stimme einer geliebten Person und lassen Sie sie Ihren Kindern Ihre Lieblingserinnerungen oder -geschichten vorlesen.

####Gibt es eine Software, mit der man die Stimme von jemandem nachahmen kann? Speechify AI Voice Cloning kann jede Stimme in Sekunden klonen. Alles, was die KI braucht, ist, dass sie sich die Stimme für etwa 30 Sekunden anhört. Sobald sie die Stimme einer Person gesampelt hat, kann sie mit der gesampelten Stimme lange Dokumente lesen, Podcasts erstellen und vieles mehr. Haben Sie einen geliebten Menschen, dessen Stimme Sie gerne mal hören würden - wandeln Sie einfach jeden Text in seine Stimme um. Erstellen Sie Audio-Podcasts oder Voice-Overs. Jetzt können Sie stundenlang mit Ihrer eigenen Stimme sprechen - ohne ein einziges Wort zu sprechen.

https://speechify.com/de/voice-cloning/

Alternatively, https://www.resemble.ai/voice-cloning/ could do it, as they state:

####Can I clone anybody's voice? While Resemble AI empowers users to create AI replicas of various voices, it's essential to adhere to ethical guidelines and obtain proper consent before cloning someone's voice. Respect for privacy and intellectual property rights is paramount in utilizing our technology. Please read our Ethics page for more details.

However, Danish is only in their 'Pro' tier list, so you should perhaps try it with some English text first.

[–] boojumliussnark 2 points 2 weeks ago

Thank you again for diving in to this. It looks like speechify don't support Danish.

On Resemble AI, they have something like Elevenlabs:

For professional voice cloning through data upload, we require explicit, verifiable consent from the voice talent. This involves providing a clear audio consent statement along with the training data, so that we can confirm the identity. By uploading voice data, you're confirming that you have such consent, which should align with our guidelines. The consent recording must follow our template, e.g., "I acknowledge my recordings will be used by [Your Company] to create a synthetic voice by Resemble AI." For any questions regarding consent, please reach out to us.

I will try to contact them regarding my use, which is neither commercial, nor using "voice talent" exactly.

[–] hperrin 10 points 2 weeks ago

The very first thing you should do is get them professionally digitized, that way the quality won’t degrade any further. Then you can try training a voice AI, but as long as you have the digitized version, you can always train whatever new AI is invented in the future.

[–] Boozilla 9 points 2 weeks ago (1 children)

I've been able to generate very good results with this open source project. You need a pretty good nVidia GPU, and it takes some time and tedious work to get it working they way you want it to:

https://github.com/neonbjb/tortoise-tts

Some voices sound exactly right. Other sound like a broken robot. The main reason I like it is that I can run it local without having to sign up for some stupid cloud service.

[–] boojumliussnark 1 points 2 weeks ago (1 children)

Looks very cool. I was unable to see anything regarding languages. Is it completely language independent somehow, or is it English only?

[–] Boozilla 1 points 2 weeks ago

I have only used it with American English. Oddly, it will sometimes slip into a British accent. I believe it is possible to retrain it on other languages, but I have not done the deep dive required to do so.

[–] poleslav 6 points 2 weeks ago (1 children)

If you can get them into a digital format I’ve personally used eleven labs to clone voices and make narrations for missions I created for a video game. I tried using different open source projects and getting it to run on my own with no avail, but 11 labs has been solid (it is unfortunately paid software of like $5/10 bucks a month though)

[–] boojumliussnark 2 points 2 weeks ago (1 children)

Was this with the "Instant voice clone"?

[–] poleslav 1 points 2 weeks ago
[–] [email protected] 4 points 2 weeks ago

Voice cloning is still kinda early, especially in the open source area. Whatever you may find now or later (it's a rapidly developing field), the first and most important step you should do is to digitalize the cassettes into actual audio files to prevent further degradation / loss. Make sure to back those files up too, preferably in the cloud I guess.

You should not need anything special to digitalize the tapes either. A simple cassette player with a 3.5mm audio output fed into your mic input on your computer and a recording software such as Tenacity should be enough (start recording on the program, then play the cassette).

[–] Grimy 4 points 2 weeks ago* (last edited 2 weeks ago) (1 children)

Elvenlabs is currently the best but you can get some very good results with first xtts then rvc as a second pass. It involves fine tuning models and running things with python and notebooks, so requires some know how.

You can explore more models on the huggingface page https://huggingface.co/models?pipeline_tag=text-to-speech&sort=trending

Most have a huggingface space dedicated to them where you can try them, here is the xtts space for example https://huggingface.co/spaces/coqui/xtts

The language adds an other layer of difficulty, I would try their demo first to see if it gives anything workable but it isn't a language current tts software cater too, it doesn't seem to be an available option on xtts sadly.

[–] boojumliussnark 4 points 2 weeks ago (1 children)

Thank you for the tips. As I see it currently, I expect the language to be the biggest hurdle. It doesn't appear like something I can add myself, even if I had the data for a model. So as far as I can tell it involves two currently more or less impossible steps: Get model data and teach language to model.

[–] Grimy 2 points 2 weeks ago* (last edited 2 weeks ago)

If you have material with him speaking in English, you might be able to train an xtts model on it and then use that to bypass the elvenlabs captcha but I'm not sure if they give enough time. Although GPU rental is cheap these days, so captcha time is less of a factor.

If anything, the tech is moving quite fast, it will definitely be easier in a few years, maybe even months.

[–] [email protected] 2 points 2 weeks ago

There's a lot of results for ai voice clone. I can't personally speak to the effectiveness of them but it might be worth a look.

[–] InSamsara -1 points 2 weeks ago

Clone his vocal chords and get surgery to replace your old voice chords with the new cloned one.