this post was submitted on 08 Nov 2023
83 points (87.4% liked)
Showerthoughts
30000 readers
784 users here now
A "Showerthought" is a simple term used to describe the thoughts that pop into your head while you're doing everyday things like taking a shower, driving, or just daydreaming. A showerthought should offer a unique perspective on an ordinary part of life.
Rules
- All posts must be showerthoughts
- The entire showerthought must be in the title
- Avoid politics
- 3.1) NEW RULE as of 5 Nov 2024, trying it out
- 3.2) Political posts often end up being circle jerks (not offering unique perspective) or enflaming (too much work for mods).
- 3.3) Try c/politicaldiscussion, volunteer as a mod here, or start your own community.
- Posts must be original/unique
- Adhere to Lemmy's Code of Conduct
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Yeah but vocaloids suck and I've heard ai singing recently that made me double check because they were so good.
Does this suck? To my ears, it doesn't. Not unmistakably human by any stretch, but still pretty good. And that's 9 years ago
And by "AI singing" do you mean "a famous voice overlaid on another singer's performanse" or something closer to text-to-speech (text-to-song)?
I dont understand the language nor am i familiar with that style so i couldnt really judge.
Im not sure about your second point. I'll keep this in mind and the next example i come across i will come back here to share.
Well, if you talk about the newest AI-powered UTAU voicebanks, that's because the developers finally thought about crossing the streams, and instead of having the singers merely pronounce syllables in several pitches, they used that data (expanded to also include several syllable clusters) to train an AI. Unlike most trained AI models, where the voice samples are recorded from live performances, so they vary in quality and on data points for each individual syllable, these have the full set of voice training data prerecorded by design, so the quality of every possible combination of phonemes is as clear as possible.
That's very interesting. Where can i read more about it?
https://dreamtonics.com/en/synthesizer-v-ai-announcement/