this post was submitted on 05 Jul 2023
124 points (98.4% liked)
Asklemmy
43755 readers
2266 users here now
A loosely moderated place to ask open-ended questions
Search asklemmy ๐
If your post meets the following criteria, it's welcome here!
- Open-ended question
- Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
- Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
- Not ad nauseam inducing: please make sure it is a question that would be new to most members
- An actual topic of discussion
Looking for support?
Looking for a community?
- Lemmyverse: community search
- sub.rehab: maps old subreddits to fediverse options, marks official as such
- [email protected]: a community for finding communities
~Icon~ ~by~ ~@Double_[email protected]~
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I remember some years back there was a news story about some chatbot passing the Turing test. The researchers decided to make their chatbot impersonate a young Russian boy, which made its limitations harder to identify as non-human by the native-English-speaking test subjects. So it wasn't actually that impressive.
That will likely be the first kind of thing we'll see for an artificial voice-chatbot as well. It's a big world and many of the people I talk with on Discord (and even IRL) are not native English speakers and not from my country.
I'm not intimately familiar with the accents and speech patterns from everywhere in the world, so I'm conditioned to shrug off a lot of "strange" language. Because of this wide range of human speech patterns, I'm not confident that I could validate voices with a low enough false-positive and false-negative rate in practice.
I haven't really dug into the latest voice generation AI yet so I'm not sure how capable off-the-shelf programs are. I am familiar with the general techniques, though, and I think adding realistic inflection is within reach. I don't think it's possible to automate the entire pipeline yet, at least not with publicly available programs, but the field is advancing quickly so I can't take much solace in that.