this post was submitted on 12 May 2024
79 points (93.4% liked)

Selfhosted

40229 readers
1383 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS
 

Hey all,

Almost as impressive as all the LLMs these days is the voice that ChatGPT uses with its emphasis and dramatic pauses and umms, etc.

I would love to integrate that with a self-hosted Llama3 engine.

Is there a project that y'all have heard of?

top 12 comments
sorted by: hot top controversial new old
[–] [email protected] 14 points 6 months ago

Regarding the TTS specifically, I remember looking into TorToiSeTTS back when this stuff was first coming out. You can generate ElevenLabs quality audio with it, but it's insanely slow. In fact, when I was looking into it, it seemed like ElevenLabs may have been using a (much faster at the time) version of TorToiSe TTS, given the output is so similar.

According to the linked Github page, they seem to have solved the speed issues now, so it might be worth looking into. Of course, the other commenters have provided solutions that are pre-integrated into the LLM, but if you're just looking for TTS this could be worth checking out. Also worth noting that this requires an NVIDIA GPU.

[–] requiem 12 points 6 months ago (1 children)
[–] [email protected] 2 points 6 months ago

This is what OP looks for. It exists! Other repos only cover partially (e.g. either ollama or tts)

[–] [email protected] 10 points 6 months ago

You mean just the text to speech part? Look into Piper

[–] [email protected] 9 points 6 months ago
[–] [email protected] 6 points 6 months ago (1 children)

When can I get one of these voices to read an epub on my phone? I'd love to have something like that

[–] [email protected] 3 points 6 months ago (1 children)

Librera FD as your reader app: https://www.f-droid.org/en/packages/com.foobnix.pro.pdf.reader/
Sherpa Onnx as your TTS engine: https://github.com/k2-fsa/sherpa-onnx

I recommend the piper TTS pretrained models, either Lessac medium or Kusal high/medium

[–] [email protected] 3 points 6 months ago

Installing Sherpa Onnx TTS makes it an option to use as your system TTS voice

[–] [email protected] 5 points 6 months ago* (last edited 6 months ago)
[–] PeachMan 3 points 6 months ago

epub2tts: https://github.com/aedocw/epub2tts

Looks like a project that utilizes coqui-AI: https://github.com/coqui-ai/TTS

[–] [email protected] 3 points 6 months ago

Oh WOW! Thanks to all who commented. Next time I get a chance I'm going to check these all out! 👍🏻 I hope others find this thread helpful too!

[–] [email protected] -2 points 6 months ago

New Lemmy Post: ChatGPT's voice, self-hosted? (https://lemmyverse.link/lemmy.world/post/15336896)
Tagging: #SelfHosted

(Replying in the OP of this thread (NOT THIS BOT!) will appear as a comment in the lemmy discussion.)

I am a FOSS bot. Check my README: https://github.com/db0/lemmy-tagginator/blob/main/README.md