this post was submitted on 03 Jan 2025

3 points (80.0% liked)

Perchance - Create a Random Text Generator

494 readers

13 users here now

⚄︎ Perchance

This is a Lemmy Community for perchance.org, a platform for sharing and creating random text generators.

Feel free to ask for help, share your generators, and start friendly discussions at your leisure :)

This community is mainly for discussions between those who are building generators. For discussions about using generators, especially the popular AI ones, the community-led Casual Perchance forum is likely a more appropriate venue.

See this post for the Complete Guide to Posting Here on the Community!

Rules

1. Please follow the Lemmy.World instance rules.

The full rules are posted here: (https://legal.lemmy.world/)
User Rules: (https://legal.lemmy.world/fair-use/)

2. Be kind and friendly.

Please be kind to others on this community (and also in general), and remember that for many people Perchance is their first experience with coding. We have members for whom English is not their first language, so please be take that into account too :)

3. Be thankful to those who try to help you.

If you ask a question and someone has made a effort to help you out, please remember to be thankful! Even if they don't manage to help you solve your problem - remember that they're spending time out of their day to try to help a stranger :)

4. Only post about stuff related to perchance.

Please only post about perchance related stuff like generators on it, bugs, and the site.

5. Refrain from requesting Prompts for the AI Tools.

We would like to ask to refrain from posting here needing help specifically with prompting/achieving certain results with the AI plugins (text-to-image-plugin and ai-text-plugin) e.g. "What is the good prompt for X?", "How to achieve X with Y generator?"
See Perchance AI FAQ for FAQ about the AI tools.
You can ask for help with prompting at the 'sister' community Casual Perchance, which is for more casual discussions.
We will still be helping/answering questions about the plugins as long as it is related to building generators with them.

6. Search through the Community Before Posting.

Please Search through the Community Posts here (and on Reddit) before posting to see if what you will post has similar post/already been posted.

founded 2 years ago

MODERATORS

[Suggestion] RAG for Perchance Text Generation AIs (blogs.nvidia.com)

submitted 4 days ago* (last edited 4 days ago) by ufl to c/perchance

6 comments fedilink hide all child comments

Hi, I've been playing with some AI models on my machine in GPT4ALL software and it have this thing called "LocalDocs".

It looks like it is just a RAG for AI and simple structured text document is more that enough for casual usage. Document like this was more than enough for Llama 3.x to be aware of dates and current state:

{
    "Today": "12-Dec-2024",
    "Tomorrow": "13-Dec-2024",
    "Deadline": "23-Dec-2024",
    "Tasks Left": [
        {"Task Name": "Get groceries"},
        {"Task Name": "Buy presents"}
    ]
}

Can perchance have something like that or is it up to generator creator to setup RAG?

In my experience, text AIs tend to ignore or forget information very quickly. Before setting RAG up I was constantly correcting AI about everything, but after getting RAG, AI worked flawlessly.

Also, RAGs seems to not increase context size. It looks like AI just uses it during generation and then forgets, so context is increased only by AIs reply.

To sum up, I've found this thing very useful, it will be super helpful for all text generators, especially for generators like where AI must be aware of some persistent context like world rules, story characters, etc. Here's couple of examples:

But all generator authors and users will benefit from this.

Note: If decided to implement, please don't make it as a file upload. This is how RAG is implemented in LM Studio and it is really annoying to delete previous document and upload new one. Live editor is significantly better.

top 6 comments

sorted by: hot top controversial new old

[–] wthit56 2 points 4 days ago (1 children)

Sounds like it's basically reminding the AI of some data each time a prompt is sent to it. What I do is just put it in the prompt, and it seems to work fine.

Those generators you listed all do the same, reminding the AI of things it needs to know about, but are pretty complex and have a lot to send, most likely. You can make your own Text AI pretty easily, and send whatever you like with the prompt. Maybe try that.

[–] ufl 1 points 3 days ago* (last edited 3 days ago)

I noticed from my tries that RAGs do not affect AI output as much. When I put text into prompt AI tends to quote from it or ignore what I said completely lol. RAGs are more like telling AI: "here's a document/documents that you have to look through every time you generate output" and it just does it

[–] j4k3 3 points 4 days ago* (last edited 4 days ago) (1 children)

Augmented generation is very difficult to implement for increased complexity. The chunking size and strategy is difficult and unique for each use case. While it may work at a surface level, in my experience with my own tailored learning using llama.cpp, langchain, and chroma db with parts of the computer science intermediate curriculum, automated chunking strategies are not effective for reliable assistance with open source models I can run on my hardware. The database needs to be human curated first from the perspective of an educator.

You might have a look at Storm.
https://www.youtube.com/watch?v=Rrls3Uvb7ic
I think that is a more practical form of augmentation that is more accessible. I haven't tried it yet, but it looks interesting.

[–] ufl 3 points 4 days ago* (last edited 4 days ago) (1 children)

This is probably true, I don't have a lot of experience with RAGs from dev side, I was just a user.

From my attempts with small structured data like <1000 words all Llama family AIs were good at "consuming" it without any additional preparations, just "plug and play". If you want to feed your AI whole wikipedia you most likely will need to curate data first to get reliable results, yes. But for casual usage for ensuring that AI won't forget or ignore some rules and be aware of present context it was enough. I was running Llama 3.2 8B Instruct with Q4 and Q8 and I believe this is the family of AIs that perchance uses for text generation. I was satisfied with results. Probably they were not ideal, but noticeably better with just default RAG and just some .txt file with markdown-like structured list, .json was also good. If it were up to me, I would incorporate it as a optional feature and left it up to users to evaluate results.

In chat text generators at perchance there's a feature "reminder note", this is basically a text that goes before AI output. Could've been useful, but AIs tend to directly quote from it. There's also /mem and /lore but UX of using it as some sort of RAG and especially live (where you constantly update it based on AIs output) one is not so great and it is not rare that AI just ignore it and makes something up.

Qwen and Mistral family was not so good with default RAGs and simple structured files in my tests btw, Llama had best results.

Thanks for tip on Storm, will look into it.

[–] j4k3 3 points 4 days ago* (last edited 4 days ago) (1 children)

If you do not have the RAM to load a Mixtral 8×7B Q4, look into setting up deepspeed. Once the model is actually loaded, that runs about like a 13B, but with nearly the attention of a 70B. I run either a 8×7B Q4K or a 70B Q4L on a 16GB GPU and 12th gen i7 w/64GB system memory. That does not require deepspeed to load. The 70B is only marginally better, but it is a little slower than my fastest reading pace. The Mixtral model is much faster and that is a large enough model to stay coherent. Your softmax settings per model are very important too.

[–] ufl 2 points 4 days ago

Thanks for this tip, I don't have a lot of VRAM just 64GB of regular RAM, but I don't mind waiting for output :)

But anyway, all non-Llama model weren't so good and using RAG in plug-and-play mode, probably I should've spent more time working on system prompt and jinja as well as RAG curation to squeeze all juices, but I wanted something quick and easy to setup and for this needs Llama 3.2 8B Instruct was the best. I used default setup for all models and same system prompt.

Also, new Qwen reasoning model was good, it was faster in my setup, but was too "independent" I guess, it tended to ignore instructions from system prompt and other settings, while Llama was more "obedient".