Today thanks to a NetworkChuck video I discovered OpenWebUl and how easy it is to set up a local LLM chat assistant. In particular, the ability to upload documents and use them as a context for chats really caught my interest. So now my question is: let's say l've uploaded 10 different documents on OpenWebUl, is there a way to ask llama3 which between all the uploaded documents contains a certain information (without having to explicitly tag all the documents)? And if not is something like this possible with different local lIm combinations?

top 6 comments

sorted by: hot top controversial new old

[–] [email protected] 2 points 5 months ago (1 children)

Only if your model has a large enough token context to contain all the documents' info would you be able to do something like that

[–] [email protected] 1 points 5 months ago (1 children)

And where do I find how much token context has my llm?

[–] General_Effort 2 points 5 months ago

It probably says somewhere where you dled the model. It's also in the metadata. I forget where it's displayed. Maybe in the terminal window.

Things you should know:

What a token is depends on the model.
Context takes a lot of (V)RAM.
People modify models to increase the context but that often doesn't work well. Watch out for the model missing things, esp. in the middle of the document.

L3 is probably not the right base for the task. Maybe Phi-3 or Cohere.

[–] [email protected] 2 points 5 months ago (1 children)

Open WebUI's document management loads everything into a vector database. When you use the hashtag, it will trigger a search against the vector database based on your prompt. These results are run feed into the LLM. Open WebUI should generate a hashtag that can reference all the documents. But the quality of the results will be influenced by the embeddings and the LLM that responds to you.

[–] [email protected] 1 points 5 months ago (1 children)

Ok thank you for your explanation! So basically if I would want an LLM to answer me basing itself on a large set of documents this would be the wrong approach (because of the context window problem other comments were mentioning)... but then how could I do?

[–] [email protected] 2 points 5 months ago* (last edited 5 months ago)

A vector search converts your query into magic numbers, and then searches the database for other magic numbers that are "similar" (closet to it in the vector space, which is basically an N-dimensional graph of points and directions). These results are then returned as snippets to the LLM and it does stuff from that point.

The effectiveness of the vector search depends on how Open WebUI breaks up the documents into smaller sections, and how good the embeddings are.

I'm not exactly sure what you want to achieve, but you might have success in using an LLM to summarize the documents beforehand, using a specific prompt to extract the info you want, then feed that into the vector DB. This would require some scripting, of course.

The easiest thing to do is try it. See if Open WebUI's vector search is able to handle it. Make sure to use a good embedding model like nomic-embed-text (can be found on ollama.com). You can change the vector search settings in the documents settings from the workspace on OpenWebUI.

Edit: https://ollama.com/library/nomic-embed-text