Free Open-Source Artificial Intelligence

2900 readers

1 users here now

Welcome to Free Open-Source Artificial Intelligence!

We are a community dedicated to forwarding the availability and access to:

Free Open Source Artificial Intelligence (F.O.S.A.I.)

More AI Communities

LLM Leaderboards

Developer Resources

GitHub Projects

GitHub Stars

FOSAI Time Capsule

founded 1 year ago

MODERATORS

Blaed

fosai

6000+ tokens context now possible with ExLlama (self.fosai)

submitted 1 year ago by Blaed to c/fosai

1 comments fedilink hide all child comments

From a recent PR by oobabooga:

This is what I get with 24gb vram (I haven't tested extensively, it may be possible to go higher):

Model	Params	Maximum context
llama-13b	max_seq_len = 8192, compress_pos_emb = 4	6079 tokens
llama-30b	max_seq_len = 3584, compress_pos_emb = 2	3100 tokens

I also removed the chat_prompt_size parameter, since truncation_length can be reused for its purpose.

Now possible in text-generation-webui after this PR: https://github.com/oobabooga/text-generation-webui/pull/2875

I didn't do anything other than exposing the compress_pos_emb parameter implemented by turboderp here, which in turn is based on kaiokendev's recent discovery: https://kaiokendev.github.io/til#extending-context-to-8k

How to use it

Open the Model tab, set the loader as ExLlama or ExLlama_HF.
Set max_seq_len to a number greater than 2048. The length that you will be able to reach will depend on the model size and your GPU memory.
Set compress_pos_emb to max_seq_len / 2048. For instance, use 2 for max_seq_len = 4096, or 4 for max_seq_len = 8192.
Select the model that you want to load.
Set truncation_length accordingly in the Parameters tab. You can set a higher default for this parameter by copying settings-template.yaml to settings.yaml in your text-generation-webui folder, and editing the values in settings.yaml.
Those two new parameters can also be used from the command-line. For instance: python server.py --max_seq_len 4096 --compress_pos_emb 2. -

top 1 comments

sorted by: hot top controversial new old

[–] ArkyonVeil 4 points 1 year ago

Thanks for reposting the breakthroughs!

Makes me have to visit Reddit less for news.

It even rhymes, how neat is that.