this post was submitted on 04 Oct 2023

27 points (96.6% liked)

Free Open-Source Artificial Intelligence

3137 readers

18 users here now

Welcome to Free Open-Source Artificial Intelligence!

We are a community dedicated to forwarding the availability and access to:

Free Open Source Artificial Intelligence (F.O.S.A.I.)

More AI Communities

LLM Leaderboards

Developer Resources

GitHub Projects

GitHub Stars

FOSAI Time Capsule

founded 2 years ago

MODERATORS

Blaed

fosai

Mistral 7B Megathread (self.fosai)

submitted 1 year ago* (last edited 1 year ago) by Blaed to c/fosai

5 comments fedilink hide all child comments

Starting a Mistral Megathread to aggregate resources.

This is my new favorite 7B model. It is really good for what it is. I am excited to see what we can tune together. I will be using this thread as a living document, expect a lot of changes and notes, revisions and updates.

Let me know if there's something in particular you want to see here. I will be adding to this thread throughout my fine-tuning journey with Mistral.

Mistral Model Megathread

Key

Link #1 - Base Model
Link #2 - Instruct Model

Quantized Base Models from TheBloke

GPTQ

GGUF

AWQ

Quantized Samantha Models from TheBloke

GPTQ

GGUF

AWQ

Quantized Kimiko Models from TheBloke

GPTQ

https://huggingface.co/TheBloke/Kimiko-Mistral-7B-GPTQ

GGUF

https://huggingface.co/TheBloke/Kimiko-Mistral-7B-GGUF

AWQ

https://huggingface.co/TheBloke/Kimiko-Mistral-7B-AWQ

Quantized Dolphin Models from TheBloke

GPTQ

https://huggingface.co/TheBloke/dolphin-2.0-mistral-7B-GPTQ

GGUF

https://huggingface.co/TheBloke/dolphin-2.0-mistral-7B-GGUF

AWQ

https://huggingface.co/TheBloke/dolphin-2.0-mistral-7B-AWQ

Quantized Orca Models from TheBloke

GPTQ

https://huggingface.co/TheBloke/Mistral-7B-OpenOrca-GPTQ

GGUF

https://huggingface.co/TheBloke/Mistral-7B-OpenOrca-GGUF

AWQ

https://huggingface.co/TheBloke/Mistral-7B-OpenOrca-AWQ

Quantized Airoboros Models from TheBloke

GPTQ

https://huggingface.co/TheBloke/airoboros-mistral2.2-7B-GPTQ

GGUF

https://huggingface.co/TheBloke/airoboros-mistral2.2-7B-GGUF

AWQ

https://huggingface.co/TheBloke/airoboros-mistral2.2-7B-AWQ

If you like to run any of the quantized/optimized models from TheBloke, do visit the full model pages from each of the quantized model cards to see and support the developers of each fine-tuned model.

Mistral - Mistral.ai
Mistral Samantha - Eric Hartford
Mistral Kimiko - nRuaif
Mistral Dolphin - Eric Hartford
Mistral OpenOrca - OpenOrca/Alignment Lab
Mistral Airoboros - teknium

top 5 comments

sorted by: hot top controversial new old

[–] [email protected] 3 points 1 year ago (1 children)

Looks like an interesting model. I couldn't find it on their website, do you know what the training data was for this model?

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

https://mistral.ai/news/announcing-mistral-7b/

They don't publish the training dataset. It's a secret. There are open bugreports on their Github, HuggingFace #8, #10, #38 and i think someone said so explicitly on their Discord.

[–] [email protected] 3 points 1 year ago (1 children)

Nice list, thanks! What are some use-cases people are using 7B models for?

[–] Blaed 5 points 1 year ago* (last edited 1 year ago) (1 children)

I am actively exploring this question.

So far - it’s been the best performing 7B model I’ve been able to get my hands on. Anyone running consumer hardware could get a GGUF version running on almost any dedicated GPU/CPU combo.

I am a firm believer there is more performance and better quality of responses to be found in smaller parameter models. Not too mention interesting use cases you could apply fine-tuning an ensemble approach.

A lot of people sleep on 7B, but I think Mistral is a little different - there’s a lot of exploring to be had finding these use cases but I think they’re out there waiting to be discovered.

I’ll definitely report back on how the first attempt at fine-tuning this myself goes. Until then, I suppose it would be great for any roleplay or basic chat interaction. Given it’s low headroom - it’s much more lightweight to prototype with outside of the other families and model sizes.

If anyone else has a particular use case for 7B models - let us know here. Curious to know what others are doing with smaller params.

[–] [email protected] 5 points 1 year ago

That's fair, I think chat/roleplay are great use cases.

I also think some of these lightweight models might make for interesting personal recommendation/categorization engines, etc. In my experiments with using models to categorize credit card transaction statements ala Mint, only GPT4 was able to do a good job out of the box. I bet a small model could do quite well with fine tuning though.

Another thought I had was to make some sort of personal recommendation engine, so you could export your Netflix/Spotify likes and have it recommend movies or music that you might enjoy, etc. I suppose it's still early days for those kind of uses for open source models!