There has been an overwhelming amount of new models hitting HuggingFace. I wanted to kick off a thread and see what open-source LLM has been your new daily driver?

Personally, I am using many Mistral/Mixtral models and a few random OpenHermes fine-tunes for flavor. I was also pleasantly surprised by some of the DeepSeek models. Those were fun to test.

I believe 2024 is the year open-source LLMs will catchup with GPT-3.5 and GPT-4. We're already most of the way there. Curious to hear what new contenders are on the block and how others feel about their performance/precision compared to other state-of-the-art (closed) source models.

top 6 comments

sorted by: hot top controversial new old

[–] xodoh74984 18 points 1 year ago (1 children)

This one is only 7B parameters, but it punches far above its weight for such a little model:
https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha

My personal setup is capable of running larger models, but for everyday use like summarization and brainstorming, I find myself coming back to Starling the most. Since it's so small, it runs inference blazing fast on my hardware. I don't rely on it for writing code. Deepseek-Coder-33B is my pick for that.

Others have said Starling's overall performance rivals LLaMA 70B. YMMV.

[–] Blaed 2 points 1 year ago

What sort of tokens per second are you seeing with your hardware? Mind sharing some notes on what you're running there? Super curious!

[–] [email protected] 7 points 1 year ago (1 children)

I would also be interested in Code-Pilot Models that are reaching for same performance like GitHub or Microsofts paid Models.

Currently I use TabbyML but the available Models are by far inferior.

[–] xodoh74984 8 points 1 year ago* (last edited 1 year ago) (1 children)

Of all of the code specific LLMs I'm familiar with Deepseek-Coder-33B is my favorite. There are multiple pre-quantized versions available here:
https://huggingface.co/TheBloke/deepseek-coder-33B-base-GGUF/tree/main

In my experience a minimum of 5-bit quantization performs best.

[–] Blaed 3 points 1 year ago* (last edited 1 year ago)

I was pleasantly surprised by many models of the Deepseek family. Verbose, but in a good way? At least that was my experience. Love to see it mentioned here.

[–] [email protected] 2 points 1 year ago

Personally I find myself renting GPU and running Goliath 120b. Smaller models could do what I’m doing if I spent more time optimizing my prompts. But every day I’m doing different tasks, and Goliath 120b will just handle whatever I throw at it, no matter how sloppy I am. I’ve also been playing with LLAVA and Hermes vision models to describe images to me. However, when I really need alt-text for an image I can’t see, I still find myself resorting to GPT4; the open source options just aren’t as accurate or detailed.