this post was submitted on 24 Nov 2024
13 points (93.3% liked)
LocalLLaMA
2267 readers
5 users here now
Community to discuss about LLaMA, the large language model created by Meta AI.
This is intended to be a replacement for r/LocalLLaMA on Reddit.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Which backend are you using to run it, and does that backend have an option to adjust context size?
I noticed in LM Studio, for example, that the default context size is much smaller than the maximum that the model supports. Qwen should certainly support more than 2000 tokens. I'd try setting it to 32k if you can.
I have found the problem with the cut off, by default aider only sends 2048 tokens to ollama, this is why i have not noticed it anywhere else except for coding.
When running
/tokens
in aider:Even though it will only send 2048 tokens to ollama.
To fix it i needed to add a file
.aider.model.settings.yml
to the repository:That's because ollama's default max ctx is 2048, as far as I know.