PumpkinEscobar

joined 2 years ago
[–] PumpkinEscobar 30 points 4 months ago

It's not uncommon on sensitive stories like this for the government to loop-in journalists ahead of time so they can pull together background and research with an agreed-upon embargo until some point in the future.

This wasn't the US government telling the newspaper they couldn't report on a story they had uncovered from their own investigation.

[–] PumpkinEscobar 64 points 4 months ago

I guess this solves part of the mystery about why the French rioted when they raised the retirement age last year

[–] PumpkinEscobar 19 points 4 months ago

batmanties?

[–] PumpkinEscobar 9 points 4 months ago (1 children)

There's quantization which basically compresses the model to use a smaller data type for each weight. Reduces memory requirements by half or even more.

There's also airllm which loads a part of the model into RAM, runs those calculations, unloads that part, loads the next part, etc... It's a nice option but the performance of all that loading/unloading is never going to be great, especially on a huge model like llama 405b

Then there are some neat projects to distribute models across multiple computers like exo and petals. They're more targeted at a p2p-style random collection of computers. I've run petals in a small cluster and it works reasonably well.

[–] PumpkinEscobar 15 points 5 months ago (1 children)

Is this the new "Simpsons already did it"?

Cunk already did it...

(3:40 if you want to get right to it) https://www.youtube.com/watch?v=UoSUx1xyj1E

[–] PumpkinEscobar 5 points 5 months ago
[–] PumpkinEscobar 1 points 5 months ago* (last edited 5 months ago)

Yale z-wave work well and last a long time between needing to replace batteries, and can run off of rechargeables. Can add to home assistant and work with Siri and Alexa integrations on home assistant.

Had some Schlage locks that ran through batteries way too fast.

[–] PumpkinEscobar 5 points 5 months ago

Florida... woman

[–] PumpkinEscobar 4 points 5 months ago

Exactly, if she could feel shame or humiliation she would be hiding in a cabin somewhere and never speaking to another person or camera again for the rest of her life...

Sorry, I got distracted by a happy thought there, what were we talking about?

[–] PumpkinEscobar 111 points 5 months ago* (last edited 5 months ago) (2 children)

When they’re not recording your desktop in an unencrypted database for AI, boot-looping your computer with bad patches or showing ads in your start menu, they’re disabling your account for calling family to see if they’re still alive. Damn.

[–] PumpkinEscobar 6 points 5 months ago

Taking ollama for instance, either the whole model runs in vram and compute is done on the gpu, or it runs in system ram and compute is done on the cpu. Running models on CPU is horribly slow. You won’t want to do it for large models

LM studio and others allow you to run part of the model on GPU and part on CPU, splitting memory requirements but still pretty slow.

Even the smaller 7B parameter models run pretty slow in CPU and the huge models are orders of magnitude slower

So technically more system ram will let you run some larger models but you will quickly figure out you just don’t want to do it.

[–] PumpkinEscobar 128 points 5 months ago (9 children)

Boeing made $76B in revenue in 2023. This is slightly more than 1 day's revenue for them ($210M / day) or a bit more than 10 days profit for them ($21M / day). They will keep doing what they're doing, but increase their spending on a PR campaign to improve their public image.

view more: ‹ prev next ›