Technology

59708 readers

5518 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

666

submitted 1 year ago by L4s to c/technology

129 comments fedilink hide all child comments

Over just a few months, ChatGPT went from correctly answering a simple math problem 98% of the time to just 2%, study finds. Researchers found wild fluctuations—called drift—in the technology’s abi...::ChatGPT went from answering a simple math correctly 98% of the time to just 2%, over the course of a few months.

you are viewing a single comment's thread
view the rest of the comments

[–] DominicHillsun 217 points 1 year ago (51 children)

It seems rather suspicious how much ChatGPT has deteorated. Like with all software, they can roll back the previous, better versions of it, right? Here is my list of what I personally think is happening:

They are doing it on purpose to maximise profits from upcoming releases of ChatGPT.
They realized that the required computational power is too immense and trying to make it more efficient at the cost of being accurate.
They got actually scared of it's capabilities and decided to backtrack in order to make proper evaluations of the impact it can make.
All of the above

[+] [email protected] 52 points 1 year ago* (last edited 11 months ago) (9 children)

[deleted]

[–] killerinstinct101 37 points 1 year ago (1 children)

This is what was addressed at the start of the comment, you can just roll back to a previous version. It's heavily ingrained in CS to keep every single version of your software forever.

[+] [email protected] 23 points 1 year ago* (last edited 1 year ago) (3 children)

[deleted]

[–] [email protected] 30 points 1 year ago (1 children)

That's not how these LLMs work. There is a training phase which takes a large amount of compute power, and the training generates a model which is a set of weights and could easily be backed up and version-controlled. The model is then used for inference which is a less compute-intensive process and runs on much smaller hardware than the training phase.

The inference architecture does use feedback mechanisms but the feedback does not modify the model-weights that were generated at training time.

[+] [email protected] 0 points 1 year ago* (last edited 1 year ago) (3 children)

[deleted]

[–] [email protected] 13 points 1 year ago

They list the currently available models that users of their API can select here:

https://platform.openai.com/docs/models/overview

They even say that while the main models are being continuously updated (read: re-trained) there are snapshots of previous models that will remain static.

So yes, they are storing and snapshotting the models and they have many different models available with which to perform inference at the same time.

[–] [email protected] 4 points 1 year ago

Each parameter corresponds to a single number, so if it’s using 16 bit numbers then that’s 200 TB. They might be using 32 bit numbers (400 TB) but wouldn’t be using anything larger.

[–] Lukecis 1 points 1 year ago

Makes me wonder how exactly they curate said data, its such an insane amount even teams of thousands of human programmers sifting through all of it 24/7 all day everyday wouldn't be able to fact check or assess all the data for years. Presumably they use ai to go over the data scraped and thrown into the model, since I cant imagine any human being able to curate it all.

I've heard from various videos detailing the topic that many of the developers have little to no clue as to what's going on inside the LLM once it's assembled and set about its work on training itself and what not- and I'm inclined to believe them, the human programmers simply set the params, and system up and then the system eats all the data loaded into it and immediately becomes a sort of black box which nobody knows exactly whats going on inside of it to produce the output it does.

[–] Lazylazycat 6 points 1 year ago

Exactly this, that's why Loab exists forever now.

[–] [email protected] 2 points 1 year ago

Even so, surely they can take snapshots. If they're that clueless about rudimentary practices of IT operations then it is just a matter of time before an outage wipes everything. I find it hard to believe nobody considered a way to do backups, rollbacks, or any of that.

load more comments (7 replies)

load more comments (48 replies)