this post was submitted on 08 Dec 2024
458 points (94.6% liked)

Technology

59974 readers
3693 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS
458
The GPT Era Is Already Ending (www.theatlantic.com)
submitted 1 week ago* (last edited 1 week ago) by [email protected] to c/technology
 

If this is the way to superintelligence, it remains a bizarre one. “This is back to a million monkeys typing for a million years generating the works of Shakespeare,” Emily Bender told me. But OpenAI’s technology effectively crunches those years down to seconds. A company blog boasts that an o1 model scored better than most humans on a recent coding test that allowed participants to submit 50 possible solutions to each problem—but only when o1 was allowed 10,000 submissions instead. No human could come up with that many possibilities in a reasonable length of time, which is exactly the point. To OpenAI, unlimited time and resources are an advantage that its hardware-grounded models have over biology. Not even two weeks after the launch of the o1 preview, the start-up presented plans to build data centers that would each require the power generated by approximately five large nuclear reactors, enough for almost 3 million homes.

https://archive.is/xUJMG

you are viewing a single comment's thread
view the rest of the comments
[–] NocturnalMorning 68 points 1 week ago (20 children)

How is it useful to type millions of solutions out that are wrong to come up with the right one? That only works on a research project when youre searching for patterns. If you are trying to code, it needs to be right the first time every time it's run, especially if it's in a production environment.

[–] Khanzarate 6 points 1 week ago (11 children)

Well actually there's ways to automate quality assurance.

If a programmer reasonably knew that one of these 10,000 files was the "correct" code, they could pull out quality assurance tests and find that code pretty dang easily, all things considered.

Those tests would eliminate most of the 9,999 wrong ones, and then the QA person could look through the remaining ones by hand. Like a capcha for programming code.

The power usage still makes this a ridiculous solution.

[–] NocturnalMorning 15 points 1 week ago (9 children)

That seems like an awful solution. Writing a QA test for every tiny thing I want to do is going to add far more work to the task. This would increase the workload, not shorten it.

[–] Khanzarate 5 points 1 week ago

I do agree it's not realistic, but it can be done.

I have to assume the people that allow the AI to generate 10,000 answers expect that to be useful in some way, and am extrapolating on what basis they might have for that.

Unit tests would be it. QA can have a big back and forth with programming, usually. Unlike that, QA can just throw away a failed solution in this case, with no need to iterate on that case.

I mean, consider the quality of AI-generated answers. Most will fail with the most basic QA tools, reducing 10,000 to hundreds, maybe even just dozens of potential successes. While the QA phase becomes more extensive afterwards, its feasible.

All we need is... Oh right, several dedicated nuclear reactors.

The overall plan is ridiculous, overengineered, and solved by just hiring a developer or 2, but someone testing a bunch of submissions that are all wrong in different ways is in fact already in the skill set of people teaching computer science in college.

load more comments (8 replies)
load more comments (9 replies)
load more comments (17 replies)