this post was submitted on 19 Oct 2023
123 points (92.4% liked)
Games
33228 readers
2799 users here now
Welcome to the largest gaming community on Lemmy! Discussion for all kinds of games. Video games, tabletop games, card games etc.
Weekly Threads:
Rules:
-
Submissions have to be related to games
-
No bigotry or harassment, be civil
-
No excessive self-promotion
-
Stay on-topic; no memes, funny videos, giveaways, reposts, or low-effort posts
-
Mark Spoilers and NSFW
-
No linking to piracy
More information about the community rules can be found here.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Not really.
One of the big mistakes I see people make in trying to estimate capabilities is thinking of all in one models.
You'll have one model that plays the game in ways that try a wider range of inputs and approaches to reach goals than what humans would produce (similar to the existing research like OpenAI training models to play Minecraft and mine diamonds off a handful of videos with input data and then a lot of YouTube videos).
Then the outputs generated by that model would be passed though another process that looks specifically for things ranging from sequence breaks to clipping. Some of those like sequence breaks aren't even detections that need AI, and depending on just what data is generated by the 'player' AIs, a fair bit of other issues can be similarly detected with dumb approaches. The bugs that would be difficult for an AI to detect would be things like "I threw item A down 45 minutes ago but this NPC just had dialogue thanking me for bringing it back." But even things like this are going to be well within the capabilities of multimodal AI within a few years as long as hardware continues to scale such that it doesn't become cost prohibitive.
The way it's going to start is that 3rd party companies dedicated to QA start feeding their own data and play tests into models to replicate and extend the behaviors, offering synthetic play testing as a cheap additional service to find low hanging fruit and cut down on human tester hours needed, and over time it will shift more and more towards synthetic testing.
You'll still have human play testers around broader quality things like "is this fun" - but the QA that's already being outsourced for bugs is going to almost certainly go the way of AI replacing humans entirely, or just nearly so.