Think about how often you see a question raised and then responded with "I don't know" in the internet data.
Wikipedia, textbooks, book books, forums, wiki how...
The vast majority of time, the format of what you see is:
<short question>?
<long winded answer or how to guide>
And thats what the llms are all trained on, that'd the most common pattern, so llms replicate that pattern.
The majority if their training data is off wikis abd textbooks, which pretty much never have "I dunno" anywhere. If an answer isn't known, instead, the page simply doesn't exist in the first place.
They probably do use lots of NoSQL DBs too, which perform better for non relational "data lake" style architectures where you just wanna dump mountains of data as fast as possible into storage, to be perused later.
When you have cases where you have very very high volume of data in, but very low need to query it (but some potential need, just very low), nosql DBs excel
Stuff like census data where you just gotta legally store it for historical reasons, and very rarely some person will wanna query it for a study or something.
Keep in mind when I talk about low need to query, the opposite high need us on the scale of like, "this db gets queried multiple times per minute'
Stuff like... logins to a website, data that gets queried many times per minute or even second, then sometimes nosql DBs fall off.
Depends what is queried.
Super basic "lookup by ID" Stuff that operates as just a big ole KeyValuePair mapping ID -> Value? And thats all you gotta query?
NoSql is still the right tool for the job.
The moment any kind of
enters the discussion though, chances are you actually wanna use sql now