this post was submitted on 14 Feb 2025
483 points (96.9% liked)

No Stupid Questions

37443 readers
2759 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)


Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.



Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That's it.



Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.



Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.



Rule 8- All comments should try to stay relevant to their parent content.



Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



Rule 10- Majority of bots aren't allowed to participate here.



Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 2 years ago
MODERATORS
 

I'm a tech interested guy. I've touched SQL once or twice, but wasn't able to really make sense of it. That combined with not having a practical use leaves SQL as largely a black box in my mind (though I am somewhat familiar with technical concepts in databasing).

With that, I keep seeing [pic related] as proof that Elon Musk doesn't understand SQL.

Can someone give me a technical explanation for how one would come to that conclusion? I'd love if you could pass technical documentation for that.

(page 3) 50 comments
sorted by: hot top controversial new old
[–] 9point6 290 points 1 week ago (1 children)

The statement "this [guy] thinks the government uses SQL" demonstrates a complete and total lack of knowledge as to what SQL even is. Every government on the planet makes extensive and well documented use of it.

The initial statement I believe is down to a combination of the above and also the lack of domain knowledge around social security. The primary key on the social security table would be a composite key of both the SSN and a date of birth—duplicates are expected of just parts of the key.

If he knew the domain, he would know this isn't an issue. If he knew the technology he would be able to see the constraint and following investigation, reach the conclusion that it's not an issue.

The man continues to be a malignant moron

[–] spankmonkey 30 points 1 week ago* (last edited 1 week ago) (3 children)

The initial statement I believe is down to a combination of the above and also the lack of domain knowledge around social security. The primary key on the social security table would be a composite key of both the SSN and a date of birth—duplicates are expected of just parts of the key.

Since SSNs are never reused, what would be the purpose of using the SSN and birth date together as part of the primary key? I guess it is the one thing that isn't supposed to ever change (barring a clerical error) so I could see that as a good second piece of information, just not sure what it would be adding.

Note: if duplicate SSNs are accidentally issued my understanding is that they issue a new one to one of the people and I don't know how to find the start of the thread on twitter since I only use it when I accidentally click on a link to it.

https://www.ssa.gov/history/hfaq.html

Q20: Are Social Security numbers reused after a person dies?

A: No. We do not reassign a Social Security number (SSN) after the number holder's death. Even though we have issued over 453 million SSNs so far, and we assign about 5 and one-half million new numbers a year, the current numbering system will provide us with enough new numbers for several generations into the future with no changes in the numbering system.

[–] [email protected] 27 points 1 week ago (10 children)

Take this with a grain of salt as I'm not a dev, but do work on CMS reporting for a health information tech company. Depending on how the database is designed an SSN could appear in multiple tables.

In my experience reduplication happens as part of generating a report so that all relevant data related to a key and scope of the report can be gathered from the various tables.

[–] [email protected] 25 points 1 week ago* (last edited 1 week ago) (1 children)

A given SSN appearing in multiple tables actually makes sense. To someone not familiar with SQL (i.e. at about my level of understanding), I could see that being misinterpreted as having multiple SSN repeated "in the database".

Of all the comments ao far, I find yours the most compelling.

[–] [email protected] 14 points 1 week ago* (last edited 1 week ago) (4 children)

Theoretically, yeah, that's one solution. The more reasonable thing to do would be to use the foreign key though. So, for example:

SSN_Table

ID | SSN | Other info

Other_Table

ID | SSN_ID | Other info

When you want to connect them to have both sets of info, it'd be the following:

SELECT * FROM SSN_Table JOIN Other_Table ON SSN_Table.ID = Other_Table.SSN_ID

EDIT: Oh, just to clear up any confusion, the SSN_ID in this simple example is not the SSN itself. To access that in this example query, it'd by SSN_Table.SSN

[–] schteph 21 points 1 week ago (3 children)

This is true, but there are many instances where denormalization makes sense and is frequently used.

A common example is a table that is frequently read. Instead of going to the "central" table the data is denormalized for faster access. This is completely standard practice for every large system.

There's nothing inherently wrong with it, but it can be easily misused. With SSN, I'd think the most stupid thing to do is to use it as the primary key. The second one would be to ignore the security risks that are ingrained in an SSN. The federal government, being large as it is, I'm sure has instances of both, however since Musky is using his possy of young, arrogant brogrammers, I'm positively certain they're completely ignoring the security aspect.

load more comments (3 replies)
load more comments (3 replies)
load more comments (9 replies)
load more comments (2 replies)
[–] [email protected] 94 points 1 week ago (2 children)

Because SQL is everywhere. If Musk knew what it was, he would know that the government absolutely does use it.

load more comments (2 replies)
[–] darkmarx 72 points 1 week ago (2 children)

"The government" is multiple agencies and departments. There is no single computer system, database, mainframe, or file store that the entire US goverment uses. There is no standard programming language used. There is no standard server configuration. Each agency is different. Each software project is different.

When someone says the government doesn't use sql, they don't know what they are talking about. It could be refering to the fact that many government systems are ancient mainframe applications that store everything in vsam. But it is patently false that the government doesn't use sql. I've been on a number of government contracts over the years, spanning multiple agencies. MsSQL was used in all but one.

Furthermore, some people share SSNs, they are not unique. It's a common misconception that they are, but anyone working on a government software learns this pretty quickly. The fact that it seems to be a big shock goes to show that he doesn't know what he is doing and neither do the people reporting to him.

Not only is he failing to understand the technology, he is failing to understand the underlying data he is looking at.

[–] [email protected] 12 points 1 week ago* (last edited 1 week ago) (9 children)

Yeah, obviously ol' boy is tripping if he thinks SQL isn't used in the government.

Big thing I'm prying at is whether there would be a legitimate purpose to have duplicated SSNs in the database (thus showing the Vice Bro doesn't understand how SQL works).

I'm not aware of any instance where two people share an SSN though. The Social Security Administration even goes as far as to say they don't recycle the SSNs of dead people (its linked a couple times in other comments and Voyager doesn't let me save drafts of comments, I'll make an edit to this comment with that link for you).

Can you point me to somewhere showing multiple people can share an SSN?

Edit: as promised: The Social Security FAQ page

[–] [email protected] 12 points 1 week ago (2 children)

Assuming the whole "duplicate SSN" thing isn't just a complete fabrication, we have no idea what table he was even looking at! A table of transactions e.g. would have a huge number of duplicate SSNs.

load more comments (2 replies)
load more comments (8 replies)
load more comments (1 replies)
[–] [email protected] 65 points 1 week ago (9 children)

Because a simple query would have shown that SSN was a compound key with another column (birth date, I think), and not the identifier he thinks it is.

load more comments (9 replies)
[–] GaMEChld 55 points 1 week ago (10 children)

Because of course the government uses SQL. It's as stupid as saying the government doesn't use electricity or something equally stupid. The government is myriad agencies running myriad programs on myriad hardware with myriad people. My damned computers at home are using at least 2-3 SQL databases for some of the programs I run.

SQL is damn near everywhere where data sets are found.

load more comments (10 replies)
[–] [email protected] 43 points 1 week ago* (last edited 6 days ago) (17 children)

Its because the comments he made are inconsistent with common conventions in data engineering.

  1. It is very common not to deduplicate data and instead just append rows, The current value is the most recent and all the old ones are simply historical. That way you don't risk losing data and you have an entire history.
    • whilst you could do some trickery to deduplicate the data it does create more complexity. There's an old saying with ZFS: "Friends don't let friends dedupe" And it's much the same here.
    • compression is usually good enough. It will catch duplicated data and deal with it in a fairly efficient way, not as efficient as deduplication but it's probably fine and it's definitely a lot simpler
  2. Claiming the government does not use SQL
    • It's possible they have rolled their own solution or they are using MongoDB Or something but this would be unlikely and wouldn't really refute the initial claim
    • I believe many other commenters noted that it probably is MySQL anyway.

Basically what he said is ~~incoherent~~ inconsistent with typical practices among data engineers ~~to anybody who has worked with larger data.~~

In terms of using SQL, it's basically just a more reliable and better Excel that doesn't come with a default GUI.

If you need to store data, It's almost always best throw it into a SQLite database Because it keeps it structured. It's standardised and it can be used from any programming language.

However, many people use excel because they don't have experience with programming languages.

Get chatGpt to help you write a PyQT GUI for a SQLite database and I think you would develop a high level understanding for how the pieces fit together

Edit: @zalgotext made a good point.

load more comments (17 replies)
[–] [email protected] 31 points 1 week ago* (last edited 1 week ago) (5 children)

To oversimplify, there are two basic kinds of databases: SQL (Structured Query Language, usually pronounced like "sequel" or spelled aloud) and noSQL ("Not Only SQL").

SQL databases work as you'd imagine, with tables of rows and columns like a spreadsheet that are structured according to a fixed schema.

NoSQL includes all other forms of databases, document-based, graph-based, key-value pairs, etc.

The former are highly consistent and efficient at processing complicated queries or recording transactions, while the latter are more flexible and can be very fast at reads/writes but are harder to keep in sync as a result.

All large orgs will have both types in use for different purposes; SQL is better for banking needs where provable consistency is paramount, NoSQL better for real-time web apps and big data processing that need minimal response times and scalable capacity.

That Musk would claim the government doesn't use SQL immediately betrays him as someone who is entirely unfamiliar with database administration, because SQL is everywhere.

load more comments (5 replies)
[–] SolidShake 26 points 1 week ago (1 children)

How come republicans keep saying that doggy is going to expose all the fraud in the government but yet the biggest fraud with 37 felonies is president? What the actual fuck to these people think?

load more comments (1 replies)
[–] [email protected] 21 points 1 week ago* (last edited 1 week ago) (2 children)

I think a lot of comments here miss the mark, it's not really just about stating the gov does not use SQL or speculation regarding keys.

Deduplication is generally part of a compression strategy and has nothing to do with SQL. If we're being generous he may have been talking about normalization, but no one I have ever met has confused the two terms (they are distinctly different from an engineering perspective).

There are degrees of normalization too, so it may make total sense to normalize 3NF (third normal form) rather than say 6NF depending on the data.

load more comments (2 replies)
[–] P00ptart 21 points 1 week ago (1 children)

Everything they don't understand (which is nearly everything) is either God or fraud. Do with that information what you will.

[–] Lemminary 1 points 1 week ago

Well, here it's Cake or Death! Choose carefully.

[–] spankmonkey 19 points 1 week ago

If he doesn't think the government uses sql after having his goons break into multiple government servers he is an idiot.

If he is lying to cover his ass for fucking up so many things (the more likely explanation) then saying "he never used sql" is basically a dig at how technically inept he really is despite bragging about being a tech bro.

[–] jacksilver 18 points 1 week ago (4 children)

If SSNs are used as a primary key (a unique identifier for a row of data) then they'd have to be duplicated to be able to merge data together.

However, even if they aren't using ssn as an identifier as it's sensitive information. It's not uncommon to repeat data either for speed/performance sake, simplicity in table design, it's in a lookup table, or you have disconnected tables.

Having a value repeated doesn't tell you anything about fraud risk, efficency, or really anything. Using it as the primary piece of evidence for a claim isn't a strong arguement.

load more comments (4 replies)
[–] surewhynotlem 14 points 1 week ago (4 children)

Dedup is about saving storage and has literally nothing to do with primary keys.

load more comments (4 replies)
[–] [email protected] 14 points 1 week ago

I'm still learning SQL, so if I'm out of line someone please correct me, but, the gist of it, is that SQL (Structured Query Language) is a language used in pretty much all relational databases, which with something like the Social Security database is almost guaranteed. Having duplicates of information in a relational database is not a sign of fraud, or anything shady going on.

When you're born, your name, along with your SSN and any other relevant info is put into the database, later in life, say you change your name, the original name, along with your SSN will stay there, and a new line in the database would be added with your new name, along with your SSN again (a duplicate) that way the database has a reference point between old and new name, and keeps all your information lined up between the two.

If you were to get rid of all of that duplicate information, anyone who's ever had a name change, been married, etc. It will cause chaos in the database, with hundreds of millions of entries that now have no relation to anything, and are now just basically dead ends.

[–] [email protected] 12 points 1 week ago (3 children)

Rows in a SQL table have a primary key which works as the unique identifier for that row. The primary key can be as simple as an incrementing number.

load more comments (3 replies)
[–] [email protected] 10 points 1 week ago

Musk is the walking Dunning-Krueger, he is too stupid to realize how terrible he sounds.

[–] [email protected] 10 points 1 week ago (11 children)

I saw a comment about this in the last couple of days that was really interesting and educational. Unfortunately I can't seem to find it again to link it, but the gist of it was that there would be two things wrong with using SSNs as primary keys in a SQL database:

  • You should not use externally generated data as primary keys
  • You should not use personally identifying data as primary keys

Using SSNs as keys would violate both.

I went looking for best practices regarding SQL primary keys and found this really interesting post and discussion on Stack Overflow:

https://stackoverflow.com/questions/337503/whats-the-best-practice-for-primary-keys-in-tables

My first thought was that people's SSNs can and do change, and sometimes (rarely?) people may have more than one SSN. Like someone mentions in that link, human error would be another reason why you would not want to use external data and particularly SSNs as primary keys.

load more comments (11 replies)
[–] jj4211 10 points 1 week ago

Frankly the whole exchange sounds like Hollywood tech jargon.vaguely relevant words used in a not quite sensible way....

[–] [email protected] 10 points 1 week ago (1 children)

Might seem like a stupid question, but I'm in nostupidquestions sooo... Did Elon really do this tweet with the word "retard" in it? Obviously am on Lemmy so don't use Twitter.

[–] [email protected] 11 points 1 week ago (1 children)

Yep, just another example of what a trash human being he is.

load more comments (1 replies)
load more comments
view more: ‹ prev next ›