There are over 600 lines of test code for every line of code in SQLite
Holy mother of god
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Follow the wormhole through a path of communities [email protected]
There are over 600 lines of test code for every line of code in SQLite
Holy mother of god
And they still find new bugs
A limitation of testing is that you can only write tests for cases that you can think of, and cases you can think of ways to write tests for.
It's still valuable despite this limitation, of course.
That's not entirely true, e.g. you can do fuzz testing or constrained random testing. Maybe you aren't including those in "testing"?
I was mostly thinking about hand-written tests and manual test procedures, but yeah, fuzzing can help you catch issues as well and you don't necessarily consciously know about the test cases you put into the system in that case.
Then again, you have to design the fuzzing input consciously so I guess that's kind of a "what you can think about"-limitation.
Good point regardless, thanks
You can write shit tests. Finding new bugs doesn't surprise me. Putting that much effort in does, but 600:1? That's some serious red flags there. There are only so many variables in a single line of code. How many unhappy paths can there be for a single line?
SQLite is one of the best tested codebases in existence. Having only so many variables per line means nothing
I had no idea the maintainers of sqlite were religious fanatics.
As long as they don't go on a holy crusade or forcefully evangelize the entire world by genocide I wouldn't call them fanatics.
That's a pretty high bar for fanatic. This list definitely sparks of fanaticism to me and it was derived from rules for monks. We could squabble over where to draw the line for "fanatic" but either way they're very religious.
I get it...I've never been the maintainer of a codebase that's deployed on trillions of devices, and backwards compatibility is something to be taken seriously and responsibly when you're that prolific. I do not begrudge SQLite or any large projects when they make decisions in service to that.
However
It always makes me feel oddly icky when known bugs (particularly of the footgun variety) become the new standard that the project intentionally upholds.
It's not on trillions of devices, just billions. But e.g. a typical android phone has 1000s of sqlite db's for different purposes.
You’re right, that’s a distinction I failed to make
another one not mentioned there: sqlite is really tiny: (from https://sqlite.org/faq.html#q18 )
The default configuration of SQLite only supports case-insensitive comparisons of ASCII characters. The reason for this is that doing full Unicode case-insensitive comparisons and case conversions requires tables and logic that would nearly double the size of the SQLite library.
Why do we even need a server? Why can’t I pull this directly off the disk drive? That way if the computer is healthy enough, it can run our application at all, we don’t have dependencies that can fail and cause us to fail, and I looked around and there were no SQL database engines that would do that, and one of the guys I was working with says, “Richard, why don’t you just write one?” “Okay, I’ll give it a try.” I didn’t do that right away, but later on, it was a funding hiatus. This was back in 2000, and if I recall correctly, Newt Gingrich and Bill Clinton were having a fight of some sort, so all government contracts got shut down, so I was out of work for a few months, and I thought, “Well, I’ll just write that database engine now.”
Gee, thanks Newt Gingrich and Bill Clinton?! Government shutdown leads to actual production of value for everyone instead of just making a better military vessel.
Their commitment to backwards compatibility, to the point of keeping a known bug that allows primary keys to be null, is both amazing and "wtf".
You can do backwards compatibility and make breaking changes to fix bugs. All you need is an opt-in "target version". CMake and Android are good examples of this.
I'm glad to see i've been pronouncing it right all these years.
ass-keh-leet
Unfortunately, you're both wrong 🙂
It’s pronounced “gif”
like the peanut butter?
SKIPPY
like the clear soda?
Yeah I was really surprised by that. Surely "sequelite", given SQL is commonly pronounced "sequel" (c.f. PRQL).
Hmm, well... I have never murdered anyone, not even once! Is that good enough for their Code of Ethics?
So nerdy, so good
Here's a fun fact not noted in the article: Temporary files in sqlite are named etilqs_something in order to prevent people from contacting the sqlite developers for support when other applications (specifically, McAfee) have decided dump and not prune temp files.
Source: https://github.com/sqlite/sqlite/blob/95f6df5b8d55e67d1e34d2bff217305a2f21b1fb/src/os.h#L57
Here’s a fun fact not noted in the article:
It's #19 in the article.
Well, I can't read I guess.
At least I linked to the code, since the article doesn't seem to do that. The twitter thread it linked to probably does, but I can't view the replies without logging in.
At least I linked to the code,
I appreciate that. :)
Just to ponder about the first point, about the number of SQLite databases:
SQLite does not have a daemon like MySQL/MariaDB, PostgreSQL, SQL Server and others. While it would be theoretically and technically possible to count, for example, how many MySQL servers there are (by discovering, mapping and counting MySQL daemons on the internet, except for daemons running behind firewalls, or on air-gapped systems, or daemons that are using UNIX sockets instead of TCP ports), counting SQLite databases seems impractical.
Each SQLite database is a file, on a (often private and unexposed) file system. A single project can use multiple SQLite files simultaneously. Should we consider each of these files as a separate database, or as the same database?
And then there are things like WhatsApp and Chromium, for example. WhatsApp uses a few encrypted SQLite files to store the various WhatsApp things like chats, E2EE keys, etc. Chromium uses a few SQLite files to store things like browsing history, session storage, local storage, cookies, and more. Each of these could be counted as a separate "database", which would require counting how many users WhatsApp has, how many users the different Chromium derivatives have, and multiplying that by how many different databases each of those platforms uses. And there are many other projects that also use SQLite, with a large number of users, which consequently results in a large number of different databases.
This article is written as though it is targeting FOSS newbie or something -- a weird mix of jargon and simple language designed to overawe someone.
Their VCS is at least as interesting as SQLite :)