this post was submitted on 29 Mar 2024
14 points (88.9% liked)
SQL
473 readers
1 users here now
Related Fediverse communities:
- #sql on Mastodon
- #postgresql on Mastodon
- c/PostgreSQL on programming.dev
Icon base by Delapouite under CC BY 3.0 with modifications to add a gradient
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Have you considered raw queries (properly parametrized and escaped and all that)? You can’t beat the speed of getting exactly what you need with no overhead.
What you’re describing is basically a fancy ORM.
I'm not sure if what I use is proper enough in your sense. So, can you elaborate more?
Parameterization: https://cheatsheetseries.owasp.org/cheatsheets/Query_Parameterization_Cheat_Sheet.html
Yes, I've used this.
My #1 issue with raw sql is its just absolutely a nightmare to maintain.
I simply just can't easily, at a glance, do something as simple as "give me the list if every single chunk of code that touches this column on this table", which is like, 80% of my start points for debugging an error showing up on our backend.
"We sometimes get NULL being set on this column that should no longer be NULL if (other column) is getting set, can you investigate how that us happening?"
If you have an application that uses raw sql, simply just step 1 of "find all backend code that touches that column" is already 100x more effort than it should be, and that's even on a well maintained project.
If the sql is even slightly poorly maintained (and since you are tasking BE (some language other than sql) devs with maintaining SQL, it very often is very poorly maintained, often just shoved as raw magic strings in the middle of their code, so.etimes even generated dynamically.
At which point its just a fucking nightmare to figure out what the fuck is writing to that column.
With an ORM, the issue suddenly becomes as easy as clicking the "find references" button on the field for that column and, boom, all bits of code that touch that field in any way are now listed put for you, ez.
If you’re at the point where you need the performance boost from raw SQL over an ORM, you have solutions for these problems such as a well-maintained, centralized interface or store for SQL.
OP wasn’t asking for best principles, OP was asking for a language that replaces ORMs. That’s SQL directly in your code.
You can still do that.
For example, you'd still write classes for your tables:
and then you'd just do
That let's you write raw sql about as close as it gets, while still having some degree of type-safety. You could drop a query like that into Dapper, and you're pretty close to just using raw sql.
I don't see why I'd do that over
Which will produce pretty much the exact same sql under the hood but be 100x easier to read, maintain, and debug.
Because just Dapper will perform a lot better executing raw sql queries than EF having to go through an entire expression tree builder.
Anyway, I wasn't saying that that example is a better way than doing it with EF, I was just going over your points where you mentioned that with raw SQL it's just all unreferenced magic strings with no references to tables or columns. And that you can't find where anything is used.
So that's just to explain - if you write your sql inside code in the poorest possible way - yea, you're gonna have a poor experience. But if you want to write raw sql instead of using an ORM, it's pretty easy to negate all those downsides about not having references
I'd like to see some benchmarks on truly how much this difference matters when running on the cloud.
I expect latency alone between the App<->Db will dwarf whatever microseconds your raw sql would save that it's hard to distinguish from the chaos of latency variance.
There are lots of ways to find out what code touches a column. For example, if the code is deployed as stored procedures, you can easily query the text of all stored procedures for references to that column. If it's not deployed that way (maybe in a Git repo somewhere), it's still possible to search that text for the references.
But the problem you describe wouldn't be present if you had good documentation. If developers (front end, back end, and data alike) were able to create documentation that detailed what their code does, and you maintained a knowledge base or data governance platform (like Collibra, though even a wiki would do), you could simple click on the field name and immediately see every article or code reference that uses it as one of their attributes.
Good documentation is all I'm saying. It just usually doesn't exist because the bean counters don't prioritize time to create it, and the developers commonly don't want to (though they'll complain about the lack of it later) or aren't trained to do it effectively.
Yeah considering that:
You are asking backend devs that are specialized in (BE language that isn't SQL) to maintain documentation on SQL code.
Also, wikis or etc are even worse to try and search on, I'd argue this solution will be even worse than just
grep
ing the codebase.Doesn't matter, because you know how I do it with my ORM?
I hit F12 and that's it, I get a brought up list of all code that touches that column, and no false positives, in the same IDE + LSP I use to do my backend code
ORMs take moments to find the exact BE code that matters.
SQL takes minutes to actually find wtf calls what calls what.
If you have BE code that calls a stored proc that calls a stored proc that calls a stored proc that runs against a view of a view of a table (I've seen this sort of shit on very old long maintained large codebases a few times) it can take you hours just to work out the exact chain of what calls what to figure out how a table got to be a specific way at some point.
There is no way to maintain that sort of pile of code easily, you have to spend a tonne of extra time writing documentation just to even approach "not a total nightmare at least..."
No. Thanks.
It's really not that hard. I can do it in under 30 seconds from memory because I am a data dude. If you're not, that's no big deal; just find one who can help you with it. Projects don't have to be a single person; they can be composed of multiple individuals, each with different specialties. If you want to work alone, learn all the specialties so you can do it, too, instead of whining about how hard it is.
I'm not asking anyone to do anything. I'm saying that if there was good documentation, we'd all have an easier time. You can't deny that. Is it gonna be easy to create and maintain all that documentation? Not necessarily. Will it make our lives easier down the road if we do, though? Yeah, probably.
And by the way, as a SQL Dev, I create documentation on my SQL deployments. I don't expect the BEs to do it because, as you stated, they don't know it as well as I do. I created it for them so they can have an understanding of the working parts they don't know like the back of their hand, and I create it for me so I can explain it later because I don't have every line of code I've ever written memorized and I might not be the next one who needs to work on it.
I'll admit I haven't used a lot of ORMs because I'm old-school. If it has all the functionality you describe, that's great! My point was that you were quick to disparage the SQL development process and it's nuances. Not to mention the fact that the data design is, itself, often nuanced and detailed beyond just want a look-up tool can tell you. Data experts like myself exist so we can explain what's going on, not just pain a picture of it. You can see what picture the puzzle makes, but you need someone who knows how all the pieces go together, don't you?
You're absolutely right. This is bad design. It should not be done this way. However, if you have an intelligent and creative Data Expert, you'll get an easily designed solution that you can use over and over and requires minimal effort to maintain and update as the needs change. It comes down to who built it and what skills they had. Categorically believing that all data delivery solutions are terrible because of this one kind of experience is a logical fallacy.
Look, that's fine. If you don't wanna touch it, don't touch it. Let data experts touch it. I've designed database systems for IBM, Nike, and Amazon AWS, and I was singularly capable of doing so because of my background. None of the other hundreds of developers on the dozens of Agile teams were able to do that work because they all had other specialties and other jobs and just needed data to be delivered to them quickly, efficiently, and in a manner that was easy to digest and utilize. That was my job as the Data Expert. If you don't have one but you need one, you need to become one. And if you were one, you wouldn't be disparaging the trade so much.
The moment it involves me having to leave the context of my workspace, go to some other workspace, abd then try and connect the dots between the two without there being any existing solutions to aid that, it's never going to be easy.
That's just a cold hard fact. There's no fluid LSP mechanism to context switch between the SQL and my backend code. Full stop.
If I have BE code that invokes a stored proc, I can't just see the definition of "wtf does that stored proc do?" From within my workspace.
I definitely can't just hit a button at this time to just switch to its definition.
Also, the fact what my DB thinks the stored proc is vs what the code says it does is always a big wrench in the gears. I despise the entire concept of "well the code defines it as this but my local db is out of date"
The concept of literally switching git branches is all it takes for my codebase to say one thing and my db to say another.
You can't reconcile that without tonnes of extra work.
And add in the fact that this problem can layer up with stored prics calling stored procs, DB schema changing...
I just am not interested in a stack where I have to maintain 2 entirely separate chains of distinct sources of truth, especially when the latter doesn't even maintain with Version Control.
Like I said if you actually have a DB Engineer, use them.
But this may surprise you but, most projects just don't have one, nor can they afford one.
On smaller companies looking to cut costs, having a BE dev is necessary, but if the need for a DB Engineer can be replaced by a BE Dev just using an ORM for now, then I think it's pretty straightforward which of the two roles will be hired and the other passed on.
Until you get to a fairly large scale, actually having dedicated DB engineers simply just isn't the part of most companies strategies.
Do I wish I had another entire person who could just handle it for me?
Sure.
But it's not gonna happen if I'm not working on FAANG scale projects, I also wish I had a million dollars right now, but that's just not reality.