this post was submitted on 29 Aug 2024
153 points (91.4% liked)

Technology

60205 readers
1835 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS
 

Taco Bell Programming

by Ted Dziuba on Thursday, October 21, 2010

Every item on the menu at Taco Bell is just a different configuration of roughly eight ingredients. With this simple periodic table of meat and produce, the company pulled down $1.9 billion last year.

The more I write code and design systems, the more I understand that many times, you can achieve the desired functionality simply with clever reconfigurations of the basic Unix tool set. After all, functionality is an asset, but code is a liability. This is the opposite of a trend of nonsense called DevOps, where system administrators start writing unit tests and other things to help the developers warm up to them - Taco Bell Programming is about developers knowing enough about Ops (and Unix in general) so that they don't overthink things, and arrive at simple, scalable solutions.

Here's a concrete example: suppose you have millions of web pages that you want to download and save to disk for later processing. How do you do it? The cool-kids answer is to write a distributed crawler in Clojure and run it on EC2, handing out jobs with a message queue like SQS or ZeroMQ.

The Taco Bell answer? xargs and wget. In the rare case that you saturate the network connection, add some split and rsync. A "distributed crawler" is really only like 10 lines of shell script.

Moving on, once you have these millions of pages (or even tens of millions), how do you process them? Surely, Hadoop MapReduce is necessary, after all, that's what Google uses to parse the web, right?

Pfft, fuck that noise:

find crawl_dir/ -type f -print0 | xargs -n1 -0 -P32 ./process

32 concurrent parallel parsing processes and zero bullshit to manage. Requirement satisfied.

Every time you write code or introduce third-party services, you are introducing the possibility of failure into your system. I have far more faith in xargs than I do in Hadoop. Hell, I trust xargs more than I trust myself to write a simple multithreaded processor. I trust syslog to handle asynchronous message recording far more than I trust a message queue service.

Taco Bell programming is one of the steps on the path to Unix Zen. This is a path that I am personally just beginning, but it's already starting to pay dividends. To really get into it, you need to throw away a lot of your ideas about how systems are designed: I made most of a SOAP server using static files and Apache's mod_rewrite. I could have done the whole thing Taco Bell style if I had only manned up and broken out sed, but I pussied out and wrote some Python.

If you don't want to think of it from a Zen perspective, be capitalist: you are writing software to put food on the table. You can minimize risk by using the well-proven tool set, or you can step into the land of the unknown. It may not get you invited to speak at conferences, but it will get the job done, and help keep your pager from going off at night.

all 23 comments
sorted by: hot top controversial new old
[–] [email protected] 82 points 4 months ago (3 children)

This works until you scale the team beyond 1 person and someone else needs to decipher the 30 line awk | sed | xargs monstrosity you created. Give me a real programming language any day.

[–] Dran_Arcana 15 points 4 months ago* (last edited 4 months ago)

It's fuckin' art though

[–] [email protected] 14 points 4 months ago (1 children)

If you use multi-line commands, and you use bash enough, it starts to look like any other language.

[–] teft 14 points 4 months ago

I don’t see multi-line commands anymore.

[–] jimmy90 2 points 4 months ago

same principle could be applied within a language or vertical like web dev

[–] geekworking 48 points 4 months ago (1 children)

This is great until your job outgrows a single computer or you want to have redundancy. Also, chains of bash tools don't have the best error management when something chokes in one of the middle steps in the pipe. You can still leverage simple bash tools for a lot of the under the hood stuff, but you start needing something more to act as the glue petty quickly when you scale. KISS should still apply.

[–] vinniep 8 points 4 months ago (1 children)

This is great until

I think that's the point. Don't jump to the complex right away. Keep it simple and compose the capabilities you have readily available until you need to become more complex. When the task requires it, yeah, do the complex thing, but keep the simplicity mandate in mind and only add the new complexity that you need. You can get pretty far with the simple, and what about all of the situations where that future pivot or growth never happens?

The philosophy strikes a cord with me - I'm often contending with teams that are building for the future complexities that they think might come up, and we realize later that we did get complexity in the problem later, but not the kind we had planned for, so all of that infrastructure and planning was wasted on an imaginary problem that no only didn't help us but often actually make our task harder. The trick is to keep the solution set composable and flexible so that if complexity shows up later, we can reconfigure and build the new capabilities that we need rather than having to maneuver a large complicated system that we built on a white board before we really knew what the problem looked like.

[–] [email protected] 4 points 4 months ago (1 children)

Don't jump to the complex right away

It's more complex to have 10 different ways to do the same thing. Like, just take a week to teach your ops team how to use Docker and Kubernetes, so everything can simplified to just one Kubernetes cluster instead of 20 bespoke EC2 instances.

[–] vinniep 1 points 4 months ago

I absolutely agree, but you're talking about a situation where we already have 10 different ways and 20 EC2 instances. When you get to that point (or start approaching it), yeah, do the complex thing - no argument at all. The challenge is to wait until the last responsible moment to make that pivot and to not dive deeper into the complexity than you need at the current time and place. I've worked with countless small companies and teams in the past that have created whole K8s clusters, Terraform provisioning plans, and the whole kit for a single low volume service because "we'll need it when things scale out later" and later never arrives.

[–] False 39 points 4 months ago

Truly a Taco Bell-level take.

[–] DocMcStuffin 38 points 4 months ago (1 children)

And just like Taco Bell when something goes bad you get to deal with all the diarrhea.

But seriously, shouldn't this be in [email protected] and not technology?

[–] [email protected] 9 points 4 months ago (1 children)

I get the feeling like half the people here are programmers or in a related field anyway.

[–] [email protected] 5 points 4 months ago (1 children)

I mean some of us aren't programmers...

I think...

... Looks around...

...yet.

[–] [email protected] 8 points 4 months ago

https://www.factorio.com/

One of us, one of us, one of us

[–] friend_of_satan 19 points 4 months ago (1 children)

Related standup comedy from 1996, Jim Gaffigan: https://youtu.be/SLaltfyTEno

[–] [email protected] 6 points 4 months ago* (last edited 4 months ago)

Love ol' Jim. My stepdad and I caught him for his 2014 tour and it's been one of the highlights of both our live entertainment experiences lol that and seeing Rush in 2012.

Edit: "live" entertainment lol... Gotta love mobile.

[–] eager_eagle 18 points 4 months ago* (last edited 4 months ago)

I've hacked plenty of bash aliases, functions, and scripts using coreutils myself; but sometimes you need something a bit more robust when it comes to error handling, retrying, maintainability, and an actually distributed solution instead.

xargs on its own might be more resilient than a distributed crawler, as one would expect, but if I'm tasked with building a distributed data processing pipeline I want more guarantees from the system as a whole, not only from its individual building blocks.

The time and effort put into embedding these guarantees in hacked shell scripts running on a dozen machines might be better invested into building a more solid foundation instead.

[–] [email protected] 13 points 4 months ago

So... When is someone going to start making logic gates out of taco bell ingredients?

[–] shalafi 11 points 4 months ago (1 children)

Send this to my last DevOps team. Jesus. They got a Rube Goldberg machine in place of sane infra. Shit's so complicated, I know only one guy with his finger on the pulse, and that's because he built a ton of it over 18 years!

[–] [email protected] 9 points 4 months ago

That guy: wow, these people are such suckers... keeping me employed for over 18 years despite not building better infra?

How much chaos is it gonna be for your company if he's hit by a bus?

[–] [email protected] 9 points 4 months ago

I think I thought of that in like, 2003, when developing my first web site. My lazy ass was just thinking of using PHP's ability to execute terminal commands to do all the heavy lifting on the backend for everything for me, because I sucked as a programmer. That would have been a terrible, terrible idea in hindsight.

This was way before I learned about form input sanitization too. I was working off of "For Dummies" books.

[–] [email protected] 5 points 4 months ago

Cloud Native development isn't about making systems unnecessarily complex. It's about simplifying tools down to common, scalable components, and reusing code as often as possible.

For example We use kubernetes to run code, because kubernetes is the only platform to run code that can be automated with simple HTTP apis. It is a common platform for computing, much simpler to use than the mess of EC2 instances, cron jobs, and shell scripts that the industry used to rely on. Of course, it is a higher level abstraction than programming everything yourself in Assembly, but that's the point.