this post was submitted on 06 Feb 2025
1 points (66.7% liked)

Programming

0 readers
2 users here now

A magazine created for the discussion of computer programming-related topics.

Rules

Please keep submissions on topic and of high quality. No image posts, no memes, no politics. Keep the magazine focused on programming topics not general computing topics. Direct links to app demos (unrelated to programming) will be removed. No surveys.

founded 2 years ago
MODERATORS
 

In my youth I wrote m68k assembly programs with tens of thousands of lines and speed optimized every section of the code, even initialization/cleanup executed exactly once. It was very very silly. It was a lot of fun.

#development #assembly #coding #programming

you are viewing a single comment's thread
view the rest of the comments
[–] Treczoks 2 points 2 weeks ago* (last edited 2 weeks ago)

I wrote loads of assembly programs, too, but only one stands out for crazy, stupid, and useless optimization.

I was trying to draw a spline on the screen. I took the algorithms from a scientific paper I got, and my C program was dead slow. The issue boiled down to solving an equation being the problem. I moved this subroutine from C to MC68K assembler and optimized the heck out of it, with no real change to the result: It still was dead slow. Whenever I changed a param, it redrew the line, and it took something like five or more seconds to do that. For a single spline.

So I dove down into the algorithm. What were they doing with that time-eating equation? Turned out this was about measuring the distance between two coordinates - and they did not even use Phytagoras for that, but something oddly complicated. I remember 15 or 16 MULS per call. With up to 70 clock cycles per MULS instruction, this burned.

I replaced this by a simple "is delta X and delta Y both in -1, 0, or +1 range" function, and suddenly the algorithm ran like a lightning bolt on steroides. I could move any defnition point around with the mouse, and the spline followed smoothly.

So it is nice to be able to optimize assembler, but with chosing the right algo, you can get way better than that.