this post was submitted on 14 Aug 2023
551 points (96.0% liked)

linuxmemes

20757 readers
676 users here now

I use Arch btw


Sister communities:

Community rules

  1. Follow the site-wide rules and code of conduct
  2. Be civil
  3. Post Linux-related content
  4. No recent reposts

Please report posts and comments that break these rules!

founded 1 year ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 3 points 1 year ago* (last edited 1 year ago)

Yes. Also the required clock cycles depends a lot on individual CPU architectures.

For example division: Some CPUs have hardwired logic to compute the division operation directly on circuit level. Others are basically running a for loop with substraction. The difference in required clock cycles for a division operation can then be huge.

Another example: is it a scalar or superscalar CPU?

A rather obvious example: the bit width of the CPU. 32 bit systems compute 64 bit data much more inefficiently than 64 bit systems.

Then there is other stuff like branch prediction, or system dependencies like memory bus width and clock, cache size and associativity etc. etc..

Long story short: When evaluating the performance of code, multiple performance metrics have to be considered simultaneously and prioritized according to the development goals.

Lines of code is usually a veeery bad metric. (I sometimes spend hours just to write a few lines of code. But those are good ones then.) Cycles per code segment is better, but also not good (except you are developing for a very specific target system). Do benchmarking, profiling, run it on different systems and maybe design individual performance metrics based on your expectations.