this post was submitted on 26 Jun 2023
30 points (100.0% liked)
Explain Like I'm Five
14464 readers
111 users here now
Simplifying Complexity, One Answer at a Time!
Rules
- Be respectful and inclusive.
- No harassment, hate speech, or trolling.
- Engage in constructive discussions.
- Share relevant content.
- Follow guidelines and moderators' instructions.
- Use appropriate language and tone.
- Report violations.
- Foster a continuous learning environment.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Software consists of instructions for a computer to do something. These are made to be easy to follow for a computer, not for a human.
Humans write software in a human-readable form, the source code. This then (usually) gets converted to a machine-readable form, called machine code (or bytecode for some languages).
Depending on the programming language and settings used, more or less information is completely lost in the process. For some languages (.NET, Java) you can get most of the structure, sometimes even with most original variable and function names, back from the bytecode, and see relatively easily what the program does.
For other languages (e.g. C/C++), even the structure is lost - you can't even reliably tell which parts of the program belong to the same function. You can read the machine code, and it "clearly" says what it does, but trying to make sense of that mess is slow, error-prone, and you won't fully understand every part (it's just too much), so you will mostly be looking for parts that seem related to what you're interested in. For example, if you're looking for an encryption algorithm, you may look for code that opens two files, reads from one and writes to the other, then look for a piece of code "nearby" that's doing a lot of math. Or for malware, you may want to focus on network connections. Since the software needs to talk to the operating system to make network connections, this tends to happen in a standardized way and you can quickly find the part of the code that talks to the network features of the OS.
You can also run the program step by step and observe what it does (possibly messing with it while doing so to see how that changes the behavior).
For an example of how machine code looks, what in source code would be
ShowDialog('hello')
could become(Made up inaccurate example just to illustrate the idea. It's horrible to read.)
Thanks a lot for this good explanation!