this post was submitted on 29 Aug 2024
504 points (98.5% liked)

Linux

48328 readers
105 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

Wedson Almeida Filho is a Microsoft engineer who has been prolific in his contributions to the Rust for the Linux kernel code over the past several years. Wedson has worked on many Rust Linux kernel features and even did a experimental EXT2 file-system driver port to Rust. But he's had enough and is now stepping away from the Rust for Linux efforts.

From Wedon's post on the kernel mailing list:

I am retiring from the project. After almost 4 years, I find myself lacking the energy and enthusiasm I once had to respond to some of the nontechnical nonsense, so it's best to leave it up to those who still have it in them.

...

I truly believe the future of kernels is with memory-safe languages. I am no visionary but if Linux doesn't internalize this, I'm afraid some other kernel will do to it what it did to Unix.

Lastly, I'll leave a small, 3min 30s, sample for context here: https://youtu.be/WiPp9YEBV0Q?t=1529 -- and to reiterate, no one is trying force anyone else to learn Rust nor prevent refactorings of C code."

you are viewing a single comment's thread
view the rest of the comments
[–] ik5pvx 15 points 2 months ago (8 children)

At the cost of sounding naive and stupid, wouldn't it be possible to improve compilers to not spew out unsafe executables? Maybe as a compile time option so people have time to correct the source.

[–] chrash0 67 points 2 months ago

the semantics of C make that virtually impossible. the compiler would have to make some semantics of the language invalid, invalidating patterns that are more than likely highly utilized in existing code, thus we have Rust, which built its semantics around those safety concepts from the beginning. there’s just no way for the compiler to know the lifetime of some variables without some semantic indication

[–] it_depends_man 51 points 2 months ago (2 children)

At the cost of sounding naive and stupid

It may be a naive question, but it's a very important naive question. Naive doesn't mean bad.

The answer is that that is not possible, because the compiler is supposed to translate the very specific language of C into mostly very specific machine instructions. The programmers who wrote the code, did so because they usually expect a very specific behavior. So, that would be broken.

But also, the "unsafety" is in the behavior of the system and built into the language and the compiler.

It's a bit of a flawed comparison, but you can't build a house on a foundation of wooden poles, because of the advantages that wood offers, and then complain that they are flammable. You can build it in steel, but you have to replace all of the poles. Just the poles on the left side won't do.

And you can't automatically detect the unsafe parts and just patch those either. If we could, we could just fix them directly or we could automatically transpile them. Darpa is trying that at the moment.

[–] ik5pvx 14 points 2 months ago

Thank you and all the others that took time to educate me on what is for me a "I know some of those words" subject

[–] [email protected] 2 points 2 months ago

Meaning a (current) kernel is actually a C to machine code transpiler?

[–] [email protected] 16 points 2 months ago* (last edited 2 months ago)

The problem is that C is a prehistoric language and don't have any of the complex types for example. So, in a modern language you create a String. That string will have a length, and some well defined properties (like encoding and such). With C you have a char * , which is just a pointer to the memory that contains bytes, and hopefully is null terminated. The null termination is defined, but not enforced. Any encoding is whatever the developer had in mind. So the compiler just don't have the information to make any decisions. In rust you know exactly how long something lives, if something try to use it after that, the compiler can tell you. With C, all lifetimes lives in the developers head, and the compiler have no way of knowing. So, all these typing and properties of modern languages, are basically the implementation of your suggestion.

[–] [email protected] 13 points 2 months ago

Modern C compilers have a lot of features you can use to check for example for memory errors. Rusts borrow-checker is much stricter as it's designed to be part of the language, but for low-level code like the Linux kernel you'll end up having to use Rust's unsafe feature on a lot of code to do things from talking to actual hardware to just implementing certain data structures and then Rust is about as good as C.

[–] [email protected] 10 points 2 months ago

Compilers follow specs and in some cases you can have undefined behavior. You can and should use compiler flags but should complement that with good programming practices (e.g. TDD) and other tools in your pipeline (such as valgrind).

[–] [email protected] 5 points 2 months ago (1 children)

If you write unsafe code then how should it compile?

[–] [email protected] 4 points 2 months ago

I’d like to add that there’s a difference between unsafe and unspecified behavior. Sometimes I’d like the compiler to produce my unsafe code that has specified behavior. In this case, I want the compiler to produce exactly that unsafe behavior that was specified according to the language semantics.

Especially when developing a kernel or in an embedded system, an example would be code that references a pointer from a hardcoded constant address. Perhaps this code then performs pointer arithmetic to access other addresses. It’s clear what the code should literally do, but it’s quite an unsafe thing to do unless you as the developer have some special knowledge that you know the address is accessible and contains data that makes sense to be processed in such a manner. This can be the case when interacting directly with registers representing some physical device or peripheral, but of course, there’s nothing in the language that would suggest doing this is safe. It’s making dangerous assumptions that are not enforced as part of the program. Those assumptions are only true in the program is running on the hardware that makes this a valid thing to do, where that magical address and offsets to that address do represent something I can read in memory.

Of course, pointer arithmetic can be quite dangerous, but I think the point still stands that behavior can be specified and unsafe in a sense.

[–] [email protected] 2 points 2 months ago

This has been done to a limited extent. Some compilers can check for common cases and you can enforce these warnings as errors. However, this is generally not possible as others have described because the language itself has behaviors that are not safe, and too much code relies on those properties that are fundamentally unsafe.