this post was submitted on 07 Sep 2024
30 points (89.5% liked)

No Stupid Questions

36150 readers
1832 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)


Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.



Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That's it.



Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.



Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.



Rule 8- All comments should try to stay relevant to their parent content.



Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



Rule 10- Majority of bots aren't allowed to participate here.



Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 2 years ago
MODERATORS
 

So I understand that the subnet mask provides information about the length of the routing prefix (NID). It can be applied to a given IP address to extract the most significant bits allocated for the routing prefix and "zero out" the host identifier.

But why do we need the bitwise AND for that, specifically? I understand the idea, but would it not be easier to only parse the IP address ~~string~~ sequence of bits only for the first n bits and then disregard the remainder (the host identifier)? Because the information necessary for that is already available from the subnet mask WITHOUT the bitwise AND, e.g., with 255.255.255.0 or 1111 1111.1111 1111.1111 1111.0000 0000, you count the amount of 1s, which in this case is 24 and corresponds to that appendix in the CIDR notation. At this point, you already know that you only need to consider those first 24 bits from the IP address, making the subsequent bitwise AND redundant.

In the case of 192.168.2.150/24, for example, with subnet mask 255.255.255.0, you would get 192.168.2.0 (1100 0000.1010 1000.0000 0010.0000 0000) as the routing prefix or network identifier when represented as the first address of the network, however, the last eight bits are redundant, making the NID effectively only 192.168.2.

Now let's imagine an example where we create two subnets for the 192.168.2.0 network by taking one bit from the host identifier and appending it to the routing prefix. The corresponding subnet mask for these two subnets is 255.255.255.128, as we now have 25 bits making up the NID and 7 bits constituting the HID. So host A from subnet 192.168.2.5/25 (HID 5, final octet 0000 0101) now wants to send a request to 192.168.2.133/25 (HID 5, final octet 1000 0101). In order to identify the network to route to, the router needs the NID for the destination, and it gets that by either discarding the 7 least significant bits or by zeroing them out with a bitwise AND operation. Now, my point is, for identifying the network of which the destination host is part of (in this case, the host is B), the bitwise AND is redundant, is it not?

So why doesn't the router just store the NID with only the bits that are strictly required? Is it because the routing table entries are always of a fixed size of 32 bits for IPv4? Or is it because the bitwise AND operation is more efficiently computable?

all 11 comments
sorted by: hot top controversial new old
[–] [email protected] 23 points 3 months ago* (last edited 3 months ago) (2 children)

But why do we need the bitwise AND for that, specifically? I understand the idea, but would it not be easier to only parse the IP address string of bits only for the first n bits and then disregard the remainder (the host identifier)?

Essentially it boils down to:

bit operations are stupid fast and efficient, String operations are super slow.

Also, IP addresses are always stored as int32/int64, so applying String operations would require them to be converted first.

[–] ricdeh 10 points 3 months ago

Okay, that makes sense. Thank you.

[–] ricdeh 5 points 3 months ago (1 children)

Though I would like to clarify that maybe my wording was a bit confusing. By "string of bits", I did not mean the term as it is typically used in programming language environments, but rather a raw binary sequence, e.g., the first 24 bits of an IP address, therefore allocating 3 bytes of memory for storing the NID.

[–] [email protected] 10 points 3 months ago (1 children)

but rather a raw binary sequence, e.g., the first 24 bits of an IP address, therefore allocating 3 bytes of memory for storing the NID.

That would require dynamic memory allocation, since you can never know what CIDR your stack encounters. It could be a nibble, a byte, a byte and a nibble, ..., 4 bytes. So you would allocate a int32/int64 anyway to be on the safe side.

[–] ricdeh 2 points 3 months ago (1 children)

Yep, I agree. Though one could make a hypothetical argument for expanding the array dynamically when needed. Of course, due to the varying sizes of NIDs resulting from CIDR (which you correctly mentioned), you would need to have a second array that can store the length of each NID, with 5 bits per element, leaving you with 3 bits "saved" per IP address.

That can end up wasting more memory than the 32-bit per NID approach, e.g., when the host identifier is smaller than 5 bits. And there's the slowness of memory allocation and copying from one array to another that comes on-top of that.

I think that it is theoretically possible to deploy a NID-extracting and tracking program that is a tiny bit more memory efficient than the 32-bit implementation, but would probably come at a performance overhead and depend on you knowing the range of your expected IP addresses really well. So, not useful at all, lol

Anyway, thanks for your contributions.

[–] [email protected] 4 points 3 months ago

sure thing buddy, and never feel discouraged to ask "stupid questions", it's how we learn after all :)

[–] [email protected] 10 points 3 months ago

Probably because it’s only four bytes of data, and counting/extracting bits takes more cpu time than one AND operation.
Most CPU’s are optimised to work with whole integers (32/64 bit) rather than individual bits.

If memory was a serious concern you could compress it down to one byte as a ‘number of 1s’ counter at the cost of additional cpu operations, but because 3 extra bytes is such a small amount of data, this memory/time trade off isn’t worth it in most systems.

It’d be useful if you wanted to compress some data logs or something with many subnet masks though.

[–] [email protected] 5 points 3 months ago

I'll address your question in two parts: 1) is it redundant to store both the IP subnet and its subnet mask, and 2) why doesn't the router store only the bits necessary to make the routing decision.

Prior to the introduction of CIDR -- which came with the "slash" notation, like /8 for the 10.0.0.0 RFC1918 private IPv4 subnet range -- subnets would genuinely be any bit arrangement imaginable. The most sensible would be to have contiguous MSBit-justified subnet masks, such as 255.0.0.0. But the standard did not preclude using something unconventional like 255.0.0.1.

For those confused what a 255.0.0.1 subnet mask would do -- and to be clear, a lot of software might prove unable to handle this -- this is describing a subnet with 2^23 addresses, where the LSBit must match the IP subnet. So if your IP subnet was 10.0.0.0, then only even numbered addresses are part of that subnet. And if the IP subnet is 10.0.0.1, then that only covers odd numbered addresses.

Yes, that means two machines with addresses 10.69.3.3 and 10.69.3.4 aren't on the same subnet. This would not be allowed when using CIDR, as contiguous set bits are required with CIDR.

So in answer to the first question, CIDR imposed a stricter (and sensible) limit on valid IP subnet/mask combinations, so if CIDR cannot be assumed, then it would be required to store both of the IP subnet and the subnet mask, since mask bits might not be contiguous.

For all modern hardware in the last 15-20 years, CIDR subnets are basically assumed. So this is really a non-issue.

For the second question, the router does in-fact store only the necessary bits to match the routing table entry, at least for hardware appliances. Routers use what's known as a TCAM memory for routing tables, where the bitwise AND operation can be performed, but with a twist.

Suppose we're storing a route for 10.0.42.0/24. The subnet size indicates that the first 24 bits must match a prospective destination IP address. And the remaining 8 bits don't matter. TCAMs can store 1's and 0's, but also X's (aka "don't cares") which means those bits don't have to match. So in this case, the TCAM entry will mirror the route's first 24 bits, then populate the rest with X's. And this will precisely match the intended route.

As a practical matter then, the TCAM must still be as wide as the longest possible route, which is 32 bits for IPv4 and 128 bits for IPv6. Yes, I suppose some savings could be made if a CIDR-only TCAM could conserve the X bits, but this makes little difference in practice and it's generally easier to design the TCAM for max width anyway, even though non-CIDR isn't supported on most routing hardware anymore.