Rust

6133 readers

22 users here now

Welcome to the Rust community! This is a place to discuss about the Rust programming language.

Wormhole

[email protected]

Credits

The icon is a modified version of the official rust logo (changing the colors to a gradient and black background)

founded 2 years ago

MODERATORS

[email protected]

Arc vs String, is Arc really faster? (blocklisted.github.io)

submitted 11 months ago by [email protected] to c/[email protected]

10 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 2 points 11 months ago (1 children)

which are presumably going to be used to look up values in a hashmap or similar (which again, I would suspect it is faster to has a usize than a string of any type).

Yeah, for any primitive types, you can just use the value itself as the hash value here. So, effectively a noop. I assume, Rust's implementation makes use of this...

[–] MantisWaffle 4 points 11 months ago (1 children)

Probably not for ddos/security reasons. Would need to use something like nohasher to get noops.

[–] [email protected] 1 points 11 months ago (1 children)

Hmm, I was wondering, if there's an overlap with using hashes for security stuff. Do you happen to know of an exploit that makes use of something like predictable placement in a hashmap?

Or is your assumption rather that they wouldn't include special treatment for primitive types in the hashmap implementation?
It definitely feels a bit freaky to me, too, since you'd rob users of the ability to customize the Hasher implementation, but I also felt like they almost have to do it, because it might make a massive difference in performance.

...but after thinking about it some more, I guess, you'd typically use a BTreeMap when your keys are primitives. So, now I'm on board with the guess, that they wouldn't include special treatment into HashMap. 🙃

[–] [email protected] 5 points 11 months ago (1 children)

It is talked about in the hashmap docs:

By default, HashMap uses a hashing algorithm selected to provide resistance against HashDoS attacks. The algorithm is randomly seeded, and a reasonable best-effort is made to generate this seed from a high quality, secure source of randomness provided by the host without blocking the program.

The default hashing algorithm is currently SipHash 1-3, though this is subject to change at any point in the future. While its performance is very competitive for medium sized keys, other hashing algorithms will outperform it for small keys such as integers as well as large keys such as long strings, though those algorithms will typically not protect against attacks such as HashDoS.

Basically, if the attacker has control over the key inserted into a hashmap then with a simple hashing algorithm they can force collisions which results in the hashmap falling back to a much slower linear lookup. This can be enough to stress a server and slow down all requests going through it or even cause it to crash. So a lot of effort is made in the default hasher to mitigate against this. There are faster hashing implementations out there if you are not worried about this that you can opt into. But the default is to be secure.

[–] [email protected] 1 points 11 months ago

Thanks. :)