Re: BSD license compatible hash algorithm?



"Aryeh M. Friedman" <gmail.com!aryeh.friedman@xxxxxxxxxxxxxxx> wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dag-Erling Sm??rgrav wrote:
"Aryeh M. Friedman" <aryeh.friedman@xxxxxxxxx> writes:
All hashs have issues with pooling.... see
http://www.burtleburtle.net/bob/hash/index.html... btw it is
a old wives tale that the number of buckets should be prime
(mostly based on the very weak implementation Knuth offered)

Not an "old wives' tale", but rather an easy way to implement a
hash algorithm that is good enough for most simple uses: metric
modulo table size, where metric is a number derived from the
item in such a manner as to give a good spread.

... the above only applies if your using a very primitive hash
like Knuth's multiplication one.... every modern hash I know of
should have 2^k buckets actually for some k ...

It very much depends on what is used for a rehash (collision step)
value. The step value and the table size must have no common factor
larger than 1, or there will be edge cases (bugs) in which some
combinations of hash and step values will cause the table to appear
full when in fact it is not. Making the table size prime is one
simple way of preventing such problems, while still allowing the
rehash value to depend on the key (thus reducing the liklihood of
collision on the second probe).

At the other extreme, the table can be any size at all if the step
value is 1 (and the "modulo table size" operation will certainly be
more efficient if the table size is 2^k).
_______________________________________________
freebsd-hackers@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: Perl hashing speedup ?
    ... I feel that disbalanced bucket usage / and less number of hash buckets ... Now I have some queries regarding way perl hashing mechanism maps keys ... about how perl hashes work internally; but I guess we will be needing ...
    (comp.lang.perl.moderated)
  • Re: Hashing
    ... What you have to keep in mind is that a hash table for words in by ... linear testing of buckets. ... What is important is to ensure that you have an effective collision ... memory for the table, you need two arrays, one is the pointers to the ...
    (alt.lang.asm)
  • Re: Hash table in C++? STL?
    ... > a tree-like structure instead of lists for the buckets). ... > the standard will require equality or a relational comparator. ... the hashing function is where all the complexity lies and I'd be ... But we need hash only the first maxchars characters. ...
    (comp.lang.cpp)
  • Re: Hash tables
    ... A hash table is an obvious and thus common way of implementing this. ... array of buckets with buckets that are arrays with up to 500 elements due ... So I would certainly represent buckets as arrays and not lists. ... That should still be a *lot* faster than a tree, ...
    (comp.lang.functional)
  • Re: SGI hash_map
    ... >> studied hash functions you'll find on the web. ... > And before inserting into the map, each hash value is calculated with: ... a uniform distribution. ... a program to count the number of buckets that contain multiple items. ...
    (comp.lang.cpp)