Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Linear-time worst case" is different from "amortized constant time", which is what cuckoo hashing provides.

Almost all the containers we use on a regular basis provide only amortized constant time inserts.



Linear time worst case and amortized constant time are not mutually exclusive. You are right that most containers that we use have amortized constant time complexity, but they also have linear worst case complexity. Most times this is because an internal array has to be resized and the contents have to be copied over to the new enlarged array.


No, they're not mutually exclusive. But what actually matters in most software is "amortized constant time": linear time worst case only matters in hard real-time systems which can't tolerate any deviation from the expected runtime.

Even algorithms like binary trees can end up with very bad worst cases unless great care is taken in allocating their nodes.


The linear time behavior of cuckoo hashing is not triggered only when the table has to be enlarged, like standard containers (say, std::vector or python list): a linear numbers of keys can be "cuckooed" out of their slots for an insertion even when the load factor is under the capacity threshold.

In standard containers if you pre-define the capacity (say with std::vector::reserve) all the insertions are guaranteed to be constant time. Cuckoo hashing can't give this guarantee.


I'm unaware of any hash table implementations which successfully insert elements in absolute constant time in the event of collisions.


Chained hash tables have constant time insertion, but I'm not sure about the amortized lookup time (the worst case is linear)


Chaining requires a memory allocation and list insertion on collisions - is it fair to call that absolute constant time? Probability of this happening depends on the number of elements in the container.


Pooling memory for allocating links turns the allocation into amortized constant time.

List insertion is O(1) if you insert it at the head, and while moving the most-recently-accessed link to the head of the chain doesn't change the worst case behavior, it greatly improves the average in practice (and is dead easy).


As silentbycicle said, list insertion at the head is constant time.

About allocation, I agree with you that it is "slow", but all reasonable computation models count a memory allocation as a constant time operation. If we want to go down that road and measure also memory operations, even random memory access would not be worst-case constant time: the MMU has to map the page to the physical location, and this is usually a logarithmic or amortized constant time operation.


Yeah, allocation is usually included in the constant factors. Keep them in mind, though - things like locality can have a big impact.


You are kidding, right? The name chain itself suggest linear worst case behavior.


"insertion".


My fault. Just saw constant time and chain in one sentence and panicked ;)


Well, there is perfect hashing [1]. Of course it does only work under several assumptions. And you pay for it in terms of time it takes to construct the hash function or in terms of space.

[1] http://en.wikipedia.org/wiki/Perfect_hash_function




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: