Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

CSE 373: Open addressing, Study notes of Data Structures and Algorithms

University of Wales, Swansea Data Structures and Algorithms

This problem is known as “primary clustering”. Happens when λ is large, or if we get unlucky. In linear probing, we expect to get O (lg(n)) size ...

Typology: Study notes

2021/2022

Uploaded on 09/27/2022

rossi46 🇬🇧

4.5

(10)

313 documents

1 / 4

This page cannot be seen from the preview

Don't miss anything!

CSE 373: Open addressing

Michael Lee

Friday,Jan 26, 2018

1

Warmup

Warmup:

With your neighbor, discuss and review:

IHow do we implement get,put, and remove in a hash table

using separate chaining?

IWhat about in a hash table using open addressing with linear

probing?

ICompare and contrast your answers: what do we do the

same? What do we do differently?

2

Warmup

In both implementations, for all three methods, we start by finding

the initial index to consider:

index =key.hashCode() % array.length

3

Warmup

If we’re using separate chaining, wethen search/insert/delete from

the bucket:

IDictionary<K,V>bucket =array[index]

bucket.get(key)// or .put(...) or .remove(...)

...and resize when λ≈1.

(When exactly to resize is a tuneable parameter)

4

Warmup

If we’re using linear probing,search until we find an array element

where the key is equal to ours or until the arrayindex is null:

while (array[index] != null

&& array[index].hashcode != key.hashCode()

&& !array[index].equals(key)) {

index = (index +1)%this.array.length

}

if (array[index] == null)

// throw exception if implementing get

// add new key-value pair if implementing put

else

// return or set array[index]

How do we delete? (complicated, see section 04 handouts)

When do we resize?

5

Open addressing: linear probing

Strategy: Linear probing

If we collide, checking each next element until wefind an op en slot.

So, h0(k,i) = (h(k)+ i)mod T, where Tis the table size

i= 0

while (index in use)

try (hash(key)+i)%array.length

i+= 1

6

Partial preview of the text

Download CSE 373: Open addressing and more Study notes Data Structures and Algorithms in PDF only on Docsity!

CSE 373: Open addressing

Michael Lee Friday, Jan 26, 2018

1

Warmup

Warmup:

With your neighbor, discuss and review:

I How do we implement get , put , and remove in a hash table using separate chaining? I (^) What about in a hash table using open addressing with linear probing? I Compare and contrast your answers: what do we do the same? What do we do differently?

2

Warmup

In both implementations, for all three methods, we start by finding the initial index to consider : index = key.hashCode() % array.length

3

Warmup

If we’re using separate chaining, we then search/insert/delete from the bucket: IDictionary<K, V> bucket = array[index] bucket.get(key) // or .put(...) or .remove(...) ...and resize when λ ≈ 1. (When exactly to resize is a tuneable parameter)

4

Warmup

If we’re using linear probing, search until we find an array element where the key is equal to ours or until the array index is null: while (array[index] != && array[index].hashcode != key.hashCode() null && !array[index].equals(key)) { index = (index + 1) % this .array.length } if (array[index] == null ) // throw exception if implementing get // add new key-value pair if implementing put else // return or set array[index]

How do we delete? (complicated, see section 04 handouts) When do we resize?

Open addressing: linear probing

Strategy: Linear probing If we collide, checking each next element until we find an open slot. So, h ′( k , i ) = ( h ( k ) + i ) mod T , where T is the table size i = 0 while (index in use) try (hash(key) + i) % array.length i += 1

Open addressing: linear probing

Assume internal capacity of 10, insert the following keys:

38, 19, 8, 109, 10

0 1 2 3 4 5 6 7 8 9

What’s the problem? Lots of keys close together: a “cluster”. We ended up having to probe many slots! 7

Open addressing: linear probing

Primary clustering When using linear probing, we sometimes end up with a long chain of occupied slots. This problem is known as “primary clustering”

Happens when λ is large, or if we get unlucky In linear probing, we expect to get O (lg( n )) size clusters.

8

Open addressing: linear probing

Questions:

I When is performance good? When is it bad? Runtime is bad when table is nearly full. Runtime is also bad when we hit a “cluster” I What is the maximum load factor? Load factor is at most λ = 1. 0! I When do we resize?

9

Open addressing: linear probing

Punchline: clustering can be potentially bad, but in practice, it tends to be ok as long as λ is small 10

Open addressing: linear probing

Question: when do we resize? Usually when λ ≈ (^12)

Nifty equations: I (^) Average number of probes for successful probe: 1 2

(1 − λ)

I (^) Average number of probes for unsuccessful probe: 1 2

(1 + λ)^2

*These equations aren’t important to know

Open addressing: quadratic probing

Problem: We can still get unlucky/somebody can feed us a malicious series of inputs that causes several slowdown Can we pick a different collision strategy that minimizes clustering? Idea: Rather then probing linearly, probe quadratically! Exercise: assume internal capacity of 10, insert the following:

89, 18, 49, 58, 79 0 1 2 3 4 5 6 7 8 9

Open addressing: double-hashing

How many different probe sequences are there?

There are T different starting positions, T − 1 different jump intervals (since we can’t jump by 0), so there are O ( T^2 )^ different probe sequences

Result: in practice, double-hashing is very effective and commonly used “in the wild”.

19

Summary

So, what strategy is best? Separate chaining? Open addressing? No obvious answer: both implementations are common. Separate chaining:

I Don’t have to worry about clustering I Potentially more “compact” (λ can be higher)

Open addressing:

I Managing clustering can be tricky I (^) Less compact (we typically keep λ < 12 ) I Array lookups tend to be a constant factor faster then traversing pointers

20

Applications of hash functions

Can we use hash functions for more then just dictionaries?

Yes! Lots of possible applications, ranging from cryptography to biology.

Important: Depending on the application, we might want our hash function to have different properties.

21

Applications of hash functions

How would you implement the following using hash functions? For each application, also discuss what properties you want your hash function to have.

I (^) Suppose we’re sending a message over the internet. This message might become mildly corrupted. How can we detect if corruption probably occurred? I Suppose you have many fragments of DNA and want to see where they appears in a (significantly longer) segment of DNA. How can we do this efficiently?

22

Applications of hash functions

Same question as before:

I Suppose you’re designing an video uploading site and want to detect if somebody is uploading a pirated movie. A naive way to do this is to check if the movie is byte-for-byte identical to some movie. How can we do this more efficiently? I (^) Suppose you’re designing a website with a user login system. Directly storing your user’s passwords is dangerous – what if they get stolen? How can you store password in a safe way so that even if they’re stolen, the passwords aren’t compromised?

Applications of hash functions

Same question as before:

I (^) You are trying to build an image sharing site. Users upload many images, and you need to assign each image some unique ID. How might you do this? I Suppose we have a long series of financial transactions stored on some (potentially untrustworthy) computer. Somebody claims they made a specific transaction several months ago. Can you design a system that lets you audit and determine if they’re lying or not? Assume you have access to just the very latest transaction, obtained from a different trustworthy source.

CSE 373: Open addressing, Study notes of Data Structures and Algorithms

Related documents

Partial preview of the text

Download CSE 373: Open addressing and more Study notes Data Structures and Algorithms in PDF only on Docsity!

CSE 373: Open addressing

Warmup

Warmup:

Warmup

Warmup

Warmup

Open addressing: linear probing

Open addressing: linear probing

Open addressing: linear probing

Open addressing: linear probing

Open addressing: linear probing

Open addressing: linear probing

Open addressing: quadratic probing

Open addressing: double-hashing

Summary

Applications of hash functions

Applications of hash functions

Applications of hash functions

Applications of hash functions