Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Problem Set 9 in CSE3358: String Matching, Maximal Repeats, Knapsack Problem, and Circuits, Exams of Data Structures and Algorithms

Five problems from a university course, cse3358, related to computer science. The problems cover various topics such as string matching with anagrams, finding maximal repeats in a string, the knapsack problem, and the largest common circuit in logic circuits. Students are required to design algorithms and provide solutions for each problem. The document also includes instructions for each problem and the number of points allocated.

Typology: Exams

2012/2013

Uploaded on 04/07/2013

seshu_lin3
seshu_lin3 🇮🇳

4

(3)

59 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CSE3358 Problem Set 9
04/15/05
Due 04/22/05
Problem 1: Relaxed string matching (20 points)
We have seen in class a number of algorithms for the string matching problem where, given a text
T[1..n] over some finite alphabet Σ and a pattern P[1..m], we need to determine all positions in T
where Poccurs (called valid shifts). The Naive algoritms runs in O(mn) time. The Rabin-Karp
algorithm runs in O(n+mv) expected time where vis the number of valid shifts. The suffix tree
algorithm runs in O(n+m+v) = O(n+m) time. In this problem we consider a relaxed version of
the string matching problem that can be solved in linear time.
Let Tm
i=T[i..i +m1]. We say Tm
iis an anagram of Pif there is a way of permuting symbols of
Tm
iso that the resulting array is equal to P.
(a) (10 points) Give an algorithm that, given an index i, determines whether Tm
iis an anagram of P.
Your algorithm should be asymptotically as efficient as possible.
(b) (10 points) Give an algorithm that, given Tand P, reports all i’s such that Tm
iis an anagram of
P. Your algorithm should run in O(n+m) time.
Problem 2: Finding maximal repeats (20 points)
Consider a string T[1..n]. Define T[0] and T[n+ 1] as a special symbol that does not occur in T.
Consider a string P[1..m]. We say Pis a maximal repeat of Tif 0< i, j nsuch that:
P=Tm
i=Tm
j,i6=j
Tm+1
i16=Tm+1
kk
Tm+1
i6=Tm+1
kk
(a) (5 points) Based on the above definition, explain in plain English what a maximal repeat is.
(b) (10 points) Consider the suffix tree of T. Show that Pis a maximal repeat of Tis equivalent to
vin the suffix tree of Tsuch that:
vis not a leaf
the path from root to vspells P
there are two leaf nodes in v’s subtree, say iand j, such that T[i1] 6=T[j1]
(c) (5 points) Show that a string Tof length nhas O(n) maximal repeats.
1
pf3

Partial preview of the text

Download Problem Set 9 in CSE3358: String Matching, Maximal Repeats, Knapsack Problem, and Circuits and more Exams Data Structures and Algorithms in PDF only on Docsity!

CSE3358 Problem Set 9 04/15/ Due 04/22/

Problem 1: Relaxed string matching (20 points) We have seen in class a number of algorithms for the string matching problem where, given a text T [1..n] over some finite alphabet Σ and a pattern P [1..m], we need to determine all positions in T where P occurs (called valid shifts). The Naive algoritms runs in O(mn) time. The Rabin-Karp algorithm runs in O(n + mv) expected time where v is the number of valid shifts. The suffix tree algorithm runs in O(n + m + v) = O(n + m) time. In this problem we consider a relaxed version of the string matching problem that can be solved in linear time.

Let T (^) im = T [i..i + m − 1]. We say T (^) im is an anagram of P if there is a way of permuting symbols of T (^) im so that the resulting array is equal to P.

(a) (10 points) Give an algorithm that, given an index i, determines whether T (^) im is an anagram of P. Your algorithm should be asymptotically as efficient as possible.

(b) (10 points) Give an algorithm that, given T and P , reports all i’s such that T (^) im is an anagram of P. Your algorithm should run in O(n + m) time.

Problem 2: Finding maximal repeats (20 points) Consider a string T [1..n]. Define T [0] and T [n + 1] as a special symbol that does not occur in T. Consider a string P [1..m]. We say P is a maximal repeat of T if ∃ 0 < i, j ≤ n such that:

  • P = T (^) im = T (^) jm , i 6 = j
  • T (^) im−+1 1 6 = T (^) km +1∀k
  • T (^) im +1 6 = T (^) km +1∀k

(a) (5 points) Based on the above definition, explain in plain English what a maximal repeat is.

(b) (10 points) Consider the suffix tree of T. Show that P is a maximal repeat of T is equivalent to ∃ v in the suffix tree of T such that:

  • v is not a leaf
  • the path from root to v spells P
  • there are two leaf nodes in v’s subtree, say i and j, such that T [i − 1] 6 = T [j − 1]

(c) (5 points) Show that a string T of length n has O(n) maximal repeats.

Problem 3: The Thief and the Knapsack problem... and You (20 points) A thief robbing a store finds a set S of n items 1 to n. Item i has a value vi dollars and a weight wi pounds, where vi and wi are integers. He wants to take as valuable a load as possible, but he can carry at most W pounds in his knapsack for some integer W. Unfortunately, he did not take CSE3358; otherwise, he would not be stealing (would he?). But he needs an algorithm to find a set of items A ⊆ S of maximum total value such that

∑ i∈A wi^ ≤^ W^.

Now you have a moral issue here... Should you help the thief in solving his problem and in return get some points for this homework, or should you just refuse and loose the points? It’s up to you... :) But if you decide not to help him, you definitely get 5 points for honesty.

(a) (10 points) This problem satisfies the optimality condition for subproblems: If A is an optimal solution for S with weight at most W , and i ∈ A, then A − {i} is an optimal solution for S − {i} with weight at most W − wi (cut and paste argument). Give a Dynamic Programming formulation for this problem based on the above idea and an algorithm to solve it in O(nW ) time. More precisely, let c(i, w) be the optimal solution for the set { 1 , ..., i} with a weight at most w. Obviously, if wi > w, then i cannot be part of the solution and c(i, w) = c(i − 1 , w). Express c(i, w) in terms of c(i − 1 , w) and c(i − 1 , w − wi) and describe how the dynamic programming table is computed, and finally how to obtain the solution itself from the table.

(b) (10 points) Assume that any two items i and j satisfy the following: wi < wj ⇒ vi > vj. In other words, lighter items are more valuable. Describe an efficient greedy way for solving the knapsack problem in this case with a running time independent of W. Use a “cut and paste” argument that justifies the greedy choice.

Problem 4: Largest common circuit (20 points) Define a circuit as a logic consisting of AND, OR, and NOT gates connected together such that:

  • every gate has a fan in of 2 (a gate can have at most two inputs)
  • every gate has a fan out of 1 (the output of a gate can be input to at most one gate)
  • no feedback loops

The problem: We have two circuits as described above, one with m gates and one with n gates, we would like to obtain the largest circuit (in terms of the number of gates) that is a sub-circuit of both.

(a) (5 points) Give a brute force algorithm for finding the largest common circuit of two circuits and analyze its running time.

(b) (15 points) Give a dynamic programming formulation of the problem and design an algorithm (based on dynamic programming) for finding the largest common circuit of two circuits that runs in O(mn) time.