Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Clustering Algorithms: Types, Sequential Algorithms, and Challenges, Slides of Pattern Classification and Recognition

Banasthali Vidyapith Pattern Classification and Recognition

Various clustering algorithms, their major categories, and the challenges they face. It discusses sequential clustering algorithms, such as the basic sequential clustering algorithm (bsas), maxmin algorithm, and refinement stages. It also touches upon the issues of cluster closeness and sensitivity to data presentation order.

Typology: Slides

2011/2012

Uploaded on 07/17/2012

bandhula 🇮🇳

4.7

(10)

94 documents

1 / 16

This page cannot be seen from the preview

Don't miss anything!

CLUSTERING ALGORITHMS

Number of possible clusterings

Let X={x1,x2,…,xN}

Question: In how many ways the Npoints can be

assigned into m groups?

Answer:

Examples:



















 m

Nm i

mNS

)1(

),(

101 375 2)3,15(



901 115 232 45)4,20(



!!10)5,100( 68

S

docsity.com

Partial preview of the text

Download Clustering Algorithms: Types, Sequential Algorithms, and Challenges and more Slides Pattern Classification and Recognition in PDF only on Docsity!

CLUSTERING ALGORITHMS

^

Number of possible clusteringsLet

X ={

x,x^1

,…,x 2

}^. N

Question: In how many ways the

N^

points can be

assigned into m groups?

Answer:^ 

Examples:



m i

i

m i

m

N

S^

15 (^

S

100 (^

S

^

A way out:^ 

Consider only a small fraction of clusterings of

X^

and

select a “sensible” clustering among them.• Question 1: Which fraction of clusterings is

considered?• Question 2: What “sensible” means?• The answer depends on the specific clusteringalgorithm and the specific criteria to be adopted.

^

Cost function optimization.

For most of the cases a

single clustering is obtained.^ ^ Hard clustering

(each point belongs exclusively to a single

cluster):• Basic hard clustering algorithms (e.g.,

k -means)

-^ k

-medoids algorithms

Mixture decomposition• Branch and bound• Simulated annealing• Deterministic annealing• Boundary detection• Mode seeking• Genetic clustering algorithms

^ Fuzzy clustering

(each point belongs to more than one

clusters simultaneously).  Possibilistic clustering

(it is based on the

possibility of a

point to belong to a cluster).

^

Other schemes:^ ^ Algorithms based on graph theory

(^ e.g., Minimum

Spanning Tree, regions of influence, directed trees).  Competitive learning algorithms (basic competitivelearning scheme, Kohonen self organizing maps).  Subspace clustering algorithms.  Binary morphology clustering algorithms.

^

Basic Sequential Clustering Algorithm (BSAS)^ •^

m =

{number of clusters}\

-^ C

={ m

x }^1

-^ For

i =

to^

N

^ Find

C: dk^

( x^ ,Ci^

)= mink

1  j  m

d ( x^ i

,C ) j

^ If

( d (

x^ ,Ci^

)> Θk

)^ AND

( m <

q )^ then

o^ m

= m + o^ C

={ m x^ } i

^ Else

o^ C

= Ck

{ k x^ } i

o Where necessary, update representatives (*) ^ End {if}

-^ End {for} (*) When the mean vector

m^ C

is used as representative of the cluster

C

with

nc^

elements, the updating in the light of a new vector

x^ becomes

m^ C

new =(

nC^

m^ C

+^ x

nC +1)

^

Remarks:•^

The order of presentation of the data in the algorithm plays importantrole in the clustering results. Different order of presentation may leadto totally different clustering results, in terms of the number ofclusters as well as the clusters themselves.• In BSAS the decision for a vector

x^ is reached prior to the final cluster

formation.• BSAS perform a single pass on the data. Its complexity is

O (

N ).

-^ If clusters are represented by point representatives, compact clustersare favored.

^

MBSAS, a modification of BSASIn BSAS a decision for a data vector

x^ is reached prior to the final cluster

formation, which is determined after all vectors have been presented tothe algorithm.•^

MBSAS deals with the above drawback,

at the cost of presenting the

data twice to the algorithm.• MBSAS consists of:^ ^

A cluster determination phase (first pass on the data),which is the same as BSAS with the exception that no vector is assignedto an already formed cluster. At the end of this phase, each clusterconsists of a single element.  A pattern classification phase (second pass on the data),where each one of the unassigned vector is assigned to its closest cluster.

^ Remarks:•^

In MBSAS, a decision for a vector

x^ during the pattern classification

phase is reached taking into account all clusters.• MBSAS is

sensitive to the order of presentation of the vectors.

-^ MBSAS requires two passes on the data. Its complexity is

O (

N ).

^

The maxmin algorithmLet

W^

be the set of all points that have been chosen to form clusters up to the current iteration step. The formation of clusters is carried out asfollows:

For each

x 

X-W

determine

d = x

min

z  W^

d ( x,z

Determine

y: d

=max y

x  X-W

d x

d y^

is greater than a prespecified threshold then  this vector forms a new cluster

else

^ the cluster determination phase of the algorithm terminates.

End {if} After the formation of the clusters, each unassigned vector is assigned toits closest cluster.

^

Refinement stagesThe problem of closeness of clusters: “

In all the above algorithms it may

happen that two formed clusters lie very close to each other”.^ ^

A simple merging procedure• (

A) Find

C , i

Cj^

( i < j

) such that

d ( C

,C )= i j

min

k,r =

,…,m,k

d (≠ r C,Ck^

) r

d ( C

,C ) i j

 M

then { 1

M^1

is a user-defined threshold }

^ Merge

C , i

Cj^

to^ C

and eliminate i

Cj^

^ If necessary, update the cluster representative of

Ci^

^ Rename the clusters

Cj +

,…,C

to m C,…,Cj^

, respectively. m -

^ m=m

^ Go to (

A)

Else

^ Stop

End {if}

^

The problem of sensitivity to the order of data presentation:“A vector

x^ may have been assigned to a cluster

at the current stage

but another cluster

may be formed at a later stage that lies closer to

x ”

^ A simple reassignment procedure

i =

to^

N

^ Find

Cj^

such that

d ( x

,C )= i^ j

min

k =1 ,…,m

d ( x^ i

,C ) k

^ Set

b ( i

)= j^

{^ b

( i )^ is the index of the cluster that lies closet to

x } i^

End {for}• For

j =

to^

m ^ Set

C ={ j

x^  i X: b

( i )=

j }

^ If necessary, update representatives

End {for}

docsity.com

^

Remarks:•^

In practice, a few passes (

^2 ) of the data set are required.

-^ TTSAS is less sensitive to the order of data presentation, compared toBSAS.

Clustering Algorithms: Types, Sequential Algorithms, and Challenges, Slides of Pattern Classification and Recognition

Related documents

Partial preview of the text

Download Clustering Algorithms: Types, Sequential Algorithms, and Challenges and more Slides Pattern Classification and Recognition in PDF only on Docsity!

CLUSTERING ALGORITHMS

^

Number of possible clusteringsLet

X ={

x,x^1

,…,x 2

Question: In how many ways the

N^

points can be

assigned into m groups?

Answer:^ 

Examples:

i

m i

m

m

N

S^

15 (^

S

S

100 (^

S

^

A way out:^ 

Consider only a small fraction of clusterings of

X^

and

select a “sensible” clustering among them.• Question 1: Which fraction of clusterings is

considered?• Question 2: What “sensible” means?• The answer depends on the specific clusteringalgorithm and the specific criteria to be adopted.

^

Cost function optimization.

For most of the cases a

k -means)

^ Fuzzy clustering

clusters simultaneously).  Possibilistic clustering

possibility of a

^

Other schemes:^ ^ Algorithms based on graph theory

(^ e.g., Minimum

Spanning Tree, regions of influence, directed trees).  Competitive learning algorithms (basic competitivelearning scheme, Kohonen self organizing maps).  Subspace clustering algorithms.  Binary morphology clustering algorithms.

^

Basic Sequential Clustering Algorithm (BSAS)^ •^

m =

={ m

x }^1

N

)^ AND

m^ C

C

nc^

x^ becomes

m^ C

new =(

nC^

m^ C

+^ x

nC +1)

^

Remarks:•^

O (

N ).

^

MBSAS, a modification of BSASIn BSAS a decision for a data vector

x^ is reached prior to the final cluster

O (

N ).

^

The maxmin algorithmLet

W^

X-W

^

Refinement stagesThe problem of closeness of clusters: “

In all the above algorithms it may

happen that two formed clusters lie very close to each other”.^ ^

A) Find