









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Various clustering algorithms, their major categories, and the challenges they face. It discusses sequential clustering algorithms, such as the basic sequential clustering algorithm (bsas), maxmin algorithm, and refinement stages. It also touches upon the issues of cluster closeness and sensitivity to data presentation order.
Typology: Slides
1 / 16
This page cannot be seen from the preview
Don't miss anything!
1
}^. N
m i
N
m^
0
1
68
2
4
single clustering is obtained.^ ^ Hard clustering
(each point belongs exclusively to a single
cluster):• Basic hard clustering algorithms (e.g.,
-^ k
-medoids algorithms
(each point belongs to more than one
(it is based on the
point to belong to a cluster).
5
7
{number of clusters}\
-^ C
-^ For
i =
to^
^ Find
C: dk^
( x^ ,Ci^
)= mink
1 j m
d ( x^ i
,C ) j
^ If
( d (
x^ ,Ci^
)> Θk
( m <
q )^ then
o^ m
= m + o^ C
={ m x^ } i
^ Else
o^ C
= Ck
{ k x^ } i
o Where necessary, update representatives (*) ^ End {if}
-^ End {for} (*) When the mean vector
is used as representative of the cluster
with
elements, the updating in the light of a new vector
8
The order of presentation of the data in the algorithm plays importantrole in the clustering results. Different order of presentation may leadto totally different clustering results, in terms of the number ofclusters as well as the clusters themselves.• In BSAS the decision for a vector
x^ is reached prior to the final cluster
formation.• BSAS perform a single pass on the data. Its complexity is
-^ If clusters are represented by point representatives, compact clustersare favored.
10
formation, which is determined after all vectors have been presented tothe algorithm.•^
MBSAS deals with the above drawback,
at the cost of presenting the
data twice to the algorithm.• MBSAS consists of:^ ^
A cluster determination phase (first pass on the data),which is the same as BSAS with the exception that no vector is assignedto an already formed cluster. At the end of this phase, each clusterconsists of a single element. A pattern classification phase (second pass on the data),where each one of the unassigned vector is assigned to its closest cluster.
^ Remarks:•^
In MBSAS, a decision for a vector
x^ during the pattern classification
phase is reached taking into account all clusters.• MBSAS is
sensitive to the order of presentation of the vectors.
-^ MBSAS requires two passes on the data. Its complexity is
11
be the set of all points that have been chosen to form clusters up to the current iteration step. The formation of clusters is carried out asfollows:
x
determine
d = x
min
z W^
d ( x,z
y: d
=max y
x X-W
d x
d y^
is greater than a prespecified threshold then this vector forms a new cluster
^ the cluster determination phase of the algorithm terminates.
13
A simple merging procedure• (
C , i
Cj^
( i < j
) such that
d ( C
,C )= i j
min
k,r =
,…,m,k
d (≠ r C,Ck^
) r
d ( C
,C ) i j
then { 1
is a user-defined threshold }
^ Merge
C , i
Cj^
to^ C
and eliminate i
Cj^
^ If necessary, update the cluster representative of
Ci^
^ Rename the clusters
Cj +
to m C,…,Cj^
, respectively. m -
^ m=m
^ Go to (
^ Stop
14
Ci
Cj
x ”
^ A simple reassignment procedure
i =
to^
^ Find
Cj^
such that
d ( x
,C )= i^ j
min
k =1 ,…,m
d ( x^ i
,C ) k
^ Set
b ( i
)= j^
{^ b
( i )^ is the index of the cluster that lies closet to
x } i^
j =
to^
m ^ Set
C ={ j
x^ i X: b
( i )=
j }
^ If necessary, update representatives
docsity.com
16
Remarks:•^
In practice, a few passes (
^2 ) of the data set are required.
-^ TTSAS is less sensitive to the order of data presentation, compared toBSAS.