The basic k-means clustering algorithm is a simple algorithm that separates the given data space into different clusters based on centroids calculation using some proximity function. Using this algorithm, we first choose the k- points as initial centroids and then each point is assigned to a cluster with the closest centroid. The algorithm is formally described as follows:
Input: A data set D containing m objects (points) with n attributes in an Euclidean space
Output: Partitioning of m objects into k-clusters C1, C2, C3, …, Ck, i.e. Ci ⊂ D and Ci ∩ Cj = ᶲ (for 1 ≤ i, j ≤ k)