Week 7 – Oct 20

I had an overview of clustering as a whole.

There are various clustering methods available, but two of the most commonly encountered are:

  1. Hierarchical Clustering:
    • Agglomerative: This approach starts with individual data points and gradually combines them into larger clusters. The result is a hierarchical structure, often depicted as a dendrogram.
    • Divisive: In contrast, divisive clustering begins with all data points grouped together and then progressively splits them into smaller clusters until individual data points are reached.
  2. Partitional Clustering:
    • K-Means: K-Means is a widely used partitional clustering method that divides data into ‘k’ clusters, where ‘k’ is a parameter set by the user. It aims to minimize the distance between data points and the center (centroid) of their assigned cluster.
    • DBSCAN (Density-Based Spatial Clustering of Applications with Noise): DBSCAN identifies clusters based on the density of data points, forming clusters where data points are densely packed and also detecting noisy data.
    • Gaussian Mixture Models (GMM): GMM assumes data points originate from a mixture of Gaussian distributions. It estimates the parameters of these distributions to find clusters.
    • Fuzzy Clustering: Unlike traditional clustering, where each data point belongs exclusively to one cluster, fuzzy clustering allows data points to have partial membership in multiple clusters.

      I believe hierarchical clustering would be quite beneficial in the project

Leave a Reply

Your email address will not be published. Required fields are marked *