Please enable JavaScript to view this site.

Features and functionality described on this page are available with Prism Enterprise.

The Dunn index is a clustering validation metric that evaluates clustering quality by measuring the ratio between the minimal inter-cluster distance and the maximal intra-cluster distance. Developed by Dunn in 1974, this index is designed to identify clusters that are compact and well-separated.

The fundamental principle behind the Dunn index is that good clustering should have large distances between clusters and small distances within clusters. The index captures this by taking the ratio of the smallest distance between different clusters to the largest distance within any cluster.

Mathematical calculation

The Dunn index is calculated as follows:

where:

k is the number of clusters

d(Ci, Cj) is the distance between clusters i and j

Δ(Cl) is the diameter of cluster l

Inter-cluster distance

The distance between two clusters Ci and Cj is defined as the minimum distance between any two points from different clusters:

This represents the closest distance between any pair of points from the two clusters.

Cluster diameter

The diameter of a cluster C is defined as the maximum distance between any two points within the cluster:

This represents the largest distance between any pair of points within the cluster, measuring the cluster's internal spread.

Interpretation

The Dunn index evaluates clustering quality through the relationship between inter-cluster separation and intra-cluster compactness:

Higher Dunn values: Indicate better clustering with compact clusters that are well-separated from each other

Lower Dunn values: Suggest that clusters are either internally spread out or too close to neighboring clusters

The optimal number of clusters corresponds to the maximum value of the Dunn index. This occurs when:

The minimum inter-cluster distance is large (clusters are well-separated)

The maximum intra-cluster distance is small (clusters are compact)

Advantages and considerations

The Dunn index offers several advantages:

It has a clear geometric interpretation

It doesn't make assumptions about cluster shape or size

It's suitable for clusters of different shapes and densities

The calculation is relatively straightforward

However, there are some limitations:

It can be computationally expensive for large datasets due to pairwise distance calculations

It's sensitive to outliers, as a single outlier can dramatically affect the cluster diameter

It may not perform well when clusters have very different densities

The index can be dominated by the worst-case distances (minimum between-cluster, maximum within-cluster)

The Dunn index is particularly useful when you want to ensure that clusters are both internally cohesive and externally well-separated, making it valuable for applications where clear cluster boundaries are important.

The Dunn index is one of 17 methods used in Prism's consensus approach for determining optimal cluster numbers, as described on the cluster metrics page.

© 1995-2019 GraphPad Software, LLC. All rights reserved.