FB*-Index: A Compact Index for Efficient and Exact Density-based Clustering

Zhao, Bide

Permalink

Access status:

USyd Access

Type

Thesis

Thesis type

Masters by Research

Author/s

Zhao, Bide

Abstract

Density-based clustering is a fundamental technique for discovering arbitrarily shaped clusters and handling noise, without requiring the number of clusters to be specified in advance. However, existing methods often struggle with efficiency and accuracy across varying query ...
See moreDensity-based clustering is a fundamental technique for discovering arbitrarily shaped clusters and handling noise, without requiring the number of clusters to be specified in advance. However, existing methods often struggle with efficiency and accuracy across varying query parameters $\varepsilon$ and $\mu$. In this thesis, we propose a novel index-based algorithm for efficient and exact cluster extraction. We introduce FB, the first linear-size index that supports exact clustering with running time linear in the output size for any query $\varepsilon$, along with a compact variant, FB$^*$, for efficiently extracting density-based clusters. Due to the compactness of the index and the efficiency of the query algorithm, our method is well-suited for disk-based storage, enabling multiple versions of the index to support arbitrary $(\varepsilon,\mu)$ queries. We provide formal analyses of time and space complexity. Extensive experiments on 19 real-world datasets demonstrate that our method significantly outperforms existing approaches while guaranteeing exact clustering results.
See less

Date

2025

Rights statement

The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.

Faculty/School

Faculty of Engineering

Awarding institution

The University of Sydney

Subjects

FB*-Index
Compact Index
Density-based Clustering