Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data - Nature Machine Intelligence


Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data - Nature Machine Intelligence

As a pivotal branch of machine learning, manifold learning uncovers the intrinsic low-dimensional structure within complex non-linear manifolds in high-dimensional space for visualization, classification, clustering and gaining key insights. Although existing techniques have achieved remarkable successes, they suffer from extensive distortions of cluster structure, which hinders the understanding of underlying patterns. Scalability issues also limit their applicability for handling large-scale data. Here we propose a sampling-based scalable manifold learning technique that enables uniform and discriminative embedding (SUDE) for large-scale and high-dimensional data. It starts by seeking a set of landmarks to construct the low-dimensional skeleton of the entire data and then incorporates the non-landmarks into the learned space by constrained locally linear embedding. We empirically validated the effectiveness of SUDE on synthetic datasets and real-world benchmarks and applied it to analyse single-cell data and detect anomalies in electrocardiogram signals. SUDE exhibits a distinct advantage in scalability with respect to data size and embedding dimension and shows promising performance in cluster separation, integrity and global structure preservation. The experiments also demonstrate notable robustness in embedding quality as the sampling rate decreases.

Previous articleNext article

POPULAR CATEGORY

corporate

14359

entertainment

17611

research

8556

misc

17837

wellness

14436

athletics

18728