A widespread answer for the “curse of dimensionality” in machine learning is that the relevant features in data are often low-dimensional embeddings in a higher dimensional space. Subspace clustering is precisely the unsupervised task of finding low-dimensional clusters in data. In our recent work Subspace clustering in high-dimensions: Phase transitions & Statistical-to-Computational gap we characterise the statistical-to-computational trade-offs of subspace clustering in a simple Gaussian mixture model where the relevant features (the means) are sparse vectors. Check it out!