Unsupervised Learning

Model estimation:

U1. Topic Models: find better topic models — rigorous, practical relevant and efficient.

U2. Analyze EM for learning mixture models.

U3. The complexity of a learning a spherical Gaussian mixture with separation O(\log k).

U4. Learn general Gaussian mixtures in time f(k)poly(n) or prove a lower bound.

U5. Learn a mixture of spherical Gaussians in time poly(n,k), for any k.

Model fitting/agnostic unsupervised learning:

U6. ICA with arbitrary noise.

U7. k-means with noisy assumptions.

U8. Learn a mixture of arbitrary Gaussians with linearly independent means.


U9. Clustering from assumptions on the pairwise similarities.

U10. Give a probabilistic model for which k-means converges to the “right” clustering, possibly with some pre-processing.


P1. PAC-Learn an intersection of 2 halfspaces

P2. Learn DNF over nonproduct distributions with membership queries

P3. Learn DNF over the uniform distribution (on the Boolean hypercube) without membership queries.

P4. (Valiant) Learn decision lists of k out of n variables in polynomial-time, using poly(k, \log n) examples.

P5. Agnostically learn intersections of halfspaces over logconcave distributions.

P6. (A. Blum) Learn a parity of k out of n variables.

Statistical Algorithms

S1. Prove a tight lower bound for planted k-SAT.

S2. Prove a lower bound for planted clique (not bipartite clique).

S3. (Montanari) Lower bound for finding a planted vector: T = v \otimes v \otimes v + \beta N(0,1)^{n \times n \times n}

S4. Prove an exponential lower bound for learning a decision list of k variables using only O(k \log n) examples.

Online Learning

O1. What learning algorithms are efficient when applied to a sequence of learning tasks? (multitask/transfer/lifelong learning)

O2. For what nonconvex optimization problems can we achieve near-optimal regret?

O3. What is the communication/space complexity of distributed/stream learning a halfspace?

O4. When does the multiplicative update method work (fail)?

O5. 2-player approximate-Nash with polynomial convergence.

Exploratory topics

E1. Computational theories of cortex.

E2. Non i.i.d. models of example distributions motivated by real-life learning.

E3. Give a class of distributions for which a deep neural net is provably effective with an algorithm that is close to one used in practice.

E4. Give a nonconvex optimization problem for which gradient descent fails but stochastic gradient descent succeeds.



  1. James Bailey, Jeff Pavelka, and Amy Musselman would like to do something in online learning. We’re still deciding exactly what.

      1. Our relevant paper is “New algorithms for approximate Nash equilibria in bimatrix games” by Hartwig Bosse, Jaroslaw Byrka, and Evangelos Markakis.

  2. I would like to work on O4. When does the multiplicative update method work (fail)?
    I am open to work with other interested students on this topic.

    I will start with reading: S. Arora, E. Hazan, and S. Kale.
    The Multiplicative Weights Update Method: a Meta-Algorithm and Applications.
    I am also interested to read the survey on Online Learning (by N. Cesa-Bianchi)

    1. I would like to work with you on O4. I will start reading the paper mentioned and we can meet after the next class.

  3. Group members: Bo Xie, Hanjun Dai
    Topic: E4. Give a nonconvex optimization problem for which gradient descent fails but stochastic gradient descent succeeds.

  4. Nihal Balani and Ajitesh Jain are interested in:

    U7. k-means with noisy assumptions.

    We are planning to read – ‘The algorithm of noisy k-means’ by Brunet, Loustau

    We are alternatively considering O4. When does the multiplicative update method work (fail)?

  5. Hi everyone, this is Wenchen Li, I’d like to focus on the E1: Computational theories of cortex. Anyone interested in this topic? I’m still looking for teammates.

    Although deep neural network works so well in practise, “explicit
    means of allocating neurons to new concepts needs to be specified, multiple kinds of tasks need to be explained within a consistent framework, and executions of long sequences of such tasks need to be supported”(excerpt from
    Better understanding this can help us delve deeper in the artificial neural network, fundamentally understand how they work.

    The below document might give you a sense of Computational theories of cortex.

    1. I am John Stewart, and I would like to join this group. I am interested in learning about adaptable structures for artificial neural networks, particularly random graphs with local rules.

Comments are closed.