Bonus HW

Due in class, Tue, Dec 6th.

  1. Consider the Weighted majority algorithm with both reward and penalty. On a mistake, experts that get it right have their weight increased by a fixed factor R (R > 1) while those who got it wrong have their weight decreased by a factor P (P < 1). What guarantee does this give on the number of mistakes?
  2. A kernel is pairwise similarity function K(x,y) for which there is a mapping \phi that represents K as an inner product in the mapped space, K(x,y)=\phi(x)\cdot\phi(y). Show that if K_1, K_2 are kernels, then so are K_1+K_2 and K_1K_2. Show that K(x,y)=(1+x\cdot y)^d and K(x,y)=e^{-||x-y||^2} are kernels. Give an example of a positive similarity function that is not a kernel.
  3.  Consider the EM algorithm for a mixture of two Gaussians: start with random centers, identity covariances and 0.5 mixing weights; repeat: assign each point to the Gaussian that gives the point higher likelihood; compute the mean and covariance of each cluster. Give an example where this algorithm fails, with high probability.