# Bonus HW

Due in class, Tue, Dec 6th.

1. Consider the Weighted majority algorithm with both reward and penalty. On a mistake, experts that get it right have their weight increased by a fixed factor R (R > 1) while those who got it wrong have their weight decreased by a factor P (P < 1). What guarantee does this give on the number of mistakes?
2. A kernel is pairwise similarity function $K(x,y)$ for which there is a mapping $\phi$ that represents $K$ as an inner product in the mapped space, $K(x,y)=\phi(x)\cdot\phi(y)$. Show that if $K_1, K_2$ are kernels, then so are $K_1+K_2$ and $K_1K_2$. Show that $K(x,y)=(1+x\cdot y)^d$ and $K(x,y)=e^{-||x-y||^2}$ are kernels. Give an example of a positive similarity function that is not a kernel.
3.  Consider the EM algorithm for a mixture of two Gaussians: start with random centers, identity covariances and 0.5 mixing weights; repeat: assign each point to the Gaussian that gives the point higher likelihood; compute the mean and covariance of each cluster. Give an example where this algorithm fails, with high probability.