Week 5

I met with Professor Verma on June 8 to review my findings from the various readings on Bayesian Nonparametrics, and how it might help me define a loss function that includes the idea of cluster structures. I learned that a Bayesian nonparametric model can operate in an infinite-dimensional parameter space, where the number of parameters is allowed to grow with the data. To establish a prior distribution, a BN model can use a Dirichlet Process for discrete distribution, and a Dirichlet Process Mixture Model for a continuous distribution. Some of these prior distributions are described in analogies such as the Stick Breaking Process or the Chinese Restaurant Process, which is a single parameter distribution over partitions of integers.

Assuming a fixed number of clusters, we can use a BN model to determine the probability of the transformed data being grouped in that number of clusters, and derive a loss value from it. That idea informed my initial draft of the metric learning loss function, which needs refinement as it should include a way to improve the loss with the change in the transformation (via gradient descent or otherwise).

Written on June 8, 2023