We consider here the question of recovering the posterior
distribution of a random process conditioned on observing the
empirical frequencies of the outcomes. We find that, under a rather
broad assumption of the dependence structure of the process, such as
``independence'' or ``Markovian dependence'', the posterior marginal
distribution of the process at a given time index can be identified as
some empirical distribution calculated from the observed empirical
frequencies of the process' outcomes. We show by two examples
including the i.i.d. sequence with discrete values and a finite Markov
chain, that a certain ``conditional symmetry" given by the observation
of the empirical frequencies leads to the desired posterior
distribution result. Our results are about finite-time observations,
and we further investigate its infinite-time limit connecting with the
idea of Gibbs conditioning. Finally, since our results demonstrate the
importance of empiricial frequency in understanding the information
behind data, we use the Large Deviations Principle (LDP) to construct
a general notion of ``data-driven entropy", from which we can apply
the formalism of thermodynamics to data sciences. The talk is based on
joint work with Professor Hong Qian.
On the Posterior Distribution of a Random Process Conditionedon Observing the Empirical Frequencies: the i.i.d and finite Markov chain case
Wenqing Hu, Missouri University of Science and Technology
-
Online