The term ‘probability distribution’ is actually pretty vague. A probability distribution is an abstract object that can be distinguished uniquely (i.e. from other probability distributions) by way of any of a bunch of concrete representations. Probability density or mass functions, moment generating functions, characteristic functions, cumulative distribution functions, random variables, and measures all reify a probability distribution as something tangible that can be worked with. Image measures in particular are sometimes called ‘distributions’, though they still just form a single possible reification of the underlying concept. In formal probability theory the term ‘law’ is often used to refer to the abstract object being characterized.
Different characterizations are useful when doing different kinds of work. Measures are useful when doing proofs. Probability densities and mass functions are useful for a whole whack of applications. Moment generating functions are useful for exercises in introductory mathematical statistics courses.
Sampling functions - procedures that produce samples from some target distribution - also characterize probability distributions uniquely. They are an excellent basis for introducing and transforming uncertainty in probabilistic programming languages. A sampling function takes as its sole input a stream of randomness, which is consumed and transformed to produce a sample from the distribution that it characterizes. The stream can be a lazy list or, similarly, a pseudo-random number generator.
Sampling functions are precursors to generative models: models that take both randomness and causal inputs as arguments and produce a possible effect. Generative models are the main currency of probabilistic programming; they specify a mapping between hypothesized causes and a probability distribution over effects. Sampling functions are the basis for handling all uncertainty in several PP implementations.
It’s useful to look at sampling functions and generative models to emphasize the distinction between the two. In Haskelly pseudocode, they have the following types:
1 2 3
It’s easy to see that a generative model, when provided with some causes, is itself a sampling function.
We can use a monad to abstract out the provisioning of randomness and make
everything a little cleaner. Imagine ‘Observable’ to be a monad that handles
the propagation of randomness in our functions; any type tagged with
‘Observable’ is a probability distribution over some other type (and maybe we
would run it with a function called
sample). Using that, we can
write the above as follows:
1 2 3
Very clean. Here it’s immediately clear that only difference between a sampling function and a model is the introduction of causes to the latter. A generative model contains some internal logic that manipulates external causes in some way; a sampling function does not.