## Probability of Data

### Probability of Data

In Lecture 30, L2 Regularization - Theory, you mention the Probability of Data from 2:09 to 2:37. I believe the term is used for two different, but proportional quantities:

P(w | Y,X) - stated as P of w given data. In this case, data is denoted by Y,X which I interpret as the intersection of events.

P(Y|X) - stated as P of data which I interpret as the conditional statement Y given X.

Thanks in advance for the clarification.
### Re: Probability of Data

> In this case, data is denoted by Y,X which I interpret as the intersection of events.

The issue is, there is no such thing as p(w | Y | X).

A simpler way to think of it is that "X" is on the "given" side for all terms (and hence effectively ignored).

So a lazy way to think of it is p(w | Y) = p(Y | w)p(w) / p(Y) which is just Bayes rule.