This is the difference between probability (which assumes that the model parameters are known, and infers/computes the probability of an event) and statistics (which first learns the parameters first).
In your example below, if you assume that the model parameters are given (e.g. saying that the coin is a fair one), you can say that the next toss is going to come heads with half probability. If however you cannot assume the model--i.e., you have to learn/update it first, then
the "likelihood" that the 2million heads out of 3million tosses is caused by a biased coin is much higher than that it is caused by a fair coin. So, you learn/estimate the biased probabilities first (the model parameters), and subsequently infer, using the new model, that the next toss will come heads with higher probability.
On Wed, Nov 9, 2011 at 12:59 AM, Shaunak Shah <firstname.lastname@example.org> wrote:
Hello Dr. Rao,I have a question not from today's class but on the topic "need for smoothing" in classification.Why do we consider that if a coin comes up as Heads for 2 million times out of the 3 million tosses, then it should be a biased coin ?Should we not think otherwise that the coin is not actually biased and is just having a bad day because we know for sure that the probability of Heads/Tails if 0.5.--Shaunak ShahOn Tue, Nov 8, 2011 at 7:10 PM, Subbarao Kambhampati <email@example.com> wrote:
folksI couldn't call on several raised hands today. If you remember your questions, feel free to ask me by email andI will respond.(and if I think the question is of general interest I will post my answer to the blog too)rao