Statistics is an attempt to estimate population through sample.
An equation relevant to each distribution is derived from the parameters of the sample, and a value the equation makes also has random error.
And that value the equation makes is ''probability''. ''Probability'' is the chance that a given observed data is included in the distribution. Or more specifically, it is conditional probability, because a given observed data (= condition) gives a probability of the existing of the data.
<nowiki>!--
Chance that a value <math>x_i</math> in the sample is in a range of certain values (e.g., <math>x_i > n</math>) can also be calculated from the dataset. This chance is <math>probability</math>.
:<math>probability = P(x_i\text{ in a range}|\text{parameters})</math>
</nowiki!-->
===Likelihood===
Those chances are ''likelihood''. A set of parameters can be followed by observed data in the sample with very low chance, another set of parameters can be followed by the sample with relatively high chance, and yet another set of parameters can be followed by the sample with the highest chance.
As a natural sense, the highest chance of set of parameters, i.e., the most likely set of parameters (the parameters with the maximum ''likelihood'') is taken into account and is used to make the relevant equation. That is the maximum likelihood estimation method.parameters derived directly from the sample may not be good estimates of parameters of the population.In that case, multiple different sets of parameters can be applied to the same sample. A set of parameters can include the dataset by some chance, another set of parameters also can include the datase by some chance, yet another set of parameters..... That is, parameters should not be derived directly from the sample.
InsteadOn the contrary to that the above-mentioned (conditional) ''probability'' fixes the parameters (hypothesis) and varies observations (data), observed ''likelihood'' fixes observations (data of the sample) and varies parameters (hypothesis).