Content

 Introduction

 What is probability?

 Combining probabilities

 The two-state system

 Combinatorial analysis

 The binomial distribution

 The mean, variance, and standard deviation

 Application to the binomial distribution

Let us now apply what we have just learned about the mean, variance, and standard deviation of a general distribution function to the specific case of the binomial distribution function. Recall, that if a simple system has just two possible outcomes, denoted 1 and 2, with respective probabilities $p$ and $q=1-p$, then the probability of obtaining $n_1$ occurrences of outcome 1 in $N$ observations is
\begin{displaymath}
P_N(n_1) = \frac{N!}{n_1 !\,(N-n_1)!} \,p^{n_1}q^{N-n_1}.
\end{displaymath} (38)

Thus, the mean number of occurrences of outcome 1 in $N$ observations is given by
\begin{displaymath}
\overline{n_1} = \sum_{n_1=0}^N P_N(n_1)\,n_1 = \sum_{n_1=0}^N
\frac{N!}{n_1!\,(N-n_1)!}\,p^{n_1}q^{N-n_1}\, n_1.
\end{displaymath} (39)

This is a rather nasty looking expression! However, we can see that if the final factor $n_1$ were absent, it would just reduce to the binomial expansion, which we know how to sum. We can take advantage of this fact by using a rather elegant mathematical sleight of hand. Observe that since
\begin{displaymath}
n_1\,p^{n_1} \equiv p\,\frac{\partial}{\partial p}\,p^{n_1},
\end{displaymath} (40)

the summation can be rewritten as
\begin{displaymath}
\sum_{n_1=0}^N\frac{N!}{n_1!\,(N-n_1)!}\,p^{n_1}q^{N-n_1}\, ...
...{n_1=0}^N
\frac{N!}{n_1!\,(N-n_1)!}\,p^{n_1}q^{N-n_1}
\right].
\end{displaymath} (41)

This is just algebra, and has nothing to do with probability theory. The term in square brackets is the familiar binomial expansion, and can be written more succinctly as $(p+q)^N$. Thus,
\begin{displaymath}
\sum_{n_1=0}^N\frac{N!}{n_1!\,(N-n_1)!}\,p^{n_1}q^{N-n_1}\, ...
... p\,\frac{\partial}{\partial p} \,(p+q)^N\equiv pN(p+q)^{N-1}.
\end{displaymath} (42)

However, $p+q=1$ for the case in hand, so
\begin{displaymath}
\overline{n_1} = Np.
\end{displaymath} (43)

In fact, we could have guessed this result. By definition, the probability $p$ is the number of occurrences of the outcome 1 divided by the number of trials, in the limit as the number of trials goes to infinity:

\begin{displaymath}
p= ~_{lt\,N\rightarrow\infty~}\frac{n_1}{N}.
\end{displaymath} (44)

If we think carefully, however, we can see that taking the limit as the number of trials goes to infinity is equivalent to taking the mean value, so that
\begin{displaymath}
p = \overline{\left(\frac{n_1}{N}\right)} = \frac{\overline{n_1}}{N}.
\end{displaymath} (45)

But, this is just a simple rearrangement of Eq. (2.43)!

Let us now calculate the variance of $n_1$. Recall that

\begin{displaymath}
\overline{({\mit\Delta} n_1)^2}= \overline{(n_1)^2} - (\overline{n_1})^2.
\end{displaymath} (46)

We already know $\overline{n_1}$, so we just need to calculate $\overline{(n_1)^2}$. This average is written
\begin{displaymath}
\overline{(n_1)^2}=\sum_{n_1=0}^{N}\frac{N!}{n_1!\,(N-n_1)!}\,p^{n_1}
q^{N-n_1}\,(n_1)^2.
\end{displaymath} (47)

The sum can be evaluated using a simple extension of the mathematical trick we used earlier to evaluate $\overline{n_1}$. Since
\begin{displaymath}
(n_1)^2 \,p^{n_1} \equiv \left(p\,\frac{\partial}{\partial p}\right)^2 p^{n_1},
\end{displaymath} (48)

then
$\displaystyle \sum_{n_1=0}^{N}\frac{N!}{n_1!\,(N-n_1)!}\,p^{n_1}q^{N-n_1}\,(n_1)^2$ $\textstyle \equiv$ $\displaystyle \left(p\,\frac{\partial}{\partial p}\right)^2\sum_{n_1=0}^N
\frac{N!}{n_1!\,(N-n_1)!}\,p^{n_1}q^{N-n_1}$  
  $\textstyle \equiv$ $\displaystyle \left(p\,\frac{\partial}{\partial p}\right)^2(p+q)^N$ (49)
  $\textstyle \equiv$ $\displaystyle \left(p\,\frac{\partial}{\partial p}\right)\left[pN (p+q)^{N-1}\right]$  
  $\textstyle \equiv$ $\displaystyle p\left[N(p+q)^{N-1}+pN(N-1)(p+q)^{N-2}\right].$  

Using $p+q=1$ yields
$\displaystyle \overline{(n_1)^2}$ $\textstyle =$ $\displaystyle p\left[N+pN(N-1)\right]= Np\left[1+pN-p\right]$  
  $\textstyle =$ $\displaystyle (Np)^2 + Npq = (\overline{n_1})^2 + Npq,$ (50)

since $\overline{n_1}= Np$. It follows that the variance of $n_1$ is given by
\begin{displaymath}
\overline{({\mit\Delta} n_1)^2}= \overline{(n_1)^2}- (\overline{n_1})^2 = Npq.
\end{displaymath} (51)

The standard deviation of $n_1$ is just the square root of the variance, so

\begin{displaymath}
{\mit\Delta}^\ast n_1 = \sqrt{Npq}.
\end{displaymath} (52)

Recall that this quantity is essentially the width of the range over which $n_1$ is distributed around its mean value. The relative width of the distribution is characterized by
\begin{displaymath}
\frac{{\mit\Delta}^\ast n_1}{\overline{n_1}}= \frac{\sqrt{N pq}}{Np} =
\sqrt{\frac{q}{p}}\frac{1}{\sqrt{N}}.
\end{displaymath} (53)

It is clear from this formula that the relative width decreases like $N^{-1/2}$ with increasing $N$. So, the greater the number of trials, the more likely it is that an observation of $n_1$ will yield a result which is relatively close to the mean value $\overline{n_1}$. This is a very important result!

 The Gaussian distribution

 The central limit theorem

 
Video editing driver download