Hypothesis Testing

We have seen so far two types of statistical estimation frameworks: confidence intervals and point estimation. Hypothesis testing is a third inference framework that is concerned with choosing a hypothesis supported by available data out of a number of competing alternatives. We start with the basic definitions and then proceed to describe a few important cases. We assume that we have data X 1 ,. .. , X n sampled from a distribution characterized by a parameter θ. A hypothesis is a set of possible values for θ. We will consider cases where there are two competing hypothesis: the null hypothesis H 0 and the alternative hypothesis H A , with H 0 ∩ H A = ∅. The two hypothesis are not treated in a symmetric manner. The null hypothesis usually is used to describe a set of standard or believable values. The alternative hypothesis is an alternative to the null. It is sometimes called the research hypothesis since it may describe a research statement that one wishes to examine. A hypothesis test is executed by observing the values of a test statistic T (X 1 ,. .. , X n). If it lies in a set called the rejection region (RR) then the null hypothesis is rejected and the alternative is accepted. Otherwise, the null is accepted and the alternative hypothesis is rejected A hypothesis test is thus composed of a sampling distribution, a parameter of interest θ, a null and alternative hypotheses, a test statistics and a rejection region. A test can be good or bad depending on whether it leads often to the right decision or not. Formally, we have two types of errors that could be made. A type-1 error is made if H 0 is rejected while it is true and a type-2 error is made if H 0 is accepted while it is not true. The corresponding probabilities of these two errors are denoted by α and β: The two probabilities α, β are actually defined for particular values of θ and thus are function of θ (the true value of the parameter). Example: A company wishes to test whether a new drug helps to prevent a certain disease. The presence or absence of the disease in people treated with the new drug is a sequence Bernoulli RV with parameter θ. Suppose the rate of contracting the disease for people not taking the …