In this section, we move from descriptive statistics to inferential statistics.
Estimating Population Means
ESTIMATION VERSUS HYPOTHESIS TESTING
In this section, we move from descriptive
statistics to inferential statistics. In descriptive statistics, we simply
summarize information available in the data we are given. In inferential
statistics, we draw conclusions about a population based on a sample and a
known or assumed sampling distribution. Implicit in statistical infer-ence is
the assumption that the data were gathered as a random sample from a population.
Examples of the types of inferences that can be
made are estimation, conclusions from hypothesis tests, and predictions of
future observations. In estimation, we are interested in choosing the “best”
estimate of a population parameter based on the sample and statistical theory.
For example, as we saw in Chapter 7, when data are
sampled from a normal dis-tribution, the sample mean has a normal distribution
that is on average equal to the population mean with a variance equal to the
population variance divided by the sample size n. Recall that the distribution of a statistic such as a sample
mean is called a sampling distribution. The Gauss–Markov theory goes on to
determine that the sample mean is the best estimate of the population mean.
That means that for a sample of size n
it gives us the most accurate answer (e.g., has properties such as smallest
mean square error and minimum variance among unbiased estimators).
The sample mean is a point estimate, but we know it
has a sampling distribution. Hence, the sample mean will not be exactly equal
to the population mean. However, the theory we have tells us about its sampling
distribution; thus, statistical theory can aid us in describing our uncertainty
about the population mean based on our knowledge of the sampling distribution
for the sample mean.
In Section 8.2, we will further discuss point
estimates and in Section 8.3 we will discuss confidence intervals. Confidence
intervals are merely interval estimates (based on the observed data) of
population parameters that express a range of val-ues that are likely to
contain the parameter. We will describe how the sampling dis-tribution of the
point estimate is used to get confidence intervals in Section 8.3.
In hypothesis testing, we construct a null and an
alternative hypothesis. Usually, the null hypothesis is an uninteresting
hypothesis that we would like to reject. You will see examples in Chapter 9.
The alternative hypothesis is generally the interest-ing scientific hypothesis
that we would like to “prove.” However, we do not actual-ly “prove” the
alternative hypothesis; we merely reject the null hypothesis and re-tain a
degree of uncertainty about its status.
Due to statistical uncertainty, one can never
absolutely prove a hypothesis based on a sample. We will draw conclusions based
on our sample data and associate an error probability with our possible
conclusion. When our conclusion favors the null hypothesis, we prefer to say
that we fail to reject the null hypothesis rather than that we accept the null
hypothesis.
In setting up the hypothesis test, we will
determine a critical value in advance of looking at the data. This critical
value is selected to control the type I error (i.e., the probability of falsely
rejecting the null hypothesis). This is the so-called Ney-man–Pearson
formulation that we will describe in Section 9.2.
In Section 9.9, we will describe a relationship
between confidence intervals and hypothesis tests that enables one to construct
a hypothesis test from a confidence in-terval or a confidence interval from a
hypothesis test. Usually, hypothesis tests are constructed based directly on
the sampling distribution of the point estimate. How-ever, in Chapter 9 we will
introduce the simplest form of bootstrap hypothesis test-ing. This test is
based on a bootstrap percentile method confidence interval that we will
introduce in Section 8.8.
Related Topics
TH 2019 - 2025 pharmacy180.com; Developed by Therithal info.