Home

Monday, October 25, 2010

UNIT -4 ( TESTING)

 Symbols 


  • α, the probability of Type I error (rejecting a null hypothesis when it is in fact true)
  • n = sample size
  • n1 = sample 1 size
  • n2 = sample 2 size
  • \overline{x} = sample mean
  • μ0 = hypothesized population mean
  • μ1 = population 1 mean
  • μ2 = population 2 mean
  • σ = population standard deviation
  • σ2 = population variance
  • s = sample standard deviation
  • s2 = sample variance
  • s1 = sample 1 standard deviation
  • s2 = sample 2 standard deviation
  • t = t statistic
  • df = degrees of freedom
  • \overline{d} = sample mean of differences
  • d0 = hypothesized population mean difference
  • sd = standard deviation of differences
  • \hat{p} = x/n = sample proportion, unless specified otherwise
  • p0 = hypothesized population proportion
  • p1 = proportion 1
  • p2 = proportion 2
  • dp = hypothesized difference in proportion
  • min{n1,n2} = minimum of n1 and n2
  • x1 = n1p1
  • x2 = n2p2
  • χ2 = Chi-squared statistic
  • F = F statistic


Theory of t- Distribution 

According to the central limit theorem, the sampling distribution of a statistic (like a sample mean) will follow a normal distribution, as long as the sample size is sufficiently large. Therefore, when we know the standard deviation of the population, we can compute a z-score, and use the normal distribution to evaluate probabilities with the sample mean.
But sample sizes are sometimes small, and often we do not know the standard deviation of the population. When either of these problems occur, statisticians rely on the distribution of the t statistic (also known as the t score), whose values are given by:
t = [ x - μ ] / [ s / sqrt( n ) ]
where x is the sample mean, μ is the population mean, s is the standard deviation of the sample, and n is the sample size. The distribution of the t statistic is called the t distribution or the Student t distribution.

Degrees of Freedom

There are actually many different t distributions. The particular form of the t distribution is determined by its degrees of freedom. The degrees of freedom refers to the number of independent observations in a set of data.
When estimating a mean score or a proportion from a single sample, the number of independent observations is equal to the sample size minus one. Hence, the distribution of the t statistic from samples of size 8 would be described by a t distribution having 8 - 1 or 7 degrees of freedom. Similarly, a t distribution having 15 degrees of freedom would be used with a sample of size 16.
For other applications, the degrees of freedom may be calculated differently. We will describe those computations as they come up.

Properties of the t Distribution

The t distribution has the following properties:
  • The mean of the distribution is equal to 0 .
  • The variance is equal to v / ( v - 2 ), where v is the degrees of freedom (see last section) and v > 2.
  • The variance is always greater than 1, although it is close to 1 when there are many degrees of freedom. With infinite degrees of freedom, the t distribution is the same as the standard normal distribution.

When to Use the t Distribution

The t distribution can be used with any statistic having a bell-shaped distribution (i.e., approximately normal). The central limit theorem states that the sampling distribution of a statistic will be normal or nearly normal, if any of the following conditions apply.
  • The population distribution is normal.
  • The sampling distribution is symmetric, unimodal, without outliers, and the sample size is 15 or less.
  • The sampling distribution is moderately skewed, unimodal, without outliers, and the sample size is between 16 and 40.
  • The sample size is greater than 40, without outliers.
The t distribution should not be used with small samples from populations that are not approximately normal.


Probability and the Student t Distribution

When a sample of size n is drawn from a population having a normal (or nearly normal) distribution, the sample mean can be transformed into a t score, using the equation presented at the beginning of this lesson. We repeat that equation below:
t = [ x - μ ] / [ s / sqrt( n ) ]
where x is the sample mean, μ is the population mean, s is the standard deviation of the sample, n is the sample size, and degrees of freedom are equal to n - 1.
The t score produced by this transformation can be associated with a unique cumulative probability. This cumulative probability represents the likelihood of finding a sample mean less than or equal to x, given a random sample of size n.


Notation and t Scores

Statisticians use tα to represent the t-score that has a cumulative probability of (1 - α). For example, suppose we were interested in the t-score having a cumulative probability of 0.95. In this example, α would be equal to (1 - 0.95) or 0.05. We would refer to the t-score as t0.05
Of course, the value of t0.05 depends on the number of degrees of freedom. For example, with 2 degrees of freedom, that t0.05 is equal to 2.92; but with 20 degrees of freedom, that t0.05 is equal to 1.725.
Note: Because the t distribution is symmetric about a mean of zero, the following is true.
tα = -t1 - alpha       And       t1 - alpha = -tα
Thus, if t0.05 = 2.92, then t0.95 = -2.92.

 Example 

  1. The Acme Chain Company claims that their chains have an average breaking strength of 20,000 pounds, with a standard deviation of 1750 pounds. Suppose a customer tests 14 randomly-selected chains. What is the probability that the average breaking strength in the test will be no more than 19,800 pounds?

    Solution:

    One strategy would be a two-step approach:

    • Compute a t score, assuming that the mean of the sample test is 19,800 pounds.
    • Determine the cumulative probability for that t score.

    We will follow that strategy here. First, we compute the t score:

    t = [ x - μ ] / [ s / sqrt( n ) ]
    t = (19,800 - 20,000) / [ 1750 / sqrt(14) ]
    t = ( -200 ) / [ (1750) / (3.74166) ] = ( -200 ) / (467.707) = -0.4276

    where x is the sample mean, μ is the population mean, s is the standard deviation of the sample, n is the sample size, and t is the t score.

    Now, we can determine the cumulative probability for the t score. We know the following:

    • The t score is equal to -0.4276.
    • The number of degrees of freedom is equal to 13. (In situations like this, the number of degrees of freedom is equal to number of observations minus 1. Hence, the number of degrees of freedom is equal to 14 - 1 or 13.)

    Now, we are ready to use the T Distribution Calculator. Since we have already computed the t score, we select "t score" from the drop-down box. Then, we enter the t score (-4.276) and the degrees of freedom (13) into the calculator, and hit the Calculate button. The calculator reports that the cumulative probability is 0.338. Therefore, there is a 33.8% chance that the average breaking strength in the test will be no more than 19,800 pounds.

    Note: The strategy that we used required us to first compute a t score, and then use the T Distribution Calculator to find the cumulative probability. An alternative strategy, which does not require us to compute a t score, would be to use the calculator in the "Sample mean" mode. That strategy may be a little bit easier. It is illustrated in the next example.
  2. Let's look again at the problem that we addressed above in Example 1. This time, we will illustrate a different, easier strategy to solve the problem.

    Here, once again, is the problem: The Acme Chain Company claims that their chains have an average breaking strength of 20,000 pounds, with a standard deviation of 1750 pounds. Suppose a customer tests 14 randomly-selected chains. What is the probability that the average breaking strength in the test will be no more than 19,800 pounds?

    Solution:

    We know the following:

    • The population mean is 20,000.
    • The standard deviation is 1750.
    • The sample mean, for which we want to find a cumulative probability, is 19,800.
    • The number of degrees of freedom is 13. (In situations like this, the number of degrees of freedom is equal to number of observations minus 1. Hence, the number of degrees of freedom is equal to 14 - 1 or 13.)

    First, we select "Sample mean" from the dropdown box, in the T Distribution Calculator. Then, we plug our known input (degrees of freedom, sample mean, standard deviation, and population mean) into the T Distribution Calculator and hit the Calculate button. The calculator reports that the cumulative probability is 0.338. Thus, there is a 33.8% probability that an Acme chain will snap under 19,800 pounds of stress.

    Note: This is the same answer that we found in Example 1. However, the approach that we followed in this example may be a little bit easier than the approach that we used in the previous example, since this approach does not require us to compute a t score.
  3. The school board administered an IQ test to 25 randomly selected teachers. They found that the average IQ score was 115 with a standard deviation of 11. Assume that the cumulative probability is 0.90. What population mean would have produced this sample result?

    Note: In this situation, a cumulative probability of 0.90 suggests that 90% of the random samples drawn from the teacher population will have an average IQ of 115 or less. This problem asks you to find the true population IQ for which this would be true.

    Solution:

    We know the following:

    • The cumulative probability is 0.90.
    • The standard deviation is 11.
    • The sample mean is 115.
    • The number of degrees of freedom is 24. (In situations like this, the number of degrees of freedom is equal to number of observations minus 1. Hence, the number of degrees of freedom is equal to 25 - 1 or 24.)

    First, we select "Sample mean" from the dropdown box, in the T Distribution Calculator. Then, we plug the known inputs (cumulative probability, standard deviation, sample mean, and degrees of freedom) into the calculator and hit the Calculate button. The calculator reports that the population mean is 112.1.

    Here is what this means. Suppose we randomly sampled every possible combination of 25 teachers. If the true population mean were 112.1, we would expect 90% of our samples to have a sample mean of 115 or less.

Sunday, October 10, 2010

UNIT -3 (SAMPLING)

INTRODUCTION SAMPLING

Sampling is the process of selecting units (e.g., people, organizations) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen. Let's begin by covering some of the key terms in sampling like "population" and "sampling frame." Then, because some types of sampling rely upon quantitative models, we'll talk about some of the statistical terms used in sampling. Finally, we'll discuss the major distinction between probability and Nonprobability sampling methods and work through the major types in each.


External Validity

External validity is related to generalizing. That's the major thing you need to keep in mind. Recall that validity refers to the approximate truth of propositions, inferences, or conclusions. So, external validity refers to the approximate truth of conclusions the involve generalizations. Put in more pedestrian terms, external validity is the degree to which the conclusions in your study would hold for other persons in other places and at other times.
In science there are two major approaches to how we provide evidence for a generalization. I'll call the first approach the Sampling Model. In the sampling model, you start by identifying the population you would like to generalize to. Then, you draw a fair sample from that population and conduct your research with the sample. Finally, because the sample is representative of the population, you can automatically generalize your results back to the population. There are several problems with this approach. First, perhaps you don't know at the time of your study who you might ultimately like to generalize to. Second, you may not be easily able to draw a fair or representative sample. Third, it's impossible to sample across all times that you might like to generalize to (like next year).
I'll call the second approach to generalizing the Proximal Similarity Model. 'Proximal' means 'nearby' and 'similarity' means... well, it means 'similarity'. The term proximal similarity was suggested by Donald T. Campbell as an appropriate relabeling of the term external validity (although he was the first to admit that it probably wouldn't catch on!). Under this model, we begin by thinking about different generalizability contexts and developing a theory about which contexts are more like our study and which are less so. For instance, we might imagine several settings that have people who are more similar to the people in our study or people who are less similar. This also holds for times and places. When we place different contexts in terms of their relative similarities, we can call this implicit theoretical a gradient of similarity. Once we have developed this proximal similarity framework, we are able to generalize. How? We conclude that we can generalize the results of our study to other persons, places or times that are more like (that is, more proximally similar) to our study. Notice that here, we can never generalize with certainty -- it is always a question of more or less similar.

Threats to External Validity

A threat to external validity is an explanation of how you might be wrong in making a generalization. For instance, you conclude that the results of your study (which was done in a specific place, with certain types of people, and at a specific time) can be generalized to another context (for instance, another place, with slightly different people, at a slightly later time). There are three major threats to external validity because there are three ways you could be wrong -- people, places or times. Your critics could come along, for example, and argue that the results of your study are due to the unusual type of people who were in the study. Or, they could argue that it might only work because of the unusual place you did the study in (perhaps you did your educational study in a college town with lots of high-achieving educationally-oriented kids). Or, they might suggest that you did your study in a peculiar time. For instance, if you did your smoking cessation study the week after the Surgeon General issues the well-publicized results of the latest smoking and cancer studies, you might get different results than if you had done it the week before.

Improving External Validity

How can we improve external validity? One way, based on the sampling model, suggests that you do a good job of drawing a sample from a population. For instance, you should use random selection, if possible, rather than a nonrandom procedure. And, once selected, you should try to assure that the respondents participate in your study and that you keep your dropout rates low. A second approach would be to use the theory of proximal similarity more effectively. How? Perhaps you could do a better job of describing the ways your contexts and others differ, providing lots of data about the degree of similarity between various groups of people, places, and even times. You might even be able to map out the degree of proximal similarity among various contexts with a methodology like concept mapping. Perhaps the best approach to criticisms of generalizations is simply to show them that they're wrong -- do your study in a variety of places, with different people and at different times


Sampling Terminology

As with anything else in life you have to learn the language of an area if you're going to ever hope to use it. Here, I want to introduce several different terms for the major groups that are involved in a sampling process and the role that each group plays in the logic of sampling.
The major question that motivates sampling in the first place is: "Who do you want to generalize to?" Or should it be: "To whom do you want to generalize?" In most social research we are interested in more than just the people who directly participate in our study. We would like to be able to talk in general terms and not be confined only to the people who are in our study. Now, there are times when we aren't very concerned about generalizing. Maybe we're just evaluating a program in a local agency and we don't care whether the program would work with other people in other places and at other times. In that case, sampling and generalizing might not be of interest. In other cases, we would really like to be able to generalize almost universally. When psychologists do research, they are often interested in developing theories that would hold for all humans. But in most applied social research, we are interested in generalizing to specific groups. The group you wish to generalize to is often called the population in your study. This is the group you would like to sample from because this is the group you are interested in generalizing to. Let's imagine that you wish to generalize to urban homeless males between the ages of 30 and 50 in the United States. If that is the population of interest, you are likely to have a very hard time developing a reasonable sampling plan. You are probably not going to find an accurate listing of this population, and even if you did, you would almost certainly not be able to mount a national sample across hundreds of urban areas. So we probably should make a distinction between the population you would like to generalize to, and the population that will be accessible to you. We'll call the former the theoretical population and the latter the accessible population. In this example, the accessible population might be homeless males between the ages of 30 and 50 in six selected urban areas across the U.S.
Once you've identified the theoretical and accessible populations, you have to do one more thing before you can actually draw a sample -- you have to get a list of the members of the accessible population. (Or, you have to spell out in detail how you will contact them to assure representativeness). The listing of the accessible population from which you'll draw your sample is called the sampling frame. If you were doing a phone survey and selecting names from the telephone book, the book would be your sampling frame. That wouldn't be a great way to sample because significant subportions of the population either don't have a phone or have moved in or out of the area since the last book was printed. Notice that in this case, you might identify the area code and all three-digit prefixes within that area code and draw a sample simply by randomly dialing numbers (cleverly known as random-digit-dialing). In this case, the sampling frame is not a list per se, but is rather a procedure that you follow as the actual basis for sampling. Finally, you actually draw your sample (using one of the many sampling procedures). The sample is the group of people who you select to be in your study. Notice that I didn't say that the sample was the group of people who are actually in your study. You may not be able to contact or recruit all of the people you actually sample, or some could drop out over the course of the study. The group that actually completes your study is a subsample of the sample -- it doesn't include nonrespondents or dropouts. The problem of nonresponse and its effects on a study will be addressed when discussing "mortality" threats to internal validity.
People often confuse what is meant by random selection with the idea of random assignment. You should make sure that you understand the distinction between random selection and random assignment.
At this point, you should appreciate that sampling is a difficult multi-step process and that there are lots of places you can go wrong. In fact, as we move from each step to the next in identifying a sample, there is the possibility of introducing systematic error or bias. For instance, even if you are able to identify perfectly the population of interest, you may not have access to all of them. And even if you do, you may not have a complete and accurate enumeration or sampling frame from which to select. And, even if you do, you may not draw the sample correctly or accurately. And, even if you do, they may not all come and they may not all stay. Depressed yet? This is a very difficult business indeed. At times like this I'm reminded of what Donald Campbell used to say (I'll paraphrase here): "Cousins to the amoeba, it's amazing that we know anything at all!"

Statistical Terms in Sampling

Let's begin by defining some very simple terms that are relevant here. First, let's look at the results of our sampling efforts. When we sample, the units that we sample -- usually people -- supply us with one or more responses. In this sense, a response is a specific measurement value that a sampling unit supplies. In the figure, the person is responding to a survey instrument and gives a response of '4'. When we look across the responses that we get for our entire sample, we use a statistic. There are a wide variety of statistics we can use -- mean, median, mode, and so on. In this example, we see that the mean or average for the sample is 3.75. But the reason we sample is so that we might get an estimate for the population we sampled from. If we could, we would much prefer to measure the entire population. If you measure the entire population and calculate a value like a mean or average, we don't refer to this as a statistic, we call it a parameter of the population.

The Sampling Distribution

So how do we get from our sample statistic to an estimate of the population parameter? A crucial midway concept you need to understand is the sampling distribution. In order to understand it, you have to be able and willing to do a thought experiment. Imagine that instead of just taking a single sample like we do in a typical study, you took three independent samples of the same population. And furthermore, imagine that for each of your three samples, you collected a single response and computed a single statistic, say, the mean of the response. Even though all three samples came from the same population, you wouldn't expect to get the exact same statistic from each. They would differ slightly just due to the random "luck of the draw" or to the natural fluctuations or vagaries of drawing a sample. But you would expect that all three samples would yield a similar statistical estimate because they were drawn from the same population. Now, for the leap of imagination! Imagine that you did an infinite number of samples from the same population and computed the average for each one. If you plotted them on a histogram or bar graph you should find that most of them converge on the same central value and that you get fewer and fewer samples that have averages farther away up or down from that central value. In other words, the bar graph would be well described by the bell curve shape that is an indication of a "normal" distribution in statistics. The distribution of an infinite number of samples of the same size as the sample in your study is known as the sampling distribution. We don't ever actually construct a sampling distribution. Why not? You're not paying attention! Because to construct it we would have to take an infinite number of samples and at least the last time I checked, on this planet infinite is not a number we know how to reach. So why do we even talk about a sampling distribution? Now that's a good question! Because we need to realize that our sample is just one of a potentially infinite number of samples that we could have taken. When we keep the sampling distribution in mind, we realize that while the statistic we got from our sample is probably near the center of the sampling distribution (because most of the samples would be there) we could have gotten one of the extreme samples just by the luck of the draw. If we take the average of the sampling distribution -- the average of the averages of an infinite number of samples -- we would be much closer to the true population average -- the parameter of interest. So the average of the sampling distribution is essentially equivalent to the parameter. But what is the standard deviation of the sampling distribution (OK, never had statistics? There are any number of places on the web where you can learn about them or even just brush up if you've gotten rusty. This isn't one of them. I'm going to assume that you at least know what a standard deviation is, or that you're capable of finding out relatively quickly). The standard deviation of the sampling distribution tells us something about how different samples would be distributed. In statistics it is referred to as the standard error (so we can keep it separate in our minds from standard deviations. Getting confused? Go get a cup of coffee and come back in ten minutes...OK, let's try once more... A standard deviation is the spread of the scores around the average in a single sample. The standard error is the spread of the averages around the average of averages in a sampling distribution. Got it?)

Sampling Error

In sampling contexts, the standard error is called sampling error. Sampling error gives us some idea of the precision of our statistical estimate. A low sampling error means that we had relatively less variability or range in the sampling distribution. But here we go again -- we never actually see the sampling distribution! So how do we calculate sampling error? We base our calculation on the standard deviation of our sample. The greater the sample standard deviation, the greater the standard error (and the sampling error). The standard error is also related to the sample size. The greater your sample size, the smaller the standard error. Why? Because the greater the sample size, the closer your sample is to the actual population itself. If you take a sample that consists of the entire population you actually have no sampling error because you don't have a sample, you have the entire population. In that case, the mean you estimate is the parameter.

The 68, 95, 99 Percent Rule

You've probably heard this one before, but it's so important that it's always worth repeating... There is a general rule that applies whenever we have a normal or bell-shaped distribution. Start with the average -- the center of the distribution. If you go up and down (i.e., left and right) one standard unit, you will include approximately 68% of the cases in the distribution (i.e., 68% of the area under the curve). If you go up and down two standard units, you will include approximately 95% of the cases. And if you go plus-and-minus three standard units, you will include about 99% of the cases. Notice that I didn't specify in the previous few sentences whether I was talking about standard deviation units or standard error units. That's because the same rule holds for both types of distributions (i.e., the raw data and sampling distributions). For instance, in the figure, the mean of the distribution is 3.75 and the standard unit is .25 (If this was a distribution of raw data, we would be talking in standard deviation units. If it's a sampling distribution, we'd be talking in standard error units). If we go up and down one standard unit from the mean, we would be going up and down .25 from the mean of 3.75. Within this range -- 3.5 to 4.0 -- we would expect to see approximately 68% of the cases. This section is marked in red on the figure. I leave to you to figure out the other ranges. But what does this all mean you ask? If we are dealing with raw data and we know the mean and standard deviation of a sample, we can predict the intervals within which 68, 95 and 99% of our cases would be expected to fall. We call these intervals the -- guess what -- 68, 95 and 99% confidence intervals.
Now, here's where everything should come together in one great aha! experience if you've been following along. If we had a sampling distribution, we would be able to predict the 68, 95 and 99% confidence intervals for where the population parameter should be! And isn't that why we sampled in the first place? So that we could predict where the population is on that variable? There's only one hitch. We don't actually have the sampling distribution (now this is the third time I've said this in this essay)! But we do have the distribution for the sample itself. And we can from that distribution estimate the standard error (the sampling error) because it is based on the standard deviation and we have that. And, of course, we don't actually know the population parameter value -- we're trying to find that out -- but we can use our best estimate for that -- the sample statistic. Now, if we have the mean of the sampling distribution (or set it to the mean from our sample) and we have an estimate of the standard error (we calculate that from our sample) then we have the two key ingredients that we need for our sampling distribution in order to estimate confidence intervals for the population parameter.
Perhaps an example will help. Let's assume we did a study and drew a single sample from the population. Furthermore, let's assume that the average for the sample was 3.75 and the standard deviation was .25. This is the raw data distribution depicted above. now, what would the sampling distribution be in this case? Well, we don't actually construct it (because we would need to take an infinite number of samples) but we can estimate it. For starters, we assume that the mean of the sampling distribution is the mean of the sample, which is 3.75. Then, we calculate the standard error. To do this, we use the standard deviation for our sample and the sample size (in this case N=100) and we come up with a standard error of .025 (just trust me on this). Now we have everything we need to estimate a confidence interval for the population parameter. We would estimate that the probability is 68% that the true parameter value falls between 3.725 and 3.775 (i.e., 3.75 plus and minus .025); that the 95% confidence interval is 3.700 to 3.800; and that we can say with 99% confidence that the population value is between 3.675 and 3.825. The real value (in this fictitious example) was 3.72 and so we have correctly estimated that value with our sample.

Probability Sampling

A probability sampling method is any method of sampling that utilizes some form of random selection. In order to have a random selection method, you must set up some process or procedure that assures that the different units in your population have equal probabilities of being chosen. Humans have long practiced various forms of random selection, such as picking a name out of a hat, or choosing the short straw. These days, we tend to use computers as the mechanism for generating random numbers as the basis for random selection.

Some Definitions

Before I can explain the various probability methods we have to define some basic terms. These are:
  • N = the number of cases in the sampling frame
  • n = the number of cases in the sample
  • NCn = the number of combinations (subsets) of n from N
  • f = n/N = the sampling fraction
That's it. With those terms defined we can begin to define the different probability sampling methods.


Nonprobability Sampling

The difference between nonprobability and probability sampling is that nonprobability sampling does not involve random selection and probability sampling does. Does that mean that nonprobability samples aren't representative of the population? Not necessarily. But it does mean that nonprobability samples cannot depend upon the rationale of probability theory. At least with a probabilistic sample, we know the odds or probability that we have represented the population well. We are able to estimate confidence intervals for the statistic. With nonprobability samples, we may or may not represent the population well, and it will often be hard for us to know how well we've done so. In general, researchers prefer probabilistic or random sampling methods over nonprobabilistic ones, and consider them to be more accurate and rigorous. However, in applied social research there may be circumstances where it is not feasible, practical or theoretically sensible to do random sampling. Here, we consider a wide range of nonprobabilistic alternatives.


Sampling distribution

The sampling distribution of the mean is a very important distribution 

 The standard deviation of the sampling distribution of the statistic is referred to as the standard error of that quantity. For the case where the statistic is the sample mean, the standard error is:

\sigma_{\bar x} = \frac{\sigma}{\sqrt{n}}
where σ is the standard deviation of the population distribution of that quantity and n is the size (number of items) in the sample.