Research Design in Occupational Education Copyright 1997. James P. Key. Oklahoma State University Except for those materials which are supplied by different departments of the University (ex. IRB, Thesis Handbook) and references used by permission.

MODULE S2

CORRELATION

Statistical correlation refers to a quantifiable relationship between two variables. Furthermore, it is a measure of the strength and direction of that relationship. Two measures for each subject (or object) in the group are required.

Pearson Product Moment Correlation

Negative Correlation

 x y 1 5 -2 2 4 4 -4 2 4 -1 1 1 1 -1 3 3 0 0 0 0 0 4 2 1 -1 1 1 -1 5 1 2 -2 4 4 -4 15 15 0 0 10 10 10

y

x

Positive Correlation

 x y 1 1 -2 -2 4 4 4 2 2 -1 -1 1 1 1 3 3 0 0 0 0 0 4 4 1 1 1 1 1 5 5 2 2 4 4 4 15 15 0 0 10 10 10

y

x

Zero Correlation

 x y 1 2 -2 -1 4 1 2 2 5 -1 2 1 4 -2 3 3 0 0 0 0 0 4 1 1 -2 1 4 -2 5 4 2 1 4 1 2 15 15 0 0 10 10 0

y

x

Moderate Positive Correlation

 x y 1 2 -2 -1 4 1 2 2 1 -1 -2 1 4 2 3 3 0 0 0 0 0 4 5 1 2 1 4 2 5 4 2 1 4 1 2 15 15 0 0 10 10 8

y

x

Correlation

Pounds of Nitrogen (x) Bushels of Corn(y)

 x y 10 -40 1600 30 -20 400 800 20 -30 900 40 -10 100 300 50 0 0 50 0 0 0 70 20 400 60 10 100 200 100 50 2500 70 20 400 1000 250 0 5400 250 0 1000 2300

Steps for Hypothesis Testing

1. State the null hypothesis.

Ho = There is a lack of significant relationship between the pounds of nitrogen (x) applied to a crop of corn and the yield of bushels of corn (y).

2. Choose a significance level based on confidence sought, typically .05 or .01.

3. Calculate the degrees of freedom. They are calculated as the number of pairs of measures minus two. For our example, our df = 3. The correlation table of values is entered through the df on the left and the significance level (.05) on top to give a value of .878.

4. Compare calculated value to table value.

If the calculated value is equal to or greater than the table value, we reject the null hypothesis.

If the calculated value is less than the table value, we accept the null hypothesis.

calculated value => table value = reject null

calculated value < table value = accept null

5. Accept or reject null hypothesis.

Therefore, in our example we compare .99 to .878 and find our calculated value (.99) to be greater than our table value of .878. We reject the null hypothesis and conclude that in 95% of the cases this relationship would be the result of the experimental conditions rather than chance factors. Approximately five percent of the time a relationship of this magnitude could result from chance factors.

Assumptions: (Same for Correlation and Regression)

1. Representative Sample (Random)

2. Normal Population

3. Interval Measures

4. Linearity (Measures approximate a straight line)

5. Homoscedasticity (Equal variances)

According to Popham (1973, p. 80), "multiple correlation describes the degree of relationship between a variable and two or more variables considered simultaneously. . . . partial correlation allows the statistician to describe the relationship between two variables after controlling or partialing out the confounding relationship of another variable(s)."

Four Special Correlation Methods and Variable Relationships Assessed

 Method Variable Relationship Point biserial coefficient (rpb) Continuous versus dichotomous Biserial coefficient (rb) Continuous versus dichotomized Phi coefficient (0) Dichotomous versus dichotomous Tetrachoric coefficient (rt) Dichotomized versus dichotomized

The correlation ratio (eta) is the method to used when the relationship is non-linear.

SELF ASSESSMENT

1. Define correlation.

2. State the number of measures required per subject (or object) in the group.

3. Illustrate graphically the positive, negative, and neutral limits of the correlation coefficient as well as a moderate positive relationship.

4. Calculate the Pearson Product Moment Correlation coefficient for:

 Student Hours Competency Rating A 3 5 B 4 8 C 2 5 D 5 9 E 3 6 F 1 2 G 2 5 H 3 5 I 3 6 J 4 9 K 4 8 L 3 5 M 3 6 N 4 9 O 5 9 P 3 6 Q 3 5 R 2 5 S 2 5 T 1 2

5. State the null hypothesis for the above relationship.

6. Test the statistical significance of the calculated correlation coefficient and the effect on the hypothesis for the above relationship.

7. State the assumptions required for Pearson Product Moment Correlation.

8. Describe multiple correlation.

9. Describe partial correlation.

10. Name four special correlation methods and variable relationships assessed.

11. Name the correlation method used when the relationship is non-linear.