Psych 122: Solutions to suggested problems

Psychology 122: Statistics for the Behavioral Sciences - Solutions
Back to: schedule exam prep practice problems

Key Terms
Problems

The independent variable is the nature of the parents' relationship. One possible operational definition would be to record whether or not the parents are divorced; this would make 'parents' relationship' a qualitative variable. The dependent variable would be attitude towards marriage, which could be measured on a 10-point scale. This would make attitude a quantitative variable. The experiment is observational.

MCTs and Variability
Problems

a) Low arousal: Mean = 5.1; Median = 5; Mode = 5; s² = 2.32; s = 1.52
Med Arousal: Mean = 17; Median = 17.5; Mode = 18; s² = 2.0; s = 4.0
High Arousal: Mean = 7.7; Median = 8; Mode = 8; s² = 1.57; s = 1.25

b)The data suggest that a moderate level of arousal is optimal. This idea has been borne out by many experiments and is often referred to as the Yerkes-Dodson law.

These data suggest that all three distributions are symmetrical (or at least pretty darn close).

Z-Scores
Problems

a) Area in tail = .0582
b) Area in tail = .0582
c) Area in body (0.24):    .5948
    Area in tail    (1.98):      .0239
                                          .5709
d) Area in tail (0.24):         .4052
      Area in tail (1.98):      - .0239
                                           .3813

z = x-mean / sd
    z = 8 - 12 / 2.5
    z = -4 / 2.5
    z = -1.6                Area in tail: .0548
Therefore, you are a relatively industrious individual, as only about 5.5% of college students procrastinate less than you.

Confidence Intervals
Problems

1. Researchers have developed a filament that should add to the life expectancy of light bulbs. The standard 60-Watt bulb burns for an average of mu = 750 hours with sigma = 20. A sample of 100 bulbs is prepared using the new filament. The average life for this sample is 820 hours.

80% CI = 820 +/- [1.28 *(20/sqrt(100)] = 820 +/- 2.56 = [817.44 - 822.56]
99% CI = 820 +/- [2.58 *(20/sqrt(100)] = 820 +/- 5.16 = [814.84 - 825.16]

2. A survey of students produced the following data regarding their age when they first consumed an alcoholic drink: 11, 13, 14, 12 and 10.

95% CI = 12 +/- [2.776 *(1.58/sqrt(5)] = 12 +/- 1.96 = [10.04 - 13.96]
99% CI = 12 +/- [4.604 *(1.58/sqrt(5)] = 12 +/- 1.96 = [8.75 - 15.25]
The Confidence Interval gets wider as the level of confidence increases.

One-Sample Hypothesis Tests
Problems

Ho: Mu = 1250 Ha: Mu does not = 1250
RR: t < -2.00; t > 2.66
Zobs = x-MUo / [s/(sqrt(n)] = 1284 - 1250 / [90/sqrt(61)] = 2.95
tobs falls in the rejection region. Therefore, we would REJECT the null. Interpretation: Athletes are drawn from a population with a higher mean average SAT score than the general student body.
Ho: Mu = 1 Ha: Mu does not = 1
RR: t < 2.776; t > 2.776
tobs = x-MUo / [s/(sqrt(n)] = 1.34 -1 / [.5 /sqrt(5)] = 1.52
tobs does not fall in the rejection region. Therefore, we would FAIL TO REJECT the null. Interpretation: I should probably stop taking the drug because there is no evidence that nandrolene affects my publication rate (and there might be some negative side effects).
Ho: Mu = 2.6 Ha: Mu does not = 2.6
RR: t < -2.445; z > 2.445
M = 4
s = 2.08
tobs = M-MUo / [s/(sqrt(n)] = 4 - 2.6 / [2.08/sqrt(7)] = 1.78
tobs does not fall in the rejection region. Therefore, we would FAIL TO REJECT the null hypothesis. Interpretation: I do not remember more or fewer dreams than the average person according to Susan Boyle's book.

Two-Sample Independent Hypothesis Tests
Problems

Ho: Uhp = Ue Ha: Uhp is not equal to Ue tcrit = 1.98 (use value for df = 120)
SE = sqrt[(50²/100) + (75²/100)] = 9.01
t = 165-150 / 9.01 = 1.66
tobs is less than tcrit. Therefore, we would fail to reject the null.
Interpretation: neither printer is faster than the other.
1a) tcrit = 1.658. 1.66 is greater than 1.658 so we would reject the null.
95% CI = 15 +/- 1.98 * 9.01 = [-2.84-32.84]. Yes, it makes sense. We failed to reject the null, which is consistent with the fact that the interval includes the value 0.

Paired Hypothesis Tests
Problems

tobs = (69.4 - 66.2) / (12.2 /sqrt(120))

= 2.87 < tcrit (2.66)

Therefore, you would reject the null and conclude that the jingle improved the consumers' moods.
99% CI = 3.2 +/- 2.66 * (12.2 /sqrt(120))

= 3.2 +/- 2.66 * 1.11

= 3.2 +/- 2.95

= 3.2 +/- [.25 - 6.15]

In other words, we would expect that the average person's mood would increase by between approximately .25 and 6.15 points after listening to the jingle.

One-Way ANOVA
Problems

	East	West	Same
	2	6	1
	1	4	0
	3	6	1
	3	8	1
	2	5	0
	4	7	0
				Sum
Sum (x)	15	36	3	54
Sum (x²)	43	226	3	272
(T²)/n	37.5	216	1.5	255
Sum (x²)-(T²)/n	5.5	10	1.5	17

Source	SS	df	MS	F
Between	93	2	46.5	41.03
Within	17	15	1.13
Total	110	17

The observed value of F exceeds the critical value. Therefore, we would reject the null and conclude that direction of travel does influence the severity of jet lag.

Tukey's HSD = q * sqrt(MSE/n)
= 3.67 * sqrt (1.13/6)
= 3.67 * .434 = 1.59
Based on Tukey's HSD, I would conclude that traveling West is more difficult than either traveling East or staying within the same time zone. Traveling East is also more difficult than staying within the same time zone.

Repeated Measures ANOVA
Problems

In a repeated measures experiment, the same units (people) serve as observations (subjects) for each level of the independent variable. In a between subjects experiment, each level of the independent variable consists of a unique set of observations.
The difference between RM- and BS-ANOVA relates to the error term. In a BS-ANOVA, the error term consists of all of the variability within each level of the independent variable. In a RM-ANOVA, the variability within each level of the IV is derived from two sources: chance variation (error), and differences between the subjects that comprise the sample. Thus, we can reduce the error term in our F-ratio by partitioning out the variability due to individual differences. Reducing the error term in our F-ratio increases the value of F, which increases the chance that we will reject the null hypothesis. Because reducing the error term increases the probability of rejecting the null without affecting alpha, we say that our test is more powerful.

Two-Way ANOVA
Problems

Omnibus Test
Source	df	SS	MS	F	p-value
Model	8	144	18.00	3.6	.0372
Error	99	495	5.00
Total	107	639

Individual Effects
Source	df	SS	MS	F	p-value
A	2	28.0	14.0	2.8	.0496
B	2	16.0	8.0	1.6	.3428
AxB	4	100	25.0	5.0	.0215

a) See above.
b) 3 (b/c dfA= 2)
c) 3 (b/c dfB = 2)
d) 108 (b/c dfTotal = 108)
e) 12 (N/treatments = 108 / 9)
f) A is significant; B is not significant; the interaction is significant.

	Old AM	Old PM	Young AM	Young PM
Mean	4	2	4	9
Sum (x)	24	12	24	54	114
Sum(x²)	110	28	118	498	754
SS	14	4	22	12	52
(G²)/N					541.5
(T²)/n	96	24	96	486	702
Sum(x²)-(T²)/n	14	4	22	12	52

Source	SS	df	MS	F	Fcrit
Model	160.5	3	53.5	20.58	3.10
Error	52	20	2.6
Total	212.5	23

Because the observed value of F exceed the critical value, we conclude that at least one of our treatment means differs from the others (F 3, 20) = 20.58, MSE = 2.6). Now, on to test the main and interaction effects.

	Older	Younger	AM	PM
Age	108	507			615

Test Time			192	363	555

Source	SS	df	MS	F	Fcrit
Age	73.5	1	73.5	28.27	4.35
Test Time	13.5	1	13.5	5.19
Age x Test Time	73.5	1	73.5	28.27

The main effect of Age was significant (F (1, 20) = 28.27, MSE = 2.6). Younger adults performed significantly better than older adults. The main effect of Test Time was also significant (F (1, 20) = 5.19, MSE = 2.6). the subjects performed better in the afternoon than in the morning. Finally, the interaction effect was also significant (F (1, 20) = 28.27, MSE = 2.6). Whereas older adults tended to perform better in the morning, younger adults tended to perform better in the afternoon. Moreover, older adults performed just as well as younger adults in the morning, but performed much worse than younger adults when tested in the afternoon.

Simple Regression
Problems

1. Calculate the regression equation.
  SP = S(xi)(yi) - [S(xi)S(yi)] / n
  = 15,013 - [(109.5*11,087)/77]
  = 15,013 - 15766.58 = -753.58
  
  SSx = Sxi² - [(Sxi)² / n]
  = 246.25 - [(109.52)/77]
  = 246.25 - 155.72 = 90.53
  B1 = SP / SSx
  = -753.58 / 90.53 = -8.32
  B0 = My - B1 (Mx)
  = (11,087/77) - (-8.32)*(109.5/77)
  = 143.98 - (-8.32)*1.42 = 155.82
  y = 155.82 - 8.32*x
2. The slope tells us the change in y for a unit change in x. Therefore, if you ate one additional serving of vegetables, I would expect you to lose 8.32 pounds.
3. Use the regression equation with 3 substituted for x.
  y = 155.82 - 8.32*x
  y =155.82 - 8.32*(3)
  y =155.82 - 24.96
  y =130.86
4. Performing the hypothesis test.
  
  SSy = Syi2 - [(Syi)2 / n]
  = 1666795 - (110872/77)
  = 1666795 - (122921569/77)
  = 1666795- 1596384.012987
  = 70410.987013
  
  SSE = SSy - b1(SP)
  = 70410.987013 - (-8.32 *-753.58)
  = 70410.987013 - 6269.7856
  = 64141.201413
  
  s2 = MSE = SSE / (n-2)
  = 64141.201413 / (77-2) = 855.22 s = sqrt(855.22) = 29.24
  
  t = b1 - 0 / s / sqrt(SSx)
  = -8.32 / (29.24 / sqrt(90.53))
  = -8.32 / (29.24 / 9.515)
  = -8.32 / 3.073
  = -2.71
  
  Reject the null. We conclude that there is a significant relationship between weight and vegetable consumption.
5. Calculating the correlation coefficient
  
  SSy = Syi² - [(Syi)² / n]
  = 1,666,975 - (11,0872/77)
  = 1,666,795 = 1,596,384.01 = 70,410.99
  
  r = SP / [sqrt(SSx)(SSy)]
  = -753.58/(sqrt(90.53*70,410.99))
  = -753.58 /sqrt(6374306.65) = -.30

y = 6.967 + .596(x)
.596 hours later.
y = 14.119 (approximately 2:00 AM)
The tobs = 5.212; because my p-value is less than .05, I would reject the null and conclude that weekday bedtime is a significant predictor of weekend bedtime.
r = .53; therefore, r2 = .281. This suggests that we can explain almost 30% of the variance in weekend bedtimes if we know weekday bedtimes. That is a reasonably high value for "real" data.

Correlation
Problems

Solution

x (rt)	y (errors)	x²	y²	s*y
184	10	33856	100	1840
213	6	45369	36	1278
234	2	54756	4	468
197	7	38809	49	1379
189	13	35721	169	2457
221	10	48841	100	2210
237	4	26169	16	948
192	9	36864	81	1728

1667	61	350385	555	12308

SSx = 350385 - (1667²/8)
: = 3023.875
SSy = 555 - (61²/8): = 89.875
SP = 12308 - (1667*61/8)
: = -402.875
r = SP / sqrt(SSx*SSy)
: = -402.875 / sqrt(3023.875*89.875); = -.7728

Thus, there is a strong negative relationship between the speed of response and accuracy. In other words, faster responses tend to produce more errors.

2) Yes. The correlation coefficient is significant; r (6) = -.773, p = .025).

Multiple Regression
Problems

Yes, the overall model will help SMS predict success in the fraternity. This conclusion is based on the fact that the omnibus ANOVA leads us to rejct the null (F (6, 42_ = 6.00, MSE = 5.00, p = .0125).
R-squared = 180/360 = .50
y = 22 - .16(GPA) + 2.17(Math-GPA) + 4.81(Protect) + 6.03(Trek) + .88(T-shirts) - 3.25(Friends)
y = 22 - .16(3.20) + 2.17(3.80) + 4.81(2) + 6.03(2) + .88(12) - 3.25(2)
y = 22 - .512 + 8.246 + 9.62 + 12.06 + 10.56 - 6.5 y = 55.474
Therefore, we would expect your brother to be admitted to the fraternity.

Chi-Squared
Problems