| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

EPSY6210_20100208

Page history last edited by Starr Hoffman 14 years, 2 months ago

EPSY6210

02.08.2010

Becoming a Behavioral Science Researcher (another helpful title)

Research Methods in Anthropology (Bernard Russell -- univariates chapter; takes only about a page to explain most concepts)

 

How regression is like/unlike ANOVA

 

ANOVA: grouping variable, outcome variable

  • (1 or more independent variables, which may have various levels (group 1--farming method 1; group 2, etc.))
  • what level is independent variable (predictor) in ANOVA? nominal (categorical, grouping variable)
  • the outcome variable is: interval (continuous)
  • you can have more than one independent variable in an ANOVA

 

regression

  • continuous outcome (also) -- only one
  • its predictors (independent variables) can be categorical or (continuous?)
  • if ANOVA can do it, regression can do it

 

purposes of regression:

  • prediction (can this variable predict this other variable?)
    • want a big effect size (a big R squared)
    • don't care "why" necessarily
  • explanation (theory testing)
    • also care about "why" (which of the predictors, if any (or all) predict this best)
    • helps explain or inform a theory, supports it
    • understand which predictor contributed to a prediction

 

example: equation for predicting height at age 20 using height at age 2

  • predictor: height at age two  (x)
  • outcome variable: height at age 20  (y)
  • Y hat = 0 + 2X

 

another example: predicting 9th grade algebra scores...

  • r squared = .60 (predictor can explain 60% of the variance)
  • y hat = 3 + 1.5x
  • once you have your equation; use it to plug in for the next group of subjects to make predictions for them
  • use data (regression equation) to make quality decisions
    • you can be confident in the predictions if you have a large r squared (effect size)

 

ANOVA Review

  • SOS total = the differences (variability) of all the scores on the outcome (dependent variable)
  • SOS between =does the mean of each of the groups (grouped independent variables) differ? (null hypothesis)
    • SOS of the means (how different are the groups?) = take the four group means and compute the SOS (grand mean = mean of all scores (the mean of the means, if all groups have the same number of scores)
    • degrees of freedom between = number of groups, minus 1
  • SOS within = how different are all the scores in a single group? + the difference of the scores (within) the other groups... (SUM)
    • degrees of freedom within = SUM(group 1's n-1 + group 2's n-1) OR n minus the number of cells (number of unique groups/observations)
  • statistical significance says is it likely with a similar population, random sampling that we would come up with the same mean?
  • mean of squares SHOULD be called variance = ratio of the sum of squares over the number of observations
  • f calculated = MSb/MSw = look in a table to find f critical value (all f distributions are positively skewed because we used squares and cannot have a negative; f calc shows us spreadoutness--you can't have less spreadoutness than none (<0).)
    • if you reject the null hypothesis (that there are no differences), then you declare that there are differences
    • what influences how big or small f calc is? (the bigger, the more statisically significant)  ALSO TRUE FOR REGRESSION
      • sample size (degrees of freedom) -- (more subjects per group = more power)
      • number of groups (more groups = less power)
      • SOS (smaller SOS = less power)
  • eta squared = overall effect size from an ANOVA
    • eta squared = SOSb / SOSt
    • how much of the differences in output (coffee output) is due to differences between groups (farming method)? = eta squared
    • for most social science research, effect sizes as large as .6 are rare

 

Area World (not in the original metric; has a power of 2 ; squared)

  • variance
  • sum of squares (SOS)
  • r squared
  • eta squared (n); an effect size from an ANOVA (really the same as r squared)
  • COV (covariance)

 

Score World (in metric of the original scores)

  • unstandardized
    • mean

    • standard deviation

    • skewness

    • kurtosis

    • pearson r

    • (raw scores: GPA, SAT, etc.)

  • standardized
    • Z score

 

Case-One Regression (Score World)

  • = 1 predictor, 1 outcome (same as Thompson paper)
  • line of best fit-- best prediction
  • best-case scenario (perfect prediction) -- all dots line up in graph -- r = 1 -- y hat = 1.63 + 5x -- error = y minus y hat = 0 (no error if perfect prediction)
    • the factor times x = slope (slope = rise/run)
    • standardized version: line must go thru centroid; in Z score form, mean of X and Y is now 0 (so line goes thru point of origin, y/x intercept)
      • y hat = 0 + (beta times X)  -- instead of unstandardized (B times X)  --  if you say BETA, then the equation is in STANDARDIZED form
  • worst-case scenario -- your prediction is 0 (r = 0); no relationship between predictor variable and outcome variable;  y hat = ? + 0x (which means the line is FLAT, and also goes through the centroid)
    • if you have ZERO INFORMATION to predict, then always predict the MEAN (the most scores fall under the mean on a normal distribution)
    • standardized version: mean of y is 0, so regression line is right along the X axis (y hat = 0x)
  • real-world scenario (some prediction; in the middle) -- y hat = .3 + .9x (we're assuming r = .5)
    • standardized version: like the other standardized graphs, the centroid is the point of origin (and the line always goes through the centroid); y hat = beta times x
    • in the real world, what does a slope of .5 mean?  =  if we move one unit on x, y increases by .5
    • when interpreting slope, you must invoke the standard deviations for Y and X (otherwise you have no context)
    • you can also standardize Y and X so that they are in the same scale/context (their standard deviations must be the same)

 

convert B to Beta:  b = Beta(SDy/SDx)

  • b = slope  ( Y hat = a + bX )
  • take the standard deviations into account, to standardize the b into a beta score
  • Thompson page 7
    • IMPORTANT FOR HOMEWORK

 

Back to Area World...

  • to get from Score World to Area World, we SQUARE it
  • thus, square-root to get from Area World to Score World

 

perfect prediction

  • draw a box on the graph for Y -- this is the SOSy (because it's squared) (this is also SOSt of the dependent variable)
  • how much of this is y hat (predicted scores), and how much is error?
    • if it's a perfect prediction, all of the SOSy is y hat
  • how much of SOSx is useful for predicting SOSy?
    • the SOSx can be larger than SOSy, because there can be more  variability for x than for y
    • BUT the dependent variable drives the bus, so we really care about how much of SOSx is explained by SOSy?
    • SOSx that does NOT overlap is IRRELEVANT
  • r squared = 1
  • square root = r = plus or minus 1

 

no prediction

  • the square graph for y and the square graph for x can be anywhere, as long as they DO NOT overlap (there is no relationship)
  • r squared = 0
  • r = 0

 

real-world prediction (middle)

  • some overlap (but not total)
  • r squared = .25
  • r = plus or minus .5
  • 25% of y is explained by x (25% = y hat)
  • 75 % is not explained = 75% is error
  • SOSy = SOSe + SOSyhat  (SOSe is sum of squares of error)
    • any SOSx that does not overlap with SOSy is irrelevant (useless info)
  • how many variables are there in this scenario? (if n = 50) -- FOUR = y, x, e (error), y hat
  • synthetic variables = unobserved, unmeasured (created from something else)
    • error and y hat are synthetic, because they are unobserved
  • SOSy (total y SOS) = explained (y hat) + error

 

beta = r(rx)

 

Comments (0)

You don't have permission to comment on this page.