| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

EPSY6210_20100301

Page history last edited by Starr Hoffman 14 years, 1 month ago

03.01.2010

 

first walking through the homework

need to know how to interpret our results from SPSS like this--applying what we've learned

 

Two Most Important Questions:

  1. do i have anything?
    1. is it worth interpreting?
    2. is there enough prediction, statistical significance, effect size, etc.?
    3. (if you dont have anything, don't bother going to the next question)
  2. where does it come from?
    1. what variables are the important ones for creating the effect in question one?

 

is there an effect?

  • t-test for the predictors--convert to f test 

 

the predictors create the effect

what stats do we look at to see which predictors are important? (#2)

  • correlation, pearson r
  • beta weights (very important!)
  • structural coefficients (very important!)

 

its a different question to ask:

  • do these three variables collectively affect another dependent variable? (one multiple regression)

than:

  • does this variable affect the dependent variable, does another one affect it, does another? (several separate regressions)
  • use the statistical analysis that matches the question you're asking

 

size of the effect (to be considered relevant) highly depends on your study

(some don't need large effects)

context matters a lot--2% can be a large effect if it saves 2% of all cancer patients from death, for example.

 

there is probably already a literature out there that lists effect sizes relevant to the research you want to \perform in your dissertation.  use those effect sizes, so you can relate it to existing literature.

 

figure out what Cohen's D is.  this is important.

  • this is a nice frame to talk about statistical power--but Cohen did this very cautiously
  • Choen picked small, medium, and large effects to frame his book
  • he noted that these are very general terms, and aren't always effective measures--but people tend to interpret them as gospel
  • using this only makes sense if there is no literature on your topic, and you have no benchmarks to work from

 

Cohen's D (mean difference effects --  can be squared into...)

  • .2 = small (r-squared = .01 or 1%)
  • .5 = medium (r-squared = .09 or 9%)
  • .8 = large (r-squared = .25 or 25%)
  • effect size, measures & analytic thinking -- article to read... pg 611

 

Cohen's D and r-squared are different metrics and thus do not directly relate.

  • they both talk about how far from the null you are (the magnitude of the result), but in different ways.

 

never report an r-squared without telling the reader what the effect means -- tell them if the r-squared is small, medium, or large considering your context

 

you must make a judgement--not just on the p value, but on the effect size (r-squared) as well.  

  • if a result is repeated (over several studies), it gains  importance/relevance even if it's not "statistically significant."
  • trust stability (over multiple samples) over significance

 

 when describing the sample (in your statistical article), ALWAYS tell how many are in your sample.

 

beta weights:

  • ifbetas are larger, they give the scores for that variable more weight (multiplicative constants)
  • if they are smaller (like a decimal/fraction, they remove the variable (it's not effecting the independent variable much)

 

structure coefficients:

  • the computer doesn't give them to you
  • = pearson r between predictor and independent variable, divided by multiple-R
  • so the resulting structure coefficient explains the relationship between y-hat and the predictor
  • in Area World (square the structure coefficient)...
    • now look at the squared structure coefficients: which of these explains the most?
    • but it's the beta weights that tell us more about which predictor explains more--don't get confused... the squared structure coefficient tells us which y-hats are more strongly related to their predictors, but NOT which predictor explains the variance in y the most.
    • (we definitely need to review: beta weights, squared structure coefficients)

 

SAMPLE STATISTICAL ANALYSIS DESCRIPTION OF REGRESSION

  • a regression analysis was with the predictors (v2, v3, v4  -- real names of these variabalbe)
  • these were stat sign with a f value of -- and degrees of freedom of -- and effect size of
  • ///// (i missed this line.)  ////
  • this effect was due primarily to the variable (?) which had a significant beta weight and squared structure coefficient.
  • secondary to variable (?), variable (3) could also explain a substantial part of the effect, although its beta weight was near zero, indicating that this variable received little credit in teh regression equation for that variance explained.
  • v(4) although having a substantial beta, had a much smaller structured coefficient.
  • (then would explain repression here. we haven't gotten to that yet.)

 

to prep for the midterm, start reading lots of articles using multiple regression, and seeing how they interpret and describe results.

  • Journal of Applied Psychology uses a fair amount of regression articles

 

<< BREAK >>

 

Homoscedasticity

(looking at handout from todays class--from Tabachnick & Fidell)

 

homogenity of variance: if you're comparing MEANS ACROSS GROUPS, the variance on the dependent variable is the SAME across all groups (if you don't have that, you have an apples to oranges comparison)

  • an ANOVA compares MEANS--group means.
  • you use variances within, between, and total for both groups.
  • so a mean is the best possible description of a group of scores... if the variance (distribution in the graph--the SHAPE the curve makes) isn't similar, then you can't COMPARE the means in a meaningful way (they don't describe their groups equally).

 

we've been using scatterplots to describe our data--we've only had ONE group in these examples.

 

if we have a scatterplot with a dichotomous grouping variable (like male/female), then points will group around the mean of each group--where the datapoints pile up, that's the "high point" of the curve in the graph.  (see homework one-- histogram and scatterplot.). 

 

for explanation of homoscedasticity , see drawing on paper...

essentially, it assumes that the conditional  slice of the data gives me the same distribution of data points as any other slice of data (a similar distribution curve could be drawn from the "slice" of data points). 

 

if you run the regression with heteroscedasticity, you will get weaker results (attentuated your r-squared), less variance explained, less power, than a regression on a homoscedastic set on data

 

therefore, you should ALWAYS check for homoscedascity when running a regression.

 

when either the dependent or the independent variable is skewed, that can cause heteroscedascity.

  • often data collection/testing methods can produce skewness...
    • ceiling or floor effects (highest or lowest possible scores on tests) may skew data in this way.
    • giving Likert scales may produce similar effects--they may only use the 3-5 answers that are "positive" rather than the 1-2 "negative" perceived answers.
  • maybe what you're measuring is skewed in the real world, the data itself.

 

if we could get rid of the skewness, that could help r-squared (make it stronger)--the analysis could be stronger.

  • not necessarily true, but it might be.

 

2nd part of the handout is about Data Transformations.  (p.81) -- two pages later is a graph.

  • discussion about whether data transformatino is cheating or not
  • often our data measurement is arbitrarily scaled (like Likert scales)
  • then we should use some caution attributing too much meaning to that measurement scale; therefore, why not transform it if it creates a better prediction?  generally, the more arbitrary the scale, the more data transformation is accepted.
  • a less arbitrary scale: blood pressure (established medical interpretations of those numbers); not as useful to transform this data because you may affect the meaning of the data
  • the part that gets missed is that this affects your interpretation of the data--you're now interpreting using a different metric. you should change your variable symbol somehow to indicate this clearly in your results.

 

Midterm project:

  • check for data transformations-- should you transform that data?
  • consider why or why not, and what you might do if you were to transform it.
    • explain why you choose to or not too--what statistics are you looking at and basing your decision on.
    • JUSTIFY YOUR DECISIONS.  use lots of "because" synonyms in academic writing. ;)
  • check for skewness in the data, and choose transformations accordingly.
  • then re-check the descriptives and graphs of pre- and post-transformation data (trial and error!!).

 

then walked through the regression in the second handout (health-based dataset)

 

Comments (0)

You don't have permission to comment on this page.