EPSY 6210
02.01.2010
(2nd class)
normal curves can take lots of different shapes (get rid of the "bell curve" idea)
- normal distribution is a symmetrical curve
Turned in First Homework:
here's what we learned from it...
additive constants (alter dataset by adding/subtracting a specific amount):
- mean = goes up (or down) by the same amount as the constant added/subtracted to/from the dataset
- SD = same (their variability is the same)
- SD for population = same (see above)
- skewness = same (it doesn't have stronger or weaker central tendency--same PATTERN on graph); if you have a normal (symmetrical) distribution, then your skewness should be 0)
- kurtosis = same (measures normality; may be calculated differently in SPSS or other methods)
- covariance = same
- pearson's r = same
- this is fairly clearly indicated by the scatterplot (graph)
multiplicative constants:
-
mean = multiplied by the amount of the constant
-
SD = multiplied by the amount of the constant
-
SD for population = multiplied by the amount of the constant, then absolute value (essentially; actually done by squaring) --statistical difference (spreadoutness) can't be negative.
-
skewness = same (but possibly with a different positive/negative sign; the pattern won't change, in shape, but may change in direction) (multiplying doesn't change central tendency/symmetry)
-
kurtosis = same (multiplying doesn't change normality ("shape")) (doesn't change much--it does slightly change the distribution) (???)
-
covariance = multiplied by the amount of the constant (can see in the scatterplot--the spreadoutness between the Y datapoints is larger, although the pattern is the same)
-
covariance is similar to variance (which is SD squared, or simply without the last square root step)
-
covariance = the variance of two variables, crossed together (what is the pattern of their relationship together)
-
the covariance will be affected by a change in the SD, because it does not remove the influence of SD anywhere in its formula
-
if multiplied by a negative constant, it will change direction on the scatterplot (and thus be negative covariance result)
-
pearson's r = same (because we removed the influence of the SD)
additive & multiplicative constants together:
- it changes first by the additive constant, then by the multiplicative one (so see above)
more info on these concepts:
- Z scores = remove the influence of the SD (divide by SD--that's how you remove its influence)
- the property of all Z scores: the mean = 0 and the SD = 1.
- we standardize Z scores to make that happen; this puts the scores in a standard metric
- SD = shows the datapoints relation to the rest of the dataset (gives it context and thus meaning)
- pearson r (correlation) = standardized covariance (put the scores/datapoints on the same metric--can compare different groups of data now.) -- you standardize it by removing the influence of the SDs... to put it in a broader (standardized) context.
- property of all r = ranges from -1 to +1
- because pearson's r is standardized (and covariance is not), pearson's r is used more often in journal articles--you can't compare variables using just the covariance. (Because r falls within 1 to -1, it's easy to see the r in an article and understand what that means (a strong correlation (1), a weak/non-existent correlation (0), or an inverse correlation (-1)).)
"Some people like to think about this crap." (lol)
when in doubt, graph something.
- if x = 1, 2, 3, 4
- and y = 2, 4, 6, 8
- then the sample size is 4 (four people tested on 2 variables)
- these variables are positively related (they go up together)
- graph it: a straight line, going up a tad steeply
- it's a perfect pattern; as x goes up, y doubles
- if you can do mathematical stuff to x (like multiplying by 2), and turn it into y exactly, you have a perfect relationship
- in this case, 2x = y
- in another instance, 2x + 1 = y (linear equation; because when graphed, it makes a line)
- a few cases can dramatically influence our results (on a graph, a few more datapoints can change a curvilinear pattern (best fit) to a linear pattern (line of best fit)).
this points to a major point of multiple regression: it's all about predicting a relationship between variables
- we want to hammer at x until we can find a value (prediction) as close to y as possible.
- y with a carrot on top ("y hat") = predicted score
- should be as close to y as possible IF you want a strong, positive correlation.
- if the pattern/relationship is NOT perfect (linear), then r = less strong
r = if x gets bigger as y gets bigger, then r is between 0 and +1 (positive value).
centroid (cartesian coordinate): on the graph, the plotted mean of x and the plotted mean of y
- the line of best fit MUST go through the centroid (even though it might not go through any others)
- the means of x and y are the numbers that best represent x and y... therefore the line of best fit must go thru both means.
line of best fit:
- goes thru centroid
- trying to get x as close to y as possible (better predictor, thus a more perfect line and thus closer to r = 1)
- y hat = a + bx ...so we can USE this formula to calculate y hats (predicted scores) for future x scores (no error and y hat = y? then r = 1)
- y - (y hat) = "error" = tells us how close a prediction we have
- this can be negative or positive, depending on its direction (if y hat is bigger or smaller than y)
- four variables: x, y, y hat, and error
- plot x with y hat and you get a line (because it's a linear equation)
all of our analyses (this semester) are related to each other;
- they are all correlational
- they all yield r-squared-type effect sizes
- they all apply weights to observed variables to create synthetic (unobserved) variables
- the synthetic/unobserved variables become the focus of the analysis
SPSS/PASW
-
"what's the relationship between x and y?" (pearson r)
-
"does x predict y?" (regression; this is more relevant when we have more than one predictor)
-
SPSS: analyze - descriptive stats - descriptives - calculates various descriptive stats (we've done this before)
-
tip: click/look around and try to find things (he won't always tell us how to calculate things in SPSS)
- save the file: format is ".sav" for data files (the "Daddy" file)
- SPSS (regression) = analyze - regression - linear - (y is usually dependent variable--the outcome of interest); (x is the independent variable or predictor) - statistics (pick descriptive)
- then click SAVE (y hat = predicted values, unstandardized; error = residuals, unstandardized (standardized forms transform them into Z scores))
- then click PASTE (NOT "ok") -- this builds a command file
- "Syntax Editor" (syntax or command file) opens in a new window
- can add comments to show myself what I'm doing by typing an asterisk and space, then comment, and end each comment/command with a period (* .)
- file, save as: (give it same name as the datafile--different extension); file extention = .sps (the "Mommy" file)
- need to put the Daddy & Mommy files together to get a "Baby" file
- highlight what you want to run; click the "play" arrow button on the top menu (it runs the selection)
- that produces the Output file (the "Baby" file)
- ANOVA = really our regression summary table; (regression = between; residuals = within)
- click back to the dataset; PRE = y hat; RES = error
- if you run the same regression a second time, you'll get two more variable columns, appended with "2" instead of "1" -- values will be the same, simply calculated them a second time
- it's lined up in the dataset table just like you'd calculate it: y - pres (yhat) should = res (error)
- if you want to check it graphically, graphs - scatterplot - for the variables you want to check
- --add fit line - linear - apply (ignore the curves)
TO DO:
- email Paul and ask him to put SPSS/PASW on work computer
- let Annie know what he says
- email Dr. Henson about graphical book on figuring out basic statistical concepts
- read handouts from Dr. Henson (last week, and workshop)
- do homework
- look at book from Annie / library
- look at textbook...
- start research topic -- paper / project
Comments (0)
You don't have permission to comment on this page.