walking through what we did on the Midterm.

confidence affects us even when performing this kind of study... knowledge of what we don't know sometimes paralyzes us from doing what we do know.

**MISSING DATA**

why can we plug in the mean (EM) to replace missing values? it's our best shot at missing data.

thus, if it's a dichotomous variable, then put in the MODE in all missing values--it's your best shot. 75% chance of being right is better than 50% or less.

could do a t-test between existing values and imputed values (for missing data).

some people spend all their lives debating what is best to do with missing data.

it's a huge thing, and it affects almost any researcher who works with data--there is always something missing.

the larger the data set, the more chance some of it will be missing or unable to be gathered.

if there's nothing systematic about the missing data, and there isn't a large percentage missing, then just listwise delete it.

**NORMALITY, scatterplots, etc. of variables**

many tried transforming data, but no one found a big change.

mosaic had some kurtosis problems, not big

(don't make transformations on categorical data!)

(was supposed to treat faed, maed as continuous variables)

**+/- 1 skewness & kurtosis**, that's pretty close to normal. don't worry unless it's quite a bit larger than one in either direction.

(the problem has to be large enough to justify transforming your data metric!!!)

**HOMOSCEDASCITY / REGRESSION**

** **

ran the multiple regression with all predictor variables.

look at the graph of standardized y-hats and errors.

options--click for** casewise diagnostics** in SPSS -- will flag cases that have weird scores

case # 35, case #298 are both flagged

(case 35 is more of an issue, 298 is not)

**z-scores** and **standardized residual** (error scores): z scores have a mean of 0 and a standard deviation of 1.

therefore, if the **standardized residual** is **-5**, the difference between the y-hat (predicted score) and the specific obtained score is **5 standard deviations** below the **mean**. (formula for residual is y - y-hat.)

it's an ABSTRACT, but EASIER TO GRASP way to look at relationships between data points / cases.

all the information on the predictors say that person SHOULD have done better on the math achievement test. (it stands out as an odd case.)

we don't know WHY, but we can see the result.

it could be legitimate data, but that person just doesn't tend to fit the model for whatever reason.

you can **DROP oddballs** like this and **test **how this affects your model.

if there is a **score THAT far out there**, you at least need to **take a LOOK** at it and see what's going on in that data.

if you drop it, your R-squared **goes up by 3%**--going up 3 percentage points sounds small, but it takes your R-squared **from .579 to .6**--that **IS** a **BIG DEAL**. it's much more generalizable now.

document it and tell the reader: this one case was dropped because it had a (large?) residual. try to describe this in only one sentence: tell the reader what you're doing in the most efficient way possible.

(data was homoscedastic)

**REGRESSION**

1) do we have anything?

ALWAYS say **"statistically significant"** or drop the significant word altogether and say "this has a statistical relationship of..." -- you will probably have to report statistical significance, but put the primary focus on the **effect size.**

if your **R-squared** is **close **to your **Adjusted-R-squared**, what does this tell you?

indication that there is minimal sampling error in our sample.

("...minimal shrinkage due to a theoretical correction for sampling error.")

in this case, the sampling error is our large number of predictors--but we have a large sample.

that gives me greater confidence that my sample may be representative.

(YES, we have something.)

2) where does it come from?

you can group your research questions into **GROUPS** to look for results.

"grades" and "math grades" are also secondary variables (look at your research questions).

point out that mosaic isn't a very good predictor (according to beta weights)--this affects your research question of how visual-spatial predicts mathach (since mosaic is half of that).

prior coursework is awfully predictive of math achievement (in general).

visual ability is awfully predictive of math achievement.

are they equally predictive? or relationships?

relationship between visual ability and geometry: mildly related.

all these variables are related to each other (that is part of the problem with our not-entirely-clear results).

what does it mean that visual has a **very high beta**, and a **moderate structure coefficient**?

(it's a case 3 regression, but that's not what we are looking for we don't need to frame it in these terms in the paper)

-- it's getting the largest beta weight, even though it does NOT explain the most variance (that's what structure coefficients tell us, who explains the most variance).

-- therefore, it looks like visual isn't SHARING the variance it does explain--therefore the variance it DOES explain is UNIQUE.

if the **SUM** of the **structured square coefficients** equals **MORE THAN 1**, then you know that **your beta weights DO NOT describe variance with respect to ALL the variance that predictor can actually predict** (some of the beta weights are lower than the actual contribution of that predictor).

WHERE do beta weights come from?

when we look at the variance explained in Y in the AREA WORLD example, to the amount of variance being CREDITED to that predictor (and the SQUARE ROOT of that, since beta is in SCORE WORLD).

**summarizing all this:**

beta weight for visual is the largest beta weight, so it's getting the most credit in the regression. however because it's SSC is moderate instead of large, it is not the most (large variance?).... therefore at least some of the variance visual does explain is unique to just visual, and is different from the variance that can be explained by prior coursework.

both visual and prior coursework are important predictors, but visual also explains variance that prior coursework does not.

(Dr. Henson will probably hand out a version of his writeup to us later.)

**<< CLASS BREAK >>**

## Internal Replicability

preface with "hierarchical decision strategy" handout concepts.

(you can't interpret something that isn't there.)

can also use replicability to determine if something is there.

**what is science?**

- we want to identify principles in the world that we can use to help answer questions, based on observations that are repeatable.
- if the results are not replicable, then my question does not apply to anyone else in the future (generalizability).

we emphasize alpha (.05) so much because we are so concerned with Type I Error (rejecting the null hypothesis of NO effect when you SHOULDn't)--because that's what can discredit you as an academic.

if your case is unique and non-replicable for other reasons, is it maybe due to sampling error?

internal replicability gives us another way to look at potential sampling error--without spending the time, money, effort truly re-forming the study again. not as good as an external replicability, but better than nothing.

**if you can estimate the amount of sampling error in your study, you can adjust the amount of confidence you have in the generalizability of your results.**

(this will apply to a homework assignment we have in a few weeks.)

### 1) Jacknife

- small sample of 20 people. (probably a lot of sampling error; small sample.)
- randomly drop a person from the sample; run the analysis again (regression; comparing R-squared is one of the easy methods of comparison).
- now put that person back, and drop a different person; run the regression again (compare the R-squared to the previous one).
- do this for all possible groups of 19 in that sample of 20.
- create a frequency or histogram of all the R-squared values; it is NOT a distribution of SCORES; it is hopefully close to a
**sampling distribution of the mean**. (theoretical distribution, ??? distribution)--MANY NAMES! it's IMPORTANT!!
- want to know if the mean of your sample is different from the mean of your population--but the mean of the population is usually UNKNOWN.
- different samples would help, but you can't always test a second, different, sample.
- jacknifing is
*theoretically* sampling a bunch of different means (but *really* sampling artificial sub-samples of our original sample).
- in this, statistically significant difference deals with
**how likely **one sample mean is compared to other potential sample means.
- if we take sub-samples, these are empirical samples, producing an
**empirical distribution** (not a TRUE distribution or sample).
- the
**SMALLER the standard deviation**, the **more CONFIDENT** you are in the results (the less variance, the more similar all data is to the mean).
- t-test between two groups means, and divide by the
**standard error** (SD of the sample of the square root of N).
- the standard error is the standard deviation of all the theoretical means (all the sample means contained in the population)--the formula is a mathematic way to estimate this, since we can't actually take all samples from the population.
- the reason we do the
**JACKKNIFE** thing is to estimate the sampling error in our study. How much sampling error we have indicates the likelihood of us obtaining/replicating that result. if **ONE PERSON** has a big effect on the sampling distribution, then that tells us they are a big part of the effect and they need to be looked at separately.

### 2) Bootstrap

- similar thing, approached differently.
- what kind of sampling do we assume, usually? RANDOM sampling.
- assume same n = 20 regression analysis as above... instead of dropping one person from a sample...
- we will copy that sample and their data and paste it--doubled the same data.
**Concatenating the sample** into a mega-sample.
- end up with n = 1,000 (but built off the same 20 people); now you'll
**sample **from that file **with replacement.** (you have potential to draw out the same person each time--TRUE probability.)
- sample with replacement multiple times, and
**form a sub-sample** of n = 20. (even possible that all 20 scores are the same person; impossible to say).
- then form another sub-sample, another, another, etc.
- plot all the
**effect sizes** (R-squared) in a **histogram **(similar to above).
**no limit on how many times you can re-sample** from it--usually very high, like 1,000 times.
- what would it tell us if the standard deviation is very small in that distribution?
**LESS sampling error.**
- sampling distribution of R-squared's. (since they are a distribution, you can use confidence intervals just like in usual kinds of distributions.)

### 3) Double Cross-Validation

- this is what the HOMEWORK ASSIGNMENT is on.
- it's the weakest of the three (less robust), but the easiest to do.

## Comments (0)

You don't have permission to comment on this page.