| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

EPSY6420

Page history last edited by Starr Hoffman 15 years, 2 months ago
  • first class meeting
  • second class meeting
  • project abstract
  • next class meeting...

 

    * reminder about using LaTeX (see class page on Blackboard or Google it)

 

 

Class Instruction Assignment

    * where to find a dataset?

 

    * possible to collect data

 

    * better to use an existing dataset (there are lots around: IPEDS, NSF, etc.)

 

          o difficult to do analyses if you don't have student-level data

 

          o need to take a course in multi-level or hierarchical modelling to know how to use multi-level data

 

          o example: states to school districts to schools to teachers/classes to students (5 levels of data)

 

          o make sure your data supports your research question (appropriate data level)

 

    * dataframes-- if having problems with this, read Ch. 4 of the R Book (p. 107).

 

          o a matrix, with variable names at the top

 

          o correct vs. incorrect dataframes: see pg. 108

 

    * 1 hour 15 minutes at most

 

    * be focused

 

    * brief abstract and topic before next class (DO THIS ON THURSDAY/FRIDAY!!!)

 

    * be sure the class knows everything you did

 

          o give the steps, including organizing the data

 

          o can leave problems open for discussion--give hints and have the class try to figure it out

 

          o explain where your dataset came from, what your research question is

 

          o output is very important

 

          o have class generate output; have them alter the output if there is time

 

          o try to get us into chapter 5 of the R Book (READ THIS for next week--graphics chapter)

 

    * NOT WORKING ON MAC FOR SOME REASON -- CHECK THIS

 

    * copy the .txt file content, paste into R console (NOTE FOR MACS: take out all > and + at beginning of lines)

 

    * to display all colors, type in R console: "color()"

 

    * back to presentation...

 

          o descriptive stats first

 

          o then do statistical inference

 

    * Dr. Hull will assess us based on a form (see handout) (see presentation)

 

          o if have a reading for the class, assign it in advance (week early--email to group in Blackboard)

 

          o lecture is used to: summarize scattered material, adapt to audience, provide structure, motivate, model ways of approaching a problem

 

          o helps show everyone the problems you ran into, and how you solved them

 

    * improving your lecture

 

          o get attention with pitch, gestures, movement, visuals, etc.

 

          o help the class: analyze material, formulate a problem, develop hypothesis, etc. (see presentation)

 

    * intro, body, conclusion (see presentation)

 

    *

 you can include *more* in your paper than you do in your lecture (additional sttiscal considertions, etc.)

GPower -- free software --go to GPower 3 website

 

-- show statistical power, effect sizes

 

-- larger effects can be shown in small groups--smaller effects need larger-sized groups to be exhibited

----------------------------

(meeting with groups: I'm in GREEN group)

to post to group: PDF or Word file:

 

- name:

 

- research question

 

- abstract (less than 250 words)

 

- group: green

-- dataset available

 

characteristics of that dataset; how many subjects, what about subjects, what variables

 

(include in the abstract)

 

describe how your procedure addresses your research question

each group need to nominate one person to begin next week

 

can start creating semester schedule tonight

----------------------------

GROUP DISCUSSION

the important part is selecting a dataset (it's MORE about R than it is about the DATA)

I'm going on March 3, topic to be decided

---------------------

don't use spaces for data in each column--use periods or another character

 

spaces make R jump to another column (see worms2.txt for an example)

 

create file in Excel, save as .csv comma separated values file

need to type in the file pathname (complete with drive and all; in Windows you can get this by copy/pasting in Properties)

in R console, we're going to assign the file pathname a title of "worms"

worms<-read.table("PATHNAME",header=T,row.names=1)

to find the pathname, i went to Finder, clicked on the file, then clicked to the right on "more info" and it comes up under  "general: where"

 

put an extra slash in every area there is a slash (to tell R not to divide something)

 

then hit enter (nothing will happen, that's GOOD)

 

then if you type in "worms" it will display the data (table)

-- things ARE case-sensitive in R --

 tell R to attach the file to that name:

attach(worms)

 

names(worms)

this will bring up the column names (which are the variable names)

 

to get descriptive stats, type:

summary(worms)

if you have a long statement, hit + to continue on the next line--thus, don't use + for anything else!!!

 

hit the "ESC" key to get rid of the plus and get back to the carrot (R book page 9)

up arrow brings up the last thing you typed (hit it multiple times to go back further--you can edit the statements)

------------

Central Tendency

yvalues -- just like with worms, connect the name and the file

attach()

 

names()

 

summary

 

mean()

 

hist()

 

sd()

 

pie()

 

plot()

sd = standard deviation

 

pie = piechart

--------------------

open the file Simulation (2).R  using TextEdit (to view)

 

copy and paste code into the R console

-- How to IMPUTE MISSING DATA --

SPSS is good for organizing things, but it's not good for others

putting the mean in for missing data changes your variance -- that's bad

 

regression imputation -- predicts from a regression line

 

multiple imputation is even more powerful -- asks what kind of analysis you want to run, and pools five different outcomes to generate the maximum likelihood

install a package in R, then load it

 

"packages" menu

 

pick a mirror site

 

install "Amelia"

 

go back up to "packages" and refresh list, then click on "not loaded" to load it

 

then type (when it's done):

 

AmeliaView()

 

 

 

browse:  cdmse class version .CSV file

 

click "load data"

click "variables" to look at them--select "ID variables" for subjectID , sex, and group

 

the rest of the "items" are fine

 

click "ok"

 

then click "run Amelia"

Google Amelia software website, documentation is there on how to impute missing data

Comments (0)

You don't have permission to comment on this page.