SPSS Guide Page 1 of 13

Size: px
Start display at page:

Download "SPSS Guide Page 1 of 13"

Transcription

1 SPSS Guide Page 1 of 13 A Guide to SPSS for Public Affairs Students This is intended as a handy how-to guide for most of what you might want to do in SPSS. First, here is what a typical data set might look like in SPSS What: Copy variables to a new data set Why: So you can focus on analyzing the variables you care about Open the original data set which has the hundreds and hundreds of variables you re starting with. Then, go to the Start button way down in the lower left hand corner of the screen, click on it and start up a new copy of SPSS. To do this you will probably have to go through the Start/Programs/SPSS process. You will have two copies of SPSS running on your computer. You can then highlight the variables you want in the first version, copy them (Edit/Copy) click over to the second open SPSS data window, highlight the same number of columns and paste them in. What: Save your data or your output Why: So you can carry your data or your output around with you like a tiny friend From the File menu, choose Save or Save As while you are looking at either your data (if you want to save your data) or at your output (if you want to save that). Indicate the name you would like to give the file and where you would like to have it saved.

2 SPSS Guide Page 2 of 13 What: Switch from variable view to data view Why: So you can switch from seeing a list of variables to seeing your data From the View menu, choose Variables or Data. Choosing Variables will allow you to see a list of your variables, choosing Data will allow you to see all of your data. What: Select cases Why: So you can get rid of silly people who didn t answer the survey correctly or you can focus on one sub-group of your data set, such as left-handed men From the Data menu, choose Select Cases. Click on the dot next to If condition is satisfied then click on the button just below If condition is satisfied. From there, select the variables which will determine which cases (a.k.a. observations) you want buy highlighting them in the left hand side of the window and clicking the arrow to bring them to the large rectangle on the right hand side of the window. Construct the proper conditions in the window. For example, if you want to look only at people who actually reported a non-negative age, you would click on the Age variable, click on the button pointing to the right and type >=0 next to it, to indicate you only want cases where Age>=0. You may have multiple conditions using the & character, so you could choose only cases of males between the ages of 30 and 40 who are left handed. After your conditions are specified, click on the Continue button, indicate that you want to have the Unselected Cases filtered and click OK. If you then look at your data, you should see lines through the row numbers for cases you have filtered out.

3 SPSS Guide Page 3 of 13 What: Compute new variables based on old ones Why: So you can do things like calculate the percentage of the vote from each Florida county what went for each candidate From the Transform menu, choose Compute. In the box Target Variable type the name of the new variable you want to generate, remembering that you re limited to an eight character name. In the box below Target Variable click on the variables you want to use to generate your new variable and use the arrow button to move them to the Numeric Expression box. In the Numeric Expression box, assemble the formula for your new variable. For example, if you wanted the total number of votes from each Florida county, it would the number of votes for the first candidate plus the number of votes for the second candidate and so on. Then, click OK. If you go back to your data, you should see the new variable you created in the column furthest to the right. What: Generate frequency tables Why: So you can see absolute or relative frequencies of different categories of a qualitative variable From the Analyze menu, choose Descriptive Statistics and then Frequencies. Click on the variables you want and use the arrow button to move them to the right hand side of the window. You may use the Statistics button to choose what statistics to generate for your chosen variables and the Charts button to choose what kind of charts to generate, if you choose to do so. For qualitative data, I would seriously suggest making Bar Charts, which I will also describe below. Then, click OK. In the output window, you should see the results. At this point you should click on the title of your output in the output window and add some descriptive notes.

4 SPSS Guide Page 4 of 13 Frequencies Statistics BRAND PART N Valid Missing 0 0 Frequency Table BRAND Valid C N P S Total Cumulative Frequency Percent Valid Percent Percent Valid BC BL C F P S SP WBC Total PART Cumulative Frequency Percent Valid Percent Percent What: Generate descriptive statistics Why: To describe quantitative variables From the Analyze menu, choose Descriptive Statistics and then Descriptives. Click on the variables you want and use the arrow button to move them to the window on the right. Then click on OK. In the output window, you should see the results. At this point you should click on the title of your output in the output window and add some descriptive notes. What: Generate a crosstab table Why: To see the relative frequencies of occurances of pairs of qualitative variables From the Analyze menu, choose Descriptive Statistics and then Crosstabs. By clicking on the variables and then using the arrow menu, indicate which variable you want in rows and which in columns and then click the OK button. In the output window, you should see the results. At this point you should click on the title of your output in the output window and add some descriptive notes.

5 SPSS Guide Page 5 of 13 What: Do a hypothesis test Why: To see if the mean value of a variable is significantly different for two groups From the Analyze menu, choose Compare Means and Independent Samples T-test. Click on the variable whose mean you want to compare across groups and use the arrow button to move it to the Test Variable(s) window. Then, click on the variable which will define the different groups and use the other arrow button to move it to the Grouping Variable window. You will probably also need to click on Define Groups to tell SPSS which values of the grouping variable will put observations into the first and second group in order to be compared. Click OK. In the output window, you should see the results. At this point you should click on the title of your output in the output window and add some descriptive notes. What: Generate correlation coefficients Why: To examine the relationships between quantitative variables From the Analyze menu, choose Correlate and then Bivariate. Click on the variables you want to calculate correlation coefficients for and use the arrow key to move them to the Variables box. You should probably stick with the Pearson version of this. If you like, you can get covariances by clicking on the Options button. Then click OK. In the output window, you should see the results. At this point you should click on the title of your output in the output window and add some descriptive notes. What: Generate a histogram Why: To show the distribution of a quantitative variable From the graphs menu, choose Histogram. Indicate the variable you want to examine and use the arrow button to move it to the Variable box. Then click OK. In the output window, you should see the results. At this point you should click on the title of your output in the output window and add some descriptive notes. What: Generate bar charts Why: To show the frequencies or relative frequencies of different values of a qualitative variable From the graphs menu, choose Bar and then Simple. Please avoid the more complex versions unless you have a darn good reason. Click on the Define button. Click on the qualitative variable you want to examine and then use the arrow button to move it to the Category Axis box. You may also indicate what you want the bars heights to represent. Then, click on OK. In the output window, you should see the results. At this point you should click on the title of your output in the output window and add some descriptive notes.

6 SPSS Guide Page 6 of 13 Histogram of Parts Count 0 BC BL C F P S SP WBC PART What: Generate scatter plots Why: To show the relationship between pairs of quantitative variables From the graphs menu, choose Scatter. Choose Simple and then click on the Define button. Click on the variable you want on the y-axis and use the arrow button to move it to the Y Axis box. Click on the variable you want on the x-axis and use the arrow button to move it to the X Axis box. Click on the OK button. In the output window, you should see the results. At this point you should click on the title of your output in the output window and add some descriptive notes.

7 SPSS Guide Page 7 of 13 Everything You Always Wanted to Know About Regressions But Were Afraid to Ask 1. What is a regression? A regression is a statistical way of estimating a relationship between a dependent variable and one or more explanatory or independent variables. For example, you may believe that a person s annual health care expenditures depend on their age. Annual health care expenditures are the dependent variable and age is the explanatory variable. The equation you would like to estimate is a linear relationship between health care expenditures and age: EXPS = β 0 + β 1 AGE If you have observations on a large number of peoples ages and annual health care expenditures, then you can use regression analysis to estimate the values of β 0 and β 1 in the above equation. Using the MEPS data, this relationship is estimated as EXPS = *AGE (EXPS is Total Health Care Charges, 96, Excl. RX) This means that the predicted health care expenditures for a person of age zero is $ and that expected health care expenses increase by $ as a person ages by one year. For example, the predicted health care expenses for a person of age 30 would be EXPS = *30 = If you wanted to include gender in this, you might generate a male dummy variable and estimate the equation EXPS = β 0 + β 1 AGE + β 2 MALE Again from the MEPS data, this is estimated as EXPS = *AGE *MALE Suggesting that expenses rise by about $ per year of life and that men have expenses of about $ more than women, correcting for age. The predicted expenses for a 30 year old male would be $

8 SPSS Guide Page 8 of What do these things look like? Basically, regression analysis lets you draw the best possible line through some data. To show this in two dimensions (one explanatory variable) consider the following diagrams:

9 SPSS Guide Page 9 of 13 A regression would allow you to find the best value for the intercept and the slope of a straight line through the data:

10 SPSS Guide Page 10 of O.K., lets cut to the chase. How would I do one of these in SPSS? Choose Analyze/Regression/Linear

11 SPSS Guide Page 11 of 13 Then specify your dependent variable and explanatory variable or variables: Don t do anything fancy at first, just do a straight linear regression with your quantitative dependent variable and quantitative or dummy explanatory variables

12 SPSS Guide Page 12 of Splendid, now how do I read the output? Here s the output from the regression (EXPS regressed on AGE) above Model 1 Model Summary Adjusted Std. Error of R R Square R Square the Estimate.159 a a. Predictors: (Constant), AGE-RD1 (EDITED/IMPUTED) Model 1 Regression Residual Total ANOVA b Sum of Squares df Mean Square F Sig. 1.85E E a 7.10E E a. Predictors: (Constant), AGE-RD1 (EDITED/IMPUTED) b. Dependent Variable: TOTAL HEALTH CARE CHARGES 96, EXCL. RX Model 1 (Constant) AGE-RD1 (EDITED/IMPUTED) Coefficients a Unstandardized Coefficients Standardi zed Coefficien ts B Std. Error Beta t Sig a. Dependent Variable: TOTAL HEALTH CARE CHARGES 96, EXCL. RX The highlights are: From the Model Summary Table 1. R Square This tells you how much of the variation in the dependent variable is explained by the model. Basically, it s a measure of the explanatory power of the model and ranges from (total crap) to (suspiciously perfect). It is generally true that, for a given dependent variable, the higher the R Square the better. There is no standard for what a good R Square number is, but the (or 2.5%) we have here is pretty much crap and suggests that age does a poor job of explaining expenses. 2. Adjusted R Square

13 SPSS Guide Page 13 of 13 This is the R Square number adjusted for the number of explanatory variables. Adding an explanatory variable, even if that variable is unrelated to the dependent variable, will always increase the R Square, but the adjusted R Square will let you know whether the new variable really improved the model. If the adjusted R Square rises when a new explanatory variable is added, that new variable may well belong in the model. From the ANOVA Table: 1. Sig. This is the p-value for a hypothesis test in which the null hypothesis is, This model explains none of the variation in the dependent variable. Put more clearly, the null hypothesis is, This model is total crap. We get a value of here, meaning that we reject the null and the model is not total crap. It s not great (see the low R Square) but it s not total crap. From the Coefficients Table: 1. B These are the estimated coefficients attached to the explanatory variables. See the above regressions for illustrations of what this means. 2. t These are t-stats for a hypothesis test in which the null hypothesis is that the coefficient on this variable is zero. 3. Sig. These are p-values for a hypothesis test in which the null hypothesis is that the coefficient on this variable is zero. Small values here mean that the explanatory variable has a significant impact on the dependent variable. In this case, the p-value attached to AGE is 0.000, so AGE has a significant impact on health care expenditures.