






































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The steps to perform statistical analysis on experimental data using IBM SPSS Statistics software. The analysis includes calculating cumulative probabilities, performing simple linear regression and correlation, and conducting a Chi-Square Test of Independence. The data used for analysis includes variables such as protein, number, cum_bin, bin_prob, cum_pois, pois_pro, cum_norm, int_norm, inv_norm, s_ns, cadmium, lmci, umci, lny, education, policy favored, count, m, mexposed, and munexposed.
What you will learn
Typology: Schemes and Mind Maps
1 / 46
This page cannot be seen from the preview
Don't miss anything!
SPSS Help. SPSS has a good online help system. Once SPSS is up and running, you can find it by going to Help>Topics in the menu bar, i.e., click Help in the menu bar and then click Topics in the drop window that opens. You will now be in the help contents window. Click Tutorial.
*Brother Walter Schreiner, FSC (July 1, 2011)
You can then open any of the books comprising the tutorial by clicking on the + to get to the various subtopics. Once in a subtopic is open, you can just keep clicking on the right and left arrows to move through it page by page. I suggest going through the entire Overview booklet. Once you are working with a data set, and have an idea of what you want to do with the data, you can also use the Statistics Coach under the Help menu to help get the information you wish. It will lead you through the SPSS process. Using the SPSS Data Editor. When you begin SPSS, you open up to the Data Editor. For our purposes right now, you can learn how to do this by going to Help>Tutorial>Using the Data Editor, and then working your way through the subtopics. The data we will use is given in the table below, with the numbers indicating total protein (μg/ml). ` 76.33 77.63 149.49 54.38 55.47 51. 78.15 85.40 41.98 69.91 128.40 88. 58.50 84.70 44.40 57.73 88.78 86. 54.07 95.06 114.79 53.07 72.30 59. 76.33 77.63 149.49 54.38 55.47 51. 59.20 67.10 109.30 82.60 62.80 61. 74.78 77.40 57.90 91.47 71.50 61. 106.00 61.10 63.96 54.41 83.82 79. 153.56 70.17 55.05 100.36 51.16 72. 62.32 73.53 47.23 35.90 72.20 66. 59.76 95.33 73.50 62.20 67.20 44. 57. For our data, double click on the var at the top of the first column or click on the Variable View tab at the bot- tom of the page, type in ``protein" in the Name column, and hit Enter. Under the assumption that you are going to enter numerical data, the rest of the row is filled in. Changes in the type and display of the variable can be made by clicking in the appropriate cells and using any buttons given. Then hit the Data View tab and type in the data values, following each by Enter. Save the file as usual where you wish under the name protein.sav. You just need type protein. The suffix is attached automatically.
Click the Statistics... button, then make sure Descriptives and Percentiles are checked. We will use 95% for Confidence Interval for Mean. Click Continue. Then click Plots.... Under Boxplots, select Factor levels together, and under Descriptive, choose both Stem-and-leaf and Histogram. Then click Continue.
Then click OK. This opens an output window with two frames. The frame on the left contains an outline of the data on the right.
Now click on a number on the horizontal axis and then click on Number Format. In the diagram to the left below, we see that we have 2 decimal places. The values in this window can be changed as desired. Next, click on one of the bars and then Binning in the Properties window. Suppose we want bars of width 20 beginning at
Then click back to Data View. From the menu, choose Transform>Compute Variable.... When the Compute Variable window comes up, click Reset, and type cum_bin in the box labeled Target Variable. Scroll down the Function group: window to CDF & Noncentral CDF to select it, then scroll to and select Cdf.Binom in the Functions and Special Variables: window. Then press the up arrow. We need to fill in the three arguments indicated by question marks. The first is the x. This is given by the number column. At this point, the first question mark should be highlighted. Click on number in the box on the left to highlight it, then hit the right arrow to the right of that box. Now highlight the second question mark and type in 15 (our n), and then highlight the third question mark and type in .75 (our p). Hit OK. If you get a message about changing the existing variable, hit OK for that too. The cumulative binomial probabilities are now found in the column cum_bin. Now we want to put the indi- vidual binomial probabilities into the column bin_prob. Do basically the same as the above, except make the Target Variable “bin_prob,” and the Numeric Expression “CDF.BINOM(number,15,.75) - CDF. BINOM(number-1,15,.75).'' The Data View now looks like the table at the top of the next page, with the cumulative binomial probabilities in the second column and the individual binomial probabilities in the third coloumn.
Poisson Distribution. Let us assume that l =.5. We will first find P(X ≤ x | .5)for x = 0, ..., 15, i.e., the cumula- tive probabilities. First put the numbers 0 through 15 in a column of a worksheet. (We have already done this above. Again, you only need to enter the numbers whose cumulative probability you desire.) Then click Vari- able View, type in number (we have done this above and the name you choose is optional) under Name, and I suggest putting in 0 for Decimal. Still in Variable View, put the names cum_pois and pois_pro in new rows under Name, and set Width to 12 , Decimal to 10 , and Columns to 12 for each of these. Then click back to Data View. From the menu, choose Transform>Compute Variable.... When the Compute Variable window comes up, click Reset, and type cum_pois in the box labeled Target Vari- able. Scroll down the Function group: window to CDF & Noncentral CDF to select it, then scroll to and select Cdf.Poisson in the Functions and Special Variables: window. Then press the up arrow. We need to fill in the two arguments indicated by question marks. The first is the x. That is given by the number column. At this point, the first question mark should be highlighted. Click on number in the box on the left to highlight it, then hit the right arrow to the right of that box. Now highlight the second question mark and type in .5 (our l). Then hit OK. If you get a message about changing the existing variable, hit OK for that too. The
The probability is now found in the column cum_norm. Staying with the normal distribution with mean 100 and standard deviation 20, suppose we with to find P(90 ≤ X ≤135). Do as above except make the Target Variable “int_norm,” and the Numeric Expression “CDF. NORMAL(135,100,20) - CDF.NORMAL(90,100,20).” The probability is now found in the column int_norm. Continuing to use a normal distribution with mean 100 and standard deviation 20, suppose we wish to find x such that P(X ≤ x) = .6523. Again, do as above except make the Target Variable “inv_norm,” and the Numeric Expression “IDF.NORMAL(.6523,100,20)” by choosing Inverse DF under Function Group: and Idf.Normal under Functions and Special Variables:. The x-value is now found in the column inv_ norm. From the table below we see that for the normal distribution with mean 100 and standard deviation 20, P(X ≤ 135) = .9599 and P(90 ≤ X ≤ 135) = .6514$. Finally, if P(X ≤ x) = .6523, then x=107.8307.
A Single Population Mean. We found earlier that the sample mean of the data given on page 2, which you may have saved under the name protein.sav, is 73.3292 to four decimal places. We wish to test whether the mean of the population from which the sample came is 70 as opposed to a true mean greater than 70. We test H 0 : m = 70 H a: m > 70. From the menu, choose Analyze>Compare Means>One-Sample T Test. Select protein from the left-hand window and click the right arrow to move it to the Test Variable(s) window. Set the Test Value to 70. Click on Options. Set the Confidence Interval to 95% (or anyother value you desire). Then click Continue followed by OK. You get the following output.
and again press Add. Then hit OK and complete the Variable View as follows. Returning to Data View gives a window whose beginning looks like that below. Now we wish to test the hypotheses H 0 : m 1 - m 2 = 0 H a: m 1 - m 2 ≠ 0 where m 1 refers to the population mean for the non-smokers and m 2 refers to the population mean for the smokers. From the menu, choose Analyze>Compare Means>Independent-Samples T Test, and in the window that comes up, move cadmium to the Test Variable(s) window, and s_ns into the Grouping Variable window. Notice the two questions marks that appear. Click on Define Groups..., put in 1 for Group 1 and 2 for Group 2.
Then click Continue. As before, click Options..., enter 95 (or any other number) for Confidence Inter- val, and again click Continue followed by OK. The first table of output gives the descriptives. To get the second table as it appears here, I first double-clicked on the Independent Samples Test table, giving it a fuzzy border and bringing us into the table editor, and then chose Pivot>Transpose Rows and Columns from the menu. In interpreting the data, the first thing we need to determine is whether we are assuming equal variances. Lev- ene's Test for Equality of Variances is an aid in this regard. Since the p -value of Levine's test is p =. for a null hypothesis of all variances equal, in the absense of other information we have no strong evidence to
The first output table gives the descriptives and a second (not shown here) gives a correlation coefficient. From the third table, which has been pivoted to interchange rows and columns, we see that we have a t -score of 12.740. The fact that Sig.(2-tailed) is given as .000 really means that it is less than .001. Thus, for our one-sided test, we can conclude that p < .0005, so that in almost any situation we would reject the null hypothesis. We also see that the mean of the weight losses for the sample is 22.5889, with a 99% Confidence Interval of the Difference (the mean weight loss for the population from which the sample was drawn) being (16.6393, 28.5384).
For data, we will use percent predicted residual volume measurements as categorized by smoking history. Never 35, 120, 90, 109, 82, 40, 68, 84, 124, 77, 140, 127, 58, 110, 42, 57, 93 Former 62, 73, 60, 77, 52, 115, 82, 52, 105, 143, 80, 78, 47, 85, 105, 46, 66, 95, 82, 141, 64, 124, 65, 42, 53, 67, 95, 99, 69, 118, 131, 76, 69, 69 Current 96, 107, 63, 134, 140, 103, 158 We will place the volume measurements in the first column and the second column will be coded by 1 = “Nev- er,” 2 = “Former,” and 3 =”Current.” The Variable View looks as below. We test to see if there is a difference among the population means from which the samples have been drawn. We use the hypotheses
H 0 : mN = mF = mC H a: Not all of mN, mF, and mC are equal. From the menu we choose Analyze>Compare Means>One-Way ANOVA.... In the window that opens, place volume under Dependent List and Smoker[smoking] under Factor. Then click Post Hoc... For a post-hoc test, we will only choose Tukey (Tukey's HSD test) with Signifi- cance Level .05, and then click Continue.