Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Exploring Parametric Statistics: Assessing Normality and Homogeneity of Variances, Study notes of Statistics

Instructions on how to use R packages and techniques to assess the normality and homogeneity of variances for parametric statistics. It covers graphical displays, skewness and kurtosis, and statistical tests such as Shapiro-Wilk and Levene's test. The document also includes examples using the Festival dataset.

Typology: Study notes

2021/2022

Uploaded on 09/12/2022

kiras
kiras 🇬🇧

4.7

(21)

293 documents

1 / 34

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
http://www.pelagicos.net/classes_biometry_fa18.htm
Parametric Statistics:
Exploring Assumptions
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22

Partial preview of the text

Download Exploring Parametric Statistics: Assessing Normality and Homogeneity of Variances and more Study notes Statistics in PDF only on Docsity!

http://www.pelagicos.net/classes_biometry_fa18.htm

Parametric Statistics:

Exploring Assumptions

  • Reading - Field: Chapter

Exploring Assumptions

  • Assumptions of parametric tests

based on the normal distribution

  • Aim of this chapter:
  • Quantify the assumption of normality o Graphical displays o Skew o Kurtosis o Normality tests
  • Quantify the homogeneity of variances (when dealing with 2 or more samples): Levene’s test

Assessing Normality

  • We do not have access to sample the entire

biological population, so we test the observed data

    1. Central Limit Theorem
    • If N < 25, sampling distribution rarely normal
    1. Graphical Displays
    • Histogram
    • Q-Q plot
    1. Skewness / Kurtosis (point estimate +/- SE)
    • Do they overlap with 0? (normal distribution)

Assessing Normality - Graphically

Characteristics of Normal Distributions Unimodal, Symmetrical, Bell-shaped

Assessing Normality - Graphically

Comparing observations against a cumulative

normal distribution (same mean and S.D.)

Example: Festival Data Set

Biologist worried about potential health effects of music festivals. Measured hygiene of 810 concert-goers over the three days of a music festival. Hygiene measured using standardized index (from 0 to 4): 0 = you smell terribly 4 = you smell beautifully Import Download Festival Data (MusicFestival.xlsx) For ease of use, rename the Data Set “Festival”

Festival <- DownloadFestival

Explore Data Graphically: RCmdr

day1 day2 day histogram density

Graphs in Rcmdr – Quantiles

The solid red line is the expected pattern a normal distribution with the same mean and SD and the sampled data. Points outside of the dashed line envelope suggest significant deviations day

Graphs in Rcmdr – Quantiles

day 2 day 3

Note: The straight line represents the

expected pattern for a normal distribution

Explore Festival Data Set

We can also explore the summary statistics describing the three datasets (day1, day2, day3) using RCmdr: What statistics would you use to assess data normality? NOTE: multiple datasets can be analyzed at once

Explore Festival Data Set

Exploring the summary statistics describing the three datasets (day1, day2, day3) using RCmdr:

numSummary(Festival[,c("day1", "day2", "day3"), drop=FALSE], statistics=c("mean", "quantiles", "skewness", "kurtosis"), quantiles=c(.5), type="2") mean skewness kurtosis 50% n NA day1 1.7933580 8.865312 170.4502658 1.79 810 0 day2 0.9609091 1.095226 0.8222057 0.79 264 546 day3 0.9765041 1.032868 0.7315003 0.76 123 687

Further Explore Festival Data Set

Exploring additional datasets using other functions: stat.desc() function in psych package

stat.desc(Festival$day1, basic = FALSE, norm = TRUE) basic argument: Basic statistics included if TRUE (Note: FALSE is the default) norm argument: Statistics relating to normal distribution included if TRUE (Note: FALSE is the default)

Further Explore Festival Data Set

stat.desc(Festival$day1, basic = FALSE, norm = TRUE) median mean 1.790000e+00 1.793358e+ SE.mean C.I.mean.0. 3.318617e- 02 6.514115e- 02 var std.dev 8.920705e- 01 9.444949e- 01 coef.var

5.266627e- 01