I think the Shapiro-Wilk test is a great way to see if a variable is normally distributed. This is an important assumption in creating any sort of model and also evaluating models.

Let’s look at how to do this in R!

shapiro.test(data$CreditScore)

And here is the output:

Shapiro-Wilk normality test
data: data$CreditScore
W = 0.96945, p-value = 0.2198

So how do we read this? It looks like the p-value is too high. But it is not. The threshold for the p-value is 0.05. So here we fail to reject the null hypothesis. We don’t have enough evidence to say the population is not normally distributed.

Let’s make a histogram to take a look using base R graphics:

I should advise to discuss Royston’s extension of SW test if the sample size is smaller than 50.

Shapiro-Wilk test and Anderson-Darling test have better power for a given significance compared to Kolmogorov-Smirnov or Lilliefors test (an adaptation of the Kolmogorov–Smirnov test)

Often normality tests are applied to independent variables (predictors) although most statistical models, like regression analyses, make no strong assumptions regarding predictors, but rather strongly regarding Differences (Bland&Altman plot) or Residuals.

I should advise to discuss Royston’s extension of SW test if the sample size is smaller than 50.

Shapiro-Wilk test and Anderson-Darling test have better power for a given significance compared to Kolmogorov-Smirnov or Lilliefors test (an adaptation of the Kolmogorov–Smirnov test)

Often normality tests are applied to independent variables (predictors) although most statistical models, like regression analyses, make no strong assumptions regarding predictors, but rather strongly regarding Differences (Bland&Altman plot) or Residuals.

LikeLike

Thank you! Maybe you have something with nonparametric tests too!

LikeLike