# Statistical Analysis of the Missing Values

This research is being carried out to evaluate and present the two cases that are defined as outliers because visual inspection reveals that they are several multiples of the standard deviation removed from the base pay range that accounts for 99.5% of all cases. It is highly probable that cases 158 and 379 are managers only because their base pay are more than twice the highest base pay levels of everybody else. Otherwise, there is no telling from their age or educational qualifications. One therefore runs the variant of the two-sample t-test with unequal variances assumed. The result (overleaf) shows, first of all, that the variances for gender and basic wage are truly worlds apart. This stands to reason, given the respective ranges of the two variables. Secondly, the output reveals a computed t value of 67.51 which is so high that the associated p statistic is microscopic: 0.21 with 219 leading zeroes. At 399 degrees of freedom, t = -67.51, p lt. 0.001. Going by the outcome of the t-test reported in item #5 above, we can reject the null hypothesis that there is no difference in basic pay by sex. The computed difference in item 6 is statistically significant. The output overleaf shows that the calculated F value is associated with a very low significance statistic, p lt. 0.05. This means we can safely assume that the variances for the two variables are not equal. Recognizing at this point that the Excel Data Analysis setup for the F test returns erroneous output, we change the way the variable ranges are defined and obtain a different result this time. The F statistic is now just 1.21 and the associated significance statistic is p gt. 0.05. This leads us to assume that the variance of bonuses across gender is equal. Accessing the two-sample t-test with equal variances assumed, one finds that the calculated t statistic is 0.99, for which the one-tailed p-value is p=0.16.