# Statistical test

ナビゲーションに移動 検索に移動
General issues of Vaccine
General issues of Travel med.
Virus
Rickettsia
Fungi
Trematode (fluke, distoma)
Cestode (tapeworm)
Medical Zoology

## Basic concept of statistical tests

• First remind that
• Statistical test is to test

## Comparing Proportions

Independent samples
(Unpaired in case of two)
Dependent samples
(Paired in case of two)
2 proportions
• Z test
\displaystyle{ \begin{align} z & = \frac{p_1-p_2}{SE_{pooled(p_1-p_2)}} \\ & = \frac{p_1-p_2}{\sqrt{\frac{\bar{p}(1-\bar{p})}{n_1}+\frac{\bar{p}(1-\bar{p})}{n_2}}} \end{align} }
≥ 3 proportions Enough large sample
• $\displaystyle{ \chi^2 }$ test
$\displaystyle{ \chi^2 = \sum \frac{(O - E)^2}{E} }$
$\displaystyle{ O }$ = observed values
$\displaystyle{ E }$ = expected values
• McNemar's $\displaystyle{ \chi^2 }$ test
\displaystyle{ \begin{align} & McNemar's\ \chi^2 \\ & = \frac{(n_1-n_2)^2}{n_1+n_2} \end{align} }
$\displaystyle{ n_i }$ = number of observations in discordant pair
Testing linear association
• $\displaystyle{ \chi^2 }$ trend test
\displaystyle{ \begin{align} & \chi^2 trend \\ & = \frac{(\bar{x_1}-\bar{x_2})^2}{s^2(\frac{1}{n_1}+\frac{1}{n_2})} \\ & s = \sqrt{\sum \frac{(x_i-\bar{x_i})^2}{n-1}} \end{align} }
$\displaystyle{ x_i }$ = weighted values
$\displaystyle{ n_i }$ = number of observations
≥1 cell expected value <5

Fisher's exact test

• very rare in real researches

## Comparing Means

Parametric
i.e., normally distributed
Non-parametric
i.e., not normally distributed
Independent samples
(Unpaired in case of two)
Dependent samples
(Paired in case of two)
Independent samples
(Unpaired in case of two)
Dependent samples
(Paired in case of two)
2 means Enough large sample
• Z test
\displaystyle{ \begin{align} z & = \frac{\bar{x_1}-\bar{x_2}}{SE_{(\bar{x_1}-\bar{x_2})}} \\ & = \frac{\bar{x_1}-\bar{x_2}}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}} \end{align} }
• Paired Student's t test
$\displaystyle{ H_0 }$ is mean of paired differences in the population is zero.
\displaystyle{ \begin{align} paired\ t & = \frac{\bar{d}}{SE_d} \\ & = \frac{\bar{d}}{\frac{s}{\sqrt{n}}} \\ \end{align} }
where $\displaystyle{ \bar{d} }$ is the mean of differences of paired observations
• Wilcoxon rank sum test
=Mann-Whitney test
$\displaystyle{ H_0 }$ is medians or means of ranks in the two populations are the same
1. To rank whole combined observations of two groups
2. To separate back the ranks into two groups
3. To look up critical range relevant to both numbers of observations and whether the sum of ranks in the group of smaller number of observation (=statistics) is outside the range or not
if outside the range, p-value is smaller than designated
• Wilcoxon signed rank test
$\displaystyle{ H_0 }$ is median of paired differences in the population is zero
1. To calculate differences between pairs and discard 0 differences
2. To rank the absolute values of differences (ignoring 0)
3. To make the sum of ranks of positive difference and the sum of ranks of negative differences ('signed rank')
4. To look up critical value relevant to numbers of pairs with non-0 differences and whether the smaller sum of rank (=statistics) is smaller than the critical value
if smaller than the critical value, p-value is smaller than designated
Small sample size <30 in a group
• Student's t test
\displaystyle{ \begin{align} t & = \frac{\bar{x_1}-\bar{x_2}}{SE_{(\bar{x_1}-\bar{x_2})}} \\ & = \frac{\bar{x_1}-\bar{x_2}}{\sqrt{\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{(n_1-1)+(n_2-1)}}\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}} \end{align} }
Large discrepancy in SDs between groups
• Bootstrap
• Non-parametric
• Fisher-Behrens
• Welch
≥ 3 means
• One-way ANOVA
\displaystyle{ \begin{align} F & = \frac{ \sum_{j=1}^k \sum_{j=1}^{n_j} (x_{ij}-\bar{x_j})^2 }{ k-1 } \\ & \div \frac{ \sum_{j=1}^k (\bar{x_j}-\bar{x})^2 }{ n-k } \end{align} }
$\displaystyle{ n }$ is sample size (whole combined number of observations)
$\displaystyle{ k }$ is number of groups

• Linear regression model
• Repeated measures ANOVA
• Kruskall-Wallis test
$\displaystyle{ H_0 }$ is medians or means of ranks in the all populations are the same
1. To rank whole combined observations of all groups
2. To separate back the ranks into original groups
3. To make sum of ranks in each group
$\displaystyle{ H = \frac{n-1}{n} \sum_{i=1}^k \frac{n_i(\bar{R}-E_R)}{s^2} }$
$\displaystyle{ H }$ is Kruskal-Wallis statistics
$\displaystyle{ n_i }$ is number of observations in group $\displaystyle{ i }$
$\displaystyle{ \bar{R} }$ is the mean of rank sum in group $\displaystyle{ i }$
$\displaystyle{ E_R }$ is expected value of the rankings
$\displaystyle{ s^2 }$ is the variance of rank
To look up critical values relevant to sum of ranks in the group of smaller number of observation
• *needs try to transform data into parametric (e.g., logarithmic), or other considerations

## Comparing Survival time

Life table Kaplan-Meyer
• Log rank test
= Mantel-Cox $\displaystyle{ \chi^2 }$ test
$\displaystyle{ H_0 }$ is event (survival) rates in each interval are all the same in two groups
$\displaystyle{ Log\ rank\ statistics = \frac{}{} }$