Statistical test

Basics & Definition
Epidemiology
Odds in statistics and Odds in a horse race
Collider bias
Data distribution
Statistical test
Regression model
Multivariate analysis
Marginal effects
Prediction and decision
Table-related commands in STATA
Missing data and imputation

Comparing Proportions

	Independent samples (Unpaired in case of two)	Dependent samples (Paired in case of two)
2 proportions	Z test [math]\displaystyle{ \begin{align} z & = \frac{p_1-p_2}{SE_{pooled(p_1-p_2)}} \\ & = \frac{p_1-p_2}{\sqrt{\frac{\bar{p}(1-\bar{p})}{n_1}+\frac{\bar{p}(1-\bar{p})}{n_2}}} \end{align} }[/math]
≥ 3 proportions	Enough large sample [math]\displaystyle{ \chi^2 }[/math] test [math]\displaystyle{ \chi^2 = \sum \frac{(O - E)^2}{E} }[/math] [math]\displaystyle{ O }[/math] = observed values [math]\displaystyle{ E }[/math] = expected values	McNemar's [math]\displaystyle{ \chi^2 }[/math] test [math]\displaystyle{ \begin{align} & McNemar's\ \chi^2 \\ & = \frac{(n_1-n_2)^2}{n_1+n_2} \end{align} }[/math] [math]\displaystyle{ n_i }[/math] = number of observations in discordant pair
	Testing linear association [math]\displaystyle{ \chi^2 }[/math] trend test [math]\displaystyle{ \begin{align} & \chi^2 trend \\ & = \frac{(\bar{x_1}-\bar{x_2})^2}{s^2(\frac{1}{n_1}+\frac{1}{n_2})} \\ & s = \sqrt{\sum \frac{(x_i-\bar{x_i})^2}{n-1}} \end{align} }[/math] [math]\displaystyle{ x_i }[/math] = weighted values [math]\displaystyle{ n_i }[/math] = number of observations
	≥1 cell expected value <5 Fisher's exact test very rare in real researches

Comparing Means

	Parametric i.e., normally distributed		Non-parametric i.e., not normally distributed
	Independent samples (Unpaired in case of two)	Dependent samples (Paired in case of two)	Independent samples (Unpaired in case of two)	Dependent samples (Paired in case of two)
2 means	Enough large sample Z test [math]\displaystyle{ \begin{align} z & = \frac{\bar{x_1}-\bar{x_2}}{SE_{(\bar{x_1}-\bar{x_2})}} \\ & = \frac{\bar{x_1}-\bar{x_2}}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}} \end{align} }[/math]	Paired Student's t test [math]\displaystyle{ H_0 }[/math] is mean of paired differences* in the population is zero.* [math]\displaystyle{ \begin{align} paired\ t & = \frac{\bar{d}}{SE_d} \\ & = \frac{\bar{d}}{\frac{s}{\sqrt{n}}} \\ \end{align} }[/math] where [math]\displaystyle{ \bar{d} }[/math] is the mean of differences of paired observations	Wilcoxon rank sum test =Mann-Whitney test [math]\displaystyle{ H_0 }[/math] is medians or means of ranks* in the two populations are the same* To rank whole combined observations of two groups To separate back the ranks into two groups To look up critical range relevant to both numbers of observations and whether the sum of ranks in the group of smaller number of observation (=statistics) is outside the range or not if outside the range, p-value is smaller than designated	Wilcoxon signed rank test [math]\displaystyle{ H_0 }[/math] is median of paired differences* in the population is zero* To calculate differences between pairs and discard 0 differences To rank the absolute values of differences (ignoring 0) To make the sum of ranks of positive difference and the sum of ranks of negative differences ('signed rank') To look up critical value relevant to numbers of pairs with non-0 differences and whether the smaller sum of rank (=statistics) is smaller than the critical value if smaller than the critical value, p-value is smaller than designated
	Small sample size <30 in a group Student's t test [math]\displaystyle{ \begin{align} t & = \frac{\bar{x_1}-\bar{x_2}}{SE_{(\bar{x_1}-\bar{x_2})}} \\ & = \frac{\bar{x_1}-\bar{x_2}}{\sqrt{\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{(n_1-1)+(n_2-1)}}\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}} \end{align} }[/math]
	Large discrepancy in SDs between groups Bootstrap Non-parametric Fisher-Behrens Welch
≥ 3 means	One-way ANOVA [math]\displaystyle{ \begin{align} F & = \frac{ \sum_{j=1}^k \sum_{j=1}^{n_j} (x_{ij}-\bar{x_j})^2 }{ k-1 } \\ & \div \frac{ \sum_{j=1}^k (\bar{x_j}-\bar{x})^2 }{ n-k } \end{align} }[/math] [math]\displaystyle{ n }[/math] is sample size (whole combined number of observations) [math]\displaystyle{ k }[/math] is number of groups	Linear regression model Repeated measures ANOVA	Kruskall-Wallis test [math]\displaystyle{ H_0 }[/math] is medians or means of ranks* in the all populations are the same* To rank whole combined observations of all groups To separate back the ranks into original groups To make sum of ranks in each group [math]\displaystyle{ H = \frac{n-1}{n} \sum_{i=1}^k \frac{n_i(\bar{R}-E_R)}{s^2} }[/math] [math]\displaystyle{ H }[/math] is Kruskal-Wallis statistics [math]\displaystyle{ n_i }[/math] is number of observations in group [math]\displaystyle{ i }[/math] [math]\displaystyle{ \bar{R} }[/math] is the mean of rank sum in group [math]\displaystyle{ i }[/math] [math]\displaystyle{ E_R }[/math] is expected value of the rankings [math]\displaystyle{ s^2 }[/math] is the variance of rank To look up critical values relevant to sum of ranks in the group of smaller number of observation	*needs try to transform data into parametric (e.g., logarithmic), or other considerations

Comparing Survival time

Life table	Kaplan-Meyer
Log rank test = Mantel-Cox [math]\displaystyle{ \chi^2 }[/math] test [math]\displaystyle{ H_0 }[/math] is event (survival) rates in each interval are all the same in two groups [math]\displaystyle{ Log\ rank\ statistics = \frac{}{} }[/math]

Statistical test

Comparing Proportions

Comparing Means

Comparing Survival time

案内メニュー

個人用ツール

名前空間

変種

表示

その他

検索

案内

ツール