「Table-related commands in STATA」の版間の差分

提供: Vaccipedia | Resources for Vaccines, Tropical medicine and Travel medicine
ナビゲーションに移動 検索に移動
 
(同じ利用者による、間の39版が非表示)
25行目: 25行目:
 
!table
 
!table
 
|<pre>table v1</pre>
 
|<pre>table v1</pre>
create a one-way table<br> of ''v1'' with '''simple''' statistics
+
create a one-way table of ''v1''<br> with '''simple''' frequency
 
|<pre>table v1 v2</pre>
 
|<pre>table v1 v2</pre>
create a two-way table<br> of ''v1'' in row&dagger; and ''v2'' in column&dagger;
+
create a two-way table of ''v1''<br> in row&dagger; and ''v2'' in column&dagger;
 
|<pre>,statistics( )</pre>
 
|<pre>,statistics( )</pre>
 
|-
 
|-
 
!tabulate
 
!tabulate
 
|<pre>tabulate v1</pre>
 
|<pre>tabulate v1</pre>
create a one-way table<br> of ''v1'' with '''detailed''' statistics
+
create a one-way table of ''v1''<br> with '''detailed''' frequency
 
|<pre>tabulate v1 v2</pre>
 
|<pre>tabulate v1 v2</pre>
create a two-way table<br> with ''v1'' in row&dagger; and ''v2'' in column&dagger;
+
create a two-way table with ''v1''<br> in row&dagger; and ''v2'' in column&dagger;
 
|<pre>,chi2</pre>
 
|<pre>,chi2</pre>
 
Pearson's chi-squared test; <nowiki>*</nowiki>only for two-way
 
Pearson's chi-squared test; <nowiki>*</nowiki>only for two-way
42行目: 42行目:
 
!tabstat
 
!tabstat
 
|<pre>tabstat v1</pre>
 
|<pre>tabstat v1</pre>
create a one-way table of ''v1''<br> with detailed statistics
+
create a one-way table of ''v1''<br> with '''detailed''' statistics
 
|''*no two- or multiple-way table''
 
|''*no two- or multiple-way table''
 
|<pre>,statistics( )</pre>
 
|<pre>,statistics( )</pre>
50行目: 50行目:
 
!summarize
 
!summarize
 
|<pre>summarize v1</pre>
 
|<pre>summarize v1</pre>
detailed statistics of ''v1''
+
'''detailed''' statistics of ''v1''
 
|''*no two- or multiple-way summary''
 
|''*no two- or multiple-way summary''
 
|<pre>,detail</pre>
 
|<pre>,detail</pre>
60行目: 60行目:
  
 
[[file:STATAsample.jpg]]
 
[[file:STATAsample.jpg]]
 +
 +
Where,
 +
{|
 +
!id
 +
|''discrete''
 +
|:Identification number
 +
|-
 +
!sex
 +
|''binary''
 +
|:Male=0, Female=1
 +
|-
 +
!data1
 +
|''continuous''
 +
|:Results of a certain test
 +
|-
 +
!factorA, B, C
 +
|''binary''
 +
|:Negative=0, Positive=1
 +
|-
 +
!SES
 +
|''categorical''
 +
|:Categories of Socio-Economic Status, divided into four
 +
|-
 +
!disease
 +
|''binary''
 +
|:Free from a certain disease=0, Having the disease=1
 +
|}
  
 
==One-way==
 
==One-way==
66行目: 93行目:
 
|-
 
|-
 
!table
 
!table
|[[file:table_sex.jpg]]
+
|'''table sex'''
 +
[[file:table_sex.jpg]]
 
|rowspan="2"|Both reports frequency<br> but '''tabulate''' is more detailed
 
|rowspan="2"|Both reports frequency<br> but '''tabulate''' is more detailed
 
|-
 
|-
 
!tabulate
 
!tabulate
|[[file:tabulate_sex.jpg]]
+
|'''tabulate sex'''
 +
[[file:tabulate_sex.jpg]]
 
|-
 
|-
 
!tabstat
 
!tabstat
|[[file:Tabstat_sex.jpg]]
+
|'''tabstat sex'''
 +
[[file:Tabstat_sex.jpg]]
 
|rowspan="2"|Both reports mean<br> but '''summarize''' is more detailed
 
|rowspan="2"|Both reports mean<br> but '''summarize''' is more detailed
 
|-
 
|-
 
!summarize
 
!summarize
|[[file:summarize_sex.jpg]]
+
|'''summarize sex'''
 +
[[file:summarize_sex.jpg]]
 
|}
 
|}
  
===Summary of data1, a continuous variable===
+
===Summary of ''data1'', a continuous variable===
 
{|class="wikitable"
 
{|class="wikitable"
 
|-
 
|-
 
!table
 
!table
|[[file:table_data1.jpg]]
+
|'''table data1'''
 +
[[file:table_data1.jpg]]
 
|rowspan="2"|Both reports frequency of each value,<br> which does not make sense
 
|rowspan="2"|Both reports frequency of each value,<br> which does not make sense
 
|-
 
|-
 
!tabulate
 
!tabulate
|[[file:tabulate_data1.jpg]]
+
|'''tabulate data1'''
 +
[[file:tabulate_data1.jpg]]
 
|-
 
|-
 
!tabstat
 
!tabstat
|[[file:tabstat_data1.jpg]]
+
|'''tabstat data1'''
 +
[[file:tabstat_data1.jpg]]
 
|rowspan="2"|Both reports mean<br> but '''summarize''' is more detailed
 
|rowspan="2"|Both reports mean<br> but '''summarize''' is more detailed
 
|-
 
|-
 
!summarize
 
!summarize
|[[file:summarize_data1.jpg]]
+
|'''summarize data1'''
 +
[[file:summarize_data1.jpg]]
 
|}
 
|}
  
===Summary of SES, a categorical variable===
+
===Summary of ''SES'', a categorical variable===
 
{|class="wikitable"
 
{|class="wikitable"
 
|-
 
|-
 
!table
 
!table
|[[file:table_SES.jpg]]
+
|'''table SES'''
 +
[[file:table_SES.jpg]]
 
|rowspan="2"|Both reports frequency<br> but '''tabulate''' is more detailed
 
|rowspan="2"|Both reports frequency<br> but '''tabulate''' is more detailed
 
|-
 
|-
 
!tabulate
 
!tabulate
|[[file:tabulate_SES.jpg]]
+
|'''tabulate SES'''
 +
[[file:tabulate_SES.jpg]]
 
|-
 
|-
 
!tabstat
 
!tabstat
|[[file:tabstat_SES.jpg]]
+
|'''tabstat SES'''
 +
[[file:tabstat_SES.jpg]]
 
|rowspan="2"|Both reports mean<br> but '''summarize''' is more detailed
 
|rowspan="2"|Both reports mean<br> but '''summarize''' is more detailed
 
|-
 
|-
 
!summarize
 
!summarize
|[[file:summarize_SES.jpg]]
+
|'''summarize SES'''
 +
[[file:summarize_SES.jpg]]
 
|}
 
|}
  
120行目: 159行目:
 
|-
 
|-
 
!table
 
!table
|colspan="2"|''*Does not create one-way multiple table''
+
|rowspan="2" colspan="2"|''*Both do not create one-way multiple table''
 
|-
 
|-
 
!tabulate
 
!tabulate
|colspan="2"|''*Does not create one-way multiple table''
 
 
|-
 
|-
 
!tabstat
 
!tabstat
|[[file:tabstat_sex_data1_SES.jpg]]
+
|'''tabstat sex data1 SES'''
 +
[[file:tabstat_sex_data1_SES.jpg]]
 
|Reports mean in row (transverse) direction
 
|Reports mean in row (transverse) direction
 
|-
 
|-
 
!summarize
 
!summarize
|[[file:summarize_sex_data1_SES.jpg]]
+
|'''summarize sex data1 SES'''
 +
[[file:summarize_sex_data1_SES.jpg]]
 
|Reports more details in column (longitudinal) direction
 
|Reports more details in column (longitudinal) direction
 
|}
 
|}
  
 
==Two-way==
 
==Two-way==
===Relation between '''sex''' and '''factorA'''===
+
===Summary of ''factorA'' based on ''sex''===
 +
{|class="wikitable"
 +
|-
 +
!table
 +
|'''table sex factorA'''
 +
[[file:table_sex_factorA.jpg]]
 +
|rowspan="2"|Both creates the same table<br> but '''tabulate''' is better visualized
 +
|-
 +
!tabulate
 +
|'''tabulate sex factorA'''
 +
[[file:tabulate_sex_factorA.jpg]]
 +
|-
 +
!tabstat
 +
|'''tabstat factorA, by(sex)'''
 +
[[file:tabstat_factorA_by(sex).jpg]]
 +
|rowspan="2"|Both reports mean<br> but '''summarize''' is more detailed;<br>needs '''bysort''' option before the command
 +
|-
 +
!summarize
 +
|'''bysort sex: summarize factorA'''
 +
[[file:summarize_factorA_bysort_sex.jpg]]
 +
|}
 +
 
 +
===Summary of ''sex'' based on ''factorA''===
 +
{|class="wikitable"
 +
|-
 +
!table
 +
|'''table factorA sex'''
 +
[[file:table_factorA_sex.jpg]]
 +
|rowspan="2"|Both creates the same table<br> but '''tabulate''' is better visualized
 +
|-
 +
!tabulate
 +
|'''tabulate factorA sex'''
 +
[[file:tabulate_factorA_sex.jpg]]
 +
|-
 +
!tabstat
 +
|'''tabstat sex, by(factorA)'''
 +
[[file:tabstat_sex_by(factorA).jpg]]
 +
|rowspan="2"|Both reports mean<br> but '''summarize''' is more detailed;<br>needs '''bysort''' option before the command
 +
|-
 +
!summarize
 +
|'''bysort factorA: summarize sex'''
 +
[[file:summarize_sex_bysort_factorA.jpg]]
 +
|}
 +
 
 +
===Summary of ''data1'' based on ''disease''===
 +
{|class="wikitable"
 +
|-
 +
!table
 +
|rowspan="2" colspan="2"|''*Both do not create a meaningful table for continuous variable''
 +
|-
 +
!tabulate
 +
|-
 +
!tabstat
 +
|'''tabstat data, by(disease)'''
 +
[[file:tabstat_data1_by(disease).jpg]]
 +
|rowspan="2"|Both reports mean<br> but '''summarize''' is more detailed;<br>needs '''bysort''' option before the command
 +
|-
 +
!summarize
 +
|'''bysort disease: summarize data1'''
 +
[[file:summarize_data1_bysort_disease.jpg]]
 +
|}
 +
 
 +
==Two-way with proportions==
 +
===Summary of ''factorA'' based on ''sex'' with proportions===
 +
{|class="wikitable"
 +
|-
 +
!rowspan="3"|table
 +
|'''table sex factorA, statistic(percent)'''
 +
[[file:table_sex_factorA_statistic(percent).jpg]]
 +
|This calculates proportions of cells compared to the whole<br> without showing raw values
 +
|-
 +
|'''table sex factorA, statistic(percent, across(sex))'''
 +
[[file:table_sex_factorA_statistic(percent_across(sex)).jpg]]
 +
|This calculates proportions in column (longitudinal) directions<br> without showing raw values
 +
|-
 +
|'''tale sex factorA, statistic(percent, across(factorA))'''
 +
[[file:table_sex_factorA_statistic(percent_across(factorA)).jpg]]
 +
|This calculates proportions in row (transverse) directions<br> without showing raw values
 +
|-
 +
!rowspan="2"|tabulate
 +
|'''tabulate sex factorA, column'''
 +
[[file:tabulate_sex_factorA_column.jpg]]
 +
|This calculates proportions in column (longitudinal) directions
 +
|-
 +
|'''tabulate sex factorA, row'''
 +
[[file:tabulate_sex_factorA_row.jpg]]
 +
|This calculates proportions in row (transverse) directions
 +
|}
 +
 
 +
==Two-way, multiple==
 +
===Summary of ''factorA'', ''factorB'', ''factorC'' based on ''disease''===
 +
{|class="wikitable"
 +
|-
 +
!rowspan="3"|tabstat
 +
|'''tabstat factorA factorB factorC, by(disease)'''
 +
[[File:tabstat_factorABC_by(disease).jpg]]
 +
|
 +
|-
 +
|'''tabstat factorA factorB factorC, by(disease) statistic(sum)'''
 +
[[File:tabstat_factorABC_by(disease)_statistic(sum).jpg]]
 +
|factorA,B,C are binary variables so summations of values provide the positivities of factorA,B,C
 +
|-
 +
|'''tabstat factorA factorB factorC, by(disease) statistic(n)'''
 +
[[File:tabstat_factorABC_by(disease)_statistic(n).jpg]]
 +
|''statistic(n)'' (''statistic(count)'' is the same) only counts observations with real values, which only tell non-missing observations
 +
|}
 +
 
 +
===Summary of ''factorA'', ''factorB'', ''factorC'' based on ''SES''===
 +
{|class="wikitable"
 +
|-
 +
!rowspan="3"|tabstat
 +
|'''tabstat factorA factorB factorC, by(SES)'''
 +
[[File:tabstat_factorABC_by(SES).jpg]]
 +
|
 +
|-
 +
|'''tabstat factorA factorB factorC, by(SES) statistic(sum)'''
 +
[[File:tabstat_factorABC_by(SES)_statistic(sum).jpg]]
 +
|factorA,B,C are binary variables so summations of values provide the positivities of factorA,B,C
 +
|-
 +
|'''tabstat factorA factorB factorC, by(SES) statistic(n)'''
 +
[[File:tabstat_factorABC_by(SES)_statistic(n).jpg]]
 +
|''statistic(n)'' (''statistic(count)'' is the same) only counts observations with real values, which only tell non-missing observations
 +
|}
 +
 
 +
==Two-way of binary/categorical plus summary of continuous==
 +
===Summary of ''data1'' based on ''disease'' and ''SES''===
 +
{|class="wikitable"
 +
|-
 +
!tabulate
 +
|'''tabulate disease SES, summarize(data1)'''
 +
[[File:tabulate_disease_SES_summarize(data1).jpg]]
 +
|This tells means, SDs and frequencies of a continuous variable divided in two-way of binary/categorical variables
 +
|}

2023年4月2日 (日) 19:50時点における最新版

Abbreviations of commands

table (no abbv.)
tabulate ta
tab
tabstat (no abbv.)
summarize su

Differences between table, tabulate, tabstat, summarize

one-way two-way options
table
table v1

create a one-way table of v1
with simple frequency

table v1 v2

create a two-way table of v1
in row† and v2 in column†

,statistics( )
tabulate
tabulate v1

create a one-way table of v1
with detailed frequency

tabulate v1 v2

create a two-way table with v1
in row† and v2 in column†

,chi2

Pearson's chi-squared test; *only for two-way

,summarize(v3)

detailed statistics for v3

tabstat
tabstat v1

create a one-way table of v1
with detailed statistics

*no two- or multiple-way table
,statistics( )
,by(v3)

detailed statistics for each of v3

summarize
summarize v1

detailed statistics of v1

*no two- or multiple-way summary
,detail

† row = transverse direction, column = longitudinal direction

Sample data

Suppose we have such a dataset in STATA.

STATAsample.jpg

Where,

id discrete :Identification number
sex binary :Male=0, Female=1
data1 continuous :Results of a certain test
factorA, B, C binary :Negative=0, Positive=1
SES categorical :Categories of Socio-Economic Status, divided into four
disease binary :Free from a certain disease=0, Having the disease=1

One-way

Summary of sex, a binary variable

table table sex

Table sex.jpg

Both reports frequency
but tabulate is more detailed
tabulate tabulate sex

Tabulate sex.jpg

tabstat tabstat sex

Tabstat sex.jpg

Both reports mean
but summarize is more detailed
summarize summarize sex

Summarize sex.jpg

Summary of data1, a continuous variable

table table data1

Table data1.jpg

Both reports frequency of each value,
which does not make sense
tabulate tabulate data1

Tabulate data1.jpg

tabstat tabstat data1

Tabstat data1.jpg

Both reports mean
but summarize is more detailed
summarize summarize data1

Summarize data1.jpg

Summary of SES, a categorical variable

table table SES

Table SES.jpg

Both reports frequency
but tabulate is more detailed
tabulate tabulate SES

Tabulate SES.jpg

tabstat tabstat SES

Tabstat SES.jpg

Both reports mean
but summarize is more detailed
summarize summarize SES

Summarize SES.jpg

One-way, multiple

table *Both do not create one-way multiple table
tabulate
tabstat tabstat sex data1 SES

Tabstat sex data1 SES.jpg

Reports mean in row (transverse) direction
summarize summarize sex data1 SES

Summarize sex data1 SES.jpg

Reports more details in column (longitudinal) direction

Two-way

Summary of factorA based on sex

table table sex factorA

Table sex factorA.jpg

Both creates the same table
but tabulate is better visualized
tabulate tabulate sex factorA

Tabulate sex factorA.jpg

tabstat tabstat factorA, by(sex)

Tabstat factorA by(sex).jpg

Both reports mean
but summarize is more detailed;
needs bysort option before the command
summarize bysort sex: summarize factorA

Summarize factorA bysort sex.jpg

Summary of sex based on factorA

table table factorA sex

Table factorA sex.jpg

Both creates the same table
but tabulate is better visualized
tabulate tabulate factorA sex

Tabulate factorA sex.jpg

tabstat tabstat sex, by(factorA)

Tabstat sex by(factorA).jpg

Both reports mean
but summarize is more detailed;
needs bysort option before the command
summarize bysort factorA: summarize sex

Summarize sex bysort factorA.jpg

Summary of data1 based on disease

table *Both do not create a meaningful table for continuous variable
tabulate
tabstat tabstat data, by(disease)

Tabstat data1 by(disease).jpg

Both reports mean
but summarize is more detailed;
needs bysort option before the command
summarize bysort disease: summarize data1

Summarize data1 bysort disease.jpg

Two-way with proportions

Summary of factorA based on sex with proportions

table table sex factorA, statistic(percent)

Table sex factorA statistic(percent).jpg

This calculates proportions of cells compared to the whole
without showing raw values
table sex factorA, statistic(percent, across(sex))

Table sex factorA statistic(percent across(sex)).jpg

This calculates proportions in column (longitudinal) directions
without showing raw values
tale sex factorA, statistic(percent, across(factorA))

Table sex factorA statistic(percent across(factorA)).jpg

This calculates proportions in row (transverse) directions
without showing raw values
tabulate tabulate sex factorA, column

Tabulate sex factorA column.jpg

This calculates proportions in column (longitudinal) directions
tabulate sex factorA, row

Tabulate sex factorA row.jpg

This calculates proportions in row (transverse) directions

Two-way, multiple

Summary of factorA, factorB, factorC based on disease

tabstat tabstat factorA factorB factorC, by(disease)

Tabstat factorABC by(disease).jpg

tabstat factorA factorB factorC, by(disease) statistic(sum)

Tabstat factorABC by(disease) statistic(sum).jpg

factorA,B,C are binary variables so summations of values provide the positivities of factorA,B,C
tabstat factorA factorB factorC, by(disease) statistic(n)

Tabstat factorABC by(disease) statistic(n).jpg

statistic(n) (statistic(count) is the same) only counts observations with real values, which only tell non-missing observations

Summary of factorA, factorB, factorC based on SES

tabstat tabstat factorA factorB factorC, by(SES)

Tabstat factorABC by(SES).jpg

tabstat factorA factorB factorC, by(SES) statistic(sum)

Tabstat factorABC by(SES) statistic(sum).jpg

factorA,B,C are binary variables so summations of values provide the positivities of factorA,B,C
tabstat factorA factorB factorC, by(SES) statistic(n)

Tabstat factorABC by(SES) statistic(n).jpg

statistic(n) (statistic(count) is the same) only counts observations with real values, which only tell non-missing observations

Two-way of binary/categorical plus summary of continuous

Summary of data1 based on disease and SES

tabulate tabulate disease SES, summarize(data1)

Tabulate disease SES summarize(data1).jpg

This tells means, SDs and frequencies of a continuous variable divided in two-way of binary/categorical variables