「Regression model」の版間の差分

提供: Vaccipedia | Resources for Vaccines, Tropical medicine and Travel medicine
ナビゲーションに移動 検索に移動
 
(同じ利用者による、間の21版が非表示)
1行目: 1行目:
{{Epi Stat}}
+
{{Floating_Menu}}
  
 
==Classification of Regression models==
 
==Classification of Regression models==
6行目: 6行目:
 
|-
 
|-
 
!colspan="2" rowspan="2" style="width:150px"|
 
!colspan="2" rowspan="2" style="width:150px"|
!colspan="2"|Independent variable (exposure)
+
!colspan="3"|Independent variable (exposure)
 
|-
 
|-
!style="width:300px"|Monovariable (single variable)
+
!style="width:300px"|Univariable (single variable)
 
!style="width:300px"|Multivariable (multiple variables)
 
!style="width:300px"|Multivariable (multiple variables)
 +
!How to derive coefficients <math>b_i</math>
 
|-
 
|-
 
!rowspan="6"|Dependent<br>variable<br>(outcome)
 
!rowspan="6"|Dependent<br>variable<br>(outcome)
 
!Continuous
 
!Continuous
 
|
 
|
*'''Simple linear regression'''
+
*'''Single linear regression'''
 
::<math>Y = a + bX</math>
 
::<math>Y = a + bX</math>
 
 
|
 
|
 
*'''Multivariable&dagger; linear regression'''
 
*'''Multivariable&dagger; linear regression'''
 
::<math>Y = a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots</math>
 
::<math>Y = a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots</math>
 
+
|
 +
Least squares method
 
|-
 
|-
 
!Binary
 
!Binary
 
|
 
|
*'''Simple binary logistic regression'''
+
*'''Single binary logistic regression'''
::<math>\log Y = a + bX</math><br>where <math>Y</math> is odds of outcome
+
::<math>\log Y = a + bX</math><br>where <math>Y</math> is '''odds''' of outcome <math>\frac{p}{1-p}</math>
 
 
 
|
 
|
 
*'''Multivariable&dagger; binary logistic regression'''
 
*'''Multivariable&dagger; binary logistic regression'''
::<math>\log Y = a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots</math><br>where <math>Y</math> is odds of outcome
+
::<math>\log Y = a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots</math><br>where <math>Y</math> is '''odds''' of outcome <math>\frac{p}{1-p}</math>
 
+
|
 +
Maximum likelihood estimation method
 
|-
 
|-
 
!Multinominal<br>&ge; 3
 
!Multinominal<br>&ge; 3
 
|
 
|
*'''Simple multinominal logistic regression'''
+
*'''Single multinominal logistic regression'''
 
 
 
|
 
|
 
*'''Multivariable&dagger; multinominal logistic regression'''
 
*'''Multivariable&dagger; multinominal logistic regression'''
 
+
|
 +
Maximum likelihood estimation method
 
|-
 
|-
 
!Ordinal
 
!Ordinal
 
|
 
|
*'''Simple ordinal logistic regression'''
+
*'''Single ordinal logistic regression'''
 
 
 
|
 
|
 
*'''Multivariable&dagger; ordinal logistic regression'''
 
*'''Multivariable&dagger; ordinal logistic regression'''
 
+
|
 +
Maximum likelihood estimation method
 
|-
 
|-
 
!Rate ratio
 
!Rate ratio
52行目: 53行目:
 
|
 
|
 
*'''Multivariable Poisson regression'''
 
*'''Multivariable Poisson regression'''
::<math>\log Y = a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots</math><br>where <math>Y</math> is rate ratio
+
::<math>\log Y = a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots</math><br>where <math>Y</math> is '''rate ratio''' <math>\frac{events_1/person \text{-} time_1}{events_2/person \text{-} time_2}</math>
 
+
|
 +
Maximum likelihood estimation method
 
|-
 
|-
 
!Survival time
 
!Survival time
60行目: 62行目:
 
*'''Multivariable proportional hazard regression'''<br>= '''Cox hazard regression'''
 
*'''Multivariable proportional hazard regression'''<br>= '''Cox hazard regression'''
 
::<math>\log h(T) = \log h_0(T) + b_1X_1 + b_2X_2 + b_3X_3 + \cdots</math><br>where <math>h(T)</math> is the hazard at time <math>T</math><br>and <math>h_0(T)</math> is the baseline hazard at time <math>T</math>
 
::<math>\log h(T) = \log h_0(T) + b_1X_1 + b_2X_2 + b_3X_3 + \cdots</math><br>where <math>h(T)</math> is the hazard at time <math>T</math><br>and <math>h_0(T)</math> is the baseline hazard at time <math>T</math>
 +
|
 +
Maximum likelihood estimation method
 
|}
 
|}
  
 
&dagger;'Multivariable' can be rephrased as 'Multiple'; Multivariable is <font color="red">'''NOT equal to 'Multivariate'!!'''</font>
 
&dagger;'Multivariable' can be rephrased as 'Multiple'; Multivariable is <font color="red">'''NOT equal to 'Multivariate'!!'''</font>
 +
 +
==Binary logistic regression==
 +
===Conversion of logit of outcome odds to outcome probability <math>p</math>===
 +
Equation of binary logistic regression can be converted to outcome probablity <math>p</math> as,
 +
:<math>
 +
\begin{array}{lrll}
 +
\log Y =        & \log \left ( \dfrac{p}{1-p} \right ) & = a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots \\
 +
\Leftrightarrow & \dfrac{p}{1-p}                      & = \exp (a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots)
 +
                                                        & = e^{(a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots)} \\
 +
\Leftrightarrow & p                                  & = \dfrac { \exp (a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots) }{ 1 + \exp (a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots) }
 +
                                                        & = \dfrac { e^{(a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots)} }{ 1 + e^{(a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots)} } \\
 +
\end{array}
 +
</math>
 +
 +
===Conversion of coefficient to odds ratio===
 +
When thinking about outcome probability <math>p</math> and the changed outcome probability <math>p\prime</math> by adding <math>1</math> to explanatory variable <math>X_1</math>, the following two equations are obtained,
 +
:<math>
 +
\begin{array}{lcl}
 +
\log \left ( \dfrac{p}{1-p} \right ) & = & a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots \\
 +
\log \left ( \dfrac{p\prime}{1-p\prime} \right ) & = & a + b_1( {\color{red}X_1 + 1} ) + b_2X_2 + b_3X_3 + \cdots
 +
\end{array}
 +
</math>
 +
 +
Subtraction of these two equations makes,
 +
:<math>
 +
\begin{align}
 +
\log \left ( \frac{p\prime}{1-p\prime} \right ) - \log \left ( \frac{p}{1-p} \right ) & = b_1({\color{red}X_1 + 1}) - b_1X_1 \\
 +
& = b_1 \\
 +
\Leftrightarrow \frac {\left ( \dfrac{p\prime}{1-p\prime} \right )}{ \left ( \dfrac{p}{1-p} \right ) } & = \exp (b_1) = e^{b_1}
 +
\end{align}
 +
</math>
 +
 +
 +
Because <math>\dfrac{p\prime}{1-p\prime}</math> and <math>\dfrac{p}{1-p}</math> are odds of <math>p\prime</math> and <math>p</math>, respectively,
 +
 +
<math>\frac {\left ( \dfrac{p\prime}{1-p\prime} \right )}{ \left ( \dfrac{p}{1-p} \right ) }</math> is the '''odds ratio''' of [the probability when <math>1</math> is added to <math>X_1</math>] to [the probability before adding].
 +
 +
 +
Thus, converted <math>b_1</math> to <math>\color{red}{e^{b_1}}</math> or <math>\color{red}{\exp (b_1)}</math> gives the '''odds ratio of outocome probabilities''' before and after variable <math>X_1</math> gains <math>1</math>.
 +
 +
::<math>
 +
\begin{align}
 +
\exp (\text{coefficient}) = e^{\text{coefficient}} & = \text{odds ratio} \\
 +
\log (\text{odds ratio})                        & = \text{coefficient}
 +
\end{align}
 +
</math>
 +
 +
==Generalized linear model==
 +
 +
==Penalized multivariable logistic regression model==
 +
*[http://www.sthda.com/english/articles/36-classification-methods-essentials/149-penalized-logistic-regression-essentials-in-r-ridge-lasso-and-elastic-net/ Penalized Logistic Regression Essentials in R: Ridge, Lasso and Elastic Net]
 +
*[https://jojoshin.hatenablog.com/entry/2016/07/06/180923 罰則付き・正則化回帰モデルについて(About penalized/regularized regression model)]
 +
 +
==Restricted cubic spline==
 +
*[https://statakahiro.com/restricted-cubic-splines%E3%82%92stata%E3%81%A7%E5%AE%9F%E8%A1%8C%E3%81%97%E3%81%A6%E3%81%BF%E3%82%8B Restricted cubic splinesをStataで実行してみる]

2023年9月10日 (日) 13:19時点における最新版

Navigation Menu Vac logo.png
General issues of Vaccine
Cold chain
Correlates of Protection
Vaccines for Asplenia
Vaccines for Pregnant women
Vaccines for Immunocompromised hosts
Vaccine hesitancy
Additional materials of vaccine
General issues of Tropical med.
Definition of Tropical Medicine
Matrices of tropical infection
General issues of Helminths
Neglected Tropical Diseases
Sexually-transmitted infections
Non-Communicable Diseases
Maternal health and contraception
Child health
Malnutrition and Micronutrient
Eosinophilia
Fever in the tropics
Diarrhea in the tropics
Anemia in the tropics
Dermatology in the tropics
Ophthalmology in the tropics
Neurology in the tropics
Mental health in the tropics
Surgery in the tropics
Humanitarian emergency
Epidemiology in outbreak
Antimicrobial resistance
Pathology of infectious diseases
General issues of Travel med.
Epidemiology of Travel health
Last minute traveler
Time zone issue
High altitude medicine
Diving medicine
Pregnancy and travel
Children and travel
Elderly and travel
Immunology
Principle of human immune system
Innate immunity
Cellular immunity
Humoral immunity
Neutralizing antibody and its assay
Antigenic Cartography
Additional materials of immunology
Epi & Stats
Basics & Definition
Epidemiology
Odds in statistics and Odds in a horse race
Collider bias
Data distribution
Statistical test
Regression model
Multivariate analysis
Marginal effects
Prediction and decision
Table-related commands in STATA
Missing data and imputation
Virus
HIV
HIV-TB co-infection
HIV-STI interaction
Viral Hemorrhagic Fever
Ebola
Crimean-Congo hemorrhagic fever
SFTS
Rabies
Polio
Dengue
Yellow fever
Chikungunya
Zika
Japanese encephalitis
Tick-borne encephalitis
Viral hepatitis
Measles
Smallpox and Monkeypox
Respiratory Syncytial virus
COVID-19
Bivalent BA.1/BA.4-5 mRNA vaccines
Monovalent XBB-1.5 mRNA vaccine
Private archives of the initial phase of the pandemic
Private archives of lecture materials of COVID vaccine as of March 2021
厚生労働省が発出する保健行政関連の文書の読み解き方
Bacteria
Bacteriological tests
Tuberculosis
Tuberculosis in Children
HIV-TB co-infection
Leprosy
Dermatological mycobacterium infecions
Syphilis and Yaws
Plague
Pneumococcus
Meningococcus
Typhoid
Salmonellosis
Melioidosis
Leptospirosis
Brucellosis
Bartonellosis
Lyme disease and Relapsing fever
Tularaemia
Tetanus
Diphtheria
Anthrax
Coxiellosis
Rickettsia
Rickettsiosis
Scrub typhus
Spotted fevers
Epidemic typhus
Murine typhus
Protozoa
Overview of protozoa
Overview of medicine for protozoa
Malaria
Chagas disease
African trypanosomiasis
Leishmaniasis
Trichomoniasis
Toxoplasmosis
Amoebiasis
Giardiasis
Cryptosporidiosis
Cyclosporiasis
Isosporiasis
Pentatrichomoniasis
Microsporidiasis
Babesiosis
Fungi
General issues of fungi
Coccidioidomycosis
Paracoccidioidomycosis
Histoplasmosis
Talaromycosis
Blastomycosis
Sporotrichosis
Nematode (roundworm)
Nematode principles
Lympatic filariasis
Onchocerciasis
Loiasis
Microscopic differentiation of microfilariae
Strongyloidiasis
Ascariasis
Ancylostomiasis (hookworm)
Trichuriasis (whipworm)
Enterobiasis (pinworm)
Angiostrongyliasis (rat lungworm)
Dracunculiasis (Guinea worm)
Anisakiasis
Trichinellosis (Trichinosis)
Gnathostomiasis
Spirurinasis
Soil-transmitted helminths
Trematode (fluke, distoma)
General issues of Helminths
Trematode principles
Schistosomiasis
Clonorchiasis
Fascioliasis
Paragonimiasis
Metagonimiasis
Cestode (tapeworm)
General issues of Helminths
Cestode principles
Diphyllobothriasis
Sparganosis
Taeniasis
Echinococcosis
Medical Zoology
Zoonosis
Insectology
Mosquitology
Acarology
Batology
Snake toxicology
Scorpion and spider toxicology
Marine toxicology

Chevron-up-blue.png

Classification of Regression models

Independent variable (exposure)
Univariable (single variable) Multivariable (multiple variables) How to derive coefficients [math]\displaystyle{ b_i }[/math]
Dependent
variable
(outcome)
Continuous
  • Single linear regression
[math]\displaystyle{ Y = a + bX }[/math]
  • Multivariable† linear regression
[math]\displaystyle{ Y = a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots }[/math]

Least squares method

Binary
  • Single binary logistic regression
[math]\displaystyle{ \log Y = a + bX }[/math]
where [math]\displaystyle{ Y }[/math] is odds of outcome [math]\displaystyle{ \frac{p}{1-p} }[/math]
  • Multivariable† binary logistic regression
[math]\displaystyle{ \log Y = a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots }[/math]
where [math]\displaystyle{ Y }[/math] is odds of outcome [math]\displaystyle{ \frac{p}{1-p} }[/math]

Maximum likelihood estimation method

Multinominal
≥ 3
  • Single multinominal logistic regression
  • Multivariable† multinominal logistic regression

Maximum likelihood estimation method

Ordinal
  • Single ordinal logistic regression
  • Multivariable† ordinal logistic regression

Maximum likelihood estimation method

Rate ratio
  • Multivariable Poisson regression
[math]\displaystyle{ \log Y = a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots }[/math]
where [math]\displaystyle{ Y }[/math] is rate ratio [math]\displaystyle{ \frac{events_1/person \text{-} time_1}{events_2/person \text{-} time_2} }[/math]

Maximum likelihood estimation method

Survival time
  • Multivariable proportional hazard regression
    = Cox hazard regression
[math]\displaystyle{ \log h(T) = \log h_0(T) + b_1X_1 + b_2X_2 + b_3X_3 + \cdots }[/math]
where [math]\displaystyle{ h(T) }[/math] is the hazard at time [math]\displaystyle{ T }[/math]
and [math]\displaystyle{ h_0(T) }[/math] is the baseline hazard at time [math]\displaystyle{ T }[/math]

Maximum likelihood estimation method

†'Multivariable' can be rephrased as 'Multiple'; Multivariable is NOT equal to 'Multivariate'!!

Binary logistic regression

Conversion of logit of outcome odds to outcome probability [math]\displaystyle{ p }[/math]

Equation of binary logistic regression can be converted to outcome probablity [math]\displaystyle{ p }[/math] as,

[math]\displaystyle{ \begin{array}{lrll} \log Y = & \log \left ( \dfrac{p}{1-p} \right ) & = a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots \\ \Leftrightarrow & \dfrac{p}{1-p} & = \exp (a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots) & = e^{(a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots)} \\ \Leftrightarrow & p & = \dfrac { \exp (a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots) }{ 1 + \exp (a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots) } & = \dfrac { e^{(a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots)} }{ 1 + e^{(a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots)} } \\ \end{array} }[/math]

Conversion of coefficient to odds ratio

When thinking about outcome probability [math]\displaystyle{ p }[/math] and the changed outcome probability [math]\displaystyle{ p\prime }[/math] by adding [math]\displaystyle{ 1 }[/math] to explanatory variable [math]\displaystyle{ X_1 }[/math], the following two equations are obtained,

[math]\displaystyle{ \begin{array}{lcl} \log \left ( \dfrac{p}{1-p} \right ) & = & a + b_1X_1 + b_2X_2 + b_3X_3 + \cdots \\ \log \left ( \dfrac{p\prime}{1-p\prime} \right ) & = & a + b_1( {\color{red}X_1 + 1} ) + b_2X_2 + b_3X_3 + \cdots \end{array} }[/math]

Subtraction of these two equations makes,

[math]\displaystyle{ \begin{align} \log \left ( \frac{p\prime}{1-p\prime} \right ) - \log \left ( \frac{p}{1-p} \right ) & = b_1({\color{red}X_1 + 1}) - b_1X_1 \\ & = b_1 \\ \Leftrightarrow \frac {\left ( \dfrac{p\prime}{1-p\prime} \right )}{ \left ( \dfrac{p}{1-p} \right ) } & = \exp (b_1) = e^{b_1} \end{align} }[/math]


Because [math]\displaystyle{ \dfrac{p\prime}{1-p\prime} }[/math] and [math]\displaystyle{ \dfrac{p}{1-p} }[/math] are odds of [math]\displaystyle{ p\prime }[/math] and [math]\displaystyle{ p }[/math], respectively,

[math]\displaystyle{ \frac {\left ( \dfrac{p\prime}{1-p\prime} \right )}{ \left ( \dfrac{p}{1-p} \right ) } }[/math] is the odds ratio of [the probability when [math]\displaystyle{ 1 }[/math] is added to [math]\displaystyle{ X_1 }[/math]] to [the probability before adding].


Thus, converted [math]\displaystyle{ b_1 }[/math] to [math]\displaystyle{ \color{red}{e^{b_1}} }[/math] or [math]\displaystyle{ \color{red}{\exp (b_1)} }[/math] gives the odds ratio of outocome probabilities before and after variable [math]\displaystyle{ X_1 }[/math] gains [math]\displaystyle{ 1 }[/math].

[math]\displaystyle{ \begin{align} \exp (\text{coefficient}) = e^{\text{coefficient}} & = \text{odds ratio} \\ \log (\text{odds ratio}) & = \text{coefficient} \end{align} }[/math]

Generalized linear model

Penalized multivariable logistic regression model

Restricted cubic spline