install.packages(“mlogit”)

library(mlogit)

Multinomial logit model

\(p_{ij}=P(y_i=j)=\dfrac{\exp{w_i^T \gamma_j}}{\sum_{k=1}^{m}\exp{w_i^T \gamma_k}}\)

m - pocet mozností j=1,..,m

Jeden koeficient \(\gamma=0\), normalizujeme. Ziskame base-group, referencni kategorii, ke ktere budeme intepretovat odhady zbylych koeficientu.

Budeme odhadovat jak prijem ovlivni to, jak bude clovek rybarit

ryby=read.csv("C:/Users/Lukas/Desktop/multinomial_fishing1.csv")
attach(ryby)

Income v tis. USD se nemeni se zmenou typu rybareni - pouzijeme Multinomial logit model

Nejprve si upravie data. Nyni totiz vidime pouze to, jak chyta dany rybar.
Data si upravime do tzv. dlouheho formatu, abychom u kazdeho jedince meli i zaznam k ostatnim typum zavisle promenne. Jsou to samozrejme nuly.

ryby$mode=as.factor(ryby$mode)  # typ rybareni musime zadefinovat jako faktor a nebo 0,1
mldata=mlogit.data(ryby, varying=4:15, choice="mode", shape="wide")
mldata[1:12,]
##            mode   price catchrate   income mode1     alt d  catch chid
## 1.beach   FALSE 157.930    0.5391 7.083332     4   beach 0 0.0678    1
## 1.charter  TRUE 182.930    0.5391 7.083332     4 charter 1 0.5391    1
## 1.pier    FALSE 157.930    0.5391 7.083332     4    pier 0 0.0503    1
## 1.private FALSE 157.930    0.5391 7.083332     4 private 0 0.2601    1
## 2.beach   FALSE  15.114    0.4671 1.250000     4   beach 0 0.1049    2
## 2.charter  TRUE  34.534    0.4671 1.250000     4 charter 1 0.4671    2
## 2.pier    FALSE  15.114    0.4671 1.250000     4    pier 0 0.0451    2
## 2.private FALSE  10.534    0.4671 1.250000     4 private 0 0.1574    2
## 3.beach   FALSE 161.874    0.2413 3.750000     3   beach 0 0.5333    3
## 3.charter FALSE  59.334    0.2413 3.750000     3 charter 0 1.0266    3
## 3.pier    FALSE 161.874    0.2413 3.750000     3    pier 0 0.4522    3
## 3.private  TRUE  24.334    0.2413 3.750000     3 private 1 0.2413    3

porovnejte s puvodnim formatem ryby

Multinomial logit model coefficients

Nejprve normalizujeme parametr pro “charter”"

mlogit.model1 <- mlogit(mode ~ 1 | income, data=mldata, reflevel="charter")  # 1 je zde protoze tu nemame zadne promenne co se meni s variantou, treba cena rybareni 
summary(mlogit.model1)
## 
## Call:
## mlogit(formula = mode ~ 1 | income, data = mldata, reflevel = "charter", 
##     method = "nr", print.level = 0)
## 
## Frequencies of alternatives:
## charter   beach    pier private 
## 0.38240 0.11337 0.15059 0.35364 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 8.32E-07 
## gradient close to zero 
## 
## Coefficients :
##                      Estimate Std. Error t-value  Pr(>|t|)    
## beach:(intercept)   -1.341291   0.194517 -6.8955 5.367e-12 ***
## pier:(intercept)    -0.527141   0.177784 -2.9651  0.003026 ** 
## private:(intercept) -0.602371   0.136096 -4.4261 9.597e-06 ***
## beach:income         0.031640   0.041846  0.7561  0.449591    
## pier:income         -0.111763   0.043979 -2.5413  0.011046 *  
## private:income       0.123546   0.027911  4.4265 9.577e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -1477.2
## McFadden R^2:  0.013736 
## Likelihood ratio test : chisq = 41.145 (p.value = 6.0931e-09)

S vyssim prijmem:

  • poroste pravdepodobnost, ze dany jedinec bude rybarit ze soukrome lode, nez z pujcene lode.

  • klesa pravdepodobnost, ze bude rybarit z mola, nez z pujcene lode.

Zmenime referencni promennou na “pier”

mlogit.model2 <- mlogit(mode ~ 1 | income, data = mldata, reflevel="pier")
summary(mlogit.model2)
## 
## Call:
## mlogit(formula = mode ~ 1 | income, data = mldata, reflevel = "pier", 
##     method = "nr", print.level = 0)
## 
## Frequencies of alternatives:
##    pier   beach charter private 
## 0.15059 0.11337 0.38240 0.35364 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 8.32E-07 
## gradient close to zero 
## 
## Coefficients :
##                      Estimate Std. Error t-value  Pr(>|t|)    
## beach:(intercept)   -0.814150   0.228632 -3.5610 0.0003695 ***
## charter:(intercept)  0.527141   0.177784  2.9651 0.0030262 ** 
## private:(intercept) -0.075229   0.183240 -0.4106 0.6814007    
## beach:income         0.143403   0.053288  2.6911 0.0071223 ** 
## charter:income       0.111763   0.043979  2.5413 0.0110455 *  
## private:income       0.235309   0.043668  5.3886 7.101e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -1477.2
## McFadden R^2:  0.013736 
## Likelihood ratio test : chisq = 41.145 (p.value = 6.0931e-09)

S vyssim prijmem poroste pravdepodobnost, ze dany jedinec vybere variantu rybareni beach, obou lodi nez z mola. Jinak receno s roustoucim prijem se stavaji uvedene 3 varianty pravdepodobnejsi, nez ze ho najdeme na mole.

Marginal effects

POZOR znaménko koeficientu nutne nemusí korespondovat s marginalnim efektem!

Opet musime zvolit, zdali budeme pouzivat - marginal effects at mean (MEM) a average marginal effects (AME).

The marginal effect - zvyseni regresoru, zmeni pravdepodobnost vyberu alternativy j o:

\(\dfrac{\partial p_{ij}}{\partial w_i} = p_{ij}(\gamma_i-\bar{\gamma}_i)\)

m = mlogit(mode ~ price+catch |income, data = mldata, reflevel="charter")
z = with(mldata, data.frame(price = tapply(price, index(m)$alt, mean),catch = tapply(catch, index(m)$alt, mean), income=mean(income)))

Marginalni efekt

effects(mlogit.model1, covariate = "income", data = z)
##       charter         beach          pier       private 
## -1.201367e-02  7.495845e-05 -2.065980e-02  3.259851e-02
effects(mlogit.model2, covariate = "income", data = z)
##          pier         beach       charter       private 
## -2.065980e-02  7.495844e-05 -1.201367e-02  3.259851e-02

Interpretace v procentech, ale pozor na jednotky regresoru.

S rustem Income o jednotku (1000 USD) dojde k:

  • prob. vyberu charter klesne o 1.2%

  • prob. vyberu pier klesne o 2%

  • prob. vyberu private vzroste o 3.2%

Conditional logit model

Nyni si do regrese zahrneme i promenne, ktere se budou menit s variantou. Budou to cena rybareni a “catch rate”

\(p_{ij}=P(y_i=j)=\dfrac{\exp{(x_{it}^T \beta + w_i^T \gamma_j)}}{\sum_{k=1}^{m}\exp{(x_{it}^T \beta+w_i^T \gamma_k)}}\)

VSIMNETE SI, ZE \(\beta\) NEMA ZADNY INDEX. NEMENI SE S VARIANTOU

Conditional logit coefficients

Opet zacneme s interpretaci vzhledem k referencni kategorii “charter”.

clogit.model1 <- mlogit(mode ~ price+catch |income, data = mldata, reflevel="charter")
summary(clogit.model1)
## 
## Call:
## mlogit(formula = mode ~ price + catch | income, data = mldata, 
##     reflevel = "charter", method = "nr", print.level = 0)
## 
## Frequencies of alternatives:
## charter   beach    pier private 
## 0.38240 0.11337 0.15059 0.35364 
## 
## nr method
## 7 iterations, 0h:0m:0s 
## g'(-H)^-1g = 1.37E-05 
## successive function values within tolerance limits 
## 
## Coefficients :
##                       Estimate Std. Error  t-value  Pr(>|t|)    
## beach:(intercept)   -1.6943657  0.2240506  -7.5624 3.952e-14 ***
## pier:(intercept)    -0.9164063  0.2072648  -4.4214 9.805e-06 ***
## private:(intercept) -1.1670869  0.1590475  -7.3380 2.169e-13 ***
## price               -0.0251166  0.0017317 -14.5042 < 2.2e-16 ***
## catch                0.3577820  0.1097733   3.2593  0.001117 ** 
## beach:income         0.0332917  0.0503409   0.6613  0.508403    
## pier:income         -0.0942854  0.0500600  -1.8834  0.059640 .  
## private:income       0.1227315  0.0286306   4.2867 1.813e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -1215.1
## McFadden R^2:  0.18868 
## Likelihood ratio test : chisq = 565.17 (p.value = < 2.22e-16)

NEMENNE PROMENNE

Ve srovnani s charter fishing, tak s rostoucim prijmem:

  • roste prob. privatni lode

  • klesa pravdepodobnost na mole

VARIABILNI PROMENNE

s rostouci cenou alternativy, klesa pravdepodobnost, ze bude vybrana tato varianta a roste pravdepodobnost, ze bude vybrana jina varianta.

S rostouci “catch rate” alternativy, je naopak pravdepodobnejsi, ze bude zvolena alternativa.

Nyni bude referenci kategorie “prier”.

clogit.model2 <- mlogit(mode ~ price+catch | income, data = mldata, reflevel="pier")
summary(clogit.model2)
## 
## Call:
## mlogit(formula = mode ~ price + catch | income, data = mldata, 
##     reflevel = "pier", method = "nr", print.level = 0)
## 
## Frequencies of alternatives:
##    pier   beach charter private 
## 0.15059 0.11337 0.38240 0.35364 
## 
## nr method
## 7 iterations, 0h:0m:0s 
## g'(-H)^-1g = 1.37E-05 
## successive function values within tolerance limits 
## 
## Coefficients :
##                       Estimate Std. Error  t-value  Pr(>|t|)    
## beach:(intercept)   -0.7779594  0.2204939  -3.5283 0.0004183 ***
## charter:(intercept)  0.9164063  0.2072648   4.4214 9.805e-06 ***
## private:(intercept) -0.2506806  0.2039395  -1.2292 0.2190004    
## price               -0.0251166  0.0017317 -14.5042 < 2.2e-16 ***
## catch                0.3577820  0.1097733   3.2593 0.0011170 ** 
## beach:income         0.1275771  0.0506395   2.5193 0.0117582 *  
## charter:income       0.0942854  0.0500600   1.8834 0.0596396 .  
## private:income       0.2170169  0.0500582   4.3353 1.456e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -1215.1
## McFadden R^2:  0.18868 
## Likelihood ratio test : chisq = 565.17 (p.value = < 2.22e-16)

NEMENNE PROMENNE

Ve srovnani s olem, tak s rostoucim prijmem:

  • roste prob., ze si rybar zvoli jednu ze 3 variant

VARIABILNI PROMENNE

Nemeni se, hodnoty zustavaji stejne.

Marginal effects

\(\dfrac{\partial p_{ij}}{\partial x_{ik}} = p_{ij}(\delta_{ijk}-p_{ik})\beta\)

Budeme zkoumat, marginalni efekt jednotlivých regresoru na závisle promennou.

effects(clogit.model1, covariate = "income", data = z)
##       charter         beach          pier       private 
## -0.0217339246 -0.0007214181 -0.0093059734  0.0317613161

Kdyz vzroste income o (1000 USD), tak prob. výberu private vzroste o 3.2%.

effects(clogit.model1, covariate = "price", data = z)
##               charter         beach          pier       private
## charter -0.0062430047  6.091542e-04  7.642235e-04  0.0048696270
## beach    0.0006091541 -1.249124e-03  8.681094e-05  0.0005531588
## pier     0.0007642234  8.681094e-05 -1.545008e-03  0.0006939736
## private  0.0048696270  5.531588e-04  6.939737e-04 -0.0061167595

Kdyz vzroste cena charter o 1 USD, tak klesne prob. vyberu charter o 6.24%. Kdyz vzroste cena beach o 1 USD, tak vzroste prob. vyberu charter o 0.6%. Kdyz vzroste cena pier o 1 USD, tak vzroste prob. vyberu charter o 0.07%. Kdyz vzroste cena private o 1 USD, tak vzroste prob. vyberu charter o 0.04%.

effects(clogit.model1, covariate = "catch", data = z)
##              charter        beach         pier      private
## charter  0.088930726 -0.008677316 -0.010886256 -0.069367154
## beach   -0.008677329  0.017793621 -0.001236612 -0.007879681
## pier    -0.010886272 -0.001236612  0.022008455 -0.009885571
## private -0.069367164 -0.007879671 -0.009885559  0.087132394

Ordered

install.packages(“rms”)

library(rms)

Vliv promennych na zdravotni stav - (fair=1, good=2, excellent=3).

zdravi=read.csv("C:/Users/Lukas/Desktop/ordered_health.csv")
ologit<- lrm(healthstatus1 ~ age+logincome+numberdiseases, data=zdravi)
print(ologit)
## Logistic Regression Model
##  
##  lrm(formula = healthstatus1 ~ age + logincome + numberdiseases, 
##      data = zdravi)
##  
##                        Model Likelihood     Discrimination    Rank Discrim.    
##                           Ratio Test           Indexes           Indexes       
##  Obs          5574    LR chi2     740.39    R2       0.148    C       0.675    
##   1            523    d.f.             3    g        0.799    Dxy     0.350    
##   2           2034    Pr(> chi2) <0.0001    gr       2.224    gamma   0.357    
##   3           3017                          gp       0.075    tau-a   0.198    
##  max |deriv| 2e-11                          Brier    0.077                     
##  
##                 Coef    S.E.   Wald Z Pr(>|Z|)
##  y>=2            1.3960 0.2061   6.77 <0.0001 
##  y>=3           -0.9513 0.2054  -4.63 <0.0001 
##  age            -0.0293 0.0017 -17.43 <0.0001 
##  logincome       0.2837 0.0231  12.27 <0.0001 
##  numberdiseases -0.0550 0.0041 -13.51 <0.0001 
## 

\(\ln{(\dfrac{\pi_1} {\pi_3})}=\)

\(\ln{(\dfrac{\pi_2} {\pi_3})}=\)

\(\pi_1=\dfrac{\exp{L_1}}{1+\exp{L_1}+\exp{L_2}}\)

\(\pi_2=\dfrac{\exp{L_2}}{1+\exp{L_1}+\exp{L_2}}\)

\(\pi_k=\dfrac{1}{1+\exp{L_1}+\exp{L_2}}\)