Advanced Econometrics - Part II - Chapter 4: Discrete choice analysis: Multinomial Models

Tài liệu Advanced Econometrics - Part II - Chapter 4: Discrete choice analysis: Multinomial Models: Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models Nam T. Hoang UNE Business School 1 University of New England Chapter 4 DISCRETE CHOICE ANALYSIS: MULTINOMIAL MODELS We look at settings with multiple, unordered choices. A key notion here is the “independence of irrelevant alternative” property Models for discrete choice with more than two choices: We assume for the thi consumer faced with i choices (j=1,2,,J) suppose that the utility of choice j is: ijijij XU εβ += If the consumer makes choice j in particular, then we assume that ijU is the maximum among J alternatives. Prob( )ij ikU U→ > for all jk ≠ This is a probability of individual I makes choice j. jYi = if ikij UU > for all jk ≠ The model is made by a particular choice of distribution for the disturbances. Let iY be a random variable that indicates the choice made McFadden (1974) has shown that if and only if the J disturbances are independent and ...

13 trang | Chia sẻ: honghanh66 | Lượt xem: 597 | Lượt tải: 0

Bạn đang xem nội dung tài liệu Advanced Econometrics - Part II - Chapter 4: Discrete choice analysis: Multinomial Models, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên

Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models Nam T. Hoang UNE Business School 1 University of New England Chapter 4 DISCRETE CHOICE ANALYSIS: MULTINOMIAL MODELS We look at settings with multiple, unordered choices. A key notion here is the “independence of irrelevant alternative” property Models for discrete choice with more than two choices: We assume for the thi consumer faced with i choices (j=1,2,,J) suppose that the utility of choice j is: ijijij XU εβ += If the consumer makes choice j in particular, then we assume that ijU is the maximum among J alternatives. Prob( )ij ikU U→ > for all jk ≠ This is a probability of individual I makes choice j. jYi = if ikij UU > for all jk ≠ The model is made by a particular choice of distribution for the disturbances. Let iY be a random variable that indicates the choice made McFadden (1974) has shown that if and only if the J disturbances are independent and identically distributed with type I extreme value distribution: ije ijij eF ε εε − −=−−= )exp(exp()( Then: 1 exp( ) Pr ( ) exp( ) ij i J ij j X ob Y j X β β = = = ∑ ∑ = = J j ij ij Z Z 1 )exp( )exp( θ θ Utility depends on ijZ which includes aspects specific to the individual (i) as well as to choice (j). Let ],[ iijij wXZ = , ],[ αβθ = • ijX varies across choices (j) (and possibly across individual (i) as well). • iw contains the characteristics of the individual (i), therefore the same for all choice. Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models Nam T. Hoang UNE Business School 2 University of New England 1 exp( ) Prob( ) exp( ) ij i i J ij i j X w Y j X w β α β α = + = = +∑ )exp()exp( )exp()][exp( 1 αβ αβ i J j ij iij wX wX       = ∑ = ∑ = = J j ij ij X X 1 )exp( ][exp( β β For example, a model of a shopping centre choices by individual: Depends on: number of stores ijS , distance from the centre of the city Dij, and income of the individual (i’) i which varies across individuals but not across the choices. ( )iijijij IDSZ =→ I. THE MULTINOMIAL LOGIT MODEL: Suppose we have only individual specifre characteristics (i) iw which is the same for all choice. The model response probability as: 1 exp( ) Prob( ) 1 exp( ) i j i i ij J i j j w Y j w P w α α = = = = +∑ For all choices j=1,.,J. For the first choice j=0 to satisfy ∑ = = J j ijP 0 1 ∑ = + === J j ji ioii w PwYob 1 )exp(1 1)0(Pr α The log – likelihood: 1 0 L ln ln n J ij ij i j d P = = = = ∑∑ Where ijd =1 if alternative j is chosen by individual i, 0 if not ∑ = −= ∂ ∂ n i iijij j wPdL 1 )( α j=1,,J The marginal effects of the characteristics on probabilities: Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models Nam T. Hoang UNE Business School 3 University of New England 0 [ ] J ij ij ij jk ie ek ij jk eik P P P P w δ α α α α = ∂   = = − = − ∂   ∑       =∑ = J e ekieP 0 αα II. CONDITIONAL LOGIT MODEL: When the data consist of choice - specific ( )ijX instead of individual - specific characteristics  The model is: 1 2Prob( , ,..., ) Pr ( )i i i iJ i iY j X X X ob Y j X= = = ∑ = = J j ij ij ij X X P 0 )exp( )exp( β β Notes:  When iw is unchanged  jα varies  When ijX varies  β is unchanged The multinomial logit model can be viewed as a special case of this suppose we have a vector of individual characteristics iX with dimension K. Then define for each choice j the vector of ijX as following:                     = 0 . . . 0 ' 1 i i X X ,                     = 0 . . 0 0 ' i ij X X ,                     = i iJ X X . . . 0 0 ' So ijX varies for each choice )1( ×K iX ]0...00[=ioX ]0...0[1 ii XX = . . ]0..00[ ijij XX = . Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models Nam T. Hoang UNE Business School 4 University of New England . ].0.00[ iiJ XX =                     =→ Kβ β β β . . . 2 1 ∑∑ == + == J j ji ji J j ij ij ij X X X X P 10 )exp(1 )exp( )exp( )exp( β β β β In this model, the coefficients are not directly tied to the marginal effects: β)])(1([ imij im ij PmjP x P −== ∂ ∂ Where )(1 mj = equals 1 if j=m and 0 if not Log likelihood: 1 1 L ln ln n J ij ij i j d P = = = = ∑∑ III. MIXED LOGIT MODEL: For a model combines the two models: Prob 1 exp( ) ( ) exp( ) ij i j i J ij i j j X W Y j X W β α β α = + = = +∑ 1 exp( ) Pr[ ] exp( ) ij i J ij j Z Y j Z θ θ = → = = ∑ 1 1[ 0 0 ... 0]i iZ X= 2 2[ 0 ... 0]i i iZ X W= [ 0 ... ... 0]ij ij iZ X W= [ 0 ...0 ]iJ iJ iZ X W= Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models Nam T. Hoang UNE Business School 5 University of New England 1 : : j J β α θ α α         =              This model doesn’t have the advantage the same as the conditional logit model: If an additional alternative was added to the choice set then one can predict its probability of selection, since the parameter of the conditional logit model do not vary across alternatives. IV. INDEPENDENCE OF IRRELEVANT ALTERNATIVES: • The ratio of probabilities of any two alternatives is independent of the introduction of a third alternative. This is unrealistic in many economic choice models. • In the multinomial logit and conditional logit model ij im P P is independent of the remaining probability called the Independence of Irrelevant Alternative. • Consider the conditional probability of choosing j given that you choose either j or l. Prob Pr( )( { , }) Pr( ) Pr( ) i i i i i Y jY j Y j l Y j Y l = = ∈ = = + = exp( ) exp( ) exp( ) ij ij il X X X β β β = + • This probability does not depend on the characteristics imX of alternatives m other than j and l. The traditional example is MeFadden’s famous blue bus/red bus example. • Suppose there are initially three choices: commuting by car, by red or by blue bus. • People are indifferent between red versus blue buses. , ,i redbus i bluebusU U= With the choice between the blue and red bus being random, suppose: , , ,i redbus i bluebus i busX X X= = Then suppose that the probability of commuting by bus is Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models Nam T. Hoang UNE Business School 6 University of New England Pr( ) Pr(( )i iY bus Y redbus or bluebus= = = = * , * , , exp( ) exp( ) exp( ) i bus i bus i car X X X β β β = + And 1Pr( ) 2i i Y redbus Y bus= = = • That would imply that the conditional probability commuting by car, given that one commutes by blue or red bus, would differ from the same conditional probability if there is no blue bus. Presumably taking away the blue bus choice would lead all the current blue bus users to shift to the red bus, not to cars. • exp( )ie ie ik ik P X X P β β= − does not depend on any alternative other than l & k. • The conditional logit model does not allow for this type of substitution pattern. Again, consider commuting initially choosing between two models of transportation, car and red bus. So ( ) 1i car i bus red P P = exp( )( 1) iccar redbus irb PX X P β β= − = = . • Now suppose a third choice, blue bus is added. Assuming bus commuters do not care about the colour of the bus, consumers will choose between these with equal probability. The ratio of their probabilities of taking blue bus and red bus is 1: P 1irb ibbP = . But then IIA implies that Pic irbP is the same whether or not another alternative is added (blue bus) so we have: 1irb ic ibb irb P P P P = = and 1ic irb ibbP P P+ + = and 1 3ic irb ibb P P P= = = . Which are the probabilities that the logit model predicts? • In real life, however, we would expect the probability of taking a car to remain the same when a new bus is introduced that is exactly the same as the old bus. We would expect the original probability of taking the bus to be split between the two buses after the second one is introduced. That is we would expect: 1 2ic P = , 1 4ibb P = , 1 4irb P = . Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models Nam T. Hoang UNE Business School 7 University of New England • In this case, the logit model, because of its IIA property, overestimates the probability of taking a car. The ratio of probabilities of car and bus Pc bbP actually changes with introduction of the red bus, rather than remaining constant as required by the logit model. • The same kind of misprediction arises with logit models if there is change of another alternative.  Suppose individuals have choice out of three restaurants: Purdue (P) restaurant, Krannert restaurant (K), Chauncey restaurant (C): 95pP = , 85kP = , 5cP = and quality 10pQ = , 9kQ = , 2cQ = . Suppose that market shares for 3 restaurant are 0.1pS = , 0.25kS = and 0.65cS = .  0.2 2ij j j ijU P Q ε= − + + conditional logit model 0.1 0.65 ip ic P P → = .  Suppose that Krannert restaurant raise the price to 1000 (taking it out of business).  Conditional logit model would predict 0.13ipP = and 0.87icP = to satisfy 0.1 0.65 ip ic P const P = =  This seems implausible  people who were planning to go to Krannert would appear to be more likely to go to PMU than to go to the Chauncey rest so one would expect 0.35pS ≈ ; 0.65cS ≈ (IIA not holds in reality  conditional logit is not valid in this case) IIA: adding another alternative or changing the characteristics of a third alternative does not affects the ratio between two alternatives. • Test of IIA Hausman & MeFadden offer tests of the IIA assumption based on the observation that: If the conditional logit model is true, β can be consistently estimated by conditional logit by focusing on any subset of alternative. Using Hausman’s test to compare the estimate of β, using all alternative with the estimate, using a subset of alternatives: ( ) [ ] ( ) 21' ~ˆˆˆˆˆˆ χββββ fsfsfs VV −−− − s: restricted subset, f: full subset ˆ ˆ:o s fH β β= Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models Nam T. Hoang UNE Business School 8 University of New England • We need IIA holds to apply the conditional logit model If reject Ho  IIA not holds  conditional logit is not valid model in this case. • ij ij ijU X β ε= +  The IIA assumption need to hold in reality to apply the conditional logit model.  The IIA property follows from the initial assumption that ijε are extreme value distributions. V. NESTED LOGIT MODEL. • If the test of IIA fails (reject ˆ ˆ:o s fH β β= ) then the conditional logit model is not valid. We need to modify the multinomial logit model. One way to introduce correlation between the choices is through nesting them. Suppose the set of choices {0 , 1,, J} can be partitioned into S sets B1, B2 ,, Bs , so that the full set of choices can be written as: { } 1 0,1,..., s ssJ U B== Let Zs be set – specific characteristics (Branch characteristics) Mc Fadden (1981) studied the following model: Adjusted with *sρ • Conditional probability: 1 1 exp( ) Pr( , ) exp( ) s s ij i i i s s ill B X Y j X Y B X ρ β ρ β − − ∈ = ∈ = ∑ • Within the sets, the correlation coefficient for ijε is equal to 2(1 )sρ− . Between the sets the ijε are independent  adjusted the probabilities by sρ in each group. The probability of a choice in the set Bs is 1 1 1 exp( )[ exp( )] Pr( ) [exp( )( exp( )) ] s s s t s s il l B i s i s t t il t l B Z X Y B X Z X ρ ρ α ρ β α ρ β − ∈ − = ∈ ∈ = ∑ ∑ ∑ Pr( )i iY j X→ = If we fix 1sρ = for all s, then Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models Nam T. Hoang UNE Business School 9 University of New England 1 exp( ) Pr( ) exp( ) t ij s i i s il t t l B X Z Y j X X Z β α β α = ∈ + = = +∑∑ and we are back in the conditional logit model In the first:  In general this model corresponds to individuals choosing the option with the highest utility, where the utility of choice j in set Bs for individuals i is ij ij s ijU X Zβ α ε= + + Mc Fadden suppose that: the joint distribution function of the ijε is 1 1 ( ,....., ) exp( ( exp( )) )s s S io iJ t ij s j B F ρε ε ρ ε− = ∈ = −∑ ∑ From this he derive  the results in the previous page • How do we estimate these models?  One approach is to construct the log – likelihood and directly maximize it. That is complicated, especially since the log likelihood function is not concave (but this also not impossible)  An easier alternative is to directly use the nesting structure. Within a nest we have a conditional logit model with coefficient 1sρ β − . Hence we can directly estimate 1sρ β − using the concavity of the conditional logit model ( Newton – Raphson procedure will converge to a global maximum). Denote these estimate of ss λβρ ˆ 1 =− .  Then the probability of a particular set Bs can be used to estimate sρ and α through: ( ) ( )∑ ∑ ∑ = ∈ ∈                     =∈ S t Bl tilt Bl sils isi s s s s XZ XZ XBY 1 )ˆexp(exp )ˆexp()exp( Pr( ρ ρ λα λα ∑ = + + = S t ttt sss WZ WZ 1 ˆexp( )ˆexp( ρα ρα Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models Nam T. Hoang UNE Business School 10 University of New England sWˆ is called: “inclusive values” Where:       = ∑ ∈ sBl sils XW )ˆexp(lnˆ λ • We have another conditional logit model with likelihood function: n i i 1 ( Pr( X )) s i s Bi Y B Y= ∈ = ∈∏ ∏ ∏ ∏ ∑= ∈ =                         + + = n 1i 1 ˆexp( ˆexp( si BY s t ttt sss WZ WZ ρα ρα • These models can be extended too many lagers of nests. It should be noted that both the order of the nests and the elements of each nest are very important. VI. MULTINOMIAL PROBIT MODEL: • A natural alternative model to avoid the IIA problem which is caused by correlation across choices is to work with normally distributed errors (.))~( Nijε . Now we will not assume ijε ~ Extreme value distribution anymore. • Note that: extreme value ≈ normal distribution, but EV distribution is much easier to calculate. • The cost of using normal distribution is the complicated likelihood function. ijijXU ij εβ += Jj ,...,2,1=                 + + + =                 = iJiJ ii ii iJ i i i X X X U U U U εβ εβ εβ : : : : 11 00 1 0 With: 0 1 : ~ (0, ) : i i i i iJ X N ε ε ε ε        = ∑         Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models Nam T. Hoang UNE Business School 11 University of New England With unrestricted covariance matrix ∑ JjUUqY ijiqi ,...,1,Pr[)Pr( =>== ]qj ≠ , or ])(;...,)(Pr[)Pr( 11 βεεβεε iJiqiqiJiiqiqii XXXXqY −<−−<−== • The main obstacle to the implementation of the Multinomial probit model is the difficulty in computing the multivariate normal probabilities for any J > 2. • Recent results on accurate simulation of multinomial integrals have made estimation of MNP model feasible. • Read: Geweke, Keane and Runkle (1994) – RE Statistics 76, No4 for the method, if you want to use the MN Probit model. • For J = 3 );()1( 3121 iiiii UUUUPyP >>==→ ∫ ∫ ∫ +∞ ∞− +∞ ∞− +∞ ∞− =      −<−= −<−= ==→ βεε βεε )( )( )1( 31132 21121 iiii iiii i XXu XXu PyP 1 * 2 ~ (0, ) U N U   ∑    Where:           −− ∑      − − =∑ 10 01 11 011 011* • Each element of the likelihood is a double integral and must be evaluated numerically. • This model does not suffer from the IIA problem. VII. ORDERED LOGIT, ORDERED PROBIT: & SEQUENTIAL MODELS 1. Ordered Probit: εβ += ii XY * *Y is unobservable: * * 1 * 1 2 * 1 0 0 1 0 2 : : i i i i i i i J i Y if Y Y if Y Y if Y Y J if Y µ µ µ µ −  = ≤  = ≤ <  = ≤ ≤     = ≤ Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models Nam T. Hoang UNE Business School 12 University of New England μ1,μ2,μJ-1, are unknown parameters to be estimate with β. Assume that ε is normally distributed across observations. Normalize the mean and variance of ε , )1,0(~ Nε . We have: ( ) ( )0i i iP y X X β= = Φ − ( ) ( ) ( )ββµ iiii XXXyP −Φ−−Φ== 11 ( ) ( ) ( )βµβµ iiii XXXyP −Φ−−Φ== 122 : : ( ) ( )βµ iJii XXJyP −Φ−== −11 We must have: 121 ...0 −<<<< Jµµµ (for all the probabilities to be positive) Likelihood function: i j [1,...,J] Pr(Y ) all observations j ∈ = =∏ Marginal Effeds: ki ik ii X XYP ββφ χ )( )0( −= ∂ =∂ kijij ik ii XX XjYP ββµφβµφ χ )]()([ )( 12 −−−=∂ =∂ −− kiJ ik ii X XJYP ββµφ χ )]([ )( 1 −=∂ =∂ − 2. Ordered Logit: Replace Φwith the logit function )exp(1 )exp()( )exp(1 )exp()( X XXF X XXF i i i + = + = β ββ gives the ordered logit model. 3. Sequential Multinomial Models: A Special case of an ordered variable (where choices have a natural ranking) is a sequential variable. This occurs when second event is dependent on the first event, the third event is dependent on the previous two events,  Person i at nth category means person i has been all (n-1) previous categories: Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models Nam T. Hoang UNE Business School 13 University of New England      = college collegenothighschool schoolhighnot yi 3 ,2 1 [ ] [ ] [ ]1Pr12Pr2Pr ≠×≠=== iiii yyyy ))(1)(( 1122 ββ XX Φ−Φ= The parameters β1 and β2 can be estimated by maximizing the log-likelihood: 1 1 ln ln n m ij ij i j L y p = = = = ∑∑ 1 1 1( )i ip X β= Φ , p2i is given in the preceding equation and 3 1 21i i ip p p= − − Notes: )2( =iyP means )12( ≠= ii yandyP

Các file đính kèm theo tài liệu này:

chapter_04_multinomial_models_9413_7552.pdf