



























































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An overview of hierarchical bayesian estimation and the random coefficients model in econometrics. It covers topics such as sample data generation, prior densities for structural parameters, the random coefficients model, discrete vs. Continuous variation, and estimating an lc model. The document also discusses the em algorithm and its application to mixed models.
Typology: Slides
1 / 67
This page cannot be seen from the preview
Don't miss anything!
20. Individual Heterogeneity
and Random Parameter Variation
Heterogeneity
Observational: Observable differences across
C hoice strategy: How consumers make
S tructural: Differences in model frameworks
P references: Differences in model ‘parameters’
Distinguish Bayes and Classical
Both depart from the heterogeneous ‘model,’
f(y (^) it| x it)=g(y (^) it, x it, β i)
What do we mean by ‘randomness’
With respect to the information of the analyst (Bayesian) With respect to some stochastic process governing ‘nature’ (Classical)
Bayesian: No difference between ‘fixed’ and ‘random’
Classical: Full specification of joint distributions for
observed random variables; piecemeal definitions of ‘random’ parameters. Usually a form of ‘random effects’
Hierarchical Bayesian Estimation
i,t i,t
0
Sample data generation: f(y | ) g(y , ) Individual heterogeneity: , ~ N[ ] What information exists about 'the model?'
p( ) = N[ , ]
′ (^) =
i,t i,t i i i i
0
x , x , β Ω β = β u u 0, Γ
Prior densities for structural parameters : β β Σ , e.g., and (large) v^0 p( ) = Inverse Wishart[ , ] p( ) = whatever works for other parameters in model
p( )= N[ , ] End Result: Joint prior distribution for all param
γ
i
0 I Γ A Ω Priors for parameters of interest : β β Γ
0
eters p( β Γ Ω β , , , (^) i |prior 'beliefs' in β^0 , Σ , ,γ A, assumed densities)
Priors
i
i i i β j j j
β
β
Prior Densities
~N[ , ],
Implies = + , ~N[0, ] λ ~Inverse Gamma[v,s ] (looks like chi-squared), v=3, s =
Priors over structural model parameters
~N[ ,a ], =
β β V β
β β w w V
β β V β
V ~Wishart[v , 0 V 0 (^) ],v =8, 0 V 0 =8 I
Bayesian Posterior Analysis
Estimation of posterior distributions for upper level parameters: and Vβ
Estimation of posterior distributions for low (individual) level parameters, βi|data (^) i. Detailed examination of ‘individual’ parameters
(Comparison of results to counterparts using classical methods)
β
Antonio Alvarez, University of Oviedo Carlos Arias, University of Leon William Greene, Stern School of Business, New York University
The Production Function Model
2
2
2 1 2
ln ln (ln )
ln
it x (^) it xx it
m (^) i mm (^) i xm it i it
y x x
m m x m v
α β β
β β β
Definition: Maximal output, given the inputs Inputs: Variable factors, Quasi-fixed (land) Form: Log-quadratic - translog Latent Management as an unobservable input
(^12) 1 1 1
i
ln = ln -
ln ln ln
ln
m * is an unobserved, time invariant
= = =
=
= α + β + β
β + β + β
it it it K K K k k^ itk^ k l kl^ itk^ itl K m i mm i (^) k km itk i it it
y y u
x x x
m m x m
v u effect.
it it it K m (^) k km kit i i mm i i
Random Coefficients Model
( ) ( )
ln ln
ln ln
ln ln ln
= α + β + β + β + β
= α + β + β + ε
∑
∑ ∑
∑ ∑ ∑
it m i mm i (^) k k km i itk K K k l kl^ itk^ itl^ it^ it K K K i (^) k ki itk (^) k l kl itk itl it
y m m m x
x x v u
x x x
1
ln
K i k k i k
= (^) ∑ +
[Chamberlain/Mundlak:]
(1) Same random effect appears in each random parameter
(2) Only the first order terms are random
Discrete Parameter Variation
The Latent Class Model
(1) Population is a (finite) mixture of Q types of individuals.
q = 1,...,Q. Q 'classes' differentiated by ( ) (a) Analyst does not know class memberships. ('latent.')
β q
J 1 Q q=1 q
i,t it
(b) 'Mixing probabilities' (from the point of view of the analyst) are ,..., , with 1
(2) Conditional density is
P(y | class q) f(y | , )
π π Σ π =
= = x (^) i,t βq
′ ′
π
∑
Q i (^) q=1 i i,choice class i j=choice i,j class
i,q i,q
Pr(Choice ) = Pr(choice | class = q)Pr(class = q)
exp(x β )
Heterogeneity with resp
Pr(choice | class = q) = Σ exp(x β ) e Pr(class =
ect
q | i
to 'latent' cons
) = , e.g.,
umer classes
F =
( ) ,
′ ′
′ ′ ′ = = π ′
∑
i q q=classes i q
i,choice i i i j=choice i,j i i q i q i,q q=classes i q Q i (^) q=
xp(z ) Σ exp(z )
exp(x β
Simple discrete ran
) Pr(choice | β ) = Σ exp(x β ) exp(z ) Pr β β = q = 1,..., Q Σ exp(
dom param
z )
Pr(Choice
eter v
) =
ariatio
(
n
Pr c
δ δ
δ δ hoice | βi =β )Pr(β )q q
Estimating an LC Model
i i
i,t i,t it i,t i T i1 i2 i,T (^) t 1 it i,t i
Conditional density for each observation is
P(y | , class q) f(y | , )
Joint conditional density for T observations is f(y , y ,..., y | ) f(y | , ) (T may be 1. This is not
=
= =
= (^) ∏
q
i q q
x x β
X , β x β
( )
i i
Q T i1 i2 i,T (^) q 1 iq (^) t 1 it i,t
only a 'panel data' model.)
Maximize this for each class if the classes are known.
They aren't. Unconditional density for individual i is f(y , y ,..., y | , ) f(y | , )
LogLikeli
X (^) i zi = (^) ∑ (^) = π ∏ = x βq
N Q T i 1 Q (^) i 1 q 1 iq (^) t 1 it i,t
hood
LogL( β (^) 1 ,..., βQ , δ ,..., δ ) = (^) ∑ (^) = log (^) ∑ (^) = π ∏ = f(y | x , βq )
Estimating Which Class
i i
iq i T i1 i2 i,T i (^) t 1 it i,t
Prob[class=q| ]= for T observations is P(y , y ,..., y | , class q) f(y | , ) membership is the pro
=
π
= = (^) ∏
i
q
Prior class probability z Joint conditional density X x β Joint density for data and class i i
i i
i1 i2 i,T i q T it i,t t 1
i1 i2 i,T i i i1 i2 i,T i
duct P(y , y ,..., y , class q | , ) f(y | , )
P(class q | y , y ,..., y , , ) P(^ , class^ q |^ ,^ ) P(y , y ,..., y | , )
= = π =
i ∏ q
i i^ i i
X z x β Posterior probability for class, given the data X z y^ X^ z X z Q^ i q 1 i T i i iq^ t 1 it^ i,t
P( , class q | , ) P( , class q | , ) Use Bayes Theorem to compute the w(q | , , ) P(class j | , , ) f(y^ |^ ,^ )
=
=
= = = π
∑
i i i i
i i i i q
y X z y X z posterior (conditional) probability y X z y X z^ i x^ β Qq 1 iq Tt 1i it i,t iq
f(y | , ) w
∏ ∑ (^) ∏ x^ βq
Best guess = the class with the largest posterior probability.