# Economics homework help

Intro to Econometrics FINAL EXAM Thursday 05/14/20
T. Christensen Time Allowed: 24 hours
Question 1. (25 points in total, each part is worth 5 points)
You wish to estimate the causal e↵ect 1 of X on Y :
Yi = 0 + 1Xi + ui . (1)
You are concerned endogeneity bias might lead to inconsistency of the OLS estimate of 1. You
have a control variable Ci, which is not binary. The control variable satisfies conditional mean
independence:
E[ui|Xi, Ci] = E[ui|Ci] . (2)
However, the conditional mean of ui depends on Ci in a nonlinear fashion:
E[ui|Ci] = 0 + 1Ci + 2C2
i , (3)
where each of the coecients is non-zero.
You have data on Xi, Yi and Ci drawn i.i.d. from their joint distribution. You also know that each
of Yi and Xi has finite nonzero fourth moments and Ci has finite nonzero eighth moment.
Hint: conditioning on Ci is the same as conditioning on Ci and C2
i . This is because C2
i contains no
extra information beyond that contained in Ci. Therefore, E[ui|Ci] = E[ui|Ci, C2
i ] and similarly for
other conditional expectations.
(a) Propose an approach for consistently estimating 1 from data on Xi, Yi and Ci.
Be sure to clearly describe the model you would estimate. You should state what the dependent and explanatory variable/s are and the method you would use to estimate 1.
(b) Write the model from (a) in BLP form. In answering, clearly relate the BLP coecients to
(c) Show that the procedure you describe in part (a) will produce a consistent and unbiased
estimate of 1.
You do not need to provide a formal proof of consistency and unbiasedness, but you should
be able to show whether or not the relevant key assumption is satisfied.
(d) Briefly explain and distinguish the concepts of consistency and unbiasedness. In answering,
give an example of an estimator we’ve used this semester which is consistent but not unbiased.
(e) How, if at all, would your answer to (a) change if Ci was binary? Explain.
2
Intro to Econometrics FINAL EXAM Thursday 05/14/20
T. Christensen Time Allowed: 24 hours
Question 2. (15 points in total, each part is worth 5 points)
You wish to investigate whether an individual’s previous union membership status influences their
current status. You have panel data on individuals’ union membership over 4 years (t = 1, 2, 3, 4)
on the variable Mit, which takes the value 1 if individual i was a union member in year t and 0
otherwise. You model individual i’s utility from choosing to be a union member (U1) or not (U0) in
year t as a function of previous membership status Mit1, a fixed e↵ect ↵i, and random components
“it,1 and “it,0:
U1(Mit1, ↵i, “it,1) = u1(Mit1, ↵i) + “it,1 , (4)
U0(Mit1, ↵i, “it,0) = u0(Mit1, ↵i) + “it,0 . (5)
The “it,0 and “it,1 terms represent the parts of individual i’s utility from each choice in year t that
are not explained by previous membership status and the fixed e↵ect. These are drawn randomly
each year whereas the fixed e↵ect is constant over time. You assume
u1(Mit1, ↵i) u0(Mit1, ↵i) = 1Mit1 + ↵i . (6)
You also assume that, for each year t, the conditional distribution of “it,1 “it,0 given Mit1 and
↵i is a logistic distribution:
(“it,1 “it,0)|Mit1, ↵i has cdf ⇤, where ⇤(u) = 1
1 + eu . (7)
(a) Derive an expression for Pr(Mit = 1|Mit1, ↵i).
(b) Explain the role of the individual fixed e↵ects in this model. What is it that we are attempting
to control for by the inclusion of individual fixed e↵ects?
(c) Unlike panel regression models, here there is no obvious way to di↵erence out the individual
fixed-e↵ect ↵i from the expression you obtained in (a). After some algebra, you deduce
Pr(Mi2 = 1|Mi4, Mi2 + Mi3 = 1, Mi1, ↵i) = 1
1 + e1(Mi1Mi4) , (8)
Pr(Mi2 = 0|Mi4, Mi2 + Mi3 = 1, Mi1, ↵i) = e1(Mi1Mi4)
1 + e1(Mi1Mi4) . (9)
Describe how you could use these expressions to estimate 1. Be sure to clearly describe the
model you would estimate. You should state what the dependent and explanatory variable/s
are, the (subset of) data you would use, and the method you would use to estimate 1.
Hint: You might want to consider only “switchers”: these are individuals who change union
membership status between dates 2 and 3 (i.e., for whom Mi2 + Mi3 = 1).
3
Intro to Econometrics FINAL EXAM Thursday 05/14/20
T. Christensen Time Allowed: 24 hours
Question 3. (15 points in total, each part is worth 5 points)
Two economists wish to investigate whether news about COVID-19 related hospitalizations triggers
consumers to seek face masks, disinfectant, and the like, in response. Together, they assemble a
data set of COVID-19 related hospitalizations in New York and a Google Trends index of searches
for “face mask” in New York. The data are daily and span the period February 29 to May 7, 2020.
Time
cd\$gti
0 10 20 30 40 50 60 70
20 40 60 80 100
Figure 2: Hospitalizations.
Time
cd\$hosp
0 10 20 30 40 50 60 70
0 500 1000 1500
The first economist runs a regression of the Google Trends index gtit on the total number of
hospitalizations the previous day hospt (note: hospt represents the total number of hospitalizations
on day t 1, since this is only known at the end of date t 1) and obtains the following R output:
4
Intro to Econometrics FINAL EXAM Thursday 05/14/20
T. Christensen Time Allowed: 24 hours
> fm0 <- lm(gti ~ hosp, data = cd)
> summary(fm0)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 21.284007 3.460304 6.151 4.84e-08 ***
hosp 0.018346 0.004189 4.380 4.27e-05 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 18.82 on 67 degrees of freedom
F-statistic: 19.18 on 1 and 67 DF, p-value: 4.272e-05
> coeftest(fm0, df = Inf, vcov = vcovHAC)
z test of coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 21.2840070 6.7613407 3.1479 0.001644 **
hosp 0.0183465 0.0059299 3.0939 0.001976 **

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The second economist runs a regression of gtit = gtit gtit1 on the change in hospitalizations
hospt = hospt hospt1 and obtains:
> fmd0 <- lm(diff(gti) ~ diff(hosp), data = cd)
> summary(fmd0)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.26005 1.35401 0.192 0.8483
diff(hosp) 0.02231 0.01230 1.814 0.0742 .

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 11.17 on 66 degrees of freedom
5
Intro to Econometrics FINAL EXAM Thursday 05/14/20
T. Christensen Time Allowed: 24 hours
F-statistic: 3.29 on 1 and 66 DF, p-value: 0.07425
> coeftest(fmd0, df = Inf, vcov = vcovHAC)
z test of coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.260051 1.416218 0.1836 0.8543
diff(hosp) 0.022314 0.015189 1.4690 0.1418

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(a) How does the interpretation of the slope coecient di↵er across the two economists’ models?
Which of the two interpretations seems more relevant to the economists’ research question?
(b) Which of the two sets of results provides more reliable evidence of the causal e↵ect of news
about hospitalizations on consumer behavior? In answering, be sure to state whether the
e↵ect is significant or not.
The two economists notice that the second spike in the Google Trends index around day 47 coincides
with the announcement by Governor Cuomo that face masks would be mandatory in New York.
They define a dummy variable Dt that takes the value 0 before April 15 and 1 on and after April
15.
The first economist performs a Chow test for a structural break on April 15 and obtains:
> fm1 <- lm(gti ~ hosp + D + D:hosp, data = cd)
> summary(fm1)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 10.208618 3.084119 3.310 0.00152 **
hosp 0.022826 0.003187 7.162 8.97e-10 ***
D 2.826095 6.360400 0.444 0.65828
hosp:D 0.060502 0.013699 4.417 3.88e-05 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 13.09 on 65 degrees of freedom
6
Intro to Econometrics FINAL EXAM Thursday 05/14/20
T. Christensen Time Allowed: 24 hours
F-statistic: 37.69 on 3 and 65 DF, p-value: 3.122e-14
> coeftest(fm1, df = Inf, vcov = vcovHAC)
z test of coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 10.2086176 1.2444053 8.2036 2.333e-16 ***
hosp 0.0228260 0.0042302 5.3960 6.815e-08 ***
D 2.8260955 5.9497960 0.4750 0.6348
hosp:D 0.0605016 0.0139306 4.3431 1.405e-05 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> waldtest(fm0, fm1, test = “Chisq”, vcov = vcovHAC)
Wald test
Model 1: gti ~ hosp
Model 2: gti ~ hosp + D + D:hosp
Res.Df Df Chisq Pr(>Chisq)
1 67
2 65 2 111.88 < 2.2e-16 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(c) Explain what components of this output provide evidence of a structural break in the relation
between search behavior for face masks and hospitalizations at April 15. In answering, be sure
to state the null hypothesis you are testing and whether or not you reject the null hypothesis.
7
Intro to Econometrics FINAL EXAM Thursday 05/14/20
T. Christensen Time Allowed: 24 hours
Question 4. (15 points in total, each part is worth 5 points)
You have been tasked with the following consulting project by a firm. The firm would like individuals
to be remunerated for how they perform. However, the firm is worried that there may be unconscious
bias, through which workers who currently earn high wages may be more likely to earn high wages
in future, irrespective of their performance on the job.
The firm decides to run an experiment to investigate this issue. For an incoming cohort of graduates,
the firm randomly assigns a wage Wi1 to each individual i for their first year. Each individual’s wages
are then recorded for the subsequent two years. It is hypothesized that wages for the subsequent
two years (t = 2, 3) evolve according to the model
Wit = 1Wit1 + ↵i + uit , (10)
where 1 < 1 is an unknown parameter to be estimated, ↵i is an individual fixed e↵ect, and uit is
drawn independently each year.
The firm has given you a balanced panel of wages for years t = 1, 2, 3 for a large cohort of individuals.
(a) Explain whether or not you can estimate the model (10) by a panel regression of Wit on Wit1
using either of our two approaches for panel regression.
Hint: check if any of our assumptions for the fixed e↵ects model are violated. If your answer
is negative, you should provide some reasoning for why the relevant assumption fails.
(b) You notice that
Wi3 = 1Wi2 + ui3 , (11)
where Wi3 = Wi3 Wi2, Wi2 = Wi2 Wi1, and ui3 = ui3 ui2.
Calculate Cov(Wi2, Wi1) and Cov(ui3, Wi1).
(c) Using your answer to (b), propose an estimator of 1 and show it is consistent.
8