首页 > > 详细

ESB - Graded Exercise 2019

 ESB - Graded Exercise 2019

Question 1 (25 points)
Answer the following questions. Be concise and to the point.
(a) Suppose you estimate the gender difference in returns to education using the 
following model:
log(𝑤𝑎𝑔𝑒) = (𝛽0 + 𝛿0𝑓𝑒𝑚𝑎𝑙𝑒) + (𝛽1 + 𝛿1𝑓𝑒𝑚𝑎𝑙𝑒)𝑒𝑑𝑢𝑐 + 𝑢
where wage is the hourly wage, female is a gender dummy, which is =1 if the 
individual is female, and educ is the number of years of education. Provide an 
interpretation if 𝛿0 < 0 and 𝛿1 < 0. [5 points]
(b) Someone asserts that expected wages are the same for men and women who have the 
same level of education. Referring to the model in part (a), what would be your null 
hypothesis to test this? How you would test it. [5 points]
(c) Suppose your estimation returns the following values for the model from part 
(a): 𝛿̂0 = −0.1, 𝛿̂1 = −0.01. Based on this, what is the expected wage differential 
between a man and a woman with 10 years of schooling? 
(d) Suppose you find in addition that 𝛽1 = 0.01. What does it imply about the effect of 5 
years more of education on the expected wage of a woman?
(e) Suppose we have estimated the following wage equation
𝑊 = 10 + 10𝐴𝐺𝐸 − 0.1𝐴𝐺𝐸2 + 𝜖
Based on this, at what age would we expect the highest wage? [5 points]
Question 2 (25 points)
Consider the dataset ets_thres_final.dta. It contains emission figures (lnco2=log of CO2 
emissions) for a sample of firms regulated by the European Emissions Trading System 
(EUETS) for the years from 2005 to 2017 although the firm identifiers have gone missing 
from the dataset. Note that an Emissions Trading System requires firms to buy permits for 
every unit of CO2 they emit. By restricting the total number of permits that are issued 
governments can control the total amount of emissions while allowing firms to trade permits 
freely so that they can be used with those businesses that find it hardest to reduce emissions. 
In the early days of the EU ETS (which started in 2005) permits where freely given to firms. 
This changed from 2013 onwards when permits where only given to certain firms and sectors 
that were deemed at risk from foreign competition. The variable free indicates those firms in 
the dataset. According to economic theory the method of permits allocation should have no 
effect on the eventual emissions by firms (Independence hypothesis). Firms that have been 
given free permits will have an incentive to reduce emissions as that frees up permits to sell 
within the permit market. 
(a) Examine this hypothesis by running a regression of lnco2 on the free variable. Report 
what you find. [5 points]
(b) Provide an interpretation of the regression coefficient along with a discussion of the 
implications of your result. [5 points]
(c) The variable period is a categorical variable equal to 1 for observations from before 
2013 and equal to 2 for observations from year 2013 onward. Convert it into a factor 
variable and run a regression of lnco2 on period. Provide an interpretation of the 
estimated coefficients [5 points]
(d) Would you say your results in part (a) provide a causal estimate of the effect of free 
permits? [5 points]
(e) With the data at hand can you propose and implement an alternative regression 
approach that might address some of the concerns raised in (d)? If yes, implement this 
regression and discuss its results. What does the result tell you about the 
Independence hypothesis discussed in the introduction. [5 points]
Question 3 (25 points)
For this question use the dataset hals1prep.dta, containing data from the UK Health and 
Lifestyle Survey (1984-85). In this survey, several thousand people in the UK were being 
asked questions about their health and lifestyle.
(a) The variable bmi records the body mass index (BMI) of the respondents. The BMI 
uses the weight and height to work out whether a weight is healthy or if someone is 
overweight. A value between 18.5 and 24.9 indicates a healthy weight. Based on the 
information below, which region of the UK had – on average – the most overweight 
population? Run a regression of BMI on regional categories (recorded in the variable 
region). Use this to figure out in which UK regions are on average outside the healthy 
BMI range. [5 points]
b) The variable ownh_num records responses to the question “Would you say that for 
someone of your age your own health in general is…” where users had the following 
response options:
• Excellent (1)
• Good (2)
• Fair (3)
• Poor (4)
The numbers in brackets indicate how these options were recorded in the ownh_num 
variable. Run a regression of ownh_num on bmi and provide a discussion of what 
you find. Is it in line with your expectations on this? [5 points]
c) Can you think of at least two reasons why the estimate in b) does not provide a correct 
representation of the causal relationship between bmi and health? [5 points]
d) The dataset includes several additional control variables. These include 
• incomeB a categorical variable representing income brackets where “1” 
represents the lowest and “12” the highest income group.
• agyrs – a variable recording the age of the participant
Include those in the regression of reported health from b) Discuss what the output 
suggests about the relationships between health and age, and health and income. Are 
they in line with what you would have expected? In each case can you provide an 
explanation for the kind of relationship found?
Also discuss the usefulness of including both the age and income controls for 
estimating the causal effect of BMI. In each case discuss at least one reason for and 
one reason against including these controls. [5 points]
e) Consider the R output below. It builds a new dataframe as a transformation the 
dataframe halsx with the health survey data. ownh_num is defined as in b). Can you 
provide an interpretation for the coefficients of the linear regression reported at the 
end of R output? Note that the rbind() command combines dataframes vertically. [5 
points]
halsx=read_dta("../data/hals1prep.dta")
labels=c("excellent", "good", "fair", "poor")
for(i in 1:4){
fr=halsx
fr['dum']=fr$ownh_num==i
fr['label']=labels[i]
if(i==1){
longframe=fr
}
else {
longframe=rbind(longframe,fr)
}
print(nrow(longframe))
## [1] 8971
## [1] 17942
## [1] 26913
## [1] 35884
summary(lm(dum~label,longframe))
## 
## Call:
## lm(formula = dum ~ label, data = longframe)
## 
## Residuals:
## Min 1Q Median 3Q Max 
## -0.50864 -0.23141 -0.20622 0.08254 0.94627 
## 
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) 
## (Intercept) 0.206220 0.004231 48.74 < 2e-16 ***
## labelfair 0.025192 0.005984 4.21 2.56e-05 ***
## labelgood 0.302419 0.005984 50.54 < 2e-16 ***
## labelpoor -0.152491 0.005984 -25.48 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4007 on 35880 degrees of freedom
## Multiple R-squared: 0.1436, Adjusted R-squared: 0.1435 
## F-statistic: 2005 on 3 and 35880 DF, p-value: < 2.2e-16
Question 4 (25 points)
Air pollution has been shown to have a variety of adverse health effects. Recently, 
researchers have also started to investigate other negative effects. Below we report regression 
tables from a study that investigates a link between air pollution and car accidents.
(a) Can you suggest a causal mechanism that might explain why air pollution could have 
an effect on car accidents? [5 points]
(b) Table 3 below, extracted from an academic paper, reports various regressions of the 
log number of accidents per day across geographic grid cells for the UK over a period 
from 2009 to 2014. Column 6 provides a simple OLS regression of accidents on 
pollution concentration (measured as micro grams per cubic meter of PM). Can you 
think of reasons why this might not be a valid estimate of the causal impact? [5 
points]
(c) Column 7 of Table 3 in sub-question (b) repeats the same regression including various 
variables measuring weather conditions as well as region interacted with year, month 
and day of the week fixed effects/dummies. Would you say this provides a better 
estimate of the causal effect of pollution? Could it also lead to a worse estimate? [5 
points]
(d) The study proposes an instrument for pollution derived from a weather phenomenon 
known as temperature inversion. Temperature inversion occurs from time to time 
when a layer of warmer air sits on top of colder air nearer to the ground. As 
consequence pollution is trapped near the ground and cannot easily escape. Thus, all 
else equal, pollution will be more severe near the ground when this happens. 
Meteorological studies suggest that the phenomenon is driven by wider movements in 
the atmosphere and crucially is not itself driven by local pollution. Table 2 reports 
regressions of the pollution variable from Table 3 on a binary variable that is equal to 
1 if a temperature inversion is occurring in a particular area at a particular time. 
Discuss what this table is telling us. [5 points]
(e) Columns 1 to 3 of Table 3 in sub-question (b) report 2 stage least squares regressions 
using the temperature inversion as instrument. Discuss if this provides a better 
estimate of the causal effect of pollution on accidents. Can you comment on the 
relative size of the coefficients comparing columns 1 and 6? Are they in line with 
what you would expect? [5 points]
联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!