讲解 Intermediate Econometrics: Assignment 1讲解留学生Matlab程序

Intermediate Econometrics: Assignment 1

Simple Regression Model

National School of Development

March 12, 2024

1 Theoretical Deduction

Consider the simple linear regression model:

g = β0 + β1① + u (1)

where g is the dependent variable, ① is the independent variable,u is the error term. u includes the unobservable factors. β0 is the intercept parameter, β1 is the slope parameter.

Q1. What is the Chinese translation of the concept error term?

We want to estimate the paramters β0 and β1 in this model.

1.1 Method of Moments

Let’s obtain the explicit form. of the estimators of these two parameters. The main assumption we need is E(u| ①) = 0.

Q2. Let u and ① be random variables in (1). Show that:

E(u| ①) = 0 ⇒ E(u) = 0. (2)

Q3. True or False: If Cou(①)u) = 0, then E(u| ①) = 0 for every ①.

Using condition (2), we can obtain another condition.

Q4. Show that under condition (2), we have:

Cou(①)u) = E(①u) = 0

Equation (1) is the population version of the regression. Let’s rewrite it into sample version:

yi = β0 + β1 xi + ui ,

where {(xi , yi .

Rewrite condition (2) and the second equality of (3) into sample form.

ui = 0

xi ui = 0

We call equation (2) and (3) to be population moment conditions. We call equation (6) and (7) to be sample moment conditions.

Rewrite equation (4) to be

ui = yi − β0 − β1 xi (7)

Plug this into (6), we can obtain

1 n

xi (yi − β0 − β1 xi ) = 0 (8)

The explicit form of the estimator is some algebraic expression using oberved data tuples (xi , yi ), i = 1, 2, . . .. We use symbols with ˆ on the top to denote estimators, i.e., β(ˆ)1 is the estimator of β1 ,β(ˆ)0 is the estimator of β0 , etc..

Q5. Define ¯(y) =

Σ yi and¯(x) =

Σ xi. Use the sample moment conditions to show that:

β(ˆ)0 = ¯(y) −β(ˆ)1 ¯(x) (9)

Q6. Use the sample moment conditions and equation (9) to show that:

β(ˆ)1 =

Σ(xi − ¯(x))(yi − ¯(y)) (10)

1.2 Ordinary Least Squares

Now forget about equations (9) and (10). Rather than MoM, we want to use another method, OLS, to recalculate the explicit form of the estimators of parameter β0 and β1. Define the fitted value of

yi as

ˆ(y)i = β(ˆ)0 +β(ˆ)1 xi.

Note that here we have not known how to express β(ˆ)0 and β(ˆ)1 yet.

Define the residual of regression (1) to be

ˆ(u)i = yi − ˆ(y)i

Q7. What is the Chinese translation of the concept residual?

Q8. Can you draw a scatter graph with a fitted line, and depict ˆ(y)i and ˆ(u)i in it?

Define Residual Sum of Squares (SSR) of regression to be

ˆ(u)i 2 (13)

So we have

SSR = ˆ(u)i 2

(=12) (yi − ˆ(y)i )2 (14)

(=11) (yi − β(ˆ)0 − β(ˆ)1 xi )2

Define Ordinary Least Square method to be choosing some β(ˆ)0 and β(ˆ)1 to minimize SSR. This is an optimization problem. We can use derivation to solve it. Assume the function to be optimized, SSR(β(ˆ)0 , β(ˆ)1 ), is well-defined so that we can use the first order conditions (F.O.C.s) to obtain the explicit estimators.

The optimization problem can be written as

min b0 ,b1

(yi − b0 − b1 xi )2 (15)

Q9. Write out the first order conditions of this optimization problem. (For simplicity, teaching assis- tants have helped you to list them here! But you should know how to derive them using derivation.) Solve for the OLS estimators.

Soln. The F.O.C.s of this problem are

(w.r.t. b0 ) [−2(yi − b0 − b1 xi )] = 0

(w.r.t. b1 ) [−2(yi − b0 − b1 xi )xi] = 0

where ”w.r.t.” stands for ”with respect to”. Therefore, the OLS estimators are ...

Q10. Compare your answer with equations (9) and (10). Are they the same? Did you use the assumption E(u|x) = 0 to obtain OLS estimators?

Q11. 1. Define xi(′) = axi + band regress yi on xi(′) by yi = β0(′) + β1(′)xi(′) + ui(′). What is the estimate

of β1(′)?

2. Define yi(′) = ayi + band regress yi(′) on xi by yi(′) = β0(′) + β1(′)xi + ui(′). What is the estimate of β1(′)?

3. Replace yi with ln(yi ). Assume yi > 0 for i = 1, 2, .... Regress ln(yi ) on xi by ln(yi ) = β0(′) + β1(′)xi + ui(′). What is the economic interpretation of β1(′)?

4. Replace yi with ln(yi ) and replace xi with ln(xi ). Assume yi > 0 and xi > 0 for i = 1, 2, .... Regress ln(yi ) on ln(xi ) by ln(yi ) = β0(′) + β1(′)ln(xi ) + ui(′). What is the economic interpretation

of β1(′)?

1.3 Some algebraic properties of simple linear regression

Now, define zero conditional mean condition to be

E(u|x) = 0.

Recall the OLS estimators of simple linear regression. Use data {(xi , yi ) : i = 1,...., n} to fit the

model, and obtain estimates

β(ˆ)1 =

β(ˆ)0 =

Σ(xi − ¯(x))(yi − ¯(y))

¯(y) − β(ˆ)1 ¯(x)

Recall that, when deriving the OLS estimator, we did not assume zero conditional mean (on the contrary, at the start of deriving MoM estimators, we made this assumption). The assumptions we need for deriving OLS estimators were trivial:

(i) The function to be optimized in the optimization problem (15) should be well-defined;

(ii) {xi (xi − ¯(x)) 0.

Also, while deriving algebraic properties of OLS estimator, we do not need the zero conditional mean condition neither.

Q12. Use the definition of the OLS residual ˆ(u)i , show that:

ˆ(u)i = 0.

Q13. Define sample covariance estimator between regressor xi and OLS residual ˆ(u)i to be

ˆ(u)i ) ≡ xi ˆ(u)i − ( xi )( ˆ(u)i )

≈ xi ˆ(u)i − ( xi )( ˆ(u)i )

Under F.O.C. w.r.t. β1 of the optimization problem (15), with the conclusion (18) we have obtained just above, show that:

—

Cov(xi , ˆ(u)i ) = 0.

Q14. (Now trivial) Show that:

xi ˆ(u)i = 0. (20)

Equations (18) and (20) are important properties and will be needed sooner.

Q15. Define ˆ(y) =

Σ ˆ(y)i. Show that:

ˆ(y) = ¯(y) . (21)

Define

lxx = (xi − ¯(x))2 , lgg = (yi − ¯(y))2 , lxg = (xi − ¯(x))(yi − ¯(y)).

So we have

β(ˆ)1 = lxx(lxg) .

Q16. Show that

lxg = (xi − ¯(x))yi = xi (yi − ¯(y)).

Q17. Define (Pearson) correlation coefficient

Corr(xi , yi ) = Cov(xi , yi ) .

(Var(xi ) (Var(yi )

Define sample correlation coefficient

Co, yi ) =

lxg

(lxx lgg

Define R-squared to be

R2 = .

We want to show that:

R2 =

Sketchy Hints. 1.

—

(Corr(xi , yi ))2 =

= (Corr(xi , yi ))2 . (23)

[Σ(yi − ¯(y))(ˆ(y)i − ˆ(y))]2

Σ(yi − ¯(y))2 Σ(ˆ(y)i − ˆ(y))2

2. Focus on the numerator part:

[(yi − ¯(y))(ˆ(y)i − ˆ(y))]2 .

Do identical transformation

[(yi − ˆ(y)i + ˆ(y)i − ¯(y))(ˆ(y)i − ˆ(y))]2 .

3. Use conclusion (21), (18), and (20) to show that the numerator can be collasped into

(ˆ(y)i − ¯(y))2 .

Q18. 1. Fit another linear regression

xi = δ0 + δ1yi + ei. (24)

Show that δ(ˆ)1 = lxg /lgg , where δ(ˆ)1 is the OLS estimate of δ1 .

2. Define the R-squared of regression (24) to be R2′ . R2 is the R-squared in (23). Show that:

R2 = R2′ =β(ˆ)1 δ(ˆ)1 .

1.4 Is the OLS estimator appropriate?

Now we have already obtained the explicit estimator for the simple linear regression. Note that β(ˆ)0 and β(ˆ)1 are just two numbers now. They are obtained by some ”+ − ×÷” relationships between observable data points, and have no statistic implications upon real world.

To make use of the estimators, we need to derive some properties of them. We should not treat

the β(ˆ)0 and β(ˆ)1 simply as numbers in this section. We will regard them as compounds of random

variables.

We need to know how appropriate the estimators are. Three properties are used to define the appropriateness:

1. Unbiasedness

2. Consistency

3. Efficiency

You may find the definitions of unbiasedness in Slides Chapter 3 and efficiency (denoted as ”best”) in Slides Chapter 4.

Q19. Under the assumption E(ui |xi ) = 0, show that the OLS estimator β1 is unbiased. That is, to show

E(β(ˆ)1 ) = β1 .

(Hint: You may refer to Slides Chapter 3 pp. 68.)

2 Stata Exercises

1. Inverse regression.

(1) Run the following codes in STATA.

setobs100

genz=rnormal()

genu1=rnormal()

genu2=rnormal()

genx=z+0.4*u1 geny=z+0.4*u2

s c a t t e r y x

Now you generate 100 pairs. You can see that y and x are distributed near the line y = x. (2) Regress y on x.

re g y x

Interprete the results. (If x increases by a unit, you observe how much increase in y?) (3) Regress x on y.

reg x y

Interprete the results. (If y increases by a unit, you observe how much increase in x?)

(4) Compare these two slope coefficients. Are these results consistent? How do you interprete

the results? You can change the sample size and run the regressions again to confirm your claim.

2. The following questions aim to provide a taste of the estimation and inference of multiple regression research design. You may leave this part blank by submission and come back after

March 1 9 , 2024, when we will learn multiple regression (inference, part 1). No points will be deducted for no answers.

(1) Adventure (male) and Angel (female) started entrepreneurship which helps others to ille- gally cross the border and sneak into BD Island. They took advantage of loopholes in the access system of BD Island, obtained multiple fake identity cards, and planned to smuggle a group of wanderers to BD Island this Sunday. The security department of BD Island learned about this information in advance and decided to strengthen their vigilance. Now, A-A team are deciding whether to continue this smuggling operation. Please define rele- vant variables and use a regression model to describe A-A team’s action strategy.

(Hints: What is the dependent variable? What are potential dependent variables?)

(2) Now re-consider the previous setting from a provincial perspective. Please define relevant variables and use a regression model to describe potential factors contributing to provincial economic crime rates.

(Hints: The dependent variable now is provincial economic crime rates. What are potential dependent variables? What are the expected signs of the coefficients?)

联系我们