ECMT1020 Introduction to Econometrics
Semester 1, 2025
Group Assignment
Due: 11.59pm Friday 23 May 2024
Instructions
1. This group assignment counts for 10% of your final grade. You may form. a group with any student enrolled in this unit (not necessarily from your tutorial), with a maximum group size of two. The assignment will be marked based on the final group submission, and group members will receive the same mark.
2. Once you have formed a group, make sure to register your group on Canvas (instruc-tions are provided on this Canvas page). Only one group member needs to submit the assignment on behalf of the group. Individual submissions are also permitted.
If you complete the assignment on your own, there is no need to join a group on Canvas.
3. The assignment consists of 12 questions and is worth a total of 40 marks. Marks allocated to each question are indicated. You are expected to attempt all questions.
4. The dataset assigned to your group is provided in the Excel file EAWE#.xlsx, where # corresponds to the last digit of the sum of the last digits of the University of Sydney SIDs of your group members. For example, if student A and student B form. a group, and student A’s SID ends in 3 and student B’s SID ends in 8, then 3 + 8 = 11, and the last digit of 11 is 1. Therefore, this group should use dataset EAWE1.xlsx.
5. You must use your assigned dataset to answer the questions. Clearly indicate your dataset number and the SIDs of both group members on the front page of your submission. Use of the incorrect dataset may be treated as a case of Academic Dishonesty.
6. Round all numerical answers to three decimal places if applicable. When asked to “report the estimation results”, include both the relevant Stata commands and output tables in your answer (you may insert the screenshots into your document). A separate Stata do-file is not required.
7. Your answers must be typed—handwritten submissions will not be accepted. En-sure your answers are clear and concise. Lengthy responses that lack focus or understanding will be penalized.
8. Submit a single pdf file named EAWE# SID1 SID2.pdf where # is your assigned dataset number, and SID1 and SID2 are the 9-digit SIDs of the group members. Do not include your names or a cover sheet.
9. The submission is via a file upload under the Canvas module “Assignment”. You may submit up to two times, and only the most recent submission will be counted. Late submission will incur a penalty of 5% of the total 40 marks (i.e., 2 marks) per calendar day. Submissions more than 10 calendar days late will receive a mark of zero. These rules follow Section 7A of the University Assessment Procedures 2011.
10. Regarding the use of AI tools: I will not impose unenforceable restrictions. You are permitted to use AI to assist with this assignment; however, you must clearly declare any such use and explain how it contributed to your work.
Questions
1. (3 marks) Create the following two scatter plots using the variables in your dataset:
(i) EARNINGS (hourly earning in dollars) on the y-axis and S (education: years of schooling) on the x-axis.
(ii) EARNINGS (hourly earning in dollars) on the y-axis and EXP (work experi-ence: years spent working after leaving full-time education) on the x-axis.
Ensure that each plot includes clearly labeled axes. What are the observed values of S in your dataset?
2. (2 marks) Estimate a regression model with EARNINGS as the dependent variable and EXP as the only explanatory variable. Report the estimation results and write down the fitted regression equation.
3. (5 marks) Is the slope coefficient in the fitted model from Question 2 significantly different from zero at the 10% significance level? Explain how you can reach your conclusion based on the regression output. In your explanation, clearly state the null and alternative hypotheses, and describe the three ways to make your decision based on
(i) the test statistic,
(ii) the p-value of the test, or
(iii) the confidence interval for the slope coefficient.
4. (4 marks) What are the confidence intervals for the intercept and slope coefficients in your regression output from Question 2? Explain how these confidence intervals are constructed. What are the center values of these confidence intervals?
5. (4 marks) Estimate another regression model with EARNINGS as the dependent variable and both EXP and S as the explanatory variables. Report the estima-tion results and write down the fitted regression equation. Carefully interpret the estimated intercept and slope coefficients.
6. (3 marks) Is the slope coefficient on EXP in the fitted model from Question 2 smaller or larger than the slope coefficient on EXP in the fitted model from Question 5? Provide a reasonable explanation for this difference.
7. (4 marks) What are the sample means of S and EXP in your dataset? Define two new variables, SDEV and EXPDEV, as deviations of S and EXP from their respective sample means:
SDEV = S − S and EXPDEV = EXP − EXP.
Estimate a regression model with EARNINGS as the dependent variable and both EXPDEV and SDEV as explanatory variables. Report the estimation results and write down the fitted regression equation. Interpret the estimated slope coefficients.
8. (2 marks) How does the regression with “demeaned” explanatory variables in Ques-tion 7 improve the interpretation of the fitted intercept compared to the regression in Question 5?
9. (3 marks) The variable TENURE in your dataset represents the number of years an individual has spent working with the current employer. Define a new variable
PREVEXP = EXP − TENURE
to represent prior work experience with previous employers. Also, define LGEARN as the natural logarithm of EARNINGS. Estimate a regression model with LGEARN as the dependent variable and PREVEXP, TENURE, and S as explanatory vari-ables. Report the estimation results and interpret each of the three slope coefficients.
10. (2 marks) What would happen if you include the variable EXP in the regression model from Question 9? Explain why this happens.
11. (2 marks) Based on the fitted relationship between LGEARN and the explanatory variables in Question 9, derive the corresponding fitted relationship between the original dependent variable EARNINGS and the variables PREVEXP, TENURE and S.
12. (6 marks) In your dataset, the variable HOURS represents the number of hours worked per week (a proxy for labor supply) and MARRIED is a dummy variable which takes value 1 if the individual is married, and 0 otherwise. Estimate the following regression model:
LGHOURS = β1 + β2LGEARN + β3(MARRIED × LGEARN) + β4S + β5EXP + u,
where LGHOURS is the natural logarithm of HOURS, and the other variables are as previously defined. Report the estimation results and carefully interpret the estimates of β2 and β3. Additionally, write out the fitted relationships between HOURS and the variables EARNINGS, S, and EXP separately for married and unmarried individuals.