ECON30025/ECOM90020 Assignment 1

Assignment 1, 2022

Due 11:59 pm Monday April 11, 2022

This assignment is worth 20% of your final grade for those in ECON30025.

This assignment is worth 25% of your final grade for those in ECOM90020.

Make sure to include the coversheet with your answers. Read the instructions on the coversheet. Try

to keep your answers short and clear. Please submit copies of all the programs. Write not more than

5/8 pages of text (not including the programs) and cut and paste any results into a word document.

The program code should be included as an appendix. When writing code remember to add

comments to the programs so they would look like ones that you would write for someone else to

use. One method for saving space is to paste the code and results as a picture then reduce the size.

All assignments are to be submitted as pdfs. The exam in this subject will be in a similar form.

It is not enough just to provide the computer results – you will be graded on your interpretation of

what you find. If you are in doubt as to a particular definition or question – state your assumption

and move on.

There are 4 parts to this assignment. The assessment for this subject differs depending on your

enrolment. All students are to submit answers to questions in parts I and II.

For those enrolled in ECON30025 they are only to answer the non-stared questions in part III

(Parts a to e) and not part IV.

Students enrolled in ECOM90020 are also to answer all questions in part III (including the

starred ones) and the question in part IV.

J. Hirschberg ECON30025/ECOM90020

Part I. (10pts)

1. (4 pts) List all the errors in the following code. Fix them and run the code.

DATA CLASS;

INPUT NAME $ SEX AGE HEIGHT WEIGHT;

CAROL F 14 62.8 102.5

HENRY 14 63.S 102.5

JAMES M 12 57.3 B3.o

ALFRED M 14 69.Ø 112.5

ROBERT M 12 64.8 128:0

RONALD M 15 67.0 133.0.

ALICE F 13 56.5 84.0

BARBARA F 13 65.3 98.0

JEFFREY M 13 62.5 -84.0

JOHN M 12 59.0 99.5

JOYCE F 11 51.3 50.5

RUN

2. (6pts) Answer the following questions.

(a)(2pt) Briefly explain what the following code does after the code from part 1.

DATA CLASS1;

SET CLASS;

BY AGE WEIGHT;

IF ?? THEN OUTPUT;

RUN;

(b)(1pt) Can the code in I.2a immediately follow the code in I.1?

(c)(1pt) Replace the ?? in this code to define the CLASS1 data set as the heaviest students in each

age? Print out CLASS1.

(d)(1pt) Generate a new data set called CLASS2 that computes the Body-Mass index (BMI) (where

2 703( / ) BMI weight height = ) for all students in CLASS.

(e)(1pt) Create another data set called CLASS3 of the students in each age group with the lowest

BMI and print it out.

Part II. (5 pts) Linear Algebra

Consider a system of linear equations defined as:

1234

1 2 34

34 2

3 4 3 1

2 3 3 -9

10 2 4 5 3

5 2

2 6 5 14

xxxx

x x xx

xx x

x x xx

+++=

−+ = −

+ =

+ + −+

−

1.(3 pts) Define the matrix A, and the vectors b and x where this system of equations is

equivalent to the expression: Ax = b . Then modify the IML routine lms_example1 to solve for the

elements of x.

2.(2 pts) We now have two additional equations to include defined as:

3 42

14 2

2 12 3

10 5

7 x xx

xx x

− + =

+ +=

J. Hirschberg ECON30025/ECOM90020

Describe how one might find a solution that for the vector x. By modifying the IML routine you

used for part 1 above compute this solution. (Hint: We might consider the minimization of the

squared errors).

Part III. (5pts-ECON30025 /8pts-ECOM90020) The World's Super Yachts

This question requires you to consider two data series. This assignment combines the

methods used in the AFL football attendance program and the multivariate statistics routines.

Recent events have brought to the fore the existence of Super Yachts that are owned by

individuals of very high wealth. The syt data set read by the code below, is a sample of over 1000

so-called super yachts and their characteristics.2

These characteristics include: country of the owner,

the value in US$, the length in metres, number of crew, number of guests, and the year in which it

was built.

The other dataset to consider is the United Nations' (UN) cross country series entitled

un_plus we considered for the principal component analysis example with quality of life indicators

(see PCA_Example.sas). This data set records several country specific characteristics that include

GDP per capita, population, infant mortality rates, and many other national characteristics.3

I have written a program to read these data series from the subject datasets on line as listed

below. It is called assign1Q3_22.sas you should use this at the start of your routine. If you have any

questions about the interpretation of the UN variables go to the source website.

assign1Q3_22.sas

Read a file of data found from lists of Super Yachts on line.

These data list the value of the Yachts and other characteristics

by country of their owners.

;

filename csvFile1 url

"https://www.online.fbe.unimelb.edu.au/t_drive/ECOM/ECOM90020/data/Super_Yachts_v.csv"

1 https://edition.cnn.com/travel/article/skyscraper-superyacht-concept/index.html

The list provided here is a modification of information that can be found on the internet.

3 Most of these variables can be found at http://hdr.undp.org/en/data. Note we add c_n for the country number to the

dataset.

J. Hirschberg ECON30025/ECOM90020

termstr=crlf;

proc import datafile=csvFile1 out=syt replace dbms=csv; run;

data syt ; set syt ;

age = 2022 - yr;

label

Value = Estimated value (mill $US)

Guests = Number of Guests that can be accommodated

Crew = Number of Crew

yr = Year Built

size = Length in metres

up_cnty = Country of Owner

c_n = Country Number

age = Age of Yacht

;

run;

The United Nations' (UN) cross country series.

As considered for the principal component analysis example with quality of life indicators

(see PCA_Example.sas). This data set records several country specific characteristics that

include GDP per capita, population, infant mortality rates, and many other national

characteristics.

;

filename csvFile2 url

"https://www.online.fbe.unimelb.edu.au/t_drive/ECOM/ECOM90020/data/un_plus.csv"

termstr=crlf;

proc import datafile=csvFile2 out=un_plus replace dbms=csv; run;

data un_plus ; set un_plus ;

productivity = productivity / 1000 ; * rescale productivity to be in 1,000s ;

gini = gini * 100 ; * rescale the gini coefficient ;

Define the uppercase name of the country and change some of the country names

on the UN data set. As assembled from: http://hdr.undp.org/en/data

;

up_cnty = upcase(country) ;

if up_cnty = "CZECH REPUBLIC" then up_cnty = "CZECHIA";

if up_cnty = "BURKIAN FASO" then up_cnty = "BURKINA FASO";

label

ARTICLES= "Scientific articles per capita"

CO2= "Carbon dioxide emissions per capita, (tonnes), 2011"

EMP_RATIO_15= "Employment to population ratio, (% ages 15 and older), 2013"

EN_SEC= "Gross enrolment ratio, Secondary, (% of secondary school age population), 2008"

EN_TER= "Gross enrolment ratio, Tertiary, (% of tertiary school age population), 2008-2012"

EQ_MATH= "Education quality, Performance of 15-year-old students, Mathematics, 2012"

EQ_READING= "Education quality, Performance of 15-year-old students, Reading, 2012"

EQ_SCIENCE= "Education quality, Performance of 15-year-old students, Science, 2012"

EQ_SEC= "Education quality, Population with at least some secondary education, (% ages 25+)"

FER_2010= "Total fertility rate, (births per woman),2010/2015"

GDP_CAP= "Gross Domestic Product per capita"

GINI= "GINI index (World Bank estimate)"

GR_HDI= "Average annual HDI growth, (%), 1990-2014"

HDI= "Rank of Human Development Index 2013"

HDI_VALUE= "HDI, Value, 2014"

IMMIGRANTS= "Human mobility, Stock of immigrants (% of population), 2013"

INEQ_GEN= "Gender Inequality Index Value, 2014"

INEQ_PALMA= "Income inequality, Palma ratio20052013"

INEQU_GINI= "Income inequality, Gini coefficient, 2005-2013"

INEQU_QUIN= "Income inequality, Quintile ratio, 2005-2013"

INT_STUDENTS= "Human mobility, International student mobility, (% of tertiary enrolment)"

J. Hirschberg ECON30025/ECOM90020

INTERNET= "Communication, Internet users, (% of population), 2014"

LGDP_CAP= "Log GDP per capita"

LIFE_EXP= "Healthy life expectancy at birth"

MED_AGE= "Population, Median age, (years), 2015"

MIGRATION= "Human mobility, Net migration rate, (per 1,000 people), 2010/2015"

MORT_INF= "Mortality rates, (per 1,000 live births), Infant, 2013"

POP= "Population, Total, (millions), 2014"

POP_GR_2000= "Population, Average annual growth=, 2000/2005"

POP_GR_2010= "Population, Average annual growth=, 2010/2015"

POP_OV_65= "Population, Ages 65 and older, (millions), 2014"

POP_URBAN= "Population, Urban, (%), 2014"

PRIS_POP= "Prison population, (per 100,000 people), 20022013"

PRODUCTIVITY= "Labour productivity, Output per worker, (2011 PPP $), 2005-2012"

R_N_D= "Research and development expenditure (% of GDP), 20052012"

SCH_YR_F= "Mean years of schooling(years), Female, 2014"

SCH_YR_M= "Mean years of schooling(years), Male, 2014"

SEX_RATIO= "Sex ratio at birth, (male to female births), 2010/2015"

SOILSUIT= "Soil fertility"

TEMP= "Geographic temperature average 1961-1990"

TOURISTS= "Human mobility, International inbound tourists, (thousands), 2013"

WOMEN_MPS= "Share of seats in parliament (% held by women), 2014"

;

run;

proc sort data = un_plus ; by up_cnty ; run;

data un_plus ; set un_plus ; c_n = _n_ ;

label c_n = Country Number ; run;

a) (1pt) Using the proc sgscatter create the scatter plots of how the value of the yachts in syt

varies with the characteristics of the yachts. As in the example below with a variable y on the

y-axis and x1 and x2 on the x-axis.4

proc sgscatter data=syt ;

plot (y ) * (x1 x2 ) / columns = 2

loess=( )reg=(degree=3) datalabel = up_cnty; run;

b) (1pt) Again using the super yacht data (syt) estimate an hedonic regression with proc reg to

predict the value of the yacht based on the characteristics of the yacht. Do the signs and

significance of the coefficients match your prior opinions?

c) (1pt) Sort the super yacht data by country number (keep the country name as an id variable)

and compute the averages, the median and the maximum for the value and the characteristics

by country using the code listed below to create c_syt. Once this is done sort both the new

datasets c_syt and un_plus data by country using the code given below:

proc sort data = syt; by c_n ; run;

proc summary data = syt ; by c_n ; id up_cnty ;

var value crew guests yr size age;

output out = c_syt

mean = avg_value avg_crew avg_guests avg_yr avg_size avg_age

median = med_value med_crew med_guests med_yr med_size med_age

4 The loess line is a non-parametric fit to the data and the reg line is a 3rd order polynomial.

J. Hirschberg ECON30025/ECOM90020

max = max_value max_crew max_guests max_yr max_size max_age; run;

d) (1 pt) Once this is done sort the dataset un_plus data by country number and merge it with

c_syt using the code given below: What is in the result? What do you learn from this?

proc sort data = un_plus ; by c_n ; run;

data match miss1 miss2 ; merge un_plus(in=i1) c_syt(in=i2) ; by c_n ;

if i1 & i2 then output match ;

if i1 & not i2 then output miss1 ;

if i2 & not i1 then output miss2 ;

run;

e) (1 pt) Using the data set match create two new variables that could be considered measures

of wealth inequality in each country. The first is the number of times greater than the average

income (assuming it is equal to GDP per capita) is the average value of a super yacht owned

by someone in the country ( equivalent to the number of years it would take at that income

level to buy it). The other is the proportion of the population that owns a super yacht by

assuming only one super yacht per owner. 5

Using Proc sgscatter, plot the two new

variables y-axis and the UN variables on the x-axis. Limit your analysis to no more than five

variables and choose ones that you can justify using

f) *(1 pt) Use the Proc Princomp routine to compute the principal components of the average

yacht characteristics excluding the value using the full data set (syt). Base this computation

on the correlation matrix of the characteristics. Interpret the results of this routine by

commenting on the variables that have the greatest influence on the first two components.

g) *(2 pt) Construct a new variable formed by the ratio of the average of value to the average

size in the data matched to the UN data set (match). Find a regression relationship that best

explains the variation in this new variable when only including the average age of the yachts

and at least two variables from the UN data. Interpret the signs of the estimated coefficients

that you estimate.

*Part IV 3-D Fun (2 pts-ECOM90020)

Consider the following function:

1 2 46 2 1

6 min(log( ( , ) ),5), where ( , ) 2 1.05 q f xy f x y x x x xy y − = = − + ++

Where 5 5, and 5 5 x y −≤ ≤ − ≤ ≤ with increments of x and y of .1.

1. *(1 pt) Using a program similar to the one that we used to plot the Hat in three dimensions

(three_d_plot), construct a 3-D plot of this function. Try changing the perspective to obtain

the best view.

2. *(1 pt) Use the contour program to locate the extreme values of this function in the

neighbourhood of the range of the xs and ys specified. Identify the values of x and y where

these points occur. Is there a global extremum?

Recall that the _freq_ variable in the c_syt data set is the count of the values used in the computation for each country.

联系我们

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

热点文章

辅导 comm2000 creating socia... 2026-01-08
讲解 isen1000 – introductio... 2026-01-08
讲解 cme213 radix sort讲解 c... 2026-01-08
辅导 csc370 database讲解迭代 2026-01-08
讲解 ca2401 a list of colleg... 2026-01-08
讲解 nfe2140 midi scale play... 2026-01-08
讲解 ca2401 the universal li... 2026-01-08
辅导 engg7302 advanced compu... 2026-01-08
辅导 comp331/557 – class te... 2026-01-08
讲解 soft2412 comp9412 exam辅... 2026-01-08
讲解 scenario # 1 honesty讲解... 2026-01-08
讲解 002499 accounting infor... 2026-01-08
讲解 comp9313 2021t3 project... 2026-01-08
讲解 stat1201 analysis of sc... 2026-01-08
辅导 stat5611: statistical m... 2026-01-08
辅导 mth2010-mth2015 - multi... 2026-01-08
辅导 eeet2387 switched mode ... 2026-01-08
讲解 an online payment servi... 2026-01-08
讲解 textfilter辅导 r语言 2026-01-08
讲解 rutgers ece 434 linux o... 2026-01-08

热点标签

msinm014/msing014/msing014b

联系我们 - QQ: 99515681 微信：codinghelp

程序辅导网！