Final Project for R Course
1. Attach data “Florida” in library(car). The Florida data is the vote by
county in Florida for President in the 2000 election. (1) Check the class of
Florida and its dimensions. (2) Show the names of the top three
candidates with the largest number of votes. (3) Use box-plot to show the
vote information of the top three candidates. (4) Use pie-plot to show the
percentages of votes of the top three candidates. (20 marks)
2. Attach data “Salaries” in library(car). It is the 2008-2009 nine-month
academic salary for Assistant Professors, Associate Professors and
Professors in a college in the U.S. The data were collected as part of the
on-going effort of the college's administration to monitor salary
differences between male and female faculty members. (1) Show the
general information of the data. (2) Extract the salaries of male and
female faculty members, respectively. (3) Plot the densities of the
extracted salary data of different sex in a single graph. Please add a
legend in the graph to show the information of curves. (4) Use hypothesis
testing to test whether the salaries of the female and male professors are
different. (20 marks)
3. In numerical analysis, Newton's method (also known as the Newton–
Raphson method), named after Isaac Newton and Joseph Raphson, is a
method for finding successively better approximations to the roots (or
zeroes) of a function
The Newton–Raphson method is implemented as follows: one starts
with an initial guess which is reasonably close to the true root, then
the function is approximated by its tangent line. The tangent line is a
straight line with slope )('
0xf
and passes through the point ))(,(
00 xfx
,
that is )())(('
000 xfxxxfy
. Next, one computes the x-intercept of
this tangent line, which will typically be a better approximation to the
function's root than the original guess x0. Setting , we obtain an
updated guess of root
. Geometrically, (x1, 0) is the
intersection of the x-axis and the tangent line of the graph of
)(xf at ))(,( 00 xfx .
The process is repeated as
. Theory shows that
those will converge to the true root . In practice, if become
very close, we will stop the algorithm and report
the current value as the root value.
In most cases, we use the Newton–Raphson method to find the
maximum or minimum value of a function F(x) by setting
0)()(' xfxF to find its root xc.
Use the Newton–Raphson method to find the that locally minimizes
352.0)( 34 xxxxf . Set the initial value x0=1. Stop the repeated
algorithm either converges (for instance, )
or Report the final value of , f(xn) and F(xn). (15 marks)
4. Create a function with two arguments. The first argument is a vector,
and the second argument is a single number. The function is to find
whether the vector contains the specified number and returns (1) the
subscripts of elements of the vector that equals the specified number; (2)
the number of times that the specified number is shown in the vector. Use
one example to show how to apply the function. (15 marks)
5. Attach data “Highway1” in library(car). The data include 39 sections
of large highways in the state of Minnesota in 1973. The goal of this
analysis was to understand the impact of the selected variables on
automobile accident rate. This data frame. contains the following
columns:
(1) rate : 1973 accident rate per million vehicle miles
(2) len: length of the Highway1 segment in miles
(3) adt: average daily traffic count in thousands
(4) trks: truck volume as a percent of the total volume
(5) sigs1: the number of signals per mile of roadway
(6) slim: speed limit in 1973
(7) shld: width in feet of outer shoulder on the roadway
(8) lane: total number of lanes of traffic
(9) acpt: number of access points per mile
(10) itg: number of freeway-type interchanges per mile
(11) lwid: lane width, in feet
(12) htype: an indicator of the type of roadway or the source of funding
for the road, either MC, FAI, PA, or MA
Use linear model to fit the data. Propose a model that you think fits
the data well and interpret the results. Show the analysis steps in details,
including building the model and checking model assumptions. Save the
coefficients and their standard deviations together in a text file named
“coefficient.txt”. (30 marks)
Note: not just show the graphs, but give a formal, detailed and easy
understood report. Use the attached template. To see the details of
data in R, use help(“Highway1”).
Save the source code as final.r
Save the report as final.doc
Send the above two files together to email: , with email