Exercises for
Statistical Modeling: A Fresh Approach
Daniel Kaplan
Second Edition, Copyright (c) 2012, v2.02Contents
1 Introduction 5
2 Data: Cases, Variables, Samples 7
3 Describing Variation 11
4 Group-wise Models 23
5 Confidence Intervals 27
6 The Language of Models 33
7 Model Formulas and Coefficients 39
8 Fitting Models to Data 49
9 Correlation and Partitioning of Variance 55
10 Total and Partial Relationships 59
11 Modeling Randomness 67
12 Confidence in Models 77
13 The Logic of Hypothesis Testing 81
14 Hypothesis Testing on Whole Models 85
15 Hypothesis Testing on Parts of Models 95
16 Models of Yes/No Variables 101
17 Causation 109
18 Experiment 111
34 CONTENTSChapter 1
Introduction
Reading Questions
1. How can a model be useful even if it is not exactly
correct?
2. Give an example of a model used for classification.
3. Often we describe personalities as “patient,” “kind,”
“vengeful,” etc. How can these descriptions be used as
models for prediction?
4. Give three examples of models that you use in everyday
life. For each, say what is the purpose of the model and
in what ways the representation differs from the real
thing.
5. Make a sensible statement about how precisely these
quantities are typically measured:
The speed of a car.
Your weight.
The national unemployment rate.
A person’s intelligence.
6. Give an example of a controlled experiment. What
quantity or quantities have been varied and what has
been held constant?
7. Using one of your textbooks from another field, pick an
illustration or diagram. Briefly describe the illustration
and explain how this is a model, in what ways it is faithful
to the system being described and in what ways it
fails to reflect that system.
Prob 1.01 Many fields of natural and social science have
principles that are identified by name. Sometimes these are
called “laws,” sometimes “principles”, “theories,” etc. Some
examples:
Kepler’s Law Newton’s Laws of Motion
Ohm’s Law Grimm’s Law Nernst equation
Raoult’s Law Nash equilibrium Boyle’s Law
Zipf’s Law Law of diminishing marginal utility
Pareto principle Snell’s Law Hooke’s Law
Fitt’s Law Laws of supply and demand
Ideal gas law Newton’s law of cooling
Le Chatelier’s principle Poiseuille’s law
These laws and principles can be thought of as models.
Each is a description of a relationship. For instance, Hooke’s
law relates the extension and stiffness of a spring to the force
exerted by the spring. The laws of supply and demand relate
the quantity of a good to the price and postulates that the
market price is established at the equilibrium of supply and
demand.
Pick a law or principle from an area of interest to you —
chemistry, linguistics, sociology, physics, ... whatever. Describe
the law, what quantities or qualities it relates to one
another, and the ways in which the law is a model, that is, a
representation that is suitable for some purposes or situations
and not others.
Enter your answer here:
An example is given below.
EXAMPLE: As described in the text, Hooke’s
Law, f = ?kx, relates the force (f), the stiffness
(k) and the extension past resting length (x) for a
spring. It is a useful and accurate approximation
for small extensions. For large extensions, however,
springs are permanently distorted or break.
Springs involve friction, which is not included in
the law. Some springs, such as passive muscle,
are really composites and show a different pattern,
e.g., f = ?k x3/|x| for moderate sized extensions.
Prob 1.02 NOTE: Before starting, instruct R to use the
mosaic package:
> require(mosaic)
Each of the following statements has a syntax mistake.
Write the statements properly and give a sentence saying what
was wrong. (Cut and paste the correct statement from R,
along with any output that R gives and your sentence saying
what was wrong in the original.)
56 CHAPTER 1. INTRODUCTION
Here’s an example:
QUESTION: What wrong with this statement?
> a = fetchData(myfile.csv)
ANSWER: It should be
> a = fetchData("myfile.csv")
The file name is a character string and therefore
should be in quotes. Otherwise it’s treated as an
object name, and there is no object called my-
file.csv.
Now for the real thing. Say what’s wrong with each of
these statements for the purpose given:
(a) > seq(5;8) to give [1] 5 6 7 8
A Nothing is wrong.
B Use a comma instead of a semi-colon to separate