Statistics 710 Advanced Statistics:
Large-Sample Statistical
Theory
Instructor: Eric Slud, Statistics Program, Math. Dept.
Office: Mth 2314, x5-5469, email evs@math.umd.edu
Office hours: initially MF 1
Course Text: A. van der Vaart,
Asymptotic Statistics (2000),
Cambridge University Press (paperback).
Assigned work and Grading: the course grade
will be based on 7
Prerequisite: Stat 700 and Stat 600.
This course consists of five topical modules on advanced
probability and statistical theory, with the common theme of
statistical inference from large-sample data. Three of the modules
are mostly about Probability Theory tools: (I) Empirical processes --- material
generalizing the Law of Large Numbers to provide results about uniform
almost-sure convergence of empirical averages of random variables like
f(Xj) (for iid r.v.'s X
j) where "uniformity" is
over classes of functions f. (II) Contiguity Theory and Local Asymptotic
Normality. References here are
Chapters 6 and 7 of Van der Vaart's book, possibly supplemented with
the books Le Cam, and L. Yang, G.L. (1990), "Asymptotics in Statistics:
Some Basic Concepts" or Greenwood, P. E., and Shiryaev, A. N. (1985),
"Contiguity and the statistical invariance principle". (III) Estimating Equations.
Maximum likelihood and generalizations. Minimum contrast, misspecified
likelihood, and M estimators. References are
Chapters 5 of van der Vaart, plus other materials to be
filled in later, in conjunction with module (IV) on efficient
estimating equations. (IV) Efficiency of Estimators
Asymptotically linear estimators and influence
functions. Least favorable alternatives. Regular estimators and
convolution theorem. Reference is Chapter
8 of van der Vaart. (V) Counting processes, compensators,
martingales, and statistics defined in terms of stochastic
integrals with respect to compensated counting process martingales.
References from several books on martingales
(e.g. Bremaud 1981, "Point Processes and Queues, Martingale Dynamics")
and Survival Analysis, such as Fleming, T. and Harrington, D. (1991),
"Counting Processes and Survival Analysis", plus my own notes. The first two lectures will motivate the study of uniform limit
theorems by considering the statistical topic of estimating equations,
including M-estimation. The reading is Chapter 5 of the van der Vaart
text, pp. 41-59. From there, we will branch to Chapter 19 and
introduce enough empirical process theory to complete Theorem 5.23
via Lemma 19.31. That will take a few weeks.
Homework Set #1, due Monday September 17 : in Chapter 5 of
van der Vaart text, #5.12, 5.14, 5.18, 5.25. As a final assigned
probem, prove the assertion on p.46, lines 3-6 from the bottom, "One
simple set of sufficient conditions [for
{m&theta(.)}&theta to be a Glivenko-Cantelli class] is
...". (The hint, as we shall discuss in class, is to study the proof
of Theorem 5.14.) For HW1 partial
solutions and comments, click here. Homework Set #2, due Friday October 5 : in Chapter 19 of
van der Vaart text, #19.3, 19.4, 19.5, 19.6, 19.7, and
19.10. Notes. Problem 19.3 involves only checking
equality of covariances and invoking an appropriate Theorem to imply
that a unique set of finite-dimensional distributions determines a
unique stochastic process law. In problem 19.4, the meaning of the
notations Fm,
Gn are different
from the empirical-process usage of the chapter: here they are
"empirical distribution functions". That is, Fm(t) is the proportion of
observations X1, ...,
Xm less than or equal to t,
and Gn(t) is the proportion of
observations Y1, ...,
Yn less than or equal to
t. Problem 19.4(c) and 19.10 are exercises
in formulating limiting probabilities using empirical process
convergence plus continuous mapping Theorem. Problem 19.5 is about
bracketing and is fairly straightforward. 19.6 and 19.7 give some
practice in estimating the VC numbers used to measure the size of
function classes used in proving GC and Donsker properties.
For HW2 solutions, click here. Homework Set #3, due Monday October 29 : Chap. 6, p.91: # 1,
2, 3, 4, 6. Homework Set #4, due Friday November 16: Chap. 7, p.106,
#1, 5, 6, 10. Chap. 8, p.123, # 3. Homework Set #5, due Wednesday, Dec. 12. There are
7 problems in all. The reading for this part of the
course is from Notes of mine, about Martingale Methods, together with
a little bit of material on Chapter 13 (from which some problems will
be taken). This part of the course is about compensated counting
processes and martingale methods in statistics, especially with
reference to rank-based statistics. NOTES (1). A very useful general lemma on uniform convergence of
random functions Mn(&theta)
(2). A really nice article by Peter Bickel along the lines of our
semiparametric (3). A set of Chapters I wrote on Martingale Methods in Statistics
can be found The UMCP
Math Department home page. The University of
Maryland home page. My home
page. © Eric V Slud,
December 5, 2007.
homework problem sets assigned throughout the course.
For this material, the references are: Chapter
19 of the Van der Vaart book; a 1980 book, "Convergence of Stochastic
Processes" by David Pollard; and some results from a 1996 book of Van
der Vaart and Wellner, "Weak Convergence and Empirical
Processes".
For HW3 solutions, click
here.
For
HW4 solutions, click here.
(1).
Prove that when Xi
for i=1,2,...,n, are iid with not necessarily
continuous distribution function F, then the Kiefer-Wolfowitz
Generalized Nonparametric Maximum Likelihood Estimator of the
unknown d.f. F based on the sample of size n is the empirical
distribution function. (For this problem, you can
briefly summarize the argument given in class to establish that the
GNPMLE must assign all its mass to probability atoms at the points
Xi.)
(2).
Suppose that (Xi,
Zi) are
iid where Xi given
Zi has continuous cumulative
hazard function exp(&beta' Zi) &Lambda(t). Then we saw in class that the Cox
Partial Likelihood is equal to the product of Likelihood factors
involving (&beta,&Lambda) with &Lambda replaced by the weighted
Nelson-Aalen estimator &Lambda&beta
(found in class). Show that this same Cox
Partial Likelihood expression is equal to the "marginal rank
likelihood", that is, to the probability conditional on the Z's that
the Xi observations would
fall in their observed sorted order.
(3)--(4). Two problems from
Chapter 13, numbers 4 and 6 on pp.190-191. But for #13.6, you
may if you prefer do the problem for the 2-sample (not the
signed-rank) version of the Wilcoxon.
(5)--(7). The
remaining problems are about the martingale material:
Exercises 2 and 3, respectively on pp. 8 and 11 of Chapter 1 of the
Martingale Methods in Statistics notes referenced as Handout
(3) below, and Exercise 5 on p. 44 of those Notes. In the latter
problem find also the variance process of the process M(t).
defined in terms of data (and which which will be maximized to
estimate &theta ) is
given in Appendix II (p.1116) of
P. K. Andersen; R. D. Gill (1982), Cox's Regression Model for
Counting Processes:
A Large Sample Study, Annals of
Statistics 10, No. 4. (Dec., 1982), pp. 1100-1120
which can be found in JSTOR. The Lemma and proof are restricted to a
single page,
and can be found here.
efficiency discussion is "On Adaptive Estimation",
the 1980 Wald Memorial Lectures
published in Annals of
Statistics (1982) 10, 647-671. The Stable URL is
http://links.jstor.org/sici?sici=0090-5364%28198209%2910%3A3%3C647%3AOAE%3E2.0.CO%3B2-1
.
here as reading material for
the last segment of the course.
Important Dates