# Econometric Analysis of Cross Section and Panel Data

Jefrey M. Wooldridge
Pages: 1096
https://www.jstor.org/stable/j.ctt5hhcfr

1. Front Matter
(pp. i-iv)
(pp. v-xx)
3. Preface
(pp. xxi-xxviii)
4. Acknowledgments
(pp. xxix-xxx)
5. ### I INTRODUCTION AND BACKGROUND

• 1 Introduction
(pp. 3-12)

The goal of most empirical studies in economics and other social sciences is to determine whether a change in one variable, sayw, causes a change in another variable, sayy. For example, does having another year of education cause an increase in monthly salary? Does reducing class size cause an improvement in student performance? Does lowering the business property tax rate cause an increase in city economic activity? Because economic variables are properly interpreted as random variables, we should use ideas from probability to formalize the sense in which a change inwcauses a change iny.

The...

• 2 Conditional Expectations and Related Concepts in Econometrics
(pp. 13-36)

As we suggested in Section 1.1, the conditional expectation plays a crucial role in modern econometric analysis. Although it is not always explicitly stated, the goal of most applied econometric studies is to estimate or test hypotheses about the expectation of one variable—called the explained variable, the dependent variable, the regressand, or the response variable, and usually denotedy—conditional on a set of explanatory variables, independent variables, regressors, control variables, or covariates, usually denoted$\textbf{x}=({{x}_{1}},{{x}_{2}},\ldots ,{{x}_{K}})$.

A substantial portion of research in econometric methodology can be interpreted as finding ways to estimate conditional expectations in the numerous settings...

• 3 Basic Asymptotic Theory
(pp. 37-50)

This chapter summarizes some definitions and limit theorems that are important for studying large-sample theory. Most claims are stated without proof, as several require tedious epsilon-delta arguments. We do prove some results that build on fundamental definitions and theorems. A good, general reference for background in asymptotic analysis is White (2001). In Chapter 12 we introduce further asymptotic methods that are required for studying nonlinear models.

Asymptotic analysis is concerned with the various kinds of convergence of sequences of estimators as the sample size grows. We begin with some definitions regarding nonstochastic sequences of numbers. When we apply these results...

6. ### II LINEAR MODELS

• 4 Single-Equation Linear Model and Ordinary Least Squares Estimation
(pp. 53-88)

This and the next couple of chapters cover what is still the workhorse in empirical economics: the single-equation linear model. Though you are assumed to be comfortable with ordinary least squares (OLS) estimation, we begin with OLS for a couple of reasons. First, it provides a bridge between more traditional approaches to econometrics, which treat explanatory variables as fixed, and the current approach, which is based on random sampling with stochastic explanatory variables. Second, we cover some topics that receive at best cursory treatment in first-semester texts. These topics, such as proxy variable solutions to the omitted variable problem, arise...

• 5 Instrumental Variables Estimation of Single-Equation Linear Models
(pp. 89-122)

In this chapter we treat instrumental variables estimation, which is probably second only to ordinary least squares in terms of methods used in empirical economic research. The underlying population model is the same as in Chapter 4, but we explicitly allow the unobservable error to be correlated with the explanatory variables.

To motivate the need for the method of instrumental variables, consider a linear population model$y={{\beta }_{0}}+{{\beta }_{1}}{{x}_{1}}+{{\beta }_{2}}{{x}_{2}}+\cdots +{{\beta }_{K}}{{x}_{K}}+u,\caption {(5.1)}$$\text{E(}u)=0,\quad \text{Cov(}{{x}_{j}},u)=0,\quad j=1,2,\ldots ,K-1,\caption {(5.2)}$but wherexKmight be correlated withu. In other words, the explanatory variables${{x}_{1}},{{x}_{2}},\ldots ,{{x}_{K-1}}$are exogenous, butxKis potentially endogenous in equation (5.1). The endogeneity can come from any of...

(pp. 123-160)

In this section we discuss the large-sample properties of OLS and 2SLS estimators when some regressors or instruments have been estimated in a first step.

We often need to draw on results for OLS estimation when one or more of the regressors have been estimated from a first-stage procedure. To illustrate the issues, consider the model$y={{\beta }_{0}}+{{\beta }_{1}}{{x}_{1}}+\cdots +{{\beta }_{K}}{{x}_{K}}+\gamma q+u.\caption {(6.1)}$

We observe${{x}_{1}},\ldots ,{{x}_{K}}$, butqis unobserved. However, suppose thatqis related to observable data through the function$q=f({\textbf w}, {\mathbf\delta})$, wherefis a known function and w is a vector of observed variables, but the vector of parametersδis...

• 7 Estimating Systems of Equations by Ordinary Least Squares and Generalized Least Squares
(pp. 161-206)

This chapter begins our analysis of linear systems of equations. The first method of estimation we cover is system ordinary least squares, which is a direct extension of OLS for single equations. In some important special cases the system OLS estimator turns out to have a straightforward interpretation in terms of single-equation OLS estimators. But the method is applicable to very general linear systems of equations.

We then turn to a generalized least squares (GLS) analysis. Under certain assumptions, GLS—or its operationalized version, feasible GLS—will turn out to be asymptotically more efficient than system OLS. Nevertheless, we emphasize...

• 8 System Estimation by Instrumental Variables
(pp. 207-238)

In Chapter 7 we covered system estimation of linear equations when the explanatory variables satisfy certain exogeneity conditions. For many applications, even the weakest of these assumptions, Assumption SOLS.1, is violated, in which case instrumental variables procedures are indispensable.

The modern approach to system instrumental variables (SIV) estimation is based on the principle of generalized method of moments (GMM). Method of moments estimation has a long history in statistics for obtaining simple parameter estimates when maximum likelihood estimation requires nonlinear optimization. Hansen (1982) and White (1982b) showed how the method of moments can be generalized to apply to a variety...

• 9 Simultaneous Equations Models
(pp. 239-280)

The emphasis in this chapter is on situations where two or more variables are jointly determined by a system of equations. Nevertheless, the population model, the identification analysis, and the estimation methods apply to a much broader range of problems. In Chapter 8, we saw that the omitted variables problem described in Example 8.2 has the same statistical structure as the true simultaneous equations model in Example 8.1. In fact, any or all of simultaneity, omitted variables, and measurement error can be present in a system of equations. Because the omitted variable and measurement error problems are conceptually easier—and...

• 10 Basic Linear Unobserved Effects Panel Data Models
(pp. 281-344)

In Chapter 7 we covered a class of linear panel data models where, at a minimum, the error in each time period was assumed to be uncorrelated with the explanatory variables in the same time period. For certain panel data applications this assumption is too strong. In fact, a primary motivation for using panel data is to solve the omitted variables problem.

In this chapter we study population models that explicitly contain a time-constant, unobserved effect. The treatment in this chapter is “modern” in the sense that unobserved effects are treated as random variables, drawn from the population along with...

• 11 More Topics in Linear Unobserved Effects Models
(pp. 345-394)

This chapter continues our treatment of linear, unobserved effects panel data models. In Section 11.1 we briefly treat the GMM approach to estimating the standard, additive effect model from Chapter 10, emphasizing some equivalences between the standard estimators and GMM 3SLS estimators. In Section 11.2, we cover estimation of models where, at a minimum, the assumption of strict exogeneity conditional on the unobserved heterogeneity (Assumption FE.1) fails. Instead, we assume we have available instrumental variables (IVs) that are uncorrelated with the idiosyncratic errors in all time periods. Depending on whether these instruments are also uncorrelated with the unobserved effect, we...

7. ### III GENERAL APPROACHES TO NONLINEAR ESTIMATION

• [III Introduction]
(pp. 395-396)

In this part we begin our study of nonlinear econometric methods. What we mean by nonlinear needs some explanation because it does not necessarily mean that the underlying model is what we would think of as nonlinear. For example, suppose the population model of interest can be written as$y=\textbf{x}\mathbf\beta +u$, but, rather than assuming$\text{E}(u|\textbf x)=0$, we assume that themedianofugiven x is zero for all x. This assumption implies$\text{Med}(y|\textbf{x})=\textbf{x}\mathbf\beta$, which is a linear model for the conditional median ofygiven x. (The conditional mean,$\text{E}(y|\textbf{x})$, may or may not be linear...

• 12 M-Estimation, Nonlinear Regression, and Quantile Regression
(pp. 397-468)

We begin our study of nonlinear estimation with a general class of estimators known as M-estimators, a term introduced by Huber (1967). (You might think of the “M” as standing for minimization or maximization.) M-estimation methods include maximum likelihood, nonlinear least squares, least absolute deviations, quasi-maximum likelihood, and many other procedures used by econometricians.

Much of this chapter is somewhat abstract and technical, but it is useful to develop a unified theory early on so that it can be applied in a variety of situations. We will carry along the example of nonlinear least squares for cross section data to...

• 13 Maximum Likelihood Methods
(pp. 469-524)

This chapter contains a general treatment of maximum likelihood estimation (MLE) under random sampling. All the models we considered in Part I could be estimated without making full distributional assumptions about the endogenous variables conditional on the exogenous variables: maximum likelihood methods were not needed. Instead, we focused primarily on zero-covariance and zero-conditional-mean assumptions, and secondarily on assumptions about conditional variances and covariances. These assumptions were sufficient for obtaining consistent, asymptotically normal estimators, some of which were shown to be efficient within certain classes of estimators.

Some texts on advanced econometrics take MLE as the unifying theme, and then most...

• 14 Generalized Method of Moments and Minimum Distance Estimation
(pp. 525-558)

In Chapter 8 we saw how the generalized method of moments (GMM) approach to estimation can be applied to multiple-equation linear models, including systems of equations, with exogenous or endogenous explanatory variables, and to panel data models. In this chapter we extend GMM to nonlinear estimation problems. This setup allows us to treat various efficiency issues that we have glossed over until now. We also cover the related method of minimum distance estimation. Because the asymptotic analysis has many features in common with Chapters 8 and 12, the analysis is not quite as detailed here as in previous chapters. A...

8. ### IV NONLINEAR MODELS AND RELATED TOPICS

• [IV Introduction]
(pp. 559-560)

We now apply the general methods of Part III to study specific nonlinear models that often arise in applications. Many nonlinear econometric models are intended to explain limited dependent variables. Roughly, a limited dependent variable is a variable whose range is restricted in some important way. Most variables encountered in economics are limited in range, but not all require special treatment. For example, many variables—wage, population, and food consumption, to name just a few—can only take on positive values. If a strictly positive variable takes on numerous values, we can avoid special econometric tools by taking the log...

• 15 Binary Response Models
(pp. 561-642)

In binary response models, the variable to be explained,y, is a random variable taking on the values zero and one, which indicate whether or not a certain event has occurred. For example,y= 1 if a person is employed,y= 0 otherwise;y= 1 if a family contributes to charity during a particular year,y= 0 otherwise;y= 1 if a firm has a particular type of pension plan,y= 0 otherwise. Regardless of the definition ofy, it is traditional to refer toy= 1 as asuccessandy=...

• 16 Multinomial and Ordered Response Models
(pp. 643-666)

In this chapter we consider discrete response models with more than two outcomes. Most applications fall into one of two categories. The first is an unordered response, sometimes called a nominal response, where the values attached to different outcomes are arbitrary and have no effect on estimation, inference, or interpretation. Examples of unordered responses include occupational choice, health plan choice, and transportation mode for commuting to work. For example, if there are four health plans to choose from, we might label these 0, 1, 2, and 3—or 100, 200, 300, 400—and it does not matter which plan we...

• 17 Corner Solution Responses
(pp. 667-722)

We now turn to models for limited dependent variables that have features of both continuous and discrete random variables. In particular, they are continuously distributed over a range of values—sometimes a very wide range—but they take on one or two focal points with positive probability. Such variables arise often in modeling individual, family, or firm behavior, and even when studying outcomes at a more aggregated level, such as the classroom or school level.

The most common case is when the nonnegative response variable,y, has a (roughly) continuous distribution over strictly positive values, but P(y= 0) > 0....

• 18 Count, Fractional, and Other Nonnegative Responses
(pp. 723-776)

A count variable is a variable that takes on nonnegative integer values. Many variables that we would like to explain in terms of covariates come as counts. A few examples include the number of times someone is arrested during a given year, number of emergency room drug episodes during a given week, number of cigarettes smoked per day, and number of patents applied for by a firm during a year. These examples have two important characteristics in common: there is no natural a priori upper bound, and the outcome will be zero for at least some members of the population....

• 19 Censored Data, Sample Selection, and Attrition
(pp. 777-852)

In previous chapters we assumed that we can obtain a random sample from the population of interest. For example, in Part II, where we studied models linear in the parameters, we assumed that data on the dependent variable, the explanatory variables, and instrumental variables can be obtained by means of random sampling—whether in a cross section or panel data context. In earlier chapters of Part IV we studied various nonlinear models for response variables that are limited in some way. Chapter 15 extensively considered binary response models, and we saw that the most commonly used models imply nonconstant partial...

• 20 Stratified Sampling and Cluster Sampling
(pp. 853-902)

In this chapter we study estimation when the data have been obtained by means of two common nonrandom sampling schemes. Stratified sampling occurs when units in a population are sampled with probabilities that do not reflect their frequency in the population. For example, in obtaining a data set on families, low-income families might be oversampled and high-income families undersampled. There are various mechanisms by which stratified samples are obtained, and we will cover the most common ones in this chapter.

The case of truncated sampling covered in Section 19.7 can be viewed as an extreme case of stratified sampling, where...

• 21 Estimating Average Treatment Effects
(pp. 903-982)

We now explicitly cover the problem of estimating an average treatment effect (ATE), sometimes called an average causal effect. An ATE is a special case of an average partial effect—it is an APE for a binary explanatory variable—and therefore many of the econometric models and methods that we have used in previous chapters can be applied or adapted to the problem of estimating ATEs.

Estimating ATEs has become important in the program evaluation literature, such as the evaluation of job-training programs or school voucher programs. Many of the early applications of the methods described in this chapter were...

• 22 Duration Analysis
(pp. 983-1024)

Some response variables in economics come in the form of a duration, which is the time elapsed until a certain event occurs. A few examples include weeks unemployed, months spent on welfare, days until arrest after incarceration, and quarters until an Internet firm files for bankruptcy.

The recent literature on duration analysis is quite rich. In this chapter we focus on the developments that have been used most often in applied work. In addition to providing a rigorous introduction to modern duration analysis, this chapter should prepare you for more advanced treatments, such as Lancaster’s (1990) monograph, van den Berg...

9. References
(pp. 1025-1044)
10. Index
(pp. 1045-1064)