Ch17 Endogeneity and IV (Part B)

background-image: url("../pic/slide-front-page.jpg")
class: center,middle
count: false

# Advanced Econometrics III

## Simultaneous Equation Models (SEM)

### Hu Huaping (胡华平 )

### NWAFU (西北农林科技大学)

### School of Economics and Management (经济管理学院)

### huhuaping01 at hotmail.com

### 2026-06-04
<div>
<style type="text/css">.xaringan-extra-logo {
width: 110px;
height: 70px;
z-index: 0;
background-image: url(../pic/logo/nwafu-logo-circle-wb.png);
background-size: contain;
background-repeat: no-repeat;
position: absolute;
top:0.2em;left:1em;
}
</style>
<script>(function () {
  let tries = 0
  function addLogo () {
    if (typeof slideshow === 'undefined') {
      tries += 1
      if (tries < 10) {
        setTimeout(addLogo, 100)
      }
    } else {
      document.querySelectorAll('.remark-slide-content:not(.title-slide):not(.inverse):not(.hide_logo)')
        .forEach(function (slide) {
          const logo = document.createElement('div')
          logo.classList = 'xaringan-extra-logo'
          logo.href = null
          slide.appendChild(logo)
        })
    }
  }
  document.addEventListener('DOMContentLoaded', addLogo)
})()</script>
</div>

???

Good evening everyone. Welcome to my class.

I am teacher Hu Huaping.

Here is my email.

You can contact me when you have any questions with our course.

In this part, we will learn Simultaneous Equation Models together by using almost eight lessons.

---
count: false
class: center, middle, duke-orange,hide_logo

# Simultaneous Equation Models (SEM)

Chapter 18. Why Should We Concern SEM ?

Chapter 19. What is the Identification Problem ?

Chapter 20. How to Estimate SEM ?

]

???

As you see this part contains four chapters.

Firstly, we will go through Chapter 17. Regressor Endogeneity problems and instrumental Variables solutions will be discussed in this chapter. Anyway, this chapter will be a good start for learning SEM.

The next three chapters focus closely on SEM. We will answer three important questions in turn.

In Chapter 18, we will know SEM is important in social science and it also brings new challenges to us.

Large SEM system always contains lots of parameters need to be solved and will face with the identification problems. We will give you guides and rules to check the SEM identification status in Chapter 19.

Finally, we will discuss different SEM estimation approaches in Chapter 20, including 2SLS, Three-stage least squares (3SLS) and full information maximum likelihood (FIML)  method.

---
class: center, middle,duke-softblue,hide_logo
count: false

## Chapter 17 Endogeneity and IV <br><br><br>(Part B)

[Back to Part A](SEM-slide-eng-part0-IV-v3a.html#chapter17a)

???
Part B: identification bridge, 2SLS, and tests.

---
layout: false
class: center, middle, duke-softblue,hide_logo
name: chapter17b

# Chapter 17. Endogeneity and Instumental Variables <br> (Part B)

[17.6 Two-stage least squares method](#TSLS)

[17.7 Testing instrument validity](#validity)

[17.8 Testing regressor endogeneity](#endogeneity)

[Exercise and Computation](#exercise)

???
[source](https://web.sgh.waw.pl/~mrubas/AdvEcon/pdf/T2_Endogeneity.pdf)

So let us start the first chapter.

In this chapter:

- You will see how the method of **instrumental variables** (IV) can be used to solve the problem of **endogeneity** due to one or more regressors.

- Also we will learn the method of **two stage least squares** in section 17.6. 2SLS method is second in popularity only to ordinary least squares for estimating linear equations in econometrics.

- And some useful testing techniques will be introduced to check instrument validity and regressor endogeneity. These content will be uncovered in the last two sections.

---
layout: false
class: center, middle, duke-softblue,hide_logo
name: TSLS

## 17.6 Two-stage least squares method

???
In this section, we will discuss how to perform Two-stage least squares estimation procedure.

---
layout: true

<div class="my-footer"><span>huhuaping@   <a href="#chapter17"> Chapter 17. Endogeneity and Instumental Variables  |</a> &emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp; <a href="#TSLS"> 17.6 Two-stage least squares method </a></span></div>

---

### Recall 17.5: IV as a ratio of two regressions

From section 17.5, the IV estimand can be written as `$\beta^{IV} = \rho / \pi$` — the ratio of a **reduced-form** coefficient to a **first-stage** coefficient. **2SLS** generalizes this when we have multiple instruments and controls.

---

### Two-stage least squares: glance

When we have **more** instruments than endogenous variables,
`$\boldsymbol{\hat{\beta}_{IV}}$` can be computed in 2 steps:

- **Step 1**: Regress each column of
`$X$` on all the instruments (
`$Z$` ,in matrix form ). For each column of
`$X$`, get the fitted values and combine them into the matrix
`$\hat{X}$`.

- **Step 2**: Regress
`${Y}$` on
`$\hat{X}$`

And, this procedure is named **two-stage least squares** or **2SLS** or **TSLS**.

---

### Two-stage least squares: indentification

Consider the model setting

`$$\begin{align}
Y_{i}=\beta_{0}+\sum_{j=1}^{k} \beta_{j} X_{j i}+\sum_{s=1}^{r} \beta_{k+s} W_{ri}+\epsilon_{i}
\end{align}$$`

where
`$\left(X_{1 i}, \ldots, X_{k i}\right)$` are **endogenous regressors**,
`$\left(W_{1 i}, \ldots, W_{r i}\right)$` are **exogenous regressors** and there are
`$m$` **instrumental variables**
`$\left(Z_{1 i}, \ldots, Z_{m i}\right)$` satisfying instrument relevance and instrument exogeneity conditions.

- When
`$m=k$` ,the coefficients are **exactly identified**.

- When
`$m>k$` ,the coefficients are **overidentified**.

- When
`$m<k$`, the coefficients are **underidentified**.

- Finnaly, coefficients can be identified only when
`$m \geq k$`.

???

Because the model identification is the most important thing before applying the estimation procedure.

So, We should overview the model status explicitly.

We will denote the general model format as below.

---

### Two-stage least squares: the procedure

- **Stage 1**: Regress
`$X_{1i}$` on constant, all the instruments
`$\left(Z_{1i}, \ldots, Z_{m i}\right)$` and all exogenous regressors
`$\left(W_{1i}, \ldots, W_{ri}\right)$` using OLS and obtain the fitted values
`$\hat{X}_{1 i}$` . Repeat this to get
`$\left(\hat{X}_{1 i}, \ldots, \hat{X}_{k i}\right)$`

- **Stage 2**: Regress
`$Y_{i}$` on constant,
`$\left(\hat{X}_{1 i}, \ldots, \hat{X}_{k i}\right)$` and `$\left(W_{1 i}, \ldots, W_{r i}\right)$` using
OLS to obtain
`$\left(\hat{\beta}_{0}^{IV}, \hat{\beta}_{1}^{IV}, \ldots, \hat{\beta}_{k+r}^{IV}\right)$`

???

So, in case with “exactly identification” and “over-identification”, we can go ahead with the **Two-Stage Least Squares** as a “whole” solution for IV estimation.

---

### Two-stage least squares: the solutions

We can conduct the **2SLS** procedure with following two solutions:

- use the **"Step-by-Step solution"** methods without variance correction.

- use the  **"Integrated solution"** with variance correction.

**Notice**:

DO NOT use **"Step-by-Step solution"** solution in your paper! It is only for teaching purpose here.

In `R` ecosystem, we have two packages to execute the  **Integrated solution**:

- We can use `systemfit` package function `systemfit::systemfit()`.

- Or we may use `ARE` package function `ARE::ivreg()`.

]

???

Let us apply these solutions to the empirical wage examples.

---

### Step-by-step solution: stage 1 model

First, let's try to use `$matheduc$` as instrument of endogenous variable `$educ$`.

**Stage 1 of 2SLS**: with mother education as instrument

we can obtain the fitted variable `$\widehat{educ}$` by conduct the following **step 1** OLS regression

`$$\begin{align}
\widehat{educ} = \hat{\gamma}_1 +\hat{\gamma}_2exper + \hat{\gamma}_3expersq +\hat{\gamma}_4mothereduc
\end{align}$$`

???

Again, let us do the demo of two-stage least squares procedure based on the wage example.

In the firs stage, we can obtain the fitted variable `$\widehat{educ}$` by conduct the following OLS regression.

---

### Step-by-step solution: stage 1 OLS estimate(tidy)

Here we obtain the OLS results of **Stage 1 of 2SLS**:

``` r
mod_step1 <- formula(educ~exper + expersq + motheduc)  # modle setting
ols_step1 <- lm(formula = mod_step1, data = mroz)  # OLS estimation
```

`$$\begin{equation}
\begin{alignedat}{999}
&\widehat{educ}=&&+9.78&&+0.05exper_i&&-0.00expersq_i&&+0.27motheduc_i\\ 
&(s)&&(0.4239)&&(0.0417)&&(0.0012)&&(0.0311)\\ 
&(t)&&(+23.06)&&(+1.17)&&(-1.03)&&(+8.60)\\ 
&(fit)&&R^2=0.1527&&\bar{R}^2=0.1467 && &&\\ 
&(Ftest)&&F^*=25.47&&p=0.0000 && &&
\end{alignedat}
\end{equation}$$`

The t -value for coefficient of `$mothereduc$`  is so large (larger than 2), indicating a strong correlation between this instrument and the endogenous variable  `$educ$`  even after controlling for other variables.

???
we should note that the t-value for coefficient of `$mothereduc$`  is larger than 2. and the t test is significant. This  means there is a strong correlation between the instrument `$motheduc$` and the endogenous variable  `$educ$`  even when we control all other variables.

---

### Step-by-step solution: stage 1 OLS estimate(output)

Here shows the raw output of stage 1 regression from `R` results:

``` r
summary(ols_step1)
```

```

Call:
lm(formula = mod_step1, data = mroz)

Residuals:
    Min      1Q  Median      3Q     Max 
-7.4423 -1.2963 -0.0837  1.1761  5.9870

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  9.775103   0.423889  23.061   <2e-16 ***
exper        0.048862   0.041669   1.173    0.242    
expersq     -0.001281   0.001245  -1.029    0.304    
motheduc     0.267691   0.031130   8.599   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.111 on 424 degrees of freedom
Multiple R-squared:  0.1527,	Adjusted R-squared:  0.1467 
F-statistic: 25.47 on 3 and 424 DF,  p-value: 3.617e-15
```
]

---

### Step-by-step solution: stage 1 OLS predicted values

Along with the regression of **Stage 1 of 2SLS**, we will extract the fitted value `$\widehat{educ}$` and add them into new data set.

``` r
mroz_add <- mroz %>%
  # add fitted educ to data set
  mutate(educHat = fitted(ols_step1))
```

<div class="datatables html-widget html-fill-item" id="htmlwidget-6ac7fa90f4adc1d5db31" style="width:100%;height:auto;"></div>
<script type="application/json" data-for="htmlwidget-6ac7fa90f4adc1d5db31">{"x":{"filter":"none","vertical":false,"data":[["1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","32","33","34","35","36","37","38","39","40","41","42","43","44","45","46","47","48","49","50","51","52","53","54","55","56","57","58","59","60","61","62","63","64","65","66","67","68","69","70","71","72","73","74","75","76","77","78","79","80","81","82","83","84","85","86","87","88","89","90","91","92","93","94","95","96","97","98","99","100","101","102","103","104","105","106","107","108","109","110","111","112","113","114","115","116","117","118","119","120","121","122","123","124","125","126","127","128","129","130","131","132","133","134","135","136","137","138","139","140","141","142","143","144","145","146","147","148","149","150","151","152","153","154","155","156","157","158","159","160","161","162","163","164","165","166","167","168","169","170","171","172","173","174","175","176","177","178","179","180","181","182","183","184","185","186","187","188","189","190","191","192","193","194","195","196","197","198","199","200","201","202","203","204","205","206","207","208","209","210","211","212","213","214","215","216","217","218","219","220","221","222","223","224","225","226","227","228","229","230","231","232","233","234","235","236","237","238","239","240","241","242","243","244","245","246","247","248","249","250","251","252","253","254","255","256","257","258","259","260","261","262","263","264","265","266","267","268","269","270","271","272","273","274","275","276","277","278","279","280","281","282","283","284","285","286","287","288","289","290","291","292","293","294","295","296","297","298","299","300","301","302","303","304","305","306","307","308","309","310","311","312","313","314","315","316","317","318","319","320","321","322","323","324","325","326","327","328","329","330","331","332","333","334","335","336","337","338","339","340","341","342","343","344","345","346","347","348","349","350","351","352","353","354","355","356","357","358","359","360","361","362","363","364","365","366","367","368","369","370","371","372","373","374","375","376","377","378","379","380","381","382","383","384","385","386","387","388","389","390","391","392","393","394","395","396","397","398","399","400","401","402","403","404","405","406","407","408","409","410","411","412","413","414","415","416","417","418","419","420","421","422","423","424","425","426","427","428"],[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428],[1.210153698921204,0.3285121023654938,1.514137744903564,0.09212332218885422,1.524272203445435,1.556480050086975,2.120259523391724,2.059634208679199,0.7543363571166992,1.544899344444275,1.401921629905701,1.524272203445435,0.7339532375335693,0.8183690905570984,1.302831172943115,0.2980283796787262,1.167609572410583,1.643839359283447,0.6931471824645996,2.021931648254395,1.254247546195984,1.272957682609558,1.178655028343201,1.178655028343201,0.7675586938858032,1.331811785697937,1.386294364929199,1.553269624710083,1.981814861297607,1.769360423088074,0.430807888507843,0.8997548222541809,1.766629695892334,1.272957682609558,1.336788892745972,0.9017048478126526,0.8651236891746521,1.511847138404846,1.72602915763855,2.683142423629761,0.9852942824363708,1.365938544273376,0.9450336694717407,1.512376189231873,0.6931471824645996,1.244788408279419,0.7011649012565613,1.519863247871399,0.8209685683250427,0.9698315262794495,0.828508198261261,0.09430964291095734,0.1625438928604126,0.4700036346912384,0.6292484402656555,1.397160172462463,2.265443801879883,2.084541082382202,1.525838851928711,0.762160062789917,1.48160457611084,1.262826442718506,0.9996755719184875,1.832581520080566,2.479307651519775,1.279015302658081,1.937935590744019,1.070452809333801,1.12392258644104,1.321755886077881,1.744999766349792,1.301743626594543,1.641866445541382,2.107020139694214,1.46706759929657,1.605811357498169,-1.029739379882812,1.08768618106842,0,0.9382086992263794,-0.1505903750658035,0,1.073670506477356,1.265848398208618,0.4863689839839935,2.120259523391724,1.129852533340454,0.9932518005371094,1.658627986907959,0.3474121987819672,1.568324208259583,0.5108456015586853,0.1148454323410988,-0.6931471824645996,-0.3364522755146027,1.028225541114807,1.58068859577179,0.5558946132659912,0.9014207124710083,0.8843045830726624,0.4282045960426331,1.058415055274963,0.8783395886421204,1.654908299446106,1.321755886077881,0.3285121023654938,1.386294364929199,1.172884583473206,1.224187135696411,0.2876570820808411,2.23026180267334,1.504077434539795,1.531152009963989,1.375157594680786,1.760268807411194,-0.6931471824645996,1.406489133834839,1.791759490966797,1.299292087554932,1.351003885269165,1.016280889511108,1.075343608856201,1.478964686393738,1.689486742019653,2.288597822189331,-1.822631120681763,-0.9607651829719543,1.290994167327881,0.8648711442947388,1.540452122688293,0.6162121295928955,1.648658633232117,1.193498134613037,2.143976211547852,0.7244035601615906,0.9416075348854065,0.7827593684196472,1.832581520080566,1.203962802886963,1.491644859313965,1.892132639884949,2.130894899368286,1.48060405254364,0.8943313360214233,0.2025325447320938,0.4855078160762787,1.098612308502197,1.553269624710083,0.1215979680418968,2.001804351806641,1.495036602020264,0.9052298069000244,0.6325475573539734,1.386294364929199,2.102913856506348,1.959643959999084,0.5108456015586853,1.236923933029175,1.443312525749207,1.021659255027771,0.6361534595489502,1.616453289985657,0.2231435477733612,1.049807071685791,1.415051937103271,0.5753766298294067,2.60668158531189,1.517914533615112,0.7550415992736816,1.094972372055054,0.9421143531799316,1.724942803382874,1.031546115875244,0.474369078874588,0.8109301924705505,0.7092666029930115,1.710549473762512,0.4602688848972321,1.331811785697937,1.098612308502197,2.157998561859131,1.437581300735474,1.544899344444275,1.41059672832489,3.218875885009766,0.9681618809700012,1.791759490966797,1.688729524612427,-0.4091719686985016,0.2231435477733612,0.8221558332443237,1.24170196056366,1.427124381065369,1.497097492218018,0.5596157908439636,1.300028204917908,1.884429812431335,0.9555113911628723,1.582087278366089,1.755614042282104,1.513103246688843,2.251891613006592,2.364432334899902,0.1053504794836044,1.399728775024414,0.988462507724762,1.090647339820862,1.154614448547363,1.266947627067566,2.885191679000854,1.228880047798157,1.203962802886963,1.357380270957947,0.8377236127853394,0.5369611382484436,0.7487238049507141,2.295872688293457,1.107803225517273,0.6208452582359314,-2.054163694381714,1.892012000083923,1.729724526405334,0.4693784117698669,0.9808416962623596,2.069492340087891,1.675188183784485,1.386294364929199,1.799214959144592,1.832581520080566,1.090647339820862,1.443123579025269,1.250360131263733,1.602312564849854,1.018558502197266,1.297053217887878,1.685194492340088,-0.4209848940372467,1.562094688415527,2.146527528762817,2.347462892532349,0.9698315262794495,1.924146413803101,1.62672758102417,-0.03926072642207146,1.460148692131042,1.955393552780151,0.9263598918914795,2.066191673278809,1.422843217849731,2.10103178024292,2.261461019515991,0.7013137936592102,2.031012535095215,1.162369251251221,0.4700036346912384,1.41059672832489,0.3930551111698151,1.290994167327881,0,0.9571254849433899,0.5596157908439636,1.568615913391113,1.710187911987305,1.41059672832489,0.2231435477733612,0.5108456015586853,1.332392454147339,0.8601858615875244,2.322779893875122,1.91959547996521,1.976106762886047,0.8954347372055054,0.18123759329319,0.4953058362007141,0.5777924060821533,1.07881772518158,1.603198528289795,0.6208452582359314,2.083894014358521,1.379169106483459,1.112383723258972,1.067121624946594,1.118806958198547,1.588541030883789,1.390311241149902,1.714806437492371,0.2010615319013596,0.9872710108757019,0.9835006594657898,2.233170747756958,1.143617510795593,-0.6113829016685486,2.153052091598511,1.299837350845337,0.8409204483032227,1.058484435081482,1.152658462524414,1.293575882911682,1.832581520080566,2.327180147171021,1.166146278381348,2.034993171691895,0.6792510747909546,1.547136902809143,0.7530185580253601,0.8472836017608643,0.8711259961128235,0.2282504737377167,0.08965782821178436,1.321755886077881,1.196101903915405,1.636118769645691,1.892012000083923,1.518308997154236,2.472159147262573,1.321755886077881,1.473641037940979,1.369478821754456,1.203962802886963,1.198729157447815,1.270209908485413,0.4700036346912384,0.7999816536903381,1.565945625305176,1.758978009223938,0.858025848865509,0.6931471824645996,0.6418538689613342,1.633740186691284,1.703747630119324,1.844004034996033,1.966118812561035,0.8649974465370178,0.9333052039146423,0.7792331576347351,0.9555113911628723,1.316247344017029,1.475906491279602,1.491397261619568,1.455750465393066,0.5108456015586853,1.180438041687012,1.688489437103271,0.7907274961471558,1.401798605918884,-0.4335560202598572,1.683171510696411,-1.766676664352417,3.155595064163208,2.259521007537842,1.306926369667053,0.7984976768493652,0.5590441823005676,0.1479026228189468,1.944494843482971,1.378337860107422,3.064745187759399,-0.7419173121452332,0.7657003998756409,0.619392991065979,1.465452075004578,2.18925952911377,1.021659255027771,0.9770094752311707,0.9162907600402832,2.905096054077148,-0.1996711939573288,0.6931471824645996,2.733392953872681,1.868334650993347,2.120259523391724,1.515193223953247,0.9146093130111694,1.499556064605713,0.803077220916748,0.7280316352844238,0.5164099931716919,1.22644829750061,0.9162907600402832,1.376471281051636,1.828974962234497,1.368283152580261,1.064710736274719,1.406489133834839,1.047318935394287,1.948093414306641,1.078001379966736,0.6539384722709656,1.927891612052917,1.361027836799622,0.6931471824645996,1.604686617851257,0.1839036494493484,3.113515377044678,1.926829218864441,1.2701256275177,0.6826927065849304,1.68106997013092,0.5562959909439087,1.628220438957214,0.9162907600402832,1.341558456420898,0,1.122231245040894,0.5401707887649536,1.391505718231201,1.697173953056335,3.218875885009766,0.871167778968811,1.167329549789429,1.216987729072571,0.5753766298294067,1.151615738868713,0.9942512512207031,0.5263249278068542,-1.543182134628296,1.91204309463501,0.554287314414978,0.9162907600402832,1.500939130783081,0.9446837902069092,1.241268634796143,1.564984321594238,0.8380264639854431,1.668857097625732,1.769428610801697,1.22644829750061,1.406489133834839],[12,12,12,12,14,12,16,12,12,12,12,11,12,12,10,11,12,12,12,12,16,12,13,12,12,17,12,12,17,12,11,16,13,12,16,11,12,10,14,17,12,12,16,12,12,12,16,12,12,12,12,12,12,8,10,16,14,17,14,12,14,12,8,12,12,8,17,12,12,12,12,12,9,10,12,12,12,17,15,12,6,14,12,14,9,17,13,9,15,12,12,12,12,12,12,12,12,13,12,13,12,12,12,16,12,13,11,12,12,12,17,14,16,17,12,11,12,12,17,10,13,11,12,16,17,12,16,12,16,8,12,12,12,13,11,12,12,14,12,12,12,17,14,12,9,12,12,12,14,16,17,15,12,16,17,17,12,16,13,12,11,16,14,16,12,9,17,14,12,12,11,12,12,10,12,5,17,11,12,12,14,11,12,14,12,10,16,13,12,12,12,11,12,9,13,12,12,12,13,16,12,16,17,12,12,9,12,12,13,12,12,12,12,10,12,16,12,11,12,10,12,12,12,12,16,17,12,17,12,12,12,8,12,13,12,12,8,12,17,17,12,13,12,12,12,12,9,10,12,16,13,8,16,13,12,11,13,12,12,10,12,17,15,16,10,11,12,12,14,16,14,8,7,12,12,14,12,12,12,14,16,12,12,12,13,13,10,12,12,12,12,14,17,10,9,12,12,16,12,17,12,17,11,16,11,13,11,8,11,12,10,17,12,12,17,14,12,12,12,12,12,12,9,10,12,12,12,12,12,17,12,17,12,10,12,12,12,12,12,12,16,13,13,12,16,17,12,14,12,17,12,14,12,12,17,16,16,12,9,12,12,16,14,12,12,11,12,16,17,17,14,12,14,12,10,12,13,16,12,7,16,14,12,10,12,16,10,12,14,12,6,15,12,17,14,13,6,16,14,15,14,8,14,12,12,12,12,12,12,8,12,17,12,12,14,13,17,8,12,11,12,12,17,10,12,13,12,12],[14,5,15,6,7,33,11,35,24,21,15,14,0,14,6,9,20,6,23,9,5,11,18,15,4,21,31,9,7,7,32,11,16,14,27,0,17,28,24,11,1,14,6,10,6,4,10,22,16,6,12,32,15,17,34,9,37,10,35,6,19,10,11,15,12,12,14,11,9,24,12,13,29,11,13,19,2,24,9,6,22,30,10,6,29,29,36,19,8,13,16,11,15,6,13,22,24,2,6,2,2,14,9,11,9,6,19,26,19,3,7,28,13,9,15,20,29,9,1,8,19,23,3,13,8,17,4,15,11,7,0,0,10,8,2,4,6,18,3,22,33,28,23,27,11,6,11,14,17,17,14,11,7,8,6,8,4,25,24,11,19,9,19,14,22,6,23,15,6,11,2,22,10,14,12,9,13,18,8,11,9,9,14,9,2,12,15,11,7,9,19,11,8,13,4,7,19,14,14,3,9,7,7,14,29,19,14,16,10,12,24,6,9,14,26,7,4,15,23,1,29,9,6,11,17,6,7,2,24,4,11,25,11,2,19,7,2,20,10,19,17,12,11,6,10,4,2,13,21,9,4,2,19,4,9,14,6,24,1,13,3,10,16,9,19,4,10,5,7,3,38,16,13,1,7,15,10,2,19,25,25,7,15,11,25,19,4,14,19,18,14,11,4,29,21,24,19,31,28,15,27,13,4,10,8,4,18,3,11,8,10,33,19,35,21,7,18,4,12,16,14,3,1,27,12,6,9,2,6,9,16,22,26,11,11,15,13,6,20,17,8,13,15,14,14,6,24,10,2,9,23,12,8,16,10,7,19,2,9,14,9,16,7,6,22,9,9,14,17,12,13,8,10,16,1,6,4,8,4,15,7,14,16,15,23,19,4,12,12,25,14,14,11,7,18,4,37,13,14,17,5,2,0,3,21,20,19,4,19,11,14,8,13,24,1,1,3,4,21,10,13,9,14,2,21,22,14,7],[196,25,225,36,49,1089,121,1225,576,441,225,196,0,196,36,81,400,36,529,81,25,121,324,225,16,441,961,81,49,49,1024,121,256,196,729,0,289,784,576,121,1,196,36,100,36,16,100,484,256,36,144,1024,225,289,1156,81,1369,100,1225,36,361,100,121,225,144,144,196,121,81,576,144,169,841,121,169,361,4,576,81,36,484,900,100,36,841,841,1296,361,64,169,256,121,225,36,169,484,576,4,36,4,4,196,81,121,81,36,361,676,361,9,49,784,169,81,225,400,841,81,1,64,361,529,9,169,64,289,16,225,121,49,0,0,100,64,4,16,36,324,9,484,1089,784,529,729,121,36,121,196,289,289,196,121,49,64,36,64,16,625,576,121,361,81,361,196,484,36,529,225,36,121,4,484,100,196,144,81,169,324,64,121,81,81,196,81,4,144,225,121,49,81,361,121,64,169,16,49,361,196,196,9,81,49,49,196,841,361,196,256,100,144,576,36,81,196,676,49,16,225,529,1,841,81,36,121,289,36,49,4,576,16,121,625,121,4,361,49,4,400,100,361,289,144,121,36,100,16,4,169,441,81,16,4,361,16,81,196,36,576,1,169,9,100,256,81,361,16,100,25,49,9,1444,256,169,1,49,225,100,4,361,625,625,49,225,121,625,361,16,196,361,324,196,121,16,841,441,576,361,961,784,225,729,169,16,100,64,16,324,9,121,64,100,1089,361,1225,441,49,324,16,144,256,196,9,1,729,144,36,81,4,36,81,256,484,676,121,121,225,169,36,400,289,64,169,225,196,196,36,576,100,4,81,529,144,64,256,100,49,361,4,81,196,81,256,49,36,484,81,81,196,289,144,169,64,100,256,1,36,16,64,16,225,49,196,256,225,529,361,16,144,144,625,196,196,121,49,324,16,1369,169,196,289,25,4,0,9,441,400,361,16,361,121,196,64,169,576,1,1,9,16,441,100,169,81,196,4,441,484,196,49],[7,7,7,7,14,7,7,3,7,7,3,7,16,10,7,10,7,12,7,7,16,10,3,7,7,14,7,7,12,12,7,3,10,14,12,3,3,3,7,17,12,9,16,3,7,7,16,10,7,7,7,3,7,7,3,12,7,17,7,7,3,12,7,7,7,12,16,7,7,7,12,10,9,0,10,14,7,3,12,12,7,17,3,7,7,12,7,7,12,10,0,12,10,7,7,7,3,12,7,12,7,7,10,14,7,12,7,7,10,7,12,7,7,17,7,7,7,10,10,12,7,12,7,10,7,10,7,7,7,7,7,7,16,12,7,3,7,7,7,12,7,12,12,14,7,7,7,12,12,14,10,12,7,16,7,17,3,10,9,7,3,16,12,7,7,7,12,3,7,7,7,7,10,10,7,12,17,10,7,7,12,7,12,7,7,7,7,14,7,12,7,7,12,3,7,12,12,7,7,14,12,17,17,7,7,10,7,7,7,3,0,7,12,7,7,12,7,7,12,7,7,3,7,7,10,12,7,12,7,7,7,7,12,7,7,7,7,7,14,17,7,10,7,7,12,7,7,9,7,14,7,3,16,3,16,7,16,12,7,7,12,12,12,16,7,9,7,12,12,10,7,12,7,7,3,10,17,7,3,3,12,7,7,7,10,7,10,7,9,9,12,12,12,7,9,12,7,12,7,12,12,14,7,14,10,12,7,7,12,7,7,12,12,12,12,14,10,7,7,7,7,7,7,7,7,7,12,12,7,14,10,12,7,7,12,7,10,12,10,7,14,7,12,12,7,16,0,12,7,17,7,12,3,7,14,7,12,12,7,7,14,7,12,12,7,7,7,7,3,12,7,7,14,10,10,10,10,12,3,7,16,7,7,0,7,7,10,7,12,7,7,7,7,16,12,10,7,7,16,7,7,7,12,10,10,10,7,10,12,12,7,17,12,16,10,16,7,7,9,3,7,7,7,7,7,7,16,12],[12,7,12,7,12,14,14,3,7,7,12,14,16,10,7,16,10,12,7,12,10,12,7,7,12,16,3,3,12,12,7,3,12,7,12,10,3,10,7,14,12,9,14,3,12,12,14,10,7,12,7,7,12,7,7,12,7,17,17,12,14,12,7,7,7,12,12,12,7,12,12,10,7,0,7,12,7,3,10,7,12,12,7,7,7,7,7,7,7,10,7,12,10,12,7,7,7,14,7,12,12,7,7,14,12,10,7,7,7,7,12,7,12,10,10,7,7,7,12,7,7,12,14,12,7,10,7,7,12,10,7,7,12,10,7,12,7,7,7,7,3,12,16,7,3,12,7,12,12,16,12,12,7,14,7,10,7,14,7,7,12,12,17,7,7,3,12,7,7,7,3,7,10,10,12,7,14,10,7,7,10,12,12,7,7,7,12,7,12,12,12,10,12,10,12,12,12,7,12,12,12,12,16,7,16,7,7,10,12,10,0,7,12,12,10,12,3,7,12,10,7,7,7,7,12,12,7,12,7,10,10,7,12,17,7,7,7,7,12,14,7,12,7,7,16,7,10,12,7,16,10,3,16,7,12,7,7,7,12,12,7,10,14,16,7,10,7,14,14,12,7,7,3,7,7,7,12,10,7,3,12,7,12,7,10,7,0,7,10,9,12,12,12,3,9,12,12,14,7,12,12,12,7,12,10,12,7,7,3,12,7,16,12,12,7,14,7,7,12,10,7,3,7,7,10,7,7,12,12,12,10,14,7,7,14,10,10,7,7,7,14,7,12,14,14,14,0,16,7,12,7,10,7,10,10,14,7,12,7,7,12,14,12,7,7,7,12,16,3,16,7,16,7,10,10,10,10,12,10,7,16,7,7,7,7,7,10,10,12,7,7,7,7,14,14,7,7,7,16,12,7,7,12,12,12,10,3,12,12,12,7,16,12,10,10,12,7,7,7,7,7,7,7,7,7,7,12,12],[13.42036466548422,11.86121922986205,13.43207528132518,11.89598901522093,13.26665071609442,13.74012376382735,13.90524165646517,10.71902302758776,12.08372093085125,12.1100802020396,13.43207528132518,13.95574628366802,14.05815563570166,12.88498304729926,11.89598901522094,14.39414287344911,12.91681479315288,13.23444306068189,12.09506948452707,13.32337963708035,12.66429165713861,13.36986003828079,12.1133803037106,12.09362123586423,13.16234136001772,14.51929748386931,10.86177818025326,10.91416235525064,13.26665071609442,13.26665071609442,11.90069582337521,10.96064275645108,13.44122376722034,12.08191062002269,13.37275653560648,12.45201078114852,11.03859284133942,12.81577784396079,12.08372093085125,13.90524165646517,13.03497283442366,12.61729223820707,13.76982467886627,10.93868362082405,13.23444306068189,13.16234136001772,13.88328252083813,12.9069283355331,12.10276972175939,13.23444306068189,12.05080299850049,11.90069582337521,13.43207528132518,12.10935607770818,11.82931824704246,13.32337963708035,11.70303590794554,14.6863549481147,14.46669435487842,13.23444306068189,13.98867806341197,13.34790090265376,12.03140599281983,12.09362123586423,12.05080299850049,13.38925704396144,13.42036466548364,13.36986003828079,11.9849255916194,13.4221749763122,13.38925704396144,12.87071030151135,11.98854621327652,10.15757032917451,12.06763787423478,13.45329644522759,11.74153709410711,11.01295769448249,12.78799801889597,11.89598901522094,13.44230995371748,13.30027892538341,12.0094468571928,11.89598901522094,11.98854621327652,11.98854621327652,11.74769215092422,12.11484239976664,11.95784219609962,12.87071030151135,12.10276972175939,13.36986003828079,12.8966936631408,13.23444306068189,12.06763787423478,12.10385590825653,12.08372093085125,13.61537275775244,11.89598901522094,13.07999113956806,13.07999113956806,12.08191062002269,11.9849255916194,13.90524165646517,13.32337963708035,12.69906144249751,12.11484239976664,12.05333743366048,12.11484239976664,11.78399326930513,13.26665071609442,12.01270541668422,13.40609191969573,12.78799801889597,12.8966936631408,12.11374236587631,11.98854621327652,11.9849255916194,13.03497283442366,11.95784219609962,12.11484239976664,13.43352352998803,13.65782893295046,13.40609191969573,11.95784219609962,12.91242850498475,11.82388731455677,12.09362123586423,13.36986003828079,12.73126909791004,11.64893835387195,11.64893835387195,13.34790090265376,12.76091462337619,11.74153709410711,13.16234136001772,11.89598901522094,12.1133803037106,11.78399326930513,12.10385590825653,10.79552486381326,13.35115946214517,14.50428676635678,12.03430249014554,10.96064275645108,13.23444306068189,12.03140599281984,13.42036466548364,13.44781012316913,14.51857335953789,13.42036466548364,13.36986003828079,11.92819667063347,13.83167785974495,11.89598901522094,12.76091462337619,11.82388731455677,13.94364591087438,12.08372093085125,12.03140599281984,13.45329644522759,13.32337963708035,14.79175049068854,12.08191062002269,12.10385590825653,10.82522577885218,13.43352352998803,12.09362123586423,11.89598901522094,12.03140599281984,10.67077385773835,12.10385590825653,12.81251928446937,12.88498304729926,13.38925704396144,11.9849255916194,13.94147353788011,12.91645273098717,11.95784219609962,12.03140599281984,12.78799801889597,13.32337963708035,13.42036466548364,11.9849255916194,11.74153709410711,12.05080299850049,13.43207528132518,12.03140599281984,13.26665071609442,13.32337963708035,13.45329644522759,12.83447842009641,13.29629624156057,12.87071030151135,13.16234136001772,13.26665071609442,13.45329644522759,12.08191062002269,13.42036466548364,13.12244731476608,13.32337963708035,13.26665071609442,14.33741395246318,12.08191062002269,14.39776349510623,12.11484239976664,12.08191062002269,12.90584214903596,13.34790090265376,12.85387542577706,10.20988526720592,11.89598901522094,13.32337963708035,13.42036466548364,12.85640986093705,13.26665071609442,10.75312407818801,12.09362123586423,13.43352352998803,12.49959121623928,11.98854621327652,11.9849255916194,11.89598901522094,12.03140599281984,13.44781012316913,13.23444306068189,11.92819667063347,13.07999113956806,12.08372093085125,12.62695974183334,12.83447842009641,12.06981024722905,13.36986003828079,14.41844518502901,12.11484239976664,11.92819667063347,11.74153709410711,12.11374236587631,13.34790090265376,13.98867806341197,12.10935607770818,13.38925704396144,12.03140599281984,11.89598901522094,14.41866413902251,11.82388731455677,12.54460952138368,13.40609191969573,12.1100802020396,14.39414287344911,12.62695974183334,10.67077385773835,14.52405968159635,11.82388731455677,13.32337963708035,12.08191062002269,11.89598901522094,12.08372093085125,13.03497283442366,13.40609191969573,11.78399326930513,12.81251928446937,13.97660538540472,14.39414287344911,12.11484239976664,12.62695974183334,12.0094468571928,13.73505489350737,13.8020323342788,13.12244731476608,11.65581753502048,12.10276972175939,10.99687463786602,11.69651878896271,11.92819667063347,12.09362123586423,13.34790090265376,12.54460952138368,12.11484239976664,10.99904701086029,13.40826429269,11.92819667063347,13.43207528132518,12.03140599281984,12.87288267450562,12.11484239976664,9.950051650911444,12.08191062002269,12.91791482704321,12.64876192189498,13.42036466548364,13.36986003828079,13.16234136001772,10.91778297690776,12.64546182022398,13.4221749763122,13.45329644522759,13.80637708026735,12.01270541668422,13.43207528132518,13.37275653560648,13.40609191969573,11.82388731455677,13.34790090265376,12.76091462337619,13.16234136001772,12.1133803037106,11.78399326930513,10.96064275645108,13.29629624156057,12.00944685719281,14.27550538201173,13.45329644522759,13.12824030941747,12.1100802020396,13.8020323342788,12.1133803037106,11.82388731455677,13.38925704396144,12.90584214903596,12.08191062002269,10.71323003293637,11.69651878896271,12.03430249014554,12.85387542577706,11.89598901522094,11.9849255916194,13.07999113956806,13.23444306068189,13.32337963708035,12.90584214903596,13.97769157190186,12.05333743366048,12.03140599281984,13.90524165646517,12.8966936631408,12.87071030151135,11.89598901522094,12.11374236587631,12.10935607770818,13.83167785974495,12.06763787423478,13.43207528132518,13.95574628366802,13.95574628366802,13.76982467886627,10.20988526720592,14.41866413902251,11.74153709410711,13.32337963708035,12.09506948452707,12.85387542577706,11.95784219609962,12.90584214903596,12.81251928446937,13.8020323342788,12.11484239976664,13.07999113956806,11.9849255916194,12.08191062002269,13.32337963708035,13.97660538540472,13.26665071609442,11.89598901522094,12.10385590825653,11.9849255916194,13.32337963708035,14.4911279018524,11.03859284133942,14.4600202803302,12.06763787423478,14.36705947792933,12.0094468571928,12.90584214903596,12.49959121623928,12.69906144249751,12.62695974183334,13.29629624156057,12.62695974183334,12.09362123586423,14.33741395246318,12.08191062002269,12.10276972175939,12.09362123586423,12.09506948452707,12.11484239976664,12.62695974183334,12.85387542577706,13.38925704396144,12.06981024722905,12.08191062002269,12.08191062002269,12.03140599281984,13.8020323342788,13.98721596735593,11.82388731455677,11.70303590794554,12.06763787423478,14.4911279018524,13.44781012316913,11.86121922986204,11.74153709410711,12.9873923993329,13.12244731476608,13.44853424750055,12.91681479315288,11.04407916339788,13.16234136001772,13.45329644522759,13.36986003828079,12.08191062002269,14.36705947792933,13.40609191969573,12.88679335812782,12.49959121623928,13.03497283442366,11.78399326930513,11.82388731455677,12.1100802020396,12.0094468571928,12.06763787423478,11.9849255916194,12.08191062002269,11.74153709410711,12.1100802020396,12.10385590825653,13.42036466548364,13.26665071609442]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>id<\/th>\n      <th>lwage<\/th>\n      <th>educ<\/th>\n      <th>exper<\/th>\n      <th>expersq<\/th>\n      <th>fatheduc<\/th>\n      <th>motheduc<\/th>\n      <th>educHat<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"pageLength":5,"dom":"tip","columnDefs":[{"targets":2,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 2, 3, \",\", \".\", null);\n  }"},{"targets":8,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 2, 3, \",\", \".\", null);\n  }"},{"className":"dt-right","targets":[1,2,3,4,5,6,7,8]},{"orderable":false,"targets":0},{"name":" ","targets":0},{"name":"id","targets":1},{"name":"lwage","targets":2},{"name":"educ","targets":3},{"name":"exper","targets":4},{"name":"expersq","targets":5},{"name":"fatheduc","targets":6},{"name":"motheduc","targets":7},{"name":"educHat","targets":8}],"order":[],"autoWidth":false,"orderClasses":false,"lengthMenu":[5,10,25,50,100]}},"evals":["options.columnDefs.0.render","options.columnDefs.1.render"],"jsHooks":[]}</script>

---

### Step-by-step solution: stage 2 model

**Stage 2 of 2SLS**: with mother education as instrument

In the second stage, we will regress log(wage) on the `$\widehat{educ}$`  from stage 1 and experience and its quadratic term exp square.

`$$\begin{align}
lwage = \hat{\beta}_1 +\hat{\beta}_2\widehat{educ} + \hat{\beta}_3exper +\hat{\beta}_4expersq + \hat{\epsilon}
\end{align}$$`

``` r
mod_step2 <- formula(lwage~educHat + exper + expersq)
ols_step2 <- lm(formula = mod_step2, data = mroz_add)
```

---

### Step-by-step solution: stage 2 OLS estimate(tidy)

By using the new data set (`moroz_add`), the result of the explicit 2SLS procedure are shown as below.
`$$\begin{equation}
\begin{alignedat}{999}
&\widehat{lwage}=&&+0.20&&+0.05educHat_i&&+0.04exper_i&&-0.00expersq_i\\ 
&(s)&&(0.4933)&&(0.0391)&&(0.0142)&&(0.0004)\\ 
&(t)&&(+0.40)&&(+1.26)&&(+3.17)&&(-2.17)\\ 
&(fit)&&R^2=0.0456&&\bar{R}^2=0.0388 && &&\\ 
&(Ftest)&&F^*=6.75&&p=0.0002 && &&
\end{alignedat}
\end{equation}$$`

Keep in mind, however, that the **standard errors** calculated in this way are incorrect (Why?).

]

???
while the t-test on the coefficient of education is not significant because the t statistics is less than the critical value 2. But the model F-test is significant with small p value here.

---

### Step-by-step solution: stage 2 OLS estimate(output)

Here shows the raw output of stage 2 regression from `R` results:

``` r
summary(ols_step2)
```

```

Call:
lm(formula = mod_step2, data = mroz_add)

Residuals:
     Min       1Q   Median       3Q      Max 
-3.15299 -0.34773  0.02906  0.39023  2.35624

Coefficients:
              Estimate Std. Error t value Pr(>|t|)   
(Intercept)  0.1981861  0.4933427   0.402  0.68809   
educHat      0.0492630  0.0390562   1.261  0.20788   
exper        0.0448558  0.0141644   3.167  0.00165 **
expersq     -0.0009221  0.0004240  -2.175  0.03019 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.709 on 424 degrees of freedom
Multiple R-squared:  0.04559,	Adjusted R-squared:  0.03884 
F-statistic: 6.751 on 3 and 424 DF,  p-value: 0.0001861
```
]

---

### Integrated solution:  the whole story

We need a **Integrated solution** for following reasons:

- We should obtain the correct estimated error for test and inference.

- We should avoid tedious steps in the former step-by-step routine. When the model contains more than one endogenous regressors and there are lots available instruments, then the step-by-step solution will get extremely tedious.

---

### Integrated solution: the `R` toolbox

In `R` ecosystem, we have two packages to execute the  integrated solution:

- We can use `systemfit` package function `systemfit::systemfit()`.

- Or we may use `ARE` package function `ARE::ivreg()`.

Both of these tools can conduct the integrated solution, and will adjust the variance of estimators automatically.

---

### Rscript: `ARE::ivreg()` for IV (m)

---

### Integrated solution: `motheduc` IV model

In order to get the correct estimated error, we need use the  **"integrated solution"** for 2SLS. And we will process the estimation with proper software and tools.

Firstly, let's consider using `$motheduc$` as the only instrument for `$educ$`.

`$$\begin{cases}
  \begin{align}
  \widehat{educ} &= \hat{\gamma}_1 +\hat{\gamma}_2exper + \hat{\gamma}_3expersq +\hat{\gamma}_4motheduc  && \text{(stage 1)}\\
  lwage & = \hat{\beta}_1 +\hat{\beta}_2\widehat{educ} + \hat{\beta}_3exper +\hat{\beta}_4expersq + \hat{\epsilon}  && \text{(stage 2)}
  \end{align}
\end{cases}$$`

---

### Integrated solution: `motheduc` IV results

<div class="datatables html-widget html-fill-item" id="htmlwidget-d0b3807cef7d80b1d6f9" style="width:100%;height:auto;"></div>
<script type="application/json" data-for="htmlwidget-d0b3807cef7d80b1d6f9">{"x":{"filter":"none","vertical":false,"caption":"<caption>2SLS result (`motheduc` as instrument)<\/caption>","data":[["1","2","3","4","5","6","7","8"],["eq1","eq1","eq1","eq1","eq2","eq2","eq2","eq2"],["(Intercept)","exper","expersq","motheduc","(Intercept)","educ","exper","expersq"],[9.775102690226637,0.04886150006395119,-0.001281064973186706,0.2676908090921895,0.1981860564727007,0.04926295335038281,0.04485584787359646,-0.0009220761624694296],[0.4238886153570713,0.04166926042173844,0.001244905624400281,0.03112979662060363,0.4728772295378676,0.0374360256307117,0.0135768173487378,0.000406381308328036],[23.06054547370275,1.17260300685492,-1.029045855426867,8.599182717275291,0.4191067873290992,1.31592369970942,3.303855883261669,-2.268992553479153],[0,0.2416134223719495,0.3040447864773395,0,0.6753503302668307,0.1889106699097165,0.001034570786957012,0.02377054666535239]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>eq<\/th>\n      <th>vars<\/th>\n      <th>Estimate<\/th>\n      <th>Std. Error<\/th>\n      <th>t value<\/th>\n      <th>Pr(>|t|)<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"pageLength":10,"dom":"t","columnDefs":[{"targets":3,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 4, 3, \",\", \".\", null);\n  }"},{"targets":4,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 4, 3, \",\", \".\", null);\n  }"},{"targets":5,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 4, 3, \",\", \".\", null);\n  }"},{"targets":6,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 4, 3, \",\", \".\", null);\n  }"},{"className":"dt-right","targets":[3,4,5,6]},{"orderable":false,"targets":0},{"name":" ","targets":0},{"name":"eq","targets":1},{"name":"vars","targets":2},{"name":"Estimate","targets":3},{"name":"Std. Error","targets":4},{"name":"t value","targets":5},{"name":"Pr(>|t|)","targets":6}],"order":[],"autoWidth":false,"orderClasses":false}},"evals":["options.columnDefs.0.render","options.columnDefs.1.render","options.columnDefs.2.render","options.columnDefs.3.render"],"jsHooks":[]}</script>

- The t-test for variable `educ` is significant (p-value less than 0.05).

**Note** : The corresponding code of `R` programming is in the following slides. The table results use the report from the `systemfit::systemfit()` function.

]

---

### (Supplements) R code (m): `systemfit::systemfit()`

The R code using `systemfit::systemfit()` as follows:

``` r
# load pkg
require(systemfit)
# set two models
eq_1 <- educ ~  exper + expersq + motheduc
eq_2 <- lwage ~ educ + exper + expersq
sys <- list(eq1 = eq_1, eq2 = eq_2)
# specify the instruments
instr <- ~  exper + expersq + motheduc
# fit models
fit.sys <- systemfit(
  sys, inst=instr,
  method="2SLS", data = mroz)
# summary of model fit
smry.system_m <- summary(fit.sys)
```

---

### (Supplements) R report (m): `systemfit::systemfit()`

The following is the 2SLS analysis report using `systemfit::systemfit() `:

``` r
smry.system_m
```

```

systemfit results 
method: 2SLS

N  DF     SSR detRCov   OLS-R2 McElroy-R2
system 856 848 2085.49 1.96552 0.150003   0.112323

N  DF      SSR      MSE     RMSE       R2   Adj R2
eq1 428 424 1889.658 4.456742 2.111100 0.152694 0.146699
eq2 428 424  195.829 0.461861 0.679604 0.123130 0.116926

The covariance matrix of the residuals
         eq1      eq2
eq1 4.456742 0.304759
eq2 0.304759 0.461861

The correlations of the residuals
         eq1      eq2
eq1 1.000000 0.212418
eq2 0.212418 1.000000

2SLS estimates for 'eq1' (equation 1)
Model Formula: educ ~ exper + expersq + motheduc
Instruments: ~exper + expersq + motheduc

Estimate  Std. Error  t value Pr(>|t|)    
(Intercept)  9.77510269  0.42388862 23.06055  < 2e-16 ***
exper        0.04886150  0.04166926  1.17260  0.24161    
expersq     -0.00128106  0.00124491 -1.02905  0.30404    
motheduc     0.26769081  0.03112980  8.59918  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.1111 on 424 degrees of freedom
Number of observations: 428 Degrees of Freedom: 424 
SSR: 1889.658428 MSE: 4.456742 Root MSE: 2.1111 
Multiple R-Squared: 0.152694 Adjusted R-Squared: 0.146699

2SLS estimates for 'eq2' (equation 2)
Model Formula: lwage ~ educ + exper + expersq
Instruments: ~exper + expersq + motheduc

Estimate   Std. Error  t value  Pr(>|t|)   
(Intercept)  0.198186056  0.472877230  0.41911 0.6753503   
educ         0.049262953  0.037436026  1.31592 0.1889107   
exper        0.044855848  0.013576817  3.30386 0.0010346 **
expersq     -0.000922076  0.000406381 -2.26899 0.0237705 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.679604 on 424 degrees of freedom
Number of observations: 428 Degrees of Freedom: 424 
SSR: 195.829058 MSE: 0.461861 Root MSE: 0.679604 
Multiple R-Squared: 0.12313 Adjusted R-Squared: 0.116926 
```

]

**NOTE** : `systemfit::systemfit()` simultaneously reports the analysis results of two equations in 2SLS!

]

---

### (Supplements) R code (m): `ARE::ivreg()`

The R code using `ARE::ivreg()` as follows:

``` r
# load pkg
require(AER)
# specify model
mod_iv_m <- formula(lwage ~ educ + exper + expersq
                     | motheduc + exper + expersq)
# fit model
lm_iv_m <- ivreg(formula = mod_iv_m, data = mroz)
# summary of model fit
smry.ivm <- summary(lm_iv_m)
```

]

---

### (Supplements) R report (m): `ARE::ivreg()`

The following is the 2SLS analysis report using `ARE::ivreg()`:

``` r
smry.ivm
```

```

Call:
AER::ivreg(formula = lwage ~ educ + exper + expersq | motheduc + 
    exper + expersq, data = mroz)

Residuals:
     Min       1Q   Median       3Q      Max 
-3.10804 -0.32633  0.06024  0.36772  2.34351

Coefficients:
              Estimate Std. Error t value Pr(>|t|)   
(Intercept)  0.1981861  0.4728772   0.419  0.67535   
educ         0.0492630  0.0374360   1.316  0.18891   
exper        0.0448558  0.0135768   3.304  0.00103 **
expersq     -0.0009221  0.0004064  -2.269  0.02377 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.6796 on 424 degrees of freedom
Multiple R-Squared: 0.1231,	Adjusted R-squared: 0.1169 
Wald test: 7.348 on 3 and 424 DF,  p-value: 8.228e-05 
```

]

**Note**: `ARE::ivreg()` Only reports the result of the last equation of 2SLS, not include the first equation!

]

???

We can see that the t-test for
`$educ$` is not significant.

We should note that the instruments (motheduc + exper + expersq) are included as whole behind the procedure in this code chunk. But we do not  see these instruments in the output.

We can see that the t-test on the coefficient of education is still not significant.

---

### Rscript: `ARE::ivreg()` for IV (f)

---

### Integrated solution: `fatheduc` IV model

Now let's consider using `$fatheduc$` as the only instrument for `$educ$`.

`$$\begin{cases}
  \begin{align}
  \widehat{educ} &= \hat{\gamma}_1 +\hat{\gamma}_2exper + \hat{\gamma}_3expersq +\hat{\gamma}_4fatheduc  && \text{(stage 1)}\\
  lwage & = \hat{\beta}_1 +\hat{\beta}_2\widehat{educ} + \hat{\beta}_3exper +\hat{\beta}_4expersq + \hat{\epsilon}  && \text{(stage 2)}
  \end{align}
\end{cases}$$`

We will repeat the whole procedure with `R`.

???
We will repeat the whole procedure with `R`.

---

### Integrated solution: `fatheduc` IV results

<div class="datatables html-widget html-fill-item" id="htmlwidget-aa2f0ea4a2dc7e8f2d71" style="width:100%;height:auto;"></div>
<script type="application/json" data-for="htmlwidget-aa2f0ea4a2dc7e8f2d71">{"x":{"filter":"none","vertical":false,"caption":"<caption>2SLS result (`fatheduc` as instrument)<\/caption>","data":[["1","2","3","4","5","6","7","8"],["eq1","eq1","eq1","eq1","eq2","eq2","eq2","eq2"],["(Intercept)","exper","expersq","fatheduc","(Intercept)","educ","exper","expersq"],[9.88703428977956,0.0468243339408936,-0.001150382546545442,0.2705061011723723,-0.06111693330746015,0.07022629127205504,0.0436715881293298,-0.0008821549586141926],[0.3956077875642711,0.04110742425261585,0.001228568028189189,0.02887859434343325,0.4364461275559863,0.03444269413256235,0.01340012103140733,0.0004009170075461476],[24.992011281308,1.139072437454262,-0.9363604783375439,9.367010663865054,-0.1400331666354863,2.038931420456527,3.259044304672467,-2.200343068540568],[0,0.2553160733348245,0.3496206296559616,0,0.8887002806250497,0.04207657247632368,0.001207928423314186,0.02832119375282405]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>eq<\/th>\n      <th>vars<\/th>\n      <th>Estimate<\/th>\n      <th>Std. Error<\/th>\n      <th>t value<\/th>\n      <th>Pr(>|t|)<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"pageLength":10,"dom":"t","columnDefs":[{"targets":3,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 4, 3, \",\", \".\", null);\n  }"},{"targets":4,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 4, 3, \",\", \".\", null);\n  }"},{"targets":5,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 4, 3, \",\", \".\", null);\n  }"},{"targets":6,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 4, 3, \",\", \".\", null);\n  }"},{"className":"dt-right","targets":[3,4,5,6]},{"orderable":false,"targets":0},{"name":" ","targets":0},{"name":"eq","targets":1},{"name":"vars","targets":2},{"name":"Estimate","targets":3},{"name":"Std. Error","targets":4},{"name":"t value","targets":5},{"name":"Pr(>|t|)","targets":6}],"order":[],"autoWidth":false,"orderClasses":false}},"evals":["options.columnDefs.0.render","options.columnDefs.1.render","options.columnDefs.2.render","options.columnDefs.3.render"],"jsHooks":[]}</script>

> The t-test for variable `educ` is significant (p-value less than 0.05).

**Note** : The corresponding code of `R` programming is in the following slides. The table results use the report from the `systemfit::systemfit()` function.

]

---

### (Supplements) R code (f): `systemfit::systemfit()`

The R code using `systemfit::systemfit()` as follows:

``` r
# load pkg
require(systemfit)
# set two models
eq_1 <- educ ~  exper + expersq + fatheduc
eq_2 <- lwage ~ educ + exper + expersq 
sys <- list(eq1 = eq_1, eq2 = eq_2)
# specify the instruments
instr <- ~ exper + expersq + fatheduc
# fit models 
fit.sys <- systemfit(
  sys, inst=instr, 
  method="2SLS", data = mroz)
# summary of model fit
smry.system_f <- summary(fit.sys)
```

---

### (Supplements) R report (f): `systemfit::systemfit()`

The following is the 2SLS analysis report using `systemfit::systemfit() `:

``` r
smry.system_f
```

```

systemfit results 
method: 2SLS

N  DF     SSR detRCov   OLS-R2 McElroy-R2
system 856 848 2030.11 1.91943 0.172575   0.134508

N  DF      SSR      MSE     RMSE       R2   Adj R2
eq1 428 424 1838.719 4.336602 2.082451 0.175535 0.169701
eq2 428 424  191.387 0.451384 0.671851 0.143022 0.136959

The covariance matrix of the residuals
         eq1      eq2
eq1 4.336602 0.195036
eq2 0.195036 0.451384

The correlations of the residuals
         eq1      eq2
eq1 1.000000 0.139402
eq2 0.139402 1.000000

2SLS estimates for 'eq1' (equation 1)
Model Formula: educ ~ exper + expersq + fatheduc
Instruments: ~exper + expersq + fatheduc

Estimate  Std. Error  t value Pr(>|t|)    
(Intercept)  9.88703429  0.39560779 24.99201  < 2e-16 ***
exper        0.04682433  0.04110742  1.13907  0.25532    
expersq     -0.00115038  0.00122857 -0.93636  0.34962    
fatheduc     0.27050610  0.02887859  9.36701  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.082451 on 424 degrees of freedom
Number of observations: 428 Degrees of Freedom: 424 
SSR: 1838.719104 MSE: 4.336602 Root MSE: 2.082451 
Multiple R-Squared: 0.175535 Adjusted R-Squared: 0.169701

2SLS estimates for 'eq2' (equation 2)
Model Formula: lwage ~ educ + exper + expersq
Instruments: ~exper + expersq + fatheduc

Estimate   Std. Error  t value  Pr(>|t|)   
(Intercept) -0.061116933  0.436446128 -0.14003 0.8887003   
educ         0.070226291  0.034442694  2.03893 0.0420766 * 
exper        0.043671588  0.013400121  3.25904 0.0012079 **
expersq     -0.000882155  0.000400917 -2.20034 0.0283212 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.671851 on 424 degrees of freedom
Number of observations: 428 Degrees of Freedom: 424 
SSR: 191.386653 MSE: 0.451384 Root MSE: 0.671851 
Multiple R-Squared: 0.143022 Adjusted R-Squared: 0.136959 
```

]

**NOTE** : `systemfit::systemfit()` simultaneously reports the analysis results of two equations in 2SLS!

]

---

### (Supplements) R code (f): `ARE::ivreg()`

The R code using `ARE::ivreg()` as follows:

``` r
require(AER)
mod_iv_f <- formula(lwage ~ educ + exper + expersq | fatheduc + exper + expersq)
lm_iv_f <- ivreg(formula = mod_iv_f, data = mroz)
smry.ivf <- summary(lm_iv_f)
```

]

???

We can see the insturments (fatheduc + exper + expersq) are included as whole behind the procedure in this code chunk.

While, We can find that the t-test on the coefficient of education is significant now with its p value less than 0.05.

---

### (Supplements) R report (f): `ARE::ivreg()`

The following is the 2SLS analysis report using `ARE::ivreg()`:

``` r
smry.ivf
```

```

Call:
AER::ivreg(formula = lwage ~ educ + exper + expersq | fatheduc + 
    exper + expersq, data = mroz)

Residuals:
     Min       1Q   Median       3Q      Max 
-3.09170 -0.32776  0.05006  0.37365  2.35346

Coefficients:
              Estimate Std. Error t value Pr(>|t|)   
(Intercept) -0.0611169  0.4364461  -0.140  0.88870   
educ         0.0702263  0.0344427   2.039  0.04208 * 
exper        0.0436716  0.0134001   3.259  0.00121 **
expersq     -0.0008822  0.0004009  -2.200  0.02832 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.6719 on 424 degrees of freedom
Multiple R-Squared: 0.143,	Adjusted R-squared: 0.137 
Wald test: 8.314 on 3 and 424 DF,  p-value: 2.201e-05 
```

]

**Note**: `ARE::ivreg()` Only reports the result of the last equation of 2SLS, not include the first equation!

]

---

### Rscript: `ARE::ivreg()` for IV (mf)

---

### Integrated solution: `mothedu` and `fatheduc` IV model

Also, we can use both `$motheduc$` and `$fatheduc$` as instruments for `$educ$`.

`$$\begin{cases}
  \begin{align}
  \widehat{educ} &= \hat{\gamma}_1 +\hat{\gamma}_2exper + \hat{\beta}_3expersq +\hat{\beta}_4motheduc + \hat{\beta}_5fatheduc  && \text{(stage 1)}\\
  lwage & = \hat{\beta}_1 +\hat{\beta}_2\widehat{educ} + \hat{\beta}_3exper +\hat{\beta}_4expersq + \hat{\epsilon}  && \text{(stage 2)}
  \end{align}
\end{cases}$$`

---

### Integrated solution: `mothedu` and `fatheduc` IV results
<div class="datatables html-widget html-fill-item" id="htmlwidget-f5b4cb21f3d3b9464908" style="width:100%;height:auto;"></div>
<script type="application/json" data-for="htmlwidget-f5b4cb21f3d3b9464908">{"x":{"filter":"none","vertical":false,"caption":"<caption>2SLS result (`motheduc` + `fatheduc` as instruments)<\/caption>","data":[["1","2","3","4","5","6","7","8","9"],["eq1","eq1","eq1","eq1","eq1","eq2","eq2","eq2","eq2"],["(Intercept)","exper","expersq","motheduc","fatheduc","(Intercept)","educ","exper","expersq"],[9.102640109600149,0.0452254233687072,-0.001009090957170813,0.1575970327485924,0.1895484101549553,0.04810030693242318,0.06139662866013517,0.04417039294876216,-0.0008989695881555132],[0.4265613672307983,0.04025071238007708,0.001203344812335161,0.03589411554669008,0.03375646678192469,0.4003280776043034,0.03143669564470841,0.0134324755294433,0.0004016856118761809],[21.33957927013819,1.123593116605123,-0.8385717433830233,4.390609166663949,5.615173275671472,0.1201522192004904,1.953024241288851,3.288328562515743,-2.237993001433717],[0,0.2618229379317158,0.4021832850606946,1.429840367217494e-05,3.561512373906339e-08,0.9044194793608125,0.05147417391522291,0.001091838425269831,0.02574002733425673]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>eq<\/th>\n      <th>vars<\/th>\n      <th>Estimate<\/th>\n      <th>Std. Error<\/th>\n      <th>t value<\/th>\n      <th>Pr(>|t|)<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"pageLength":10,"dom":"t","columnDefs":[{"targets":3,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 4, 3, \",\", \".\", null);\n  }"},{"targets":4,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 4, 3, \",\", \".\", null);\n  }"},{"targets":5,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 4, 3, \",\", \".\", null);\n  }"},{"targets":6,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 4, 3, \",\", \".\", null);\n  }"},{"className":"dt-right","targets":[3,4,5,6]},{"orderable":false,"targets":0},{"name":" ","targets":0},{"name":"eq","targets":1},{"name":"vars","targets":2},{"name":"Estimate","targets":3},{"name":"Std. Error","targets":4},{"name":"t value","targets":5},{"name":"Pr(>|t|)","targets":6}],"order":[],"autoWidth":false,"orderClasses":false}},"evals":["options.columnDefs.0.render","options.columnDefs.1.render","options.columnDefs.2.render","options.columnDefs.3.render"],"jsHooks":[]}</script>

**Note** : The corresponding code of `R` programming is in the following slides. The table results use the report from the `systemfit::systemfit()` function.

]

???

We can see that the t-test result of the coefficient of variable `educ` is significant (p-value less than 0.1).

The insturments (motheduc +fatheduc + exper + expersq) are included behind the procedure in this code chunk.

And we can find that the t-test on the coefficient of education is significant with its p value less than 0.1.

---

### (Supplements) R code (mf): `systemfit::systemfit()`

The R code using `systemfit::systemfit()` as follows:

``` r
# load pkg
require(systemfit)
# set two models
eq_1 <- educ ~ exper + expersq + motheduc + fatheduc
eq_2 <- lwage ~ educ + exper + expersq 
sys <- list(eq1 = eq_1, eq2 = eq_2)
# specify the instruments
instr <- ~ exper + expersq + motheduc + fatheduc
# fit models 
fit.sys <- systemfit(
  sys, inst=instr, 
  method="2SLS", data = mroz)
# summary of model fit
smry.system_mf <- summary(fit.sys)
```

---

### (Supplements) R report (mf): `systemfit::systemfit()`

The following is the 2SLS analysis report using `systemfit::systemfit() `:

``` r
smry.system_mf
```

```

systemfit results 
method: 2SLS

N  DF    SSR detRCov   OLS-R2 McElroy-R2
system 856 847 1951.6 1.83425 0.204575   0.149485

N  DF     SSR      MSE     RMSE       R2   Adj R2
eq1 428 423 1758.58 4.157388 2.038967 0.211471 0.204014
eq2 428 424  193.02 0.455236 0.674712 0.135708 0.129593

The covariance matrix of the residuals
         eq1      eq2
eq1 4.157388 0.241536
eq2 0.241536 0.455236

The correlations of the residuals
         eq1      eq2
eq1 1.000000 0.175571
eq2 0.175571 1.000000

2SLS estimates for 'eq1' (equation 1)
Model Formula: educ ~ exper + expersq + motheduc + fatheduc
Instruments: ~exper + expersq + motheduc + fatheduc

Estimate  Std. Error  t value   Pr(>|t|)    
(Intercept)  9.10264011  0.42656137 21.33958 < 2.22e-16 ***
exper        0.04522542  0.04025071  1.12359    0.26182    
expersq     -0.00100909  0.00120334 -0.83857    0.40218    
motheduc     0.15759703  0.03589412  4.39061 1.4298e-05 ***
fatheduc     0.18954841  0.03375647  5.61517 3.5615e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.038967 on 423 degrees of freedom
Number of observations: 428 Degrees of Freedom: 423 
SSR: 1758.575263 MSE: 4.157388 Root MSE: 2.038967 
Multiple R-Squared: 0.211471 Adjusted R-Squared: 0.204014

2SLS estimates for 'eq2' (equation 2)
Model Formula: lwage ~ educ + exper + expersq
Instruments: ~exper + expersq + motheduc + fatheduc

Estimate   Std. Error  t value  Pr(>|t|)   
(Intercept)  0.048100307  0.400328078  0.12015 0.9044195   
educ         0.061396629  0.031436696  1.95302 0.0514742 . 
exper        0.044170393  0.013432476  3.28833 0.0010918 **
expersq     -0.000898970  0.000401686 -2.23799 0.0257400 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.674712 on 424 degrees of freedom
Number of observations: 428 Degrees of Freedom: 424 
SSR: 193.020015 MSE: 0.455236 Root MSE: 0.674712 
Multiple R-Squared: 0.135708 Adjusted R-Squared: 0.129593 
```

]

**NOTE** : `systemfit::systemfit()` simultaneously reports the analysis results of two equations in 2SLS!

]

---

### (Supplements) R code (mf): `ARE::ivreg()`

The R code using `ARE::ivreg()` as follows:

``` r
# load pkg 
require(AER)
# specify model
mod_iv_mf <- formula(
  lwage ~ educ + exper + expersq
  | motheduc + fatheduc + exper + expersq)
# fit model
lm_iv_mf <- ivreg(formula = mod_iv_mf, data = mroz)
# summary of model fit
smry.ivmf <- summary(lm_iv_mf)
```

]

---

### (Supplements) R report (mf): `ARE::ivreg()`

The following is the 2SLS analysis report using `ARE::ivreg()`:

``` r
smry.ivmf
```

```

Call:
AER::ivreg(formula = lwage ~ educ + exper + expersq | motheduc + 
    fatheduc + exper + expersq, data = mroz)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.0986 -0.3196  0.0551  0.3689  2.3493

Coefficients:
              Estimate Std. Error t value Pr(>|t|)   
(Intercept)  0.0481003  0.4003281   0.120  0.90442   
educ         0.0613966  0.0314367   1.953  0.05147 . 
exper        0.0441704  0.0134325   3.288  0.00109 **
expersq     -0.0008990  0.0004017  -2.238  0.02574 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.6747 on 424 degrees of freedom
Multiple R-Squared: 0.1357,	Adjusted R-squared: 0.1296 
Wald test: 8.141 on 3 and 424 DF,  p-value: 2.787e-05 
```

]

**Note**: `ARE::ivreg()` Only reports the result of the last equation of 2SLS, not include the first equation!

]

---

### Solutions comparison: a glance

Until now, we obtain totally **Five** estimation results with different model settings or solutions:

a. Error specification model with OLS regression directly.

b. (**Step-by-Step solution**) Explicit 2SLS estimation **without** variance correction (IV regression step by step with only `$matheduc$` as instrument).

c. (**Integrated solution**) Dedicated IV estimation **with** variance correction ( using `R` tools of `systemfit::systemfit()` or `ARE::ivreg()`).

- The IV model with only
`$motheduc$` as instrument for endogenous variable `$edu$`

- The IV model with only
`$fatheduc$` as instrument for endogenous variable `$edu$`

- The IV model with  both
`$motheduc$` and
`$fatheduc$` as instruments

For the purpose of comparison, all results will show in next slide.

???

we use `R` function `ARE::ivreg()` to get the IV estimation **with** variance correction with the last three model considering different instruments.

---

### Solutions comparison: tidy reports (png)

---

### Solutions comparison: tidy reports (html)

<table style="text-align:center"><tr><td colspan="6" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="5">Dependent variable: lwage</td></tr>
<tr><td></td><td colspan="5" style="border-bottom: 1px solid black"></td></tr>
<tr><td style="text-align:left"></td><td>OLS</td><td>explicit 2SLS</td><td>IV mothereduc</td><td>IV fathereduc</td><td>IV mothereduc and fathereduc</td></tr>
<tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td><td>(3)</td><td>(4)</td><td>(5)</td></tr>
<tr><td colspan="6" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Constant</td><td>-0.5220<sup>***</sup></td><td>0.1982</td><td>0.1982</td><td>-0.0611</td><td>0.0481</td></tr>
<tr><td style="text-align:left"></td><td>(0.1986)</td><td>(0.4933)</td><td>(0.4729)</td><td>(0.4364)</td><td>(0.4003)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">educ</td><td>0.1075<sup>***</sup></td><td></td><td>0.0493</td><td>0.0702<sup>**</sup></td><td>0.0614<sup>*</sup></td></tr>
<tr><td style="text-align:left"></td><td>(0.0141)</td><td></td><td>(0.0374)</td><td>(0.0344)</td><td>(0.0314)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">educHat</td><td></td><td>0.0493</td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left"></td><td></td><td>(0.0391)</td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">exper</td><td>0.0416<sup>***</sup></td><td>0.0449<sup>***</sup></td><td>0.0449<sup>***</sup></td><td>0.0437<sup>***</sup></td><td>0.0442<sup>***</sup></td></tr>
<tr><td style="text-align:left"></td><td>(0.0132)</td><td>(0.0142)</td><td>(0.0136)</td><td>(0.0134)</td><td>(0.0134)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td><td></td><td></td></tr>
<tr><td style="text-align:left">expersq</td><td>-0.0008<sup>**</sup></td><td>-0.0009<sup>**</sup></td><td>-0.0009<sup>**</sup></td><td>-0.0009<sup>**</sup></td><td>-0.0009<sup>**</sup></td></tr>
<tr><td style="text-align:left"></td><td>(0.0004)</td><td>(0.0004)</td><td>(0.0004)</td><td>(0.0004)</td><td>(0.0004)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td><td></td><td></td><td></td></tr>
<tr><td colspan="6" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>428</td><td>428</td><td>428</td><td>428</td><td>428</td></tr>
<tr><td style="text-align:left">R<sup>2</sup></td><td>0.1568</td><td>0.0456</td><td>0.1231</td><td>0.1430</td><td>0.1357</td></tr>
<tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.1509</td><td>0.0388</td><td>0.1169</td><td>0.1370</td><td>0.1296</td></tr>
<tr><td style="text-align:left">Residual Std. Error (df = 424)</td><td>0.6664</td><td>0.7090</td><td>0.6796</td><td>0.6719</td><td>0.6747</td></tr>
<tr><td style="text-align:left">F Statistic (df = 3; 424)</td><td>26.2862<sup>***</sup></td><td>6.7510<sup>***</sup></td><td></td><td></td><td></td></tr>
<tr><td colspan="6" style="border-bottom: 1px solid black"></td></tr></table>

]
]

.footnote[
Table source: [`comparison-stargazer.R`](scripts/chapter17/chunk-sources/comparison-stargazer.R) (run in a clean R session; slides use cached HTML).
]

???

参看在线图书"Principles of Econometrics with R"(10.1 The Instrumental Variables (IV) Method，[免费在线](https://bookdown.org/ccolonescu/RPoE4/random-regressors.html#the-instrumental-variables-iv-method))

---

### Solutions comparison: report tips

- The second column shows the result of the direct OLS estimation, and the third column shows the result of explicit 2SLS estimation without variance correction.

- While the last three column shows the results of IV solution with variance correction.

- And we should also remind that the  `$educ$`  in the IV model is equivalent to the  `$educHat$`  in 2SLS.

- The value within the bracket is the standard error of the estimator.

---

### Solutions comparison: report insights

So the key points of this comparison including:

- Firstly, the table shows that the importance of education in determining wage decreases in the IV model (3) (4) and (5) with the coefficients 0.049, 0.07, 0.061 respectively. And the standard error also decrease along IV estimation (3) , (4) and (5).

- Secondly, It also shows that the explicit 2SLS model (2) and the IV model with only `$motheduc$`  instrument yield the same coefficients, but the **standard errors** are different. The standard error in explicit 2SLS is 0.039, which is little large than the standard error 0.037 in IV estimation.

- Thirdly, the t-test of the coefficient on education shows no significance when we use `motheduc` as the only instrument for education. You can compare this under the  explicit 2SLS estimation or IV estimation.

- Fourthly, we can fully feel and understand the **relative estimated efficiency** of 2SLS!

---

### Solutions comparison: further thinking

After the empirical comparison, we will be even more confused with these results.

While, new question will arise inside our mind.

- Which estimation is the best?

- How to judge and evaluate different instrument choices?

We will discuss these topics in the next section.

---
layout: false
class: center, middle, duke-softblue,hide_logo
name: validity

## 17.7 Testing Instrument validity

???
As we know, valid instruments should satisfy both relevance condition and exogeneity condition.

So, let us check these conditions in this section.

---
layout: true

<div class="my-footer"><span>huhuaping@   <a href="#chapter17"> Chapter 17. Endogeneity and Instumental Variables  |</a> &emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp; <a href="#validity"> 17.7 Testing Instrument validity  </a></span></div>

---

### Instrument vality: the concept

Consider the general model

`$$\begin{align}
Y_{i}=\beta_{0}+\sum_{j=1}^{k} \beta_{j} X_{j i}+\sum_{s=1}^{r} \beta_{k+s} W_{ri}+\epsilon_{i}
\end{align}$$`

> - `$Y_{i}$` is the dependent variable
- `$\beta_{0}, \ldots, \beta_{k+1}$` are `$1+k+r$` unknown regression coefficients
- `$X_{1 i}, \ldots, X_{k i}$` are `$k$` endogenous regressors
- `$W_{1 i}, \ldots, W_{r i}$` are `$r$` exogenous regressors which are uncorrelated with `$u_{i}$`
- `$u_{i}$` is the error term
- `$Z_{1 i}, \ldots, Z_{m i}$` are `$m$` instrumental variables

**Instrument valid** means satisfy both Relevance and Exogeneity conditions.

`$$E\left(Z_{i} X_{i}^{\prime}\right) \neq 0$$`

]

`$$E\left(Z_{i} \epsilon_{i}\right)=0$$`

]

???

Consider the general model as we have done.

---

### Instrument Relevance: relax condition

In practice, **Instrument Relevance** also means that:

If there are
`$k$` endogenous variables and
`$m$` instruments
`$Z$`, and
`$m \geq k$`, it must hold that the exogenous vector

`$$\left(\hat{X}_{1 i}^{*}, \ldots, \hat{X}_{k i}^{*}, W_{1 i}, \ldots, W_{r i}, 1\right)$$`

should .red[not] be **perfectly multicollinear**.

> **Where**:
> - `$\hat{X}_{1i}^{\ast}, \ldots, \hat{X}_{ki}^{\ast}$` are the predicted values from the  `$k$` first stage regressions.
> - 1 denotes the constant regressor which equals 1 for all observations.

???

While the concept of **Instrument Relevance** is much tricky.

So, what is the meaning of **Instrument Relevance**?

___

Obviously, the perfect multicollinear is the rare fact and can be get rid with careful inspection.

What we really need to pay attention is the contrary fact which is called **weak instruments**.

---

### Instrument Relevance: Weak instrument

Instruments that explain little variation in the endogenous regressor `$X$`  are called **weak instruments**.

Formally,  When
`$\operatorname{corr}\left(Z_{i}, X_{i}\right)$` is close to zero,
`$z_{i}$` is called a weak instrument.

- Consider a simple one regressor model
`$Y_{i}=\beta_{0}+\beta_{1} X_{i}+\epsilon_{i}$`

- The IV estimator of
`$\beta_{1}$` is
`$\widehat{\beta}_{1}^{IV}=\frac{\sum_{i=1}^{n}\left(Z_{i}-\bar{Z}\right)\left(Y_{i}-\bar{Y}\right)}{\sum_{i=1}^{n}\left(Z_{i}-\bar{Z}\right)\left(X_{i}-\bar{X}\right)}$`

> Note that `$\frac{1}{n} \sum_{i=1}^{n}\left(Z_{i}-\bar{Z}\right)\left(Y_{i}-\bar{Y}\right) \xrightarrow{p} \operatorname{Cov}\left(Z_{i}, Y_{i}\right)$`

> and `$\frac{1}{n} \sum_{i=1}^{n}\left(Z_{i}-\bar{Z}\right)\left(X_{i}-\bar{X}\right) \xrightarrow{p} \operatorname{Cov}\left(Z_{i}, X_{i}\right)$`.

- Thus,if `$\operatorname{Cov}\left(Z_{i},X_{i}\right) \approx 0$`, then `$\widehat{\beta}_{1}^{IV}$` is useless.

???

Let me give you an example.

---

### Example: Weak instrument

We focus on the effect of cigarette smoking on the infant birth weight. Without other
explanatory variables, the model is

- `$bwght =$`  child birth weight, in ounces.

- `$packs =$` packs smoked per day while pregnant.

- `$cigprice=$` cigarette price in home state

`$$\begin{align}
\log (\text {bwght})=\beta_{0}+\beta_{1} \text {packs}+u_{i}
\end{align}$$`

- We might worry that packs is cor-related
with other health factors or the availability of good prenatal care, so that
`$packs$` and
`$u_i$` might be  correlated.

- A possible instrumental variable for
`$packs$` is the average price of cigarettes (
`$cigprice$`) in the state . We will assume that
`$cigprice$` and
`$u_i$` are uncorrelated.

We will use data set from `wooldridge::bwght`.
]

???

the assumed TRUE model may be

`$$\begin{align}
\log (\text {bwght})=\beta_{0}+\beta_{1} \text {packs}+ ( \beta_2 care\_surport)+u_{i}
\end{align}$$`

- `$cigs =$` number of cigarettes smoked by the mother while pregnant, per day.

- `$faminc=$` annual family income, in thousands of dollars.

---

### Example: Weak instrument

However, by regressing `$packs$` on `$cigprice$` in stage 1, we find basically no effect.

`$$\begin{equation}
\begin{alignedat}{999}
&\widehat{packs}=&&+0.0674&&+0.0003cigprice_i\\ 
&(s)&&(0.1025)&&(0.0008)\\ 
&(t)&&(+0.66)&&(+0.36)\\ 
&(Ftest)&&F^*=0.13&&p=0.7179
\end{alignedat}
\end{equation}$$`

If we insist to use `$cigprice$` as instrument, and run the stage 2 OLS, we will find
`$$\begin{equation}
\begin{alignedat}{999}
&\widehat{lbwght}=&&+4.4481&&+2.9887packs\_hat_i\\ 
&(s)&&(0.1843)&&(1.7654)\\ 
&(t)&&(+24.13)&&(+1.69)\\ 
&(Ftest)&&F^*=2.87&&p=0.0907
\end{alignedat}
\end{equation}$$`

.footnote[
Obviously, this estimation is meaningless (Why?).The `$cigprice$` behaves as a **weak instrument**, and the problem was already exposed in stage 1 regression.

]

???

`$$\begin{alignedat}{3}
\widehat{packs} &&= &&
0.067 + &&0.0003 \text { cigprice } \\
(se)&& &&(0.103)  &&(0.0008)
\end{alignedat}$$`

`$$\begin{alignedat}{3}
\log \widehat{(bwght)} &=
& 4.45   + &2.99 \text {packs} \\
(se)& &(0.91) &(8.70)\\
\end{alignedat}$$`

- because there is huge standard error and not significant on coefficient of packs.

- and also it has the wrong sign on coefficient of packs, which should not be positive.

As what we have discussed, the result is unbelievable since the cigprice is a weak instrument for packs.

---

### Weak instrument: the strategy

The weak instrument (
`$Z_i$` and
`$X_i$` is week correlated) led to an **important finding**: even with very large sample
sizes the 2SLS estimator can be biased and a distribution that is very different from standard normal (Staiger and Stock 1997).

There are two ways to proceed if instruments are weak:

- Discard the **weak instruments** and/or find **stronger instruments**.

> While the former is only an option if the unknown coefficients remain identified when the weak instruments are discarded, the latter can be difficult and even may require a redesign of the whole study.

- Stick with the weak instruments but use methods that improve upon TSLS.

> Such as **limited information maximum likelihood estimation (LIML)**.

???
So, what should we do if the instruments are weak or some of them are weak?

---

### Weak IV: Angrist and Kolesár (2022)

if the first-stage **F-statistic** `$< 10$` (Staiger and Stock 1997), treat instruments as **weak** — 2SLS can be biased toward OLS with inflated inference problems.
]

Angrist and Kolesár (2022) argue that in many applied settings the **standard-error inflation** is large enough that spurious rejection of `$\beta = 0$` remains unlikely.

- Still report first-stage F and inspect first-stage coefficients.

.footnote[
Angrist, J., and M. Kolesár. One Instrument to Rule Them All: The Bias and Coverage of Just-ID IV[J]. Journal of Econometrics, 2022, 240(2):105398.

]

---

### Many instruments and many-weak bias

With **many** instruments, first-stage F can look adequate while 2SLS still drifts toward OLS (**many-weak bias**).

- Flexible first stages fit `$D_i$` better `$\rightarrow$` smaller SEs but risk **overfitting** endogenous variation.
- Famous example: Angrist–Krueger (1991) QOB instruments with many interactions.
- **Practical rule**: prefer **few, strong** instruments; check F after every specification change; beware constructed instruments (judge IV, shift-share, etc.).

---

### Weak instrument: restricted F-test (idea)

In case with a **single** endogenous regressor, we can take the  **F-test** to check the **Weak instrument**.

The basic idea of the F-test is very simple:

If the estimated coefficients of **all instruments** in the **first-stage** of a 2SLS estimation are **zero**, the instruments do not explain any of the variation in the
`$X$` which clearly violates the relevance assumption.

]

---

### Weak instrument: restricted F-test  (procudure)

We may use the following rule of thumb:

- Conduct the **first-stage regression** of a 2SLS estimation

`$$\begin{align}
X_{i}=\hat{\gamma}_{0}+\hat{\gamma}_{1} W_{1 i}+\ldots+\hat{\gamma}_{p} W_{p i}+ \hat{\theta}_{1} Z_{1 i}+\ldots+\hat{\theta}_{q} Z_{q i}+v_{i} \quad \text{(3)}
\end{align}$$`

- Test the restricted joint hypothesis
`$H_0: \hat{\theta}_1=\ldots=\hat{\theta}_q=0$` by compute the
`$F$`-statistic. We call this **Restricted F-test**  which is different with the **Classical overall F-test**.

- If the
`$F$`-statistic is less than  critical value, the instruments are **weak**.

The rule of thumb is easily implemented in `R`. Run the first-stage regression using `lm()` and subsequently compute the restricted `$F$`-statistic by `R` function of `car::linearHypothesis()`.

]

The Classical overall F-test has
`$H_0:\gamma_1 = \cdots = \gamma_p =\hat{\theta}_1=\ldots=\hat{\theta}_q=0$`

]

???

Also, you may ask that how do you know the instruments are weak or some of them are weak?

We will test this considering with different situations.

---

### Wage example: restricted F-test (models)

For all  three IV model, we can test instrument(s) relevance respectively.

`$$\begin{align}
educ &= \gamma_1 +\gamma_2exper +\gamma_2expersq + \theta_1motheduc  +v
&& \text{(relevance test 1)}\\
educ &= \gamma_1 +\gamma_2exper +\gamma_2expersq + \theta_2fatheduc +v
&& \text{(relevance test 2)} \\
educ &= \gamma_1 +\gamma_2exper +\gamma_2expersq + \theta_1motheduc + \theta_2fatheduc +v
&& \text{(relevance test 3)}
\end{align}$$`

???

And we will test the weak instrument issues by using restricted F test.

---

### Wage example: restricted F-test (model 1)

Consider model 1:

`$$\begin{align}
educ &= \gamma_1 +\gamma_2exper +\gamma_3expersq + \theta_1motheduc  +v
\end{align}$$`

The restricted F-test' null hypothesis:
`$H_0: \theta_1  =0$`.

We will test whether `motheduc` are week instruments.

---

### Wage example: restricted F-test (model 1)

The result show that the p-value of
`$F^{\ast}$` is much smaller than 0.01. Null hypothesis `$H_0$` was rejected. `motheduc` is  **instruments relevance** (exogeneity valid).

``` r
# restricted F-test
constrain_test1 <- linearHypothesis(model =ols_relevance1, c("motheduc=0"))
# obtain F statistics
F_r1 <- constrain_test1$F[[2]]
```

]

```

Linear hypothesis test:
motheduc = 0

Model 1: restricted model
Model 2: educ ~ exper + expersq + motheduc

Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
1    425 2219.2                                  
2    424 1889.7  1    329.56 73.946 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

]

---

### Wage example (compare): classic F-test (model 1)

> Note: **Restriced F test** (73.95) is different with the **classical OLS F test**(show bellow 25.47).

`$$\begin{align}
educ &= \gamma_1 +\gamma_2exper +\gamma_2expersq + \theta_1motheduc  +v
\end{align}$$`

The classic OLS  F-test' null hypothesis:
`$H_0: \gamma_2 = \gamma_3= \theta_1  =0$`.

The OLS estimation results are:
`$$\begin{equation}
\begin{alignedat}{999}
&\widehat{educ}=&&+9.78&&+0.05exper_i&&-0.00expersq_i&&+0.27motheduc_i\\ 
&(s)&&(0.4239)&&(0.0417)&&(0.0012)&&(0.0311)\\ 
&(t)&&(+23.06)&&(+1.17)&&(-1.03)&&(+8.60)\\ 
&(fit)&&R^2=0.1527&&\bar{R}^2=0.1467 && &&\\ 
&(Ftest)&&F^*=25.47&&p=0.0000 && &&
\end{alignedat}
\end{equation}$$`

???
Restricted F test take the Null hypotheis with  coefficients before the instruments all euqal to zero, While the classical F test take the Null hypotheis with coefficients before all regressors equal to zero.

---

### Wage example: restricted F-test (model 2)

Consider model 2:

`$$\begin{align}
educ &= \gamma_1 +\gamma_2exper +\gamma_3expersq + \theta_1fatheduc +v
&& \text{(relevance test 2)}
\end{align}$$`

The restricted F-test' null hypothesis:
`$H_0: \theta_1  =0$`.

We will test whether `fatheduc` are week instruments.

---

### Wage example: restricted F-test (model 2)

The result show that the p-value of
`$F^{\ast}$` is much smaller than 0.01. Null hypothesis `$H_0$` was rejected. `fatheduc` is  **instruments relevance** (exogeneity valid).

``` r
constrain_test2 <- linearHypothesis(ols_relevance2, c("fatheduc=0"))
# obtain F statistics
F_r2 <- constrain_test2$F[[2]]
```

]

```

Linear hypothesis test:
fatheduc = 0

Model 1: restricted model
Model 2: educ ~ exper + expersq + fatheduc

Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
1    425 2219.2                                  
2    424 1838.7  1     380.5 87.741 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

]

---

### Wage example: restricted F-test (model 3)

Consider model 3:

`$$\begin{align}
educ &= \gamma_1 +\gamma_2exper +\gamma_3expersq + \theta_1motheduc + \theta_2fatheduc +v
&& \text{(relevance test 3)}
\end{align}$$`

The restricted F-test' null hypothesis:
`$H_0: \theta_1 = \theta_2 =0$`.

We will test whether `motheduc` and `fatheduc` are week instruments.

---

### Wage example: restricted F-test (model 3)

The result show that the p-value of
`$F^{\ast}$` is much smaller than 0.01. Null hypothesis `$H_0$` was rejected. `fatheduc` and `motheduc` are  **instruments relevance** (exogeneity valid).

``` r
constrain_test3 <- linearHypothesis(ols_relevance3, c("motheduc=0", "fatheduc=0"))
# obtain F statistics
F_r3 <- constrain_test3$F[[2]]
```

]

```

Linear hypothesis test:
motheduc = 0
fatheduc = 0

Model 1: restricted model
Model 2: educ ~ exper + expersq + motheduc + fatheduc

Res.Df    RSS Df Sum of Sq    F    Pr(>F)    
1    425 2219.2                                
2    423 1758.6  2    460.64 55.4 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

]

???

In sum, all relecance model test are significant. And we can conclude that the instrumenta mothereducation and father education satisify the relevance condition.

Until now, we show the relevance F-test with situation that contains only one endogenous regressor.

Next, I will give a two endogeous variables example by using Cragg-Donald test.

---

### Weak instrument: Cragg-Donald test

The former test for weak instruments might be unreliable with **more than** one endogenous regressor, though, because there is indeed one
`$F$`-statistic for each endogenous regressor.

An alternative is the **Cragg-Donald test** based on the following statistic:

`$$\begin{align}
F=\frac{N-G-B}{L} \frac{r_{B}^{2}}{1-r_{B}^{2}}
\end{align}$$`

- where:
`$G$` is the number of exogenous regressors;
`$B$` is the number of endogenous regressors;
`$L$` is the number of external instruments;
`$r_B$` is the lowest canonical correlation.

> **Canonical correlation** is a measure of the correlation between the endogenous and the exogenous variables, which can be calculated by the function `cancor()` in  `R`.

???

external (ɪkˈstɜːnl)

Canonical (kəˈnɒnɪkl)

---

### Hour example: backgound

Let us construct another IV model with two endogenous regressors. We assumed the following work hours determination model:

`$$hushrs=\beta_{1}+\beta_{2} mtr+\beta_{3} educ+\beta_{4} kidsl6+\beta_{5} nwifeinc+e$$`

> - `$hushrs$`: work hours of husband, 1975
- `$mtr$`: federal marriage tax rate on woman
- `$kidslt6$`: have kids < 6 years (dummy variable)
- `$nwifeinc$`: wife’s net income

There are:

- Two **endogenous variables**: `$educ$` and `$mtr$`
- Two **exogenous regressors**:  `$nwifeinc$` and  `$kidslt6$`
- And two external **instruments**: `$motheduc$`  and  `$fatheduc$`.

---

### Hour example: Cragg-Donald test (R code)

The data set is still `mroz`, restricted to women that are in the labor force(`$inlf=1$`).

``` r
# filter samples
mroz1 <- wooldridge::mroz %>%
  filter(wage>0, inlf==1)
# set parameters
N <- nrow(mroz1); G <- 2; B <- 2; L <- 2
# for endogenous variables
x1 <- resid(lm( mtr ~ kidslt6 + nwifeinc, data = mroz1))
x2 <- resid(lm( educ ~ kidslt6 + nwifeinc, data = mroz1))
# for instruments
z1 <-resid(lm(motheduc ~ kidslt6 + nwifeinc, data = mroz1))
z2 <-resid(lm(fatheduc ~ kidslt6 + nwifeinc, data=mroz1))
# column bind
X <- cbind(x1,x2)
Y <- cbind(z1,z2)
# calculate Canonical correlation
rB <- min(cancor(X,Y)$cor)
# obtain the F statistics
CraggDonaldF <- ((N-G-L)/L)/((1-rB^2)/rB^2)
```

]

.footnote[
R script download: [`Cragg-Donald-test.R`](scripts/chapter17/chunk-sources/Cragg-Donald-test.R)
]

---

### Hour example: Cragg-Donald test (result)

Run these code lines, we can obtain the results:

Table: Cragg-Donald test results

| G | L | B |  N  |   rb   | CraggDonaldF |
|:-:|:-:|:-:|:---:|:------:|:------------:|
| 2 | 2 | 2 | 428 | 0.0218 |    0.1008    |

The result show the Cragg-Donald  `$F=$` 0.1008 , which is much smaller than **the critical value** `4.58`<sup>[1]</sup>.

This test can not rejects the null hypothesis, thus we may conclude that some of these instruments are **weak**.

.footnote[
[1] The critical value can be found in table 10E.1 at: Hill C, Griffiths W, Lim G. Principles of econometrics[M]. John Wiley & Sons, 2018.
]

???

You can inquire the critical values in Table 10E.1 of the textbook,Hill, Griffiths, and Lim 2011.

---

### Instrument Exogeneity: the difficulty

**Instrument Exogeneity** means all
`$m$` instruments must be uncorrelated with the error term,

`$$Cov{(Z_{1 i}, \epsilon_{i})}=0; \quad \ldots; \quad Cov{(Z_{mi}, \epsilon_{i})}=0.$$`

- In the context of the simple IV estimator, we will find that the exogeneity requirement **can not** be tested. (Why?)

- However, if we have more instruments than we need, we can effectively test whether **some of** them are uncorrelated with the structural error.

???
As we know , when we call a instrument is validity, we should also check that it satisfy the exogeneity condition.

---

### Instrument Exogeneity: over-identification case

Under **over-identification**
`$(m>k)$`, consistent IV estimation with (multiple) different combinations of instruments is possible.

> If instruments are exogenous, the obtained estimates should be **similar**.

> If estimates are very **different**, some or all instruments may .red[not] be exogenous.

The **Overidentifying Restrictions Test** (**J test**) formally check this.

- The null hypothesis is Instrument Exogeneity.

`$$H_{0}: E\left(Z_{h i} \epsilon_{i}\right)=0, \text { for all } h=1,2, \dots, m$$`

---

### Instrument Exogeneity: J-test (procedure)

The **overidentifying restrictions test** (also called the  `$J$`-test, or **Sargan test**) is an approach to test the hypothesis that the additional instruments are exogenous.

Procedure of overidentifying restrictions test is:

- **Step 1**: Compute the **IV regression residuals** :

`$$\widehat{\epsilon}_{i}^{IV}=Y_{i}-\left(\hat{\beta}_{0}^{ IV}+\sum_{j=1}^{k} \hat{\beta}_{j}^{IV} X_{j i}+\sum_{s=1}^{r} \hat{\beta}_{k+s}^{IV} W_{s i}\right)$$`

- **Step 2**: Run the **auxiliary regression**: regress the IV residuals on instruments and exogenous regressors. And test the joint hypothesis
`$H_{0}: \alpha_{1}=0, \ldots, \alpha_{m}=0$`

`$$\widehat{\epsilon}_{i}^{IV}=\theta_{0}+\sum_{h=1}^{m} \theta_{h} Z_{h i}+\sum_{s=1}^{r} \gamma_{s} W_{s i}+v_{i} \quad \text{(2)}$$`

???
auxiliary (ɔːɡˈzɪliəri)

---

### Instrument Exogeneity: J-test (procedure)

- **Step3**: Compute the **J statistic**:
`$J=m F$`

> where
`$F$` is the F-statistic of the
`$m$` restrictions
`$H_0: \theta_{1}=\ldots=\theta_{m}=0$` in eq(2)

Under the **null hypothesis**,
`$J$` statistic is distributed as
`$\chi^{2}(m-k)$` approximately for large samples(
`$k=$` numbers of endogenous regressor
).

`$$\boldsymbol{J} \sim \chi^{2}({m-k})$$`

> IF `$J$` is **less** than **critical value**, it means that all instruments are .red[ex]ogenous.

> IF `$J$` is **larger** than **critical value**, it mean that some of the instruments are .red[en]ogenous.

- We can apply the  `$J$`-test by using `R` function `linearHypothesis()`.

???
approximately (əˈprɒksɪmətli)

---

### Wage example: J-test (models)

Again, we can use both `$matheduc$` and `$fatheduc$` as instruments for `$educ$`.

Thus, the IV model is over-identification, and we can test the exogeneity of both these two instruments by using **J-test**.

The 2SLS model will be set as below.

And the auxiliary regression should be

`$$\begin{align}
  \hat{\epsilon}^{IV} &= \hat{\alpha}_1 +\hat{\alpha}_2exper + \hat{\alpha}_3expersq +\hat{\theta}_1motheduc + \hat{\theta}_2fatheduc  + v && \text{(auxiliary model)}
  \end{align}$$`

---

### Wage example: J-test (R code for 2SLS residuals)

We have done the 2SLS estimation before, the `R` code  using `ivreg::ivreg()`:

After the 2SLS estimation, we can obtain the IV residuals of the second stage:

``` r
# obtain residual of IV regression, add to data set
mroz_resid <- mroz %>%
  mutate(resid_iv_mf = residuals(lm_iv_mf))
```

---

### Wage example: J-test (new data set)
<div class="datatables html-widget html-fill-item" id="htmlwidget-fdfdeb10b1eedbf0f2c4" style="width:100%;height:auto;"></div>
<script type="application/json" data-for="htmlwidget-fdfdeb10b1eedbf0f2c4">{"x":{"filter":"none","vertical":false,"caption":"<caption>Data set with the 2SLS residuals<\/caption>","data":[["1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","32","33","34","35","36","37","38","39","40","41","42","43","44","45","46","47","48","49","50","51","52","53","54","55","56","57","58","59","60","61","62","63","64","65","66","67","68","69","70","71","72","73","74","75","76","77","78","79","80","81","82","83","84","85","86","87","88","89","90","91","92","93","94","95","96","97","98","99","100","101","102","103","104","105","106","107","108","109","110","111","112","113","114","115","116","117","118","119","120","121","122","123","124","125","126","127","128","129","130","131","132","133","134","135","136","137","138","139","140","141","142","143","144","145","146","147","148","149","150","151","152","153","154","155","156","157","158","159","160","161","162","163","164","165","166","167","168","169","170","171","172","173","174","175","176","177","178","179","180","181","182","183","184","185","186","187","188","189","190","191","192","193","194","195","196","197","198","199","200","201","202","203","204","205","206","207","208","209","210","211","212","213","214","215","216","217","218","219","220","221","222","223","224","225","226","227","228","229","230","231","232","233","234","235","236","237","238","239","240","241","242","243","244","245","246","247","248","249","250","251","252","253","254","255","256","257","258","259","260","261","262","263","264","265","266","267","268","269","270","271","272","273","274","275","276","277","278","279","280","281","282","283","284","285","286","287","288","289","290","291","292","293","294","295","296","297","298","299","300","301","302","303","304","305","306","307","308","309","310","311","312","313","314","315","316","317","318","319","320","321","322","323","324","325","326","327","328","329","330","331","332","333","334","335","336","337","338","339","340","341","342","343","344","345","346","347","348","349","350","351","352","353","354","355","356","357","358","359","360","361","362","363","364","365","366","367","368","369","370","371","372","373","374","375","376","377","378","379","380","381","382","383","384","385","386","387","388","389","390","391","392","393","394","395","396","397","398","399","400","401","402","403","404","405","406","407","408","409","410","411","412","413","414","415","416","417","418","419","420","421","422","423","424","425","426","427","428"],[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428],[1.210153698921204,0.3285121023654938,1.514137744903564,0.09212332218885422,1.524272203445435,1.556480050086975,2.120259523391724,2.059634208679199,0.7543363571166992,1.544899344444275,1.401921629905701,1.524272203445435,0.7339532375335693,0.8183690905570984,1.302831172943115,0.2980283796787262,1.167609572410583,1.643839359283447,0.6931471824645996,2.021931648254395,1.254247546195984,1.272957682609558,1.178655028343201,1.178655028343201,0.7675586938858032,1.331811785697937,1.386294364929199,1.553269624710083,1.981814861297607,1.769360423088074,0.430807888507843,0.8997548222541809,1.766629695892334,1.272957682609558,1.336788892745972,0.9017048478126526,0.8651236891746521,1.511847138404846,1.72602915763855,2.683142423629761,0.9852942824363708,1.365938544273376,0.9450336694717407,1.512376189231873,0.6931471824645996,1.244788408279419,0.7011649012565613,1.519863247871399,0.8209685683250427,0.9698315262794495,0.828508198261261,0.09430964291095734,0.1625438928604126,0.4700036346912384,0.6292484402656555,1.397160172462463,2.265443801879883,2.084541082382202,1.525838851928711,0.762160062789917,1.48160457611084,1.262826442718506,0.9996755719184875,1.832581520080566,2.479307651519775,1.279015302658081,1.937935590744019,1.070452809333801,1.12392258644104,1.321755886077881,1.744999766349792,1.301743626594543,1.641866445541382,2.107020139694214,1.46706759929657,1.605811357498169,-1.029739379882812,1.08768618106842,0,0.9382086992263794,-0.1505903750658035,0,1.073670506477356,1.265848398208618,0.4863689839839935,2.120259523391724,1.129852533340454,0.9932518005371094,1.658627986907959,0.3474121987819672,1.568324208259583,0.5108456015586853,0.1148454323410988,-0.6931471824645996,-0.3364522755146027,1.028225541114807,1.58068859577179,0.5558946132659912,0.9014207124710083,0.8843045830726624,0.4282045960426331,1.058415055274963,0.8783395886421204,1.654908299446106,1.321755886077881,0.3285121023654938,1.386294364929199,1.172884583473206,1.224187135696411,0.2876570820808411,2.23026180267334,1.504077434539795,1.531152009963989,1.375157594680786,1.760268807411194,-0.6931471824645996,1.406489133834839,1.791759490966797,1.299292087554932,1.351003885269165,1.016280889511108,1.075343608856201,1.478964686393738,1.689486742019653,2.288597822189331,-1.822631120681763,-0.9607651829719543,1.290994167327881,0.8648711442947388,1.540452122688293,0.6162121295928955,1.648658633232117,1.193498134613037,2.143976211547852,0.7244035601615906,0.9416075348854065,0.7827593684196472,1.832581520080566,1.203962802886963,1.491644859313965,1.892132639884949,2.130894899368286,1.48060405254364,0.8943313360214233,0.2025325447320938,0.4855078160762787,1.098612308502197,1.553269624710083,0.1215979680418968,2.001804351806641,1.495036602020264,0.9052298069000244,0.6325475573539734,1.386294364929199,2.102913856506348,1.959643959999084,0.5108456015586853,1.236923933029175,1.443312525749207,1.021659255027771,0.6361534595489502,1.616453289985657,0.2231435477733612,1.049807071685791,1.415051937103271,0.5753766298294067,2.60668158531189,1.517914533615112,0.7550415992736816,1.094972372055054,0.9421143531799316,1.724942803382874,1.031546115875244,0.474369078874588,0.8109301924705505,0.7092666029930115,1.710549473762512,0.4602688848972321,1.331811785697937,1.098612308502197,2.157998561859131,1.437581300735474,1.544899344444275,1.41059672832489,3.218875885009766,0.9681618809700012,1.791759490966797,1.688729524612427,-0.4091719686985016,0.2231435477733612,0.8221558332443237,1.24170196056366,1.427124381065369,1.497097492218018,0.5596157908439636,1.300028204917908,1.884429812431335,0.9555113911628723,1.582087278366089,1.755614042282104,1.513103246688843,2.251891613006592,2.364432334899902,0.1053504794836044,1.399728775024414,0.988462507724762,1.090647339820862,1.154614448547363,1.266947627067566,2.885191679000854,1.228880047798157,1.203962802886963,1.357380270957947,0.8377236127853394,0.5369611382484436,0.7487238049507141,2.295872688293457,1.107803225517273,0.6208452582359314,-2.054163694381714,1.892012000083923,1.729724526405334,0.4693784117698669,0.9808416962623596,2.069492340087891,1.675188183784485,1.386294364929199,1.799214959144592,1.832581520080566,1.090647339820862,1.443123579025269,1.250360131263733,1.602312564849854,1.018558502197266,1.297053217887878,1.685194492340088,-0.4209848940372467,1.562094688415527,2.146527528762817,2.347462892532349,0.9698315262794495,1.924146413803101,1.62672758102417,-0.03926072642207146,1.460148692131042,1.955393552780151,0.9263598918914795,2.066191673278809,1.422843217849731,2.10103178024292,2.261461019515991,0.7013137936592102,2.031012535095215,1.162369251251221,0.4700036346912384,1.41059672832489,0.3930551111698151,1.290994167327881,0,0.9571254849433899,0.5596157908439636,1.568615913391113,1.710187911987305,1.41059672832489,0.2231435477733612,0.5108456015586853,1.332392454147339,0.8601858615875244,2.322779893875122,1.91959547996521,1.976106762886047,0.8954347372055054,0.18123759329319,0.4953058362007141,0.5777924060821533,1.07881772518158,1.603198528289795,0.6208452582359314,2.083894014358521,1.379169106483459,1.112383723258972,1.067121624946594,1.118806958198547,1.588541030883789,1.390311241149902,1.714806437492371,0.2010615319013596,0.9872710108757019,0.9835006594657898,2.233170747756958,1.143617510795593,-0.6113829016685486,2.153052091598511,1.299837350845337,0.8409204483032227,1.058484435081482,1.152658462524414,1.293575882911682,1.832581520080566,2.327180147171021,1.166146278381348,2.034993171691895,0.6792510747909546,1.547136902809143,0.7530185580253601,0.8472836017608643,0.8711259961128235,0.2282504737377167,0.08965782821178436,1.321755886077881,1.196101903915405,1.636118769645691,1.892012000083923,1.518308997154236,2.472159147262573,1.321755886077881,1.473641037940979,1.369478821754456,1.203962802886963,1.198729157447815,1.270209908485413,0.4700036346912384,0.7999816536903381,1.565945625305176,1.758978009223938,0.858025848865509,0.6931471824645996,0.6418538689613342,1.633740186691284,1.703747630119324,1.844004034996033,1.966118812561035,0.8649974465370178,0.9333052039146423,0.7792331576347351,0.9555113911628723,1.316247344017029,1.475906491279602,1.491397261619568,1.455750465393066,0.5108456015586853,1.180438041687012,1.688489437103271,0.7907274961471558,1.401798605918884,-0.4335560202598572,1.683171510696411,-1.766676664352417,3.155595064163208,2.259521007537842,1.306926369667053,0.7984976768493652,0.5590441823005676,0.1479026228189468,1.944494843482971,1.378337860107422,3.064745187759399,-0.7419173121452332,0.7657003998756409,0.619392991065979,1.465452075004578,2.18925952911377,1.021659255027771,0.9770094752311707,0.9162907600402832,2.905096054077148,-0.1996711939573288,0.6931471824645996,2.733392953872681,1.868334650993347,2.120259523391724,1.515193223953247,0.9146093130111694,1.499556064605713,0.803077220916748,0.7280316352844238,0.5164099931716919,1.22644829750061,0.9162907600402832,1.376471281051636,1.828974962234497,1.368283152580261,1.064710736274719,1.406489133834839,1.047318935394287,1.948093414306641,1.078001379966736,0.6539384722709656,1.927891612052917,1.361027836799622,0.6931471824645996,1.604686617851257,0.1839036494493484,3.113515377044678,1.926829218864441,1.2701256275177,0.6826927065849304,1.68106997013092,0.5562959909439087,1.628220438957214,0.9162907600402832,1.341558456420898,0,1.122231245040894,0.5401707887649536,1.391505718231201,1.697173953056335,3.218875885009766,0.871167778968811,1.167329549789429,1.216987729072571,0.5753766298294067,1.151615738868713,0.9942512512207031,0.5263249278068542,-1.543182134628296,1.91204309463501,0.554287314414978,0.9162907600402832,1.500939130783081,0.9446837902069092,1.241268634796143,1.564984321594238,0.8380264639854431,1.668857097625732,1.769428610801697,1.22644829750061,1.406489133834839],[12,12,12,12,14,12,16,12,12,12,12,11,12,12,10,11,12,12,12,12,16,12,13,12,12,17,12,12,17,12,11,16,13,12,16,11,12,10,14,17,12,12,16,12,12,12,16,12,12,12,12,12,12,8,10,16,14,17,14,12,14,12,8,12,12,8,17,12,12,12,12,12,9,10,12,12,12,17,15,12,6,14,12,14,9,17,13,9,15,12,12,12,12,12,12,12,12,13,12,13,12,12,12,16,12,13,11,12,12,12,17,14,16,17,12,11,12,12,17,10,13,11,12,16,17,12,16,12,16,8,12,12,12,13,11,12,12,14,12,12,12,17,14,12,9,12,12,12,14,16,17,15,12,16,17,17,12,16,13,12,11,16,14,16,12,9,17,14,12,12,11,12,12,10,12,5,17,11,12,12,14,11,12,14,12,10,16,13,12,12,12,11,12,9,13,12,12,12,13,16,12,16,17,12,12,9,12,12,13,12,12,12,12,10,12,16,12,11,12,10,12,12,12,12,16,17,12,17,12,12,12,8,12,13,12,12,8,12,17,17,12,13,12,12,12,12,9,10,12,16,13,8,16,13,12,11,13,12,12,10,12,17,15,16,10,11,12,12,14,16,14,8,7,12,12,14,12,12,12,14,16,12,12,12,13,13,10,12,12,12,12,14,17,10,9,12,12,16,12,17,12,17,11,16,11,13,11,8,11,12,10,17,12,12,17,14,12,12,12,12,12,12,9,10,12,12,12,12,12,17,12,17,12,10,12,12,12,12,12,12,16,13,13,12,16,17,12,14,12,17,12,14,12,12,17,16,16,12,9,12,12,16,14,12,12,11,12,16,17,17,14,12,14,12,10,12,13,16,12,7,16,14,12,10,12,16,10,12,14,12,6,15,12,17,14,13,6,16,14,15,14,8,14,12,12,12,12,12,12,8,12,17,12,12,14,13,17,8,12,11,12,12,17,10,12,13,12,12],[14,5,15,6,7,33,11,35,24,21,15,14,0,14,6,9,20,6,23,9,5,11,18,15,4,21,31,9,7,7,32,11,16,14,27,0,17,28,24,11,1,14,6,10,6,4,10,22,16,6,12,32,15,17,34,9,37,10,35,6,19,10,11,15,12,12,14,11,9,24,12,13,29,11,13,19,2,24,9,6,22,30,10,6,29,29,36,19,8,13,16,11,15,6,13,22,24,2,6,2,2,14,9,11,9,6,19,26,19,3,7,28,13,9,15,20,29,9,1,8,19,23,3,13,8,17,4,15,11,7,0,0,10,8,2,4,6,18,3,22,33,28,23,27,11,6,11,14,17,17,14,11,7,8,6,8,4,25,24,11,19,9,19,14,22,6,23,15,6,11,2,22,10,14,12,9,13,18,8,11,9,9,14,9,2,12,15,11,7,9,19,11,8,13,4,7,19,14,14,3,9,7,7,14,29,19,14,16,10,12,24,6,9,14,26,7,4,15,23,1,29,9,6,11,17,6,7,2,24,4,11,25,11,2,19,7,2,20,10,19,17,12,11,6,10,4,2,13,21,9,4,2,19,4,9,14,6,24,1,13,3,10,16,9,19,4,10,5,7,3,38,16,13,1,7,15,10,2,19,25,25,7,15,11,25,19,4,14,19,18,14,11,4,29,21,24,19,31,28,15,27,13,4,10,8,4,18,3,11,8,10,33,19,35,21,7,18,4,12,16,14,3,1,27,12,6,9,2,6,9,16,22,26,11,11,15,13,6,20,17,8,13,15,14,14,6,24,10,2,9,23,12,8,16,10,7,19,2,9,14,9,16,7,6,22,9,9,14,17,12,13,8,10,16,1,6,4,8,4,15,7,14,16,15,23,19,4,12,12,25,14,14,11,7,18,4,37,13,14,17,5,2,0,3,21,20,19,4,19,11,14,8,13,24,1,1,3,4,21,10,13,9,14,2,21,22,14,7],[196,25,225,36,49,1089,121,1225,576,441,225,196,0,196,36,81,400,36,529,81,25,121,324,225,16,441,961,81,49,49,1024,121,256,196,729,0,289,784,576,121,1,196,36,100,36,16,100,484,256,36,144,1024,225,289,1156,81,1369,100,1225,36,361,100,121,225,144,144,196,121,81,576,144,169,841,121,169,361,4,576,81,36,484,900,100,36,841,841,1296,361,64,169,256,121,225,36,169,484,576,4,36,4,4,196,81,121,81,36,361,676,361,9,49,784,169,81,225,400,841,81,1,64,361,529,9,169,64,289,16,225,121,49,0,0,100,64,4,16,36,324,9,484,1089,784,529,729,121,36,121,196,289,289,196,121,49,64,36,64,16,625,576,121,361,81,361,196,484,36,529,225,36,121,4,484,100,196,144,81,169,324,64,121,81,81,196,81,4,144,225,121,49,81,361,121,64,169,16,49,361,196,196,9,81,49,49,196,841,361,196,256,100,144,576,36,81,196,676,49,16,225,529,1,841,81,36,121,289,36,49,4,576,16,121,625,121,4,361,49,4,400,100,361,289,144,121,36,100,16,4,169,441,81,16,4,361,16,81,196,36,576,1,169,9,100,256,81,361,16,100,25,49,9,1444,256,169,1,49,225,100,4,361,625,625,49,225,121,625,361,16,196,361,324,196,121,16,841,441,576,361,961,784,225,729,169,16,100,64,16,324,9,121,64,100,1089,361,1225,441,49,324,16,144,256,196,9,1,729,144,36,81,4,36,81,256,484,676,121,121,225,169,36,400,289,64,169,225,196,196,36,576,100,4,81,529,144,64,256,100,49,361,4,81,196,81,256,49,36,484,81,81,196,289,144,169,64,100,256,1,36,16,64,16,225,49,196,256,225,529,361,16,144,144,625,196,196,121,49,324,16,1369,169,196,289,25,4,0,9,441,400,361,16,361,121,196,64,169,576,1,1,9,16,441,100,169,81,196,4,441,484,196,49],[7,7,7,7,14,7,7,3,7,7,3,7,16,10,7,10,7,12,7,7,16,10,3,7,7,14,7,7,12,12,7,3,10,14,12,3,3,3,7,17,12,9,16,3,7,7,16,10,7,7,7,3,7,7,3,12,7,17,7,7,3,12,7,7,7,12,16,7,7,7,12,10,9,0,10,14,7,3,12,12,7,17,3,7,7,12,7,7,12,10,0,12,10,7,7,7,3,12,7,12,7,7,10,14,7,12,7,7,10,7,12,7,7,17,7,7,7,10,10,12,7,12,7,10,7,10,7,7,7,7,7,7,16,12,7,3,7,7,7,12,7,12,12,14,7,7,7,12,12,14,10,12,7,16,7,17,3,10,9,7,3,16,12,7,7,7,12,3,7,7,7,7,10,10,7,12,17,10,7,7,12,7,12,7,7,7,7,14,7,12,7,7,12,3,7,12,12,7,7,14,12,17,17,7,7,10,7,7,7,3,0,7,12,7,7,12,7,7,12,7,7,3,7,7,10,12,7,12,7,7,7,7,12,7,7,7,7,7,14,17,7,10,7,7,12,7,7,9,7,14,7,3,16,3,16,7,16,12,7,7,12,12,12,16,7,9,7,12,12,10,7,12,7,7,3,10,17,7,3,3,12,7,7,7,10,7,10,7,9,9,12,12,12,7,9,12,7,12,7,12,12,14,7,14,10,12,7,7,12,7,7,12,12,12,12,14,10,7,7,7,7,7,7,7,7,7,12,12,7,14,10,12,7,7,12,7,10,12,10,7,14,7,12,12,7,16,0,12,7,17,7,12,3,7,14,7,12,12,7,7,14,7,12,12,7,7,7,7,3,12,7,7,14,10,10,10,10,12,3,7,16,7,7,0,7,7,10,7,12,7,7,7,7,16,12,10,7,7,16,7,7,7,12,10,10,10,7,10,12,12,7,17,12,16,10,16,7,7,9,3,7,7,7,7,7,7,16,12],[12,7,12,7,12,14,14,3,7,7,12,14,16,10,7,16,10,12,7,12,10,12,7,7,12,16,3,3,12,12,7,3,12,7,12,10,3,10,7,14,12,9,14,3,12,12,14,10,7,12,7,7,12,7,7,12,7,17,17,12,14,12,7,7,7,12,12,12,7,12,12,10,7,0,7,12,7,3,10,7,12,12,7,7,7,7,7,7,7,10,7,12,10,12,7,7,7,14,7,12,12,7,7,14,12,10,7,7,7,7,12,7,12,10,10,7,7,7,12,7,7,12,14,12,7,10,7,7,12,10,7,7,12,10,7,12,7,7,7,7,3,12,16,7,3,12,7,12,12,16,12,12,7,14,7,10,7,14,7,7,12,12,17,7,7,3,12,7,7,7,3,7,10,10,12,7,14,10,7,7,10,12,12,7,7,7,12,7,12,12,12,10,12,10,12,12,12,7,12,12,12,12,16,7,16,7,7,10,12,10,0,7,12,12,10,12,3,7,12,10,7,7,7,7,12,12,7,12,7,10,10,7,12,17,7,7,7,7,12,14,7,12,7,7,16,7,10,12,7,16,10,3,16,7,12,7,7,7,12,12,7,10,14,16,7,10,7,14,14,12,7,7,3,7,7,7,12,10,7,3,12,7,12,7,10,7,0,7,10,9,12,12,12,3,9,12,12,14,7,12,12,12,7,12,10,12,7,7,3,12,7,16,12,12,7,14,7,7,12,10,7,3,7,7,10,7,7,12,12,12,10,14,7,7,14,10,10,7,7,7,14,7,12,14,14,14,0,16,7,12,7,10,7,10,10,14,7,12,7,7,12,14,12,7,7,7,12,16,3,16,7,16,7,10,10,10,10,12,10,7,16,7,7,7,7,7,10,10,12,7,7,7,7,14,14,7,7,7,16,12,7,7,12,12,12,10,3,12,12,12,7,16,12,10,10,12,7,7,7,7,7,7,7,7,7,7,12,12],[-0.01689361393701838,-0.6547254735284581,0.2689901571530899,-0.9253959811841497,0.3514758544493812,0.292975113425143,0.7127141556275087,0.8300483501089928,-0.5728064417300516,0.2289068300428165,0.1567740421552262,0.358621519247367,-0.05090661332045698,-0.4086782223011236,0.4081051268904199,-0.750151842413413,-0.1410703021564883,0.6263200559104434,-0.632076794076698,0.9123547975021009,0.02542345566141502,0.1109988294859601,-0.1714023776863196,-0.06649255940727383,-0.1795992153527856,-0.2911638720042924,0.09606210688098571,0.4436927739577894,0.6248286263210914,0.7193573314123289,-0.7855630497751813,-0.507790545510034,0.4437831437657636,0.04591036975133611,-0.230909252599889,0.1782416256187805,-0.4108306308313943,0.3178016994197019,0.2760931014714907,1.214200427205391,0.1571630082217373,0.1388912314151545,-0.31807214854188,0.3757093677057712,-0.3243721209084043,0.2976304990408302,-0.6810884349101567,0.1983560328118654,-0.4404813551413733,-0.04768777709355443,-0.3569447472835228,-1.183457924032221,-1.082603694890062,-0.5603641706741909,-0.4954026696182108,0.04199680706955311,0.9541755207862375,0.6408911175553298,0.1734597360381958,-0.2553592405830869,0.05924202323415684,0.1261596211924045,0.08330323343550661,0.5874339323300919,1.293854705974992,0.3391488717539144,0.4039051345850253,-0.09150604378979676,0.01434573568874642,-0.005386912768869934,0.5595468208050087,0.09459452780488498,0.5162885087924938,1.067854543890924,0.2599185005069113,0.3062420619417945,-1.899344138281742,-0.5464397610791014,-1.293766736732756,-0.0793106041466245,-1.103717818164411,-1.423692267297247,-0.06299631504874537,0.1255358375153057,-0.6392089527648945,0.5035085573616014,-0.1414735060796277,-0.1221276090388022,0.3937491601253218,-0.8597369000076913,0.3068742847931665,-0.6511132515649127,-1.130302155409376,-1.710666485837604,-1.543601374304261,-0.2932816739447264,0.2535457969250388,-0.3751067737930928,-0.1160985909019956,-0.04669680398642162,-0.4414001623562966,-0.1686322575832586,-0.2312372621101733,0.2473629316818911,0.2121790353255872,-0.7504038296676643,0.1481216980329789,-0.1527020424555183,-0.07538215985996333,-0.6216232213260737,0.8732755676968238,0.06444548091403357,0.07841639653371413,-0.04140239937227852,0.5151212196607193,-1.940430428371517,0.09672131110548809,0.6821826402145033,0.164177670039527,0.393108201787299,-0.3446850347054202,-0.1884837390249421,0.569684382986823,0.2367511285893782,0.9009257380863853,-3.098585440687809,-2.15350960685116,0.04584657957740634,-0.5426742234694761,0.7360355456531659,-0.1686477212611308,0.8637987823780904,0.05683131308693579,1.001890642085523,-0.08380456957718485,-0.005550374353182286,-0.2347599349533567,0.4211274853908917,0.2946824994800481,0.1701376442544313,0.6286277032231167,0.5070730597620621,0.03258681868203417,-0.4277802946838207,-0.7752364224110413,-0.5320114872967252,-0.06334654462140077,0.326222311851861,-1.277149609284458,0.4802635171599772,-0.03899385413872958,-0.4409189322040363,-0.4174555343217714,0.06001890948640787,0.7784114098325725,0.5719718758961387,-0.4363123076799035,-0.3359262635873348,0.05477309824230137,-0.140299598095827,-0.6020192073472701,0.2612899245927465,-1.199219005103322,-0.4228267558130478,0.09354472204373798,-0.2579527875631342,0.974474465469821,0.1499736885443292,-0.2624777040993223,-0.06698648106854432,0.1339062234411562,0.40343558832334,-0.1051207056508572,-0.6298849766633254,-0.3745227530742332,0.02946615286179777,0.1964172316720827,-0.7669952638119799,0.2511228448957625,-0.06334654462140077,0.9256284537865289,0.3894010786433344,0.3178520315860529,0.1782266202522882,2.349271126610836,-0.09449780725447399,0.3010253885757053,0.4653740428286746,-1.459175060374246,-0.8864333029789324,-0.4774134623120507,0.141139736100216,0.3464354402631942,0.4741382794088218,-0.4489387470547794,0.2500251132421629,0.584860516874961,-0.2715359216953497,0.2936433368477125,0.6007472242345728,0.4035263959365492,0.9563020066902301,1.007446099923386,-1.121696833374618,0.08996095229506329,-0.1269169018511496,-0.1363999730373602,-0.1068354749190528,0.06888417688131043,1.699738733456071,-0.09826275104859405,0.186443499513959,0.2478034202056532,-0.2665304427525741,-0.7886254876802803,-0.5468658013656476,1.348714779054868,-0.0759477335730474,-0.7043787183053662,-2.759501711276039,0.5822441773545726,0.6201476756530409,-0.5481408916031369,-0.1811171568612384,0.5479515054412272,0.3506857371107097,0.3362912732534544,0.6226270574448913,0.5054387212338156,0.143489430582273,0.2811647259016705,0.1686829639284571,0.4403537117262555,0.08755711513818165,-0.002516077668496042,0.6351914006643431,-1.045003137795559,0.2534148138484555,0.702877563935945,0.7409104536752029,-0.3061227937265969,0.6772968395981624,0.4647687279005719,-1.056780029795075,0.3234818706049412,1.008235643541563,0.2409450194730126,0.9818358318094589,0.1068507034482731,0.7458684148500097,1.252906481617248,0.0772955499008976,0.4858567248982235,0.1538147133524776,-0.6395732160610552,0.2449460441268225,-0.685860820863343,-0.03614863151886993,-0.8281312742146335,-0.1272303565259598,-0.3496645125629512,0.1249659485642409,0.2645481025404259,0.05543336293197987,-0.9536324904627047,-0.3749156790197492,0.1957256326212375,-0.1230517143064275,1.149983544879069,0.7647286619176783,0.6880908079553036,-0.1204286716202936,-0.7189283621956969,-0.3328254380139194,-0.4722106855935915,-0.2891231198892035,0.4665317067636936,-0.2487595001629983,0.784324718802146,-0.0708878328127418,-0.4604664733575374,0.01711853327084945,-0.1263406295519272,0.426582177760191,0.001650930513855009,0.353840513275842,-0.6233031200169207,-0.2397763019825201,-0.3160686360905847,0.9445099703875919,-0.08342980206262873,-1.896135012112455,0.8989110390591506,0.1128627854362945,-0.2908821801177726,-0.2686583637652689,-0.1469108330319604,-0.2422428897771485,0.5157428237751134,0.7750494161197745,-0.1559653523238964,0.520860929601465,-0.2065102057874799,0.1648835666424251,-0.2662737541166602,-0.1612709361378788,-0.3561381525963885,-0.435443315028581,-1.010904396251659,0.2410669452757064,0.1822283397096127,0.06563068968308738,0.5924427045275489,0.2887231385840294,0.8491834895603438,0.1489595370818275,0.1849802605716129,0.4223209125158668,0.01850985734217914,-0.0627207660186011,0.0431625956271906,-0.4392766687156764,0.1560402654561674,0.3666272519202405,0.5735250636791542,-0.1594934545074949,-0.416429668287694,-0.2277508894375955,0.6162208833182803,0.2871876360662591,0.5825541115296167,0.3376284542007306,-0.4605891793917061,-0.1058603918886472,-0.3827256954888629,-0.2896361965876022,0.1090982452273703,0.4583871879065982,0.1827173870524961,0.17979614538702,-0.815429853884106,-0.08810768576280115,0.3819452206926426,-0.4363198167110662,-0.07083522157995459,-1.758058466933632,0.3560287118496603,-3.026136743198827,2.285990305764278,0.8429610134847771,-0.01829760687424442,-0.5097485260157268,-0.5216447585016069,-1.113547300647469,0.5008448786560988,0.08274825379106021,1.519589377562408,-1.611522070544163,-0.1596865648961898,-0.607654321792243,0.355875224252284,0.6822230910067364,-0.1511370939682823,-0.04050982814183324,-0.4052164550192503,1.856915831985009,-1.309248044709622,-0.7794866450342393,1.150455490565863,0.3758985621477922,0.7903171672817568,0.4345042831510726,-0.3448507658352402,0.2381061411392968,0.09773920402242309,-0.2894876680885801,-0.4921445447270512,-0.099827157942181,-0.03086714919830558,0.4383068366019325,0.5333853559181354,0.01844258240173091,-0.1967391871916968,0.2841348034046727,-0.2779050411470105,0.4029376041096493,0.2536367280484556,-0.5315144732738182,0.6196454091878254,0.03376415482372863,-0.165520358432697,0.1934494190125726,-0.9780552036742496,1.756529142068162,0.5153751841747662,0.2615710896189571,-0.1374025452274812,0.2283343567006453,-0.7935445792346216,0.1680762329707053,-0.1897400731739771,0.7175402126625858,-0.9076531081743348,0.2129509416339788,-0.7758217256365048,0.08282584366412937,0.397604657499961,2.271717975771177,-0.4284015165875634,0.2509572113064478,-0.01005958378565119,-0.812295454273539,-0.0555333599209451,-0.3328915476260477,-0.4245996037280877,-2.432710037503083,0.6957796479273237,-0.1472840801829937,-0.3997017543611752,0.4256689379171341,-0.2624653085827493,0.131691784043849,0.03095386543524503,0.09121496290682196,0.3528645832242741,0.3865247670820091,-0.000599015357611643,0.3564860421590941]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>id<\/th>\n      <th>lwage<\/th>\n      <th>educ<\/th>\n      <th>exper<\/th>\n      <th>expersq<\/th>\n      <th>fatheduc<\/th>\n      <th>motheduc<\/th>\n      <th>resid_iv_mf<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"pageLength":10,"dom":"tip","columnDefs":[{"targets":8,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 4, 3, \",\", \".\", null);\n  }"},{"targets":2,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatRound(data, 2, 3, \",\", \".\", null);\n  }"},{"className":"dt-center","targets":"_all"},{"orderable":false,"targets":0},{"name":" ","targets":0},{"name":"id","targets":1},{"name":"lwage","targets":2},{"name":"educ","targets":3},{"name":"exper","targets":4},{"name":"expersq","targets":5},{"name":"fatheduc","targets":6},{"name":"motheduc","targets":7},{"name":"resid_iv_mf","targets":8}],"order":[],"autoWidth":false,"orderClasses":false}},"evals":["options.columnDefs.0.render","options.columnDefs.1.render"],"jsHooks":[]}</script>

???

This table shows the  new data set after adding the IV estimation residuals.

---

### Wage example: J-test (run auxiliary regression)

We run the auxiliary regression with `R` code lines:

``` r
# set model formula
mod_jtest <- formula(resid_iv_mf ~ exper +expersq +motheduc +fatheduc)
# OLS estimate
lm_jtest <- lm(formula = mod_jtest, data = mroz_resid)
```

Then we can obtain the OLS estimation results.

]

```

Call:
lm(formula = mod_jtest, data = mroz_resid)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.1012 -0.3124  0.0478  0.3602  2.3441

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept)  1.096e-02  1.413e-01   0.078    0.938
exper       -1.833e-05  1.333e-02  -0.001    0.999
expersq      7.341e-07  3.985e-04   0.002    0.999
motheduc    -6.607e-03  1.189e-02  -0.556    0.579
fatheduc     5.782e-03  1.118e-02   0.517    0.605

Residual standard error: 0.6752 on 423 degrees of freedom
Multiple R-squared:  0.0008833,	Adjusted R-squared:  -0.008565 
F-statistic: 0.0935 on 4 and 423 DF,  p-value: 0.9845
```

]

???

Remind that the model summary gives a F test result, which is differnt with the F statistic in  J-test.

---

### Wage example: J-test (Restricted F-test)

As what we have done before, We conduct the restrict F-test for the auxiliary regression.

We will restrict jointly with `$\theta_1 = \theta_2 =0$`, and using the R function `linearHypothesis()`:

``` r
# restricted F-test
restricted_ftest <- linearHypothesis(lm_jtest, c("motheduc = 0", "fatheduc = 0"), test = "F")
# obtain the F statistics
restricted_f <- restricted_ftest$F[[2]]
```

]

```

Linear hypothesis test:
motheduc = 0
fatheduc = 0

Model 1: restricted model
Model 2: resid_iv_mf ~ exper + expersq + motheduc + fatheduc

Res.Df    RSS Df Sum of Sq     F Pr(>F)
1    425 193.02                          
2    423 192.85  2    0.1705 0.187 0.8295
```

]

The restricted F-statistics is 0.1870 (with round digits 4 here ).

???

Please pay attention to the code `c("motheduc = 0", "fatheduc = 0")`

---

### Wage example: J-test (calculate J-statistic by hand)

Finally, We can calculate J-statistic by hand or obtain it by using special tools.

- Calculate J-statistic by hand

``` r
# numbers of instruments
m <- 2
# calculate J statistics
(jtest_calc <- m*restricted_f)
```

```
[1] 0.373985
```

- The calculated J-statistic is 0.3740 (with round digits 4 here ).

---

### Wage example: J-test (obtain J-statistic with tools)

Also, We can obtain J-statistic by using special tools.

- using tools of  `linearHypothesis(.,  test = "Chisq")`

``` r
# chi square test directly
jtest_chitest <- linearHypothesis(
  lm_jtest,  c("motheduc = 0", "fatheduc = 0"),
  test = "Chisq")
# obtain the chi square value
jtest_chi <- jtest_chitest$Chisq[2]
```

]

- The chi square test result:

```

Linear hypothesis test:
motheduc = 0
fatheduc = 0

Model 1: restricted model
Model 2: resid_iv_mf ~ exper + expersq + motheduc + fatheduc

Res.Df    RSS Df Sum of Sq Chisq Pr(>Chisq)
1    425 193.02                              
2    423 192.85  2    0.1705 0.374     0.8294
```

]

- We obtain the J-statistic 0.3740 (with round digits 4 here ). It's the same as what we have calculated by hand!

???

In `R`, we can use the function `linearHypothesis(.,  test = "Chisq")` by setting argument `test = "Chisq"`.

Please check that the relations between restricted F statistics and the `$\chi^2$` statistics.

---

### Wage example: J-test (adjust the freedoms)

**Caution**: In this case the
`$p$`-Value reported by `linearHypothesis(., test = "Chisq")` is wrong because the degrees of freedom are set to  2, and the correct freedom should be `$(m-k)=1$`.

]

- We have obtain the J statistics
`${\chi^2}^{\ast} =0.3740$`, and its correct freedom is `$(m-k)=1$`.

- Then we may compute the correct
`$p$`-Value of this the J statistics (by using function `pchisq()` in `R`).

``` r
# correct freedoms
f <- m -1
# compute correct p-value for J-statistic
(pchi <- pchisq(jtest_chi, df = f, lower.tail = FALSE))
```

```
[1] 0.5408401
```
???

This differs from the degree of overidentification (
`$m−k=2−1=1$`). So the
`$J$`-statistic is
`$\chi^2(1)$` distributed instead of following a  `$\chi^2(2)$` distribution as assumed defaultly by `linearHypothesis()`.

---

### Wage example: J-test (the conclutions)

Now we can get the conclusions of J-test.

Since the p-value of  J-test(0.5408)is larger than the criteria value 0.1,  we can't reject the null hypothesis that both instruments are exogenous.

This means both instruments(
`motheduc` and `fatheduc`) are **exogenous**.

???

Finally, we go through all instrument validity tests in this section.

the next section we will illustrate how to test regressor endogeneity.

### 2SLS as a weighted average of just-identified IVs

Another useful view of **2SLS** when `$L > J$`:

`$$\begin{aligned}
\beta^{2SLS} & = \left(\pi' \operatorname{Cov}(\tilde{Z}_i, X_i')\right)^{-1} \pi' \operatorname{Cov}(\tilde{Z}_i, Y_i) \\
& = \left(\pi' \operatorname{Var}(\tilde{Z}_i) \pi\right)^{-1} \pi' \operatorname{Var}(\tilde{Z}_i) \rho
\end{aligned}$$`

This is a `$\operatorname{Var}(\tilde{Z}_i)$`-weighted regression of reduced-form coefficients `$\rho$` on first-stage coefficients `$\pi$` (through the origin).

- When `$J = 1$`: `$\beta^{2SLS} = \sum_\ell \omega_\ell \beta_\ell^{IV}$` where each `$\beta_\ell^{IV} = \rho_\ell / \pi_\ell$` uses one instrument at a time.
- Intuition: 2SLS **combines** multiple one-at-a-time IV estimands, weighting by first-stage strength.

---

### Overidentification tests: caveats

Under a **constant-effects** model, overidentification checks whether all just-identified IVs agree (`$\beta_\ell^{IV} = \beta$` for all `$\ell$`).

- The **J-test** / **Sargan test** (above) implements this when `$L > J$`.

**Do not over-interpret rejections:**

- Tests often have **low power** (each `$\hat{\beta}_\ell^{IV}$` is noisy).
- Rejection need **not** mean invalid instruments — **treatment effect heterogeneity** can make valid IVs differ.
- Rejection does **not** identify which instrument fails.

---

### Class Exercise: three IVs for `educ`

For the wage case, and the origin model(Mis-specificated) was assumed to have only one endogenous variable(`educ`)

`$$\begin{align}
lwage = {\beta}_1 +{\beta}_2{educ} + {\beta}_3exper +{\beta}_4expersq + {\epsilon}
\end{align}$$`

If we add husband’s education(`huseduc`) to the IV list, then we will have totally three IVs  (`fatheduc`, `motheduc` and `huseduc`) for `educ`.

-  Use these three IVs to obtain TSLS results. Compare the TSLS results when using two IVs(`fatheduc` and `motheduc`) which we have got.

- Conduct the over-identification test (J-test).

---
layout: false
class: center, middle, duke-softblue,hide_logo
name: endogeneity

## 17.8 Testing Regressor endogeneity

???

In this section, we focus mainly on regressor endogeneity issues.

---
layout: true

<div class="my-footer"><span>huhuaping@   <a href="#chapter17"> Chapter 17. Endogeneity and Instumental Variables  |</a> &emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp; <a href="#endogeneity"> 17.8 Testing Instrument endogeneity </a></span></div>

---

### Regressor Endogeneity: the concepts

It is the researcher’s responsibility to specify which variables are endogenous and which are exogenous. So, how can we test the regressor endogeneity?

Since OLS is in general more efficient than IV (recall that if Gauss-Markov assumptions hold OLS is BLUE), we don't want to use IV when we don't need to get the consistent estimators.

Of course, if we really want to get a consistent estimator, we also need to check whether the endogenous regressors are really **endogenous** in the model.

So we should test following hypothesis:

`$$H_{0}: \operatorname{Cov}(X, \epsilon)=0 \text { vs. } H_{1}: \operatorname{Cov}(X, \epsilon) \neq 0$$`

---

### Regressor Endogeneity: Hausman test

`Hausman` tells us that we should use OLS if we fail to reject
`$H_{0}$`. And we should use IV estimation if we reject
`$H_{0}$`

Let's see how to construct a `Hausman test`. While the idea is very simple.

- If
`$X$` is **.red[ex]ogenous** in fact, then both OLS and IV are consistent, but OLS estimates are more efficient than IV estimates.

- If
`$X$` is **.red[en]dogenous** in fact, then the results from OLS estimators are different, while results obtained by IV (eg. 2SLS) are consistent.

In fact, the general class of tests are called
**Durbin-Wu-Hausman** tests, **Wu-Hausman** tests, or **Hausman** tests.

]

---

### Hausman test: the idea

We can compare the difference between estimates computed using both OLS and IV.

- If the difference is **small**, we can conjecture that both OLS and IV are consistent and the small difference between the estimates is not systematic.
- If the difference is **large** this is due to the fact that OLS estimates are not consistent. We should use IV in this case.

---

### Hausman test: the statistics

The **Hausman test** takes the following statistics form (
`$k=$` numbers of edogenous regressor
)

`$$\begin{align}
\hat{H}=n\boldsymbol{\left[\hat{\beta}_{IV}-\hat{\beta}_{\text {OLS}}\right] ^{\prime}\left[\operatorname{Var}\left(\hat{\beta}_{IV}-\hat{\beta}_{\text {OLS}}\right)\right]^{-1}\left[\hat{\beta}_{IV}-\hat{\beta}_{\text {OLS}}\right]} \xrightarrow{d} \chi^{2}(k)
\end{align}$$`

- If
`$\hat{H}$` is less than the critical
`$\chi^2$` value, we can not reject the null hypothesis, and the regressor should **not be endogenous**.

- If
`$\hat{H}$` is **larger** than the critical
`$\chi^2$` value,
the null hypothesis is rejected , and the regressor should **be endogenous**.

The three authors' approaches yield  the same statistic except for possible differences regarding the choice of
`$\hat{\sigma}^2$`.

- Durbin (1954) proposed setting
`$\hat{\sigma}^2$` to be the OLS estimator of
`$\sigma^2$`.

- Wu (1973) proposed a set of possible estimator
`$\hat{\sigma}^2$`.

- Hausman (1978) proposed a Wald statistic.

]

---

### Hausman test: the procedure

The origin model is

`$$\begin{align}
Y_{i}=\hat{\alpha}_{0} + \alpha_1 X_i+\hat{\beta}_{1} W_{1 i}+\ldots+\hat{\beta}_{p} W_{p i} +u_{i} \quad \text{(origin model)}
\end{align}$$`

- Conduct the **first-stage regression** of 2SLS estimation and obtain the residuals `$v_i$`.

`$$\begin{align}
X_{i}=\hat{\gamma}_{0}+\hat{\gamma}_{1} W_{1 i}+\ldots+\hat{\gamma}_{p} W_{p i}+ \hat{\theta}_{1} Z_{1 i}+\ldots+\hat{\theta}_{q} Z_{q i}+v_{i} \quad \text{(reduced model)}
\end{align}$$`

-  Then estimate the control function by least squares

`$$\begin{align}
Y_{i}=\hat{\delta}_{0} + \hat{\delta}_1 X_i + \hat{\delta}_{1} W_{1 i}+\ldots+\hat{\delta}_{p} W_{p i}+ \hat{\lambda}_{1} v_i+ u_{i} \quad \text{(control model)}
\end{align}$$`

- Conduct the **Restricted F-test** with
`$H_0: \lambda_1=0$` (**Wu-Hausman F-test**).

- If the
`$F$`-statistic is lager than  critical value, the regressor
`$X_i$` is **Endogenous**.

The restricted F statistics is equivalent to the square of t statistics of `$\hat{\lambda}$` in the  control function, which is
`$t^2_{\hat{\lambda}} = F^{\ast} \quad \text{(Wu-Hausman F)}$`

]

---

### Wage example: Hausman test (the origin & IV model)

The origin model is

`$$\begin{aligned}
  lwage & = \hat{\alpha}_0 +\hat{\alpha}_1 {educ} + \hat{\beta}_1 exper +\hat{\beta}_2 expersq + u_i  && \text{(origin model)}
  \end{aligned}$$`

Again, we use both `$matheduc$` and `$fatheduc$` as instruments for `$educ$` in our IV model setting.

`$$\begin{cases}
  \begin{align}
  {educ} &= \hat{\gamma}_0 +\hat{\gamma}_1exper + \hat{\gamma}_2expersq + \hat{\theta}_1motheduc + \hat{\theta}_2fatheduc +v_i && \text{(stage 1)}\\
  lwage & = \hat{\eta}_1 +\hat{\eta}_2\widehat{educ} + \hat{\eta}_3exper +\hat{\eta}_4expersq + e_i  && \text{(stage 2)}
  \end{align}
\end{cases}$$`

---

### Wage example: Hausman test (R solutions)

In `R`, we have at least two equivalent solutions to conduct Hausman test:

- Solution 1 (Automatically) : We can use IV model **diagnose tool** to check the Hausman test results. In fact, `R` function `summary(lm_iv_mf, diagnostics = TRUE)` by setting `diagnostics = TRUE` will give you these results (**Wu-Hausman F**).

- Solution 2 (Calculate by hand) : With Step-by-step calculation  according to  the procedures, you can obtain the **Wu-Hausman F** statistics.

So let's try both of these solutions!

---

### Wage example: Hausman test (Solution 1 diagnose)

``` r
require(AER)
mod_iv_mf <- formula(
  lwage ~ educ + exper + expersq | motheduc + fatheduc + exper + expersq)
lm_iv_mf <- ivreg(formula = mod_iv_mf, data = mroz)
summary(lm_iv_mf, diagnostics = TRUE)
```

]

``` r
### ==== solution 1 for Hausman test  (full model diagnose) ====
summary(lm_iv_mf, diagnostics = TRUE)
```

```

Call:
AER::ivreg(formula = lwage ~ educ + exper + expersq | motheduc + 
    fatheduc + exper + expersq, data = mroz)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.0986 -0.3196  0.0551  0.3689  2.3493

Diagnostic tests:
                 df1 df2 statistic p-value    
Weak instruments   2 423    55.400  <2e-16 ***
Wu-Hausman         1 423     2.793  0.0954 .  
Sargan             1  NA     0.378  0.5386    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.6747 on 424 degrees of freedom
Multiple R-Squared: 0.1357,	Adjusted R-squared: 0.1296 
Wald test: 8.141 on 3 and 424 DF,  p-value: 2.787e-05 
```

]

.footnote[
R script download: [`ivreg-mf-diagnostics.R`](scripts/chapter17/chunk-sources/ivreg-mf-diagnostics.R)
]

---

### Wage example: Hausman test (Solution 2 calculation)

``` r
### ==== solution 2 for Hausman test (calculate) ====
## guide with Hansen's chpt 12.29 Endogeneity test

## reduced function for endogenous education
red_mf <- formula(educ ~ exper + expersq + motheduc + fatheduc)
fit_red_mf<- lm(formula = red_mf, data = mroz)

## extract residual u2 and combined new dataset
resid_mf <- data.frame(resid_mf = resid(fit_red_mf))
tbl_mf <- cbind(mroz, resid_mf)

## control function OLS estimation
control_mf <- formula(lwage ~ educ +exper + expersq  + resid_mf)
fit_control_mf <- lm(formula = control_mf, data = tbl_mf)
smry_control_mf <- summary(fit_control_mf)

## extract t statistics of alpha
t_star_resid <- pull(
  as_tibble(t(smry_control_mf$coefficients[,"t value"])),
  "resid_mf")

## calculate equivalent F statistics
restricted_F_mf <- linearHypothesis(model = fit_control_mf, "resid_mf=0")
F_star_resid <- restricted_F_mf$F[2]
p_F_resid <- restricted_F_mf$`Pr(>F)`[2]
```

]

``` r
# the OLS result of control model
smry_control_mf
```

```

Call:
lm(formula = control_mf, data = tbl_mf)

Residuals:
     Min       1Q   Median       3Q      Max 
-3.03743 -0.30775  0.04191  0.40361  2.33303

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.0481003  0.3945753   0.122 0.903033    
educ         0.0613966  0.0309849   1.981 0.048182 *  
exper        0.0441704  0.0132394   3.336 0.000924 ***
expersq     -0.0008990  0.0003959  -2.271 0.023672 *  
resid_mf     0.0581666  0.0348073   1.671 0.095441 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.665 on 423 degrees of freedom
Multiple R-squared:  0.1624,	Adjusted R-squared:  0.1544 
F-statistic:  20.5 on 4 and 423 DF,  p-value: 1.888e-15
```

]

``` r
# t statistics
number(t_star_resid, 0.0001)
```

```
[1] "1.6711"
```

``` r
# F statistics and probability
number(F_star_resid, 0.01)
```

```
[1] "2.79"
```

``` r
number(p_F_resid, 0.001)
```

```
[1] "0.095"
```

]

---

### Wage example: the diagnosed conclusions

The results for the lwage equation are as follows:

- **(Wu-)Hausman test** for endogeneity: **barely reject** the null that the variable of concern is uncorrelated with the error term, indicating that  `educ`  is marginally endogenous. The Hausman statistics
`$\hat{H}= {\chi^2}^{\ast} = 2.79$`, and its p-value is 0.095.

- **Weak instruments test**: **rejects** the null hypothesis(Weak instruments). At least one of these instruments(`motheduc` or `fatheduc`) is strong. The **restricted F-test** statistics
`$F^{\ast}_R = 55.4$`, and its p-value is 0.0000.

- **Sargan overidentifying restrictions**(Instruments exogeneity J-test): **does not** reject the null. The extra instruments (`motheduc` and `fatheduc`)  are valid (both are exogenous, and are uncorrelated with the error term).

???

So far, We have finished both the instrument validity test and the regressor endogeneity test.

Now, I will show you two examples. You can download the data set and go through all these test we have discussed.

---

### Summary

- An **instrumental variable** must have two properties:

- (1) it must be exogenous, that is, uncorrelated with the error term of the structural equation;
    - (2) it must be partially correlated with the endogenous explanatory variable.

> Finding a variable with these two properties is usually challenging.

- Though we can **never** test whether .red[all] IVs are **exogenous**, we can test that at least .red[some of] them are.

- When we have valid instrumental variables, we can test whether an explanatory variable is **endogenous**.

- The method of **two stage least squares**  is used routinely in the empirical social sciences.

>  But when instruments are poor, then 2SLS can be **worse** than OLS.

---
layout: false
class: center, middle, duke-softblue,hide_logo
name: exercise

## Exercise and Computation

---
layout: true

<div class="my-footer"><span>huhuaping@   <a href="#chapter17"> Chapter 17. Endogeneity and Instumental Variables  |</a> &emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp; <a href="#exercise"> Exercise and Computation  </a></span></div>

---

### Card wage case: introduction

With data set `Card1995.dta`, researchers were interest in the return (`log(Wage)`) to education (`edu`), and they mainly focus the effect of students living region nearby the college.

In a inﬂuential paper David Card (1995) suggested if a potential student lives close to a college this reduces the cost of attendance and thereby raises the likelihood that the student will attend college.

However, college proximity does not directly affect a student’s skills or abilities so should not have a direct effect on his or her market wage.

The origin model is

`$$\begin{aligned}
  lwage & = \hat{\alpha}_0 +\hat{\alpha}_1 {educ} + \hat{\alpha}_3 exp +\hat{\alpha}_4 exp2  +\hat{\alpha}_5 black +\hat{\alpha}_6 south +\hat{\alpha}_7 urban + u_i
\end{aligned}$$`

Please follow our accompany course repository at <https://github.com/huhuaping/course-emiii-accompany> or <https://gitee.com/kevinhhp/course-emiii-accompany>.
]

---

### Card wage case: variables

<div class="datatables html-widget html-fill-item" id="htmlwidget-2d9dadf9e251c83e7235" style="width:100%;height:auto;"></div>
<script type="application/json" data-for="htmlwidget-2d9dadf9e251c83e7235">{"x":{"filter":"none","vertical":false,"extensions":["Buttons"],"caption":"<caption>variables and definition<\/caption>","data":[["1","2","3","4","5","6","7","8","9","10","11","12","13","14","15"],["obs","lwage","edu","exp","exp2","black","south","urban","college","public","private","age","age2","momedu","dadedu"],["index","quantity variable: log of wage","quantity variable: education years","quantity variable: working years","quantity variable: square working years/100","dummy: 1=black; 0=nonblack","dummy: 1=southern area; 0= other","dummy: 1=live in urban; 0= other","dummy: 1=college nearby; 0= other","dummy: 1=public college nearby; 0= other","dummy: 1=private college nearby; 0= other","quantity variable: age (years)","quantity variable: age square /100","quantity variable: mother' education years","quantity variable: father' education years"]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>variable<\/th>\n      <th>definition<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"dom":"Btip","pageLength":8,"buttons":["copy","csv","excel"],"initComplete":"function(settings, json) {\n$(this.api().table().header()).css({'background-color': '#517fb9', 'color': '#fff'});\n}","columnDefs":[{"orderable":false,"targets":0},{"name":" ","targets":0},{"name":"variable","targets":1},{"name":"definition","targets":2}],"order":[],"autoWidth":false,"orderClasses":false,"lengthMenu":[8,10,25,50,100]},"callback":"function(table) {\n$(\"button.buttons-copy\").css(\"background\",\"yellow\");\n                $(\"button.buttons-csv\").css(\"background\",\"orange\");\n                $(\"button.buttons-excel\").css(\"background\",\"#a88d32\");\n                $(\"button.dt-button\").css(\"padding\",\"0.2em 0.2em\");\n                $(\"button.dt-button\").css(\"font-size\",\"0.6em\");\n                $(\"button.dt-button\").css(\"margin-right\",\"0.2em\");\n                $(\"button.dt-button\").css(\"margin-bottom\",\"0.1em\");\n                $(\"button.dt-button\").css(\"line-height\",\"1em\");\n                return table;\n}"},"evals":["options.initComplete","callback"],"jsHooks":[]}</script>

---

### Card wage case: models and IV list sets

Let's consider following estimation solutions:

a. Error specification model with OLS regression directly.

b. Equivalent IVs for endogenous regressors (just- identificaion)

- The IV model using `college` as instruments for `educ`

- The IV model using (`college`, `age`, `age2`) as instruments for (`edu`,`exp`,`exp2`)

c. Abundant IVs for endogenous regressors (over-identification)

- The IV model using both (`public`,`private`) as instruments for `educ`

- The IV model using both (`public`,`private`,`age`,`age2`) as instruments for (`edu`, `exp`,`exp2`)

---

### Equivalent IVs (TSLS 1/4)

we will use `college` as instruments for `educ` in our IV model setting.

`$$\begin{cases}
  \begin{align}
  {edu} &= \hat{\gamma}_0 +\hat{\gamma}_1exp + \hat{\gamma}_2exp2 + \hat{\gamma}_3black + \hat{\gamma}_4south + \hat{\gamma}_5urban + \hat{\theta}_1college +v_i && \text{(stage 1)}\\
  lwage & = \hat{\eta}_1 +\hat{\eta}_2\widehat{edu} + \hat{\eta}_3exp +\hat{\eta}_4exp2 +\hat{\eta}_5 black +\hat{\eta}_6 south +\hat{\eta}_7 urban+ e_i  && \text{(stage 2)}
  \end{align}
\end{cases}$$`

---

### Equivalent IVs (TSLS 2/4)

we will use (`college`, `age`, `age2`) as instruments for (`edu`,`exp`,`exp2`) in our IV model setting.

`$$\begin{cases}
  \begin{align}
  {edu} &= \hat{\gamma}_0 +\hat{\gamma}_1age + \hat{\gamma}_2age2 + \hat{\gamma}_3black + \hat{\gamma}_4south + \hat{\gamma}_5urban + \hat{\theta}_1college +v_{1i} && \text{(1 of stage 1)}\\
  {exp} &= \hat{\lambda}_0 +\hat{\lambda}_1age + \hat{\lambda}_2age2 + \hat{\lambda}_3black + \hat{\lambda}_4south + \hat{\lambda}_5urban + \hat{\lambda}_1college +v_{2i} && \text{(2 of stage 1)}\\
  {exp2} &= \hat{\delta}_0 +\hat{\delta}_1age + \hat{\delta}_2age2 + \hat{\delta}_3black + \hat{\delta}_4south + \hat{\delta}_5urban + \hat{\delta}_1college +v_{3i} && \text{(3 of stage 1)}\\
  lwage & = \hat{\eta}_1 +\hat{\eta}_2\widehat{edu} + \hat{\eta}_3exp +\hat{\eta}_4exp2 +\hat{\eta}_5 black +\hat{\eta}_6 south +\hat{\eta}_7 urban+ e_i  && \text{(stage 2)}
  \end{align}
\end{cases}$$`

---

### Abundant IVs (TSLS 3/4)

we will use both (`public`,`private`) as instruments for `educ` in our IV model setting.

`$$\begin{cases}
  \begin{align}
  {edu} &= \hat{\gamma}_0 +\hat{\gamma}_1exp + \hat{\gamma}_2exp2 + \hat{\gamma}_3black + \hat{\gamma}_4south + \hat{\gamma}_5urban &&
  \\&+ \hat{\theta}_1public + \hat{\theta}_2private +v_i && \text{(stage 1)}\\
  lwage & = \hat{\eta}_1 +\hat{\eta}_2\widehat{edu} + \hat{\eta}_3exp +\hat{\eta}_4exp2 +\hat{\eta}_5 black &&\\ &+\hat{\eta}_6 south +\hat{\eta}_7 urban+ e_i  && \text{(stage 2)}
  \end{align}
\end{cases}$$`

---

### Abundant IVs (TSLS 4/4)

we will use both (`public`,`private`,`age`,`age2`) as instruments for (`edu`, `exp`,`exp2`) in our IV model setting.

`$$\begin{cases}
  \begin{align}
    {edu} &= \hat{\gamma}_0 +\hat{\gamma}_1age + \hat{\gamma}_2age2 + \hat{\gamma}_3black + \hat{\gamma}_4south + \hat{\gamma}_5urban &&\\
    & + \hat{\theta}_1public + \hat{\theta}_2private +v_{1i} && \text{(1 of stage 1)}\\
  {exp} &= \hat{\lambda}_0 +\hat{\lambda}_1age + \hat{\lambda}_2age2 + \hat{\lambda}_3black + \hat{\lambda}_4south + \hat{\lambda}_5urban &&\\
  &+ \hat{\lambda}_1public + \hat{\lambda}_2private +v_{2i} && \text{(2 of stage 1)}\\
  {exp2} &= \hat{\delta}_0 +\hat{\delta}_1age + \hat{\delta}_2age2 + \hat{\delta}_3black + \hat{\delta}_4south + \hat{\delta}_5urban &&\\
  &+ \hat{\delta}_1public + \hat{\delta}_2private +v_{3i} && \text{(3 of stage 1)}\\
  lwage & = \hat{\eta}_1 +\hat{\eta}_2\widehat{edu} + \hat{\eta}_3exp +\hat{\eta}_4exp2 +\hat{\eta}_5 black &&\\
  &+\hat{\eta}_6 south +\hat{\eta}_7 urban+ e_i  && \text{(stage 2)}
  \end{align}
\end{cases}$$`

---

### Exercise tasks 1/2: compare all results

.scroll-box-20[
<div class="datatables html-widget html-fill-item" id="htmlwidget-38dfe0cc2df906b537de" style="width:100%;height:auto;"></div>
<script type="application/json" data-for="htmlwidget-38dfe0cc2df906b537de">{"x":{"filter":"none","vertical":false,"caption":"<caption>The stage 2 results for lwage<\/caption>","data":[["1","2","3","4","5","6","7"],["(Intercept)","edu","exp","exp2","black","south","urban"],["4.7337<br/>(0.0676)","0.0740<br/>(0.0035)","0.0836<br/>(0.0066)","-0.2241<br/>(0.0318)","-0.1896<br/>(0.0176)","-0.1249<br/>(0.0151)","0.1614<br/>(0.0156)"],["3.7528<br/>(0.8293)","0.1323<br/>(0.0492)","0.1075<br/>(0.0213)","-0.2284<br/>(0.0334)","-0.1308<br/>(0.0529)","-0.1049<br/>(0.0231)","0.1313<br/>(0.0301)"],["4.0657<br/>(0.6085)","0.1329<br/>(0.0514)","0.0560<br/>(0.0260)","-0.0796<br/>(0.1340)","-0.1031<br/>(0.0774)","-0.0982<br/>(0.0288)","0.1080<br/>(0.0497)"],["3.2680<br/>(0.6872)","0.1611<br/>(0.0408)","0.1193<br/>(0.0182)","-0.2305<br/>(0.0350)","-0.1017<br/>(0.0453)","-0.0950<br/>(0.0217)","0.1164<br/>(0.0271)"],["3.7481<br/>(0.4834)","0.1597<br/>(0.0409)","0.0470<br/>(0.0250)","-0.0323<br/>(0.1281)","-0.0640<br/>(0.0630)","-0.0857<br/>(0.0256)","0.0835<br/>(0.0412)"],["3.2220<br/>(0.7015)","0.1638<br/>(0.0416)","0.1204<br/>(0.0185)","-0.2307<br/>(0.0352)","-0.0990<br/>(0.0461)","-0.0941<br/>(0.0219)","0.1150<br/>(0.0275)"]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>term<\/th>\n      <th>OLS<\/th>\n      <th>IV_c<\/th>\n      <th>IV_ca<\/th>\n      <th>2SLS_pp<\/th>\n      <th>2SLS_ppa<\/th>\n      <th>LIML_pp<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"dom":"t","pageLength":15,"columnDefs":[{"width":10,"targets":[2,3,4,5,6,7]},{"orderable":false,"targets":0},{"name":" ","targets":0},{"name":"term","targets":1},{"name":"OLS","targets":2},{"name":"IV_c","targets":3},{"name":"IV_ca","targets":4},{"name":"2SLS_pp","targets":5},{"name":"2SLS_ppa","targets":6},{"name":"LIML_pp","targets":7}],"order":[],"autoWidth":false,"orderClasses":false,"lengthMenu":[10,15,25,50,100]}},"evals":[],"jsHooks":[]}</script>

]

---

### Exercise tasks 2/2: conduct several tests

So we should conduct several tests as we have learned.

- Weak instrument test (Restricted F test or Cragg-donald test)

- Instrument Exogeneity test (J-test )

- Regressor Endogeneity test (Wu-Hausman test)

Find these results and get the conclusion!

---
layout:false
background-image: url("../pic/thank-you-gif-funny-little-yellow.gif")
class: inverse,center

# End Of This Chapter

???

So we finished all content of chapter 17.

The next three chapters will focus on SEM closely.

See you in the next class.

if you have questions, please let me know. you can leave messages by QQ or email.

Thanks. Goodbye!