我们如何估计回归函数中的系数?
总体回归: \[ \begin{cases} \begin{aligned} E(Y|X_i) &= \beta_1 +\beta_2X_i && \text{(PRF)} \\ Y_i &= \beta_1 +\beta_2X_i + u_i && \text{(PRM)} \end{aligned} \end{cases} \]
样本回归: \[ \begin{cases} \begin{aligned} \hat{Y}_i & =\hat{\beta}_1 + \hat{\beta}_2X_i && \text{(SRF)} \\ Y_i &= \hat{\beta}_1 + \hat{\beta}_2X_i +e_i && \text{(SRM)} \end{aligned} \end{cases} \]
首先需要回答的问题是,我们该如何估计得出样本回归函数中的系数?事实上,方法有多种多样:
图解法:比较粗糙,但提供了基本的视觉认知
最小二乘法(order lease squares, OLS):最常用的方法
最大似然法(maximum likelihood, ML)
矩估计方法(Moment method, MM)
总体回归函数PRF:
\[ \begin{aligned} E(Y|X_i) &= \beta_1 +\beta_2X_i \end{aligned} \]
总体回归模型PRM:
\[ \begin{aligned} Y_i &= \beta_1 +\beta_2X_i + u_i \end{aligned} \]
样本回归函数SRF:
\[ \begin{aligned} \hat{Y}_i =\hat{\beta}_1 + \hat{\beta}_2X_i \end{aligned} \]
样本回归模型SRM:
\[ \begin{aligned} Y_i &= \hat{\beta}_1 + \hat{\beta}_2X_i +e_i \end{aligned} \]
思考:
PRF无法直接观测,只能用SRF近似替代
估计值与观测值之间存在偏差
SRF又是怎样决定的呢?
认识普通最小二乘法的原理:一个图示
OLS的基本原理:残差平方和最小化。
\[ \begin{aligned} e_i &= Y_i - \hat{Y}_i \\ &= Y_i - (\hat{\beta}_1 +\hat{\beta}_2X_i) \end{aligned} \]
\[ \begin{aligned} Q &= \sum{e_i^2} \\ &= \sum{(Y_i - \hat{Y}_i)^2} \\ &= \sum{\left( Y_i - (\hat{\beta}_1 +\hat{\beta}_2X_i) \right)^2} \\ &\equiv f(\hat{\beta}_1,\hat{\beta}_2) \end{aligned} \]
\[ \begin{aligned} Min(Q) &= Min \left ( f(\hat{\beta}_1,\hat{\beta}_2) \right) \end{aligned} \]
假设存在下面所示的4组观测值 \((X_i, Y_i)\) :
假设猜想两个SRF,完成下表计算,并分析哪个SRF给出的 \((\hat{\beta}_1, \hat{\beta}_2)\) 要更好?
\[ \begin{aligned} SRF1:\hat{Y}_{1i} & = \hat{\beta}_1 +\hat{\beta}_2X_i = 1.572 + 1.357X_i \\ SRF2:\hat{Y}_{2i} & = \hat{\beta}_1 +\hat{\beta}_2X_i = 3.0 + 1.0X_i \end{aligned} \]
\[ \begin{aligned} Min(Q) = Min \left ( f(\hat{\beta}_1,\hat{\beta}_2) \right) &= Min\left(\sum{\left( Y_i - (\hat{\beta}_1 +\hat{\beta}_2X_i) \right)^2} \right) \\ &= Min \sum{\left( Y_i - \hat{\beta}_1 - \hat{\beta}_2X_i \right)^2} \end{aligned} \]
\[ \begin{aligned} \left \{ \begin{split} \sum{\left[ \hat{\beta}_1 - (Y_i -\hat{\beta}_2X_i) \right]} &=0 \\ \sum{\left[ X_i^2\hat{\beta}_2 - (Y_i-\hat{\beta}_1 )X_i \right ] }&=0 \end{split} \right. \end{aligned} \]
\[ \begin{aligned} \left \{ \begin{split} \sum{Y_i} - n\hat{\beta}_1- (\sum{X_i})\hat{\beta}_2 &=0 \\ \sum{X_iY_i}-(\sum{X_i})\hat{\beta}_1 - (\sum{X_i^2})\hat{\beta}_2 &=0 \end{split} \right. \end{aligned} \]
进而得到回归系数的计算公式1(Favorite Five,FF):
\[ \begin{aligned} \left \{ \begin{split} \hat{\beta}_2 &=\frac{n\sum{X_iY_i}-\sum{X_i}\sum{Y_i}}{n\sum{X_i^2}-\left ( \sum{X_i} \right)^2}\\ \hat{\beta}_1 &=\frac{n\sum{X_i^2Y_i}-\sum{X_i}\sum{X_iY_i}}{n\sum{X_i^2}-\left ( \sum{X_i} \right)^2} \end{split} \right. &&\text{(FF solution)} \end{aligned} \]
此外我们也可以得到如下的离差公式(favorite five,ff)
\[ \begin{aligned} \left \{ \begin{split} \hat{\beta}_2 &=\frac{\sum{x_iy_i}}{\sum{x_i^2}}\\ \hat{\beta}_1 &=\bar{Y}_i-\hat{\beta}_2\bar{X}_i \end{split} \right. && \text{(ff solution)} \end{aligned} \]
其中离差计算 \(x_i=X_i-\bar{X};\ y_i=Y_i - \bar{Y}\) 。
以下式子为什么是等价的?你能推导出来么?
\[ \begin{aligned} \left\{ \begin{split} \sum{x_iy_i} &= \sum{\left[ (X_i-\bar{X})(Y_i-\bar{Y})\right]} &&= \sum{X_iY_i} - \frac{1}{n}\sum{X_i}\sum{Y_i} \\ \sum{x_i^2} &= \sum{(X_i- \bar{X})^2} &&= \sum{X_i^2} -\frac{1}{n} \left( \sum{X_i} \right)^2 \end{split} \right. \end{aligned} \]
PRM公式变形:
\[ \begin{alignedat}{2} &\left. \begin{split} Y_i &&= \beta_1 - &&\beta_2X_i +u_i \ && \text{(PRM)} \Rightarrow \\ \hat{Y} &&= \beta_1 - &&\beta_2\bar{X} +\bar{u} && \\ \end{split} \right \} \Rightarrow \\ & y_i = \beta_2x_i +(u_i- \bar{u}) \end{alignedat} \]
残差公式变形:
\[ \begin{alignedat}{2} &\left. \begin{split} & e_i = y_i - \hat{\beta}_2x_i \\ & e_i = \beta_2x_i +(u_i- \bar{u}) -\hat{\beta}_2x_i \end{split} \right \} \Rightarrow \\ & e_i =-(\hat{\beta}_2- \beta_2)x_i + (u_i- \hat{u}) \end{alignedat} \]
求解残差平方和:
\[ \begin{alignedat}{2} & \sum{e_i^2} && = (\hat{\beta}_2 - \beta_2)^2\sum{x_i^2} + \sum{(u-\bar{u})^2} - 2(\hat{\beta}_2 - \beta_2)\sum{x_i(u-\bar{u})} \end{alignedat} \]
求残差平方和的期望:
\[ \begin{aligned} E(\sum{e_i^2}) &= \sum{x_i^2 E \left[ (\hat{\beta}_2 - \beta_2)^2 \right ]}+ E\left[ \sum{(u-\bar{u})^2} \right ]\\ &+ 2E \left[ (\hat{\beta}_2 - \beta_2)\sum{x_i(u-\bar{u})} \right ] \\ & \equiv A + B + C \\ & = \sigma^2 + (n-1)\sigma^2 -2\sigma^2 \\ & = (n-2)\sigma^2 \end{aligned} \]
回归误差方差(Deviation of Regression Error):
\[ \begin{aligned} \hat{\sigma}^2=\frac{\sum{e_i^2}}{n-2} \end{aligned} \]
回归误差标准差(Standard Deviation of Regression Error):有时候也记为se。
\[ \begin{aligned} \hat{\sigma}=\sqrt{\frac{\sum{e_i^2}}{n-2}} \end{aligned} \]
\[ \begin{aligned} A & = \sum{x_i^2 E \left[ (\hat{\beta}_2 - \beta_2)^2 \right ]} \\ & = \sum{ \left[ x_i^2 \cdot var(\hat{\beta}_2) \right] } \\ & = var(\hat{\beta}_2) \cdot \sum{x_i^2} \\ & = \frac{\sigma^2}{\sum{ x_i^2}} \cdot \sum{ x_i^2} \\ & = \sigma^2 \end{aligned} \]
\[ \begin{aligned} B = E \left[ \sum{(u-\bar{u})^2} \right ] & = E(\sum{u_i^2}) - 2E \left[ \sum{(u_i\bar{u})} \right] +nE(\bar{u}^2) \\ & = n \cdot Var(u_i) - 2E \left[ \sum{(u_i \cdot \frac{\sum{u_i}}{n} )} \right] + nE(\frac{\sum{u_i}}{n})^2 \\ & = n \sigma^2 - 2E \left[ \frac{\sum{u_i}}{n} \sum{u_i} \right] + E\left[ \frac{(\sum{u_i})^2}{n} \right]\\ & = n \sigma^2- E\left[ (\sum{u_i})^2/{n} \right] = n \sigma^2 - \frac{E(u_i^2) + E(u_2^2) + \cdots + E(u_n^2) )}{n} \\ & = n \sigma^2 - \frac{nVar{u_i}}{n} = n \sigma^2 - \sigma^2 = (n-1) \sigma^2 \end{aligned} \]
\[ \begin{aligned} C &= - 2E \left[ (\hat{\beta}_2 - \beta_2)\sum{x_i(u_i-\bar{u})} \right ] \\ &= - 2E \left[ \frac{\sum{x_iu_i}}{\sum{x_i^2}} \left( \sum{x_iu_i}-\bar{u}\sum{x_i} \right) \right ] \\ &= - 2E \left[ \frac{ \left( \sum{x_iu_i} \right)^2}{\sum{x_i^2}} \right ] \\ &= -2E \left[(\hat{\beta}_2 - \beta_2)^2 \right] = -2\sigma^2 \end{aligned} \]
\[ \begin{aligned} \hat{\beta}_2 & = \sum{k_iY_i} = \sum{k_i(\beta_1 +\beta_2X_i +u_i)} = \beta_1\sum{k_i} +\beta_2 \sum{k_iX_i}+\sum{k_iu_i} = \beta_2 +\sum{k_iu_i} \\ \hat{\beta}_2 - \beta_2 & = \sum{k_iu_i} = \frac{ \sum{x_iu_i} }{\sum{x_i^2}} \end{aligned} \]
公式1: (Favorite Five,FF形式)
\[ \begin{aligned} \hat{\beta}_2 &=\frac{n\sum{X_iY_i}-\sum{X_i}\sum{Y_i}}{n\sum{X_i^2}-\left ( \sum{X_i} \right)^2}\\ &=\frac{ 13 \ast 1485.04 - 156 \ast 112.771}{ 13 \ast 2054 - 156^2} \\ &= 0.7241 \end{aligned} \]
\[ \begin{aligned} \hat{\beta}_1 &= \bar{Y} - \hat{\beta}_2 \bar{X} \\ &= 8.6747 - 0.7241 \ast 12 \\ &= -0.0145 \end{aligned} \]
公式2:(离差形式,favorite five,ff形式)
\[ \begin{aligned} \hat{\beta}_2 = \frac{\sum{x_iy_i}}{\sum{x_i^2}} = \frac{ 131.786 }{ 182 } = 0.7241 \end{aligned} \]
\[ \begin{aligned} \hat{\beta}_1 = \bar{Y} - \hat{\beta}_2 \bar{X} = 8.6747 - 0.7241 \ast 12 = -0.0145 \end{aligned} \]
\[ \begin{aligned} \hat{Y_i} &= \hat{\beta}_1 + \hat{\beta}_2X_i \\ &= -0.0145 + 0.7241X_i \end{aligned} \]
根据以上样本回归方程,可以计算得到 \(Y_i\) 的回归拟合值 \(\hat{Y}_i\) ,以及回归残差 \(e_i\) 。
\[ \begin{aligned} \hat{Y}_i &=\hat{\beta}_1 +\hat{\beta}_2X_i\\ e_i &= Y_i - \hat{Y}_i \end{aligned} \]
回归误差方差 \(\hat{\sigma}^2\)
\[ \begin{aligned} \hat{\sigma}^2= \frac{\sum{e_i^2}} {(n-2)} = \frac{ 9.693 }{ 11 } = 0.8812 \end{aligned} \]
回归误差标准差 \(\hat{\sigma}\) :
\[ \begin{aligned} \hat{\sigma}=\sqrt{\frac{\sum{e_i^2}}{(n-2)}} = \sqrt{ 0.8812 } = 0.9387 \end{aligned} \]
理解OLS方法下的“估计值”与“估计量”
回归系数的计算公式1(Favorite Five,FF):
\[ \begin{aligned} \left \{ \begin{split} \hat{\beta}_2 &=\frac{n\sum{X_iY_i}-\sum{X_i}\sum{Y_i}}{n\sum{X_i^2}-\left ( \sum{X_i} \right)^2}\\ \hat{\beta_1} &=\frac{n\sum{X_i^2Y_i}-\sum{X_i}\sum{X_iY_i}}{n\sum{X_i^2}-\left ( \sum{X_i} \right)^2} \end{split} \right. &&\text{(FF solution)} \end{aligned} \]
如果给出的参数估计结果是由一个具体样本资料计算出来的,它是一个“估计值”,或者“点估计”,是参数估计量的一个具体数值;
如果把上式看成参数估计的一个表达式,那么,则它是 \((X_i,Y_i)\) 的函数,而 \(Y_i\) 是随机变量,所以参数估计也是随机变量,在这个角度上,称之为“估计量”。
OLS估计量是纯粹由可观测的(即样本)量(指X和Y)表达的,因此它们很容易计算。
它们是点估计量(point estimators),即对于给定样本,每个估计量仅提供有关总体参数的一个(点)值*。
一旦从样本数据得到OLS估计值,便容易画出样本回归线。
\[ \begin{aligned} \bar{Y} = \hat{\beta}_1 +\hat{\beta}_2\bar{X} \end{aligned} \]
\[ \begin{aligned} \hat{Y_i} &= \hat{\beta}_1 +\hat{\beta}_2\bar{X} \\ & =(\bar{Y} - \hat{\beta}_2\bar{X}) + \hat{\beta_2}X_i \\ & = \bar{Y} - \hat{\beta}_2(X_i - \bar{X}) \end{aligned} \]
\[ \begin{aligned} &\Rightarrow 1/n\sum{\hat{Y_i}} = 1/n\sum{\bar{Y} - \hat{\beta}_2(X_i - \bar{X})} \\ &\Rightarrow \bar{\hat{Y_i}} = \bar{Y} \end{aligned} \]
\[ \begin{aligned} \sum{\left[ \hat{\beta}_1 - (Y_i -\hat{\beta}_2X_i) \right]} &=0 \\ \sum{\left[ Y_i- \hat{\beta}_1 - \hat{\beta}_2X_i) \right]} &=0 \\ \sum{( Y_i- \hat{Y}_i )} &=0 \\ \sum{e_i} &=0 \\ \bar{e_i} &=0 \end{aligned} \]
\[ \begin{equation} & \left. \begin{split} Y_i && = \hat{\beta}_1 + \hat{\beta}_2X_i + e_i \\ \bar{Y} &&= \hat{\beta}_1 + \hat{\beta}_2\bar{X} \end{split} \right \} \Rightarrow \\ & Y_i - \bar{Y} =\hat{\beta_2}(X_i - \bar{X}) + e_i \Rightarrow \\ & y_i=\hat{\beta_2}x_i +e_i \ &&\text{(SRM-dev)} \end{equation} \]
\[ \begin{alignedat}{999} & \left. \begin{split} \hat{Y}_i && = \hat{\beta}_1 + \hat{\beta}_2X_i\\ \bar{Y} &&= \hat{\beta}_1 + \hat{\beta}_2\bar{X} \end{split} \right \} \Rightarrow \\ & \hat{Y}_i - \bar{Y} =\hat{\beta_2}(X_i - \bar{X}) \Rightarrow \\ & \hat{y}_i=\hat{\beta_2}x_i \ &&\text{(SRF-dev)} \end{alignedat} \]
\[ \begin{aligned} Cov(e_i, \hat{Y_i}) &= E \left[ \left( e_i-E(e_i)\right )\cdot \left( \hat{Y_i}-E(\hat{Y_i})\right ) \right] = E(e_i \cdot \hat{y_i}) \\ & = \sum(e_i \cdot \hat{\beta_2}x_i) \\ & = \sum{ \left[ (y_i-\hat{\beta_2}x_i) \cdot \hat{\beta_2}x_i \right]} \\ & = \hat{\beta_2}\sum \left[ (y_i-\hat{\beta_2}x_i)\cdot x_i \right]\\ & = \hat{\beta_2}\sum \left[ (y_ix_i-\hat{\beta_2}x_i^2) \right]\\ & = \hat{\beta_2}\sum{x_iy_i}-\hat{\beta}_2^2\sum{x_i^2} && \Leftarrow \hat{\beta_2} = \frac{\sum{x_iy_i}}{x_i^2} \\ & = \hat{\beta}_2^2\sum{x_i^2}- \hat{\beta_2}^2\sum{x_i^2} = 0 \end{aligned} \]
\[ \begin{aligned} x_i &= X_i - \bar{X} \\ y_i &= Y_i - \bar{Y} \\ \hat{y}_i &= \hat{Y}_i - \bar{\hat{Y}}_i = \hat{Y}_i - \bar{Y} \end{aligned} \]
\[ \begin{alignedat}{999} & \left. \begin{split} Y_i && = \beta_1 + \beta_2X_i + u_i \\ \bar{Y} &&= \beta_1 + \beta_2\bar{X} + \bar{u} \end{split} \right \} \Rightarrow \\ & Y_i - \bar{Y} =\beta_2x_i + (u_i- \bar{u}) \Rightarrow \\ & y_i=\hat{\beta_2}x_i + (u_i- \bar{u}) \ &&\text{(PRM-dev)} \end{alignedat} \]
\[ \begin{alignedat}{999} & \left. \begin{split} \hat{Y}_i && = \hat{\beta}_1 + \hat{\beta}_2X_i\\ \bar{Y} &&= \hat{\beta}_1 + \hat{\beta}_2\bar{X} \end{split} \right \} \Rightarrow \\ & \hat{Y}_i - \bar{Y} =\hat{\beta_2}(X_i - \bar{X}) \Rightarrow \\ & \hat{y}_i=\hat{\beta_2}x_i \ \end{alignedat} \]
\[ \begin{aligned} y_i=\hat{\beta_2}x_i +e_i &&\text{(SRM-dev)} \ \Rightarrow \\ e_i =y_i - \hat{\beta_2}x_i \ &&\text{(residual-dev)} \end{aligned} \]
内容小结:
普通最小二乘方法(OLS)采用“铅垂线距离平方和最小化”的思想,来拟合一条样本回归线,进而求解出模型参数估计量。
大家需要很熟练地记住OLS参数估计量公式,以及它们的几大重要特征!
思考讨论:
OLS采用的“铅垂线距离平方和最小化”这一方案,凭什么它被奉为计量分析的经典方法?你觉得还有其他可行替代方案么?
回归标准误差 \(se\) 的现实含义是什么?回归参数估计与随机干扰项的方差估计有什么内在联系么?
OLS方法的几个特征,是不是使它“天生丽质”、“娘胎里生下来就含着金钥匙”?为什么能这么说?
我们已经使用OLS方法分别得到总体回归模型(PRM)的3个重要参数(实际不止3个)的点估计量:
\[ \begin{aligned} Y_i &= \beta_1 +\beta_2X_i + u_i \\ \hat{\beta}_2 &=\frac{\sum{x_iy_i}}{\sum{x_i^2}} ; \quad \hat{\beta}_1 =\bar{Y}_i-\hat{\beta}_2\bar{X}_i ; \quad \hat{\sigma}^2 =\frac{\sum{e_i^2}}{n-2} \end{aligned} \]
问题是:我们如何知道OLS方法点估计量是否可靠?OLS方法的点估计量是否稳定? OLS方法的点估计量是否可信?
因此,我们需要找到一种表达OLS方法估计稳定性或估计精度的指标!
斜率系数( \(\hat{\beta}_2\) )的总体方差( \(\sigma^2_{\hat{\beta}_2}\) )和总体标准差( \(\sigma_{\hat{\beta}_2}\) ):
\[ \begin{aligned} Var(\hat{\beta}_2) \equiv \sigma_{\hat{\beta}_2}^2 & =\frac{\sigma^2}{\sum{x_i^2}} \\ \sigma_{\hat{\beta}_2} &=\sqrt{\frac{\sigma^2}{\sum{x_i^2}}} \end{aligned} \]
斜率系数( \(\hat{\beta}_2\) )的样本方差( \(S^2_{\hat{\beta}_2}\) )和样本标准差( \(S_{\hat{\beta}_2}\) ):
\[ \begin{aligned} S_{\hat{\beta}_2}^2 &=\frac{\hat{\sigma}^2}{\sum{x_i^2}} \\ S_{\hat{\beta}_2} &=\sqrt{\frac{\hat{\sigma}^2}{\sum{x_i^2}}} \end{aligned} \]
步骤1 \(\hat{\beta}_2\) 的变形:
\[ \begin{aligned} \hat{\beta}_2 &=\frac{\sum{x_iy_i}}{\sum{x_i^2}}= \frac{\sum{\left[ x_i (Y_i -\bar{Y}) \right]} }{\sum{x_i^2}} \\ & = \frac{\sum{ x_iY_i}- \sum{ x_i \bar{Y} } }{\sum{x_i^2}} \\ & = \frac{\sum{x_iY_i}- \bar{Y}\sum{x_i} }{\sum{x_i^2}} && \leftarrow \left[ \sum{x_i}=\sum{(X_i -\bar{X})} = 0 \right] \\ & = \sum{ \left(\frac{x_i}{\sum{x_i^2}} \cdot Y_i \right) } && \leftarrow \left[ k_i \equiv \frac{x_i}{\sum{x_i^2}} \right]\\ & = \sum{k_iY_i} \end{aligned} \]
- 其中, \(k_i \equiv \frac{x_i}{\sum{x_i^2}}\) 。
步骤2:计算 \(\hat{\beta}_2\) 的总体方差( \(\sigma^2_{\hat{\beta}_2}\) ):
\[ \begin{aligned} \sigma^2_{\hat{\beta}_2} & \equiv Var(\hat{\beta}_2) = Var(\sum{k_iY_i} ) \\ & = \sum{\left( k_i^2Var(Y_i) \right)} \\ & = \sum{\left( k_i^2Var(\beta_1 +\beta_2X_i +u_i) \right)} \\ & = \sum{ \left( k_i^2Var(u_i) \right)} && \leftarrow \left[ k_i \equiv \frac{x_i}{\sum{x_i^2}} \right]\\ & = \sum{ \left( \left(\frac{x_i}{\sum{x_i^2}} \right)^2 \cdot \sigma^2 \right)} \\ & = \frac{\sigma^2}{\sum{x_i^2}} \end{aligned} \]
其中, \(Var(u_i) \equiv \sigma^2\) 表示随机干扰项 \(u_i\) 的总体方差。
截距系数( \(\hat{\beta}_1\) )的总体方差( \(\sigma^2_{\hat{\beta}_1}\) )和总体标准差( \(\sigma_{\hat{\beta}_1}\) ):
\[ \begin{aligned} Var(\hat{\beta}_1) \equiv \sigma_{\hat{\beta}_1}^2 &=\frac{\sum{X_i^2}}{n} \cdot \frac{\sigma^2}{\sum{x_i^2}} \\ \sigma_{\hat{\beta}_1} & =\sqrt{\frac{\sum{X_i^2}}{n} \cdot \frac{\sigma^2}{\sum{x_i^2}}} \end{aligned} \]
截距系数 \((\hat{\beta}_1)\) 的样本方差 \((S^2_{\hat{\beta}_1})\) 和样本标准差 \((S_{\hat{\beta}_1})\) :
\[ \begin{aligned} S_{\hat{\beta}_1}^2 &=\frac{\sum{X^2_i}}{n} \cdot \frac{\hat{\sigma}^2}{\sum{x_i^2}} \\ S_{\hat{\beta}_1} &=\sqrt{\frac{\sum{X^2_i}}{n} \cdot \frac{\hat{\sigma}^2}{\sum{x_i^2}}} \end{aligned} \]
步骤1 \(\hat{\beta}_1\) 的变形:
\[ \begin{aligned} \hat{\beta_1} & = \bar{Y}_i-\hat{\beta}_2\bar{X}_i && \leftarrow \left[ \hat{\beta}_2= \sum{k_iY_i} \right] \\ & = \frac{1}{n} \sum{Y_i} - \sum{\left( k_iY_i \cdot \bar{X} \right)} \\ & = \sum{\left( (\frac{1}{n} - k_i\bar{X}) \cdot Y_i \right)} && \leftarrow \left[ w_i \equiv \frac{1}{n} - k_i\bar{X} \right]\\ & = \sum{w_iY_i} \end{aligned} \]
- 其中:令 \(w_i \equiv \frac{1}{n} - k_i\bar{X}\)
步骤2计算 \(\hat{\beta}_1\) 的总体方差( \(\sigma^2_{\hat{\beta}_1}\) ):
\[ \begin{aligned} \sigma^2_{\hat{\beta}_1} & \equiv Var(\hat{\beta_1}) = Var(\sum{w_iY_i}) \\ & = \sum{\left( w_i^2Var(\beta_1 +\beta_2X_i + u_i) \right)} && \leftarrow \left[w_i \equiv \frac{1}{n} - k_i\bar{X} \right]\\ & = \sum{\left( \left( \frac{1}{n} - k_i\bar{X} \right)^2Var(u_i) \right)} \\ & = \sigma^2 \cdot \sum{ \left( \frac{1}{n^2} - \frac{2 \bar{X} k_i}{n} + k_i^2 \bar{X}^2 \right) } && \leftarrow \left[ \sum{k_i} = \sum{\left( \frac{x_i}{\sum{x_i^2}} \right)= \frac{\sum{x_i}} {\sum{x_i^2}}}=0 \right] \\ & = \sigma^2 \cdot \left( \frac{1}{n} + \bar{X}^2\sum{k_i^2} \right) && \leftarrow \left[ k_i \equiv \frac{x_i}{\sum{x_i^2}} \right]\\ & = \sigma^2 \cdot \left( \frac{1}{n} + \bar{X}^2\sum{ \left( \frac{x_i}{\sum{x_i^2}} \right) ^2} \right) \end{aligned} \]
步骤2计算 \(\hat{\beta}_1\) 的总体方差( \(\sigma^2_{\hat{\beta}_1}\) )(续前):
\[ \begin{aligned} & = \sigma^2 \cdot \left( \frac{1}{n} + \bar{X}^2 \frac{\sum{x_i^2}}{\left( \sum{x_i^2} \right)^2} \right) \\ & = \sigma^2 \cdot \left( \frac{1}{n} + \frac{ \bar{X}^2 } { \sum{x_i^2} } \right) \\ & = \frac{\sum{x_i^2} + n\bar{X}^2} {n\sum{x_i^2}} \cdot \sigma^2 && \leftarrow \left[ \sum{x_i^2} + n\bar{X}^2 = \sum{(X_i-\bar{X})^2} + n\bar{X}^2 = \sum{X_i^2}\right]\\ & = \frac{\sum{X_i^2}}{n} \cdot \frac{\sigma^2}{\sum{x_i^2}} \end{aligned} \]
现在做一个内容小结:
为了衡量OLS方法的点估计量是否稳定或是否可信,我们一般采用方差和标准差指标来表达。
大家应熟记斜率和截距估计量的总体方差和样本方差最终公式。
请大家思考如下问题:
总体方差和样本方差都是确定的数么?
二者分别受那些因素的影响?二者又有什么联系?
证明过程中,约定的 \(k_i\) 和 \(w_i\) ,有什么特征?
\[ \begin{cases} \begin{aligned} \sum{k_i} & =0 \\ \sum{k_iX_i} & = 1 \end{aligned} \end{cases} \]
\[ \begin{cases} \begin{aligned} \sum{w_i} & =1 \\ \sum{w_iX_i} & = 0 \end{aligned} \end{cases} \]
对于“教育程度案例”,利用FF-ff计算表,以及我们已算出的如下计算量:
则可以进一步计算出,回归系数的样本方差的标准差分别为:
\[ \begin{aligned} S^2_{\hat{\beta}_2} &= \frac{\hat{\sigma}^2}{\sum{x_i^2}} \\ S_{\hat{\beta}_2} &= \sqrt{\frac{\hat{\sigma}^2}{\sum{x_i^2}}} = \sqrt{0.0048} = 0.0696 \end{aligned} \]
\[ \begin{aligned} S^2_{\hat{\beta}_1} &= \frac{\sum{X_i^2}} {n} \frac{\hat{\sigma}^2} {\sum{x_i^2}} \\ S_{\hat{\beta}_1} &= \sqrt{\frac{\sum{X_i^2}}{n}\frac{\hat{\sigma}^2} {\sum{x_i^2}}} = \sqrt{0.765} = 0.8746 \end{aligned} \]
\[ \begin{aligned} \hat{\beta}_2 & \sim N(\mu_{\hat{\beta}_2}, \sigma^2_{\hat{\beta}_2}) && \leftarrow \left[ \mu_{\hat{\beta}_2}= \beta_2; \quad \sigma^2_{\hat{\beta}_2} = \frac{\sigma^{2}}{\sum x_{i}^{2}} \right] \end{aligned} \]
\[ \begin {aligned} &Z=\frac{\left(\hat{\beta}_{2}-\beta_{2}\right)}{\sqrt{\operatorname{var}\left(\hat{\beta}_{2}\right)}} =\frac{\left(\hat{\beta}_{2}-\beta_{2}\right)}{\sqrt{\sigma_{\beta_{2}}^{2}}} =\frac{\hat{\beta}_{2}-\beta_{2}}{\sigma_{\hat{\beta}_{2}}} =\frac{\left(\hat{\beta}_{2}-\beta_{2}\right)}{\sqrt{\frac{\sigma^{2}}{\sum x_{i}^{2}}}} && \leftarrow Z \sim N(0, 1) \end {aligned} \]
\[ \begin{aligned} T&=\frac{\left(\hat{\beta}_{2}-\beta_{2}\right)}{\sqrt{S_{\beta_{2}}^{2}}} =\frac{\hat{\beta}_{2}-\beta_{2}}{\sqrt{S_{\beta_{2}}^{2}}} =\frac{\hat{\beta}_{2}-\beta_{2}}{S_{\hat{\beta}_{2}}} && \leftarrow T \sim t(n-2) \end{aligned} \]
\[ \begin{aligned} S^2_{\hat{\beta}_2} =\frac{\hat{\sigma}^{2}}{\sum x_{i}^{2}} ; \quad \hat{\sigma}^{2}=\frac{\sum e_{i}^{2}}{n-2} \end{aligned} \]
\[ \begin{aligned} \operatorname{Pr}\left[-t_{\alpha / 2,(n-2)} \leq \mathrm{T} \leq t_{\alpha / 2,(n-2)}\right]=1-\alpha \end{aligned} \]
\[ \begin {aligned} \operatorname{Pr}\left[-t_{\alpha / 2,(n-2)} \leq \frac{\hat{\beta}_{2}-\beta_{2}}{S_{\hat{\beta}_{2}}} \leq t_{\alpha / 2 ,(n-2)}\right]=1-\alpha \end {aligned} \]
\[ \begin {aligned} \operatorname{Pr}\left[\hat{\beta}_{2}-t_{\alpha / 2,(n-2)} \cdot S_{\hat{\beta}_{2}} \leq \beta_{2} \leq \hat{\beta}_{2}+t_{\alpha / 2,(n-2)} \cdot S_{\hat{\beta}_{2}}\right]=1-\alpha \end {aligned} \]
因此, \(\beta_2\) 的 \(100(1-\alpha)\%\) 置信上限和下限分别为:
\[ \hat{\beta}_{2} \pm t_{\alpha / 2} \cdot S_{\hat{\beta}_{2}} \] \(\beta_2\) 的 \(100(1-\alpha)\%\) 置信区间为:
\[ \left[ \hat{\beta}_{2} - t_{\alpha / 2} \cdot S_{\hat{\beta}_{2}}, \quad \hat{\beta}_{2} + t_{\alpha / 2} \cdot S_{\hat{\beta}_{2}} \right] \]
\[ \begin{aligned} \hat{\beta}_1 & \sim N(\mu_{\hat{\beta}_1}, \sigma^2_{\hat{\beta}_1}) && \leftarrow \left[ \mu_{\hat{\beta}_1}= \beta_1; \quad \sigma^2_{\hat{\beta}_1} = \frac{\sum{X_i^2}}{n} \frac{\sigma^{2}}{\sum x_{i}^{2}} \right] \end{aligned} \]
\[ \begin {aligned} &Z=\frac{\left(\hat{\beta}_{1}-\beta_{1}\right)}{\sqrt{\operatorname{var}\left(\hat{\beta}_{1}\right)}} =\frac{\left(\hat{\beta}_{1}-\beta_{1}\right)}{\sqrt{\sigma_{\beta_{1}}^{2}}} =\frac{\hat{\beta}_{1}-\beta_{1}}{\sigma_{\hat{\beta}_{1}}} =\frac{\left(\hat{\beta}_{1}-\beta_{1}\right)}{\sqrt{\frac{\sum{X^2_i}}{n} \cdot \frac{\sigma^{2}}{\sum x_{i}^{2}}}} && \leftarrow Z \sim N(0, 1) \end {aligned} \]
\[ \begin{aligned} T&=\frac{\left(\hat{\beta}_{1}-\beta_{1}\right)}{S^2_{\hat{\beta}_1}} =\frac{\hat{\beta}_{1}-\beta_{1}}{\sqrt{S_{\beta_{1}}^{2}}} =\frac{\hat{\beta}_{1}-\beta_{1}}{S_{\hat{\beta}_{1}}} && \leftarrow T \sim t(n-2) \end{aligned} \]
\[ \begin{aligned} S^2_{\hat{\beta}_1} =\frac{\sum{X_i^2}}{n} \cdot \frac{\hat{\sigma}^{2}}{\sum x_{i}^{2}} ; \quad \hat{\sigma}^{2}=\frac{\sum e_{i}^{2}}{n-2} \end{aligned} \]
\[ \begin{aligned} \operatorname{Pr}\left[-t_{\alpha / 2,(n-2)} \leq \mathrm{T} \leq t_{\alpha / 2,(n-2)}\right]=1-\alpha \end{aligned} \]
\[ \begin {aligned} \operatorname{Pr}\left[-t_{\alpha / 2,(n-2)} \leq \frac{\hat{\beta}_{1}-\beta_{1}}{S_{\hat{\beta}_{1}}} \leq t_{\alpha / 2 ,(n-2)}\right]=1-\alpha \end {aligned} \]
\[ \begin {aligned} \operatorname{Pr}\left[\hat{\beta}_{1}-t_{\alpha / 2,(n-2)} \cdot S_{\hat{\beta}_{1}} \leq \beta_{1} \leq \hat{\beta}_{1}+t_{\alpha / 2,(n-2)} \cdot S_{\hat{\beta}_{1}}\right]=1-\alpha \end {aligned} \]
因此, \(\beta_1\) 的 \(100(1-\alpha)\%\) 置信上限和下限分别为:
\[ \hat{\beta}_{1} \pm t_{\alpha / 2} \cdot S_{\hat{\beta}_{1}} \] \(\beta_1\) 的 \(100(1-\alpha)\%\) 置信区间为:
\[ \left[ \hat{\beta}_{1} - t_{\alpha / 2} \cdot S_{\hat{\beta}_{1}}, \quad \hat{\beta}_{1} + t_{\alpha / 2} \cdot S_{\hat{\beta}_{1}} \right] \]
\[ \begin {aligned} \chi^{2} & =(n-2) \frac{\hat{\sigma}^{2}}{\sigma^{2}} &&\leftarrow \quad \chi^{2} \sim \chi^{2}(n-2) \end {aligned} \]
\[ \begin {aligned} \operatorname{Pr}\left(\chi_{\alpha / 2}^{2} \leq \chi^{2} \leq \chi_{\alpha / 2}^{2}\right)=1-\alpha \end {aligned} \]
\[ \begin {aligned} \operatorname{Pr}\left(\chi_{\alpha / 2}^{2} \leq (n-2) \frac{\hat{\sigma}^{2}}{\sigma^{2}} \leq \chi_{1-\alpha / 2}^{2}\right)=1-\alpha \end {aligned} \]
\[ \begin {aligned} \operatorname{Pr}\left[(n-2) \frac{\hat{\sigma}^{2}}{\chi_{1-\alpha/2}^{2}} \leq \sigma^{2} \leq (n-2) \frac{\hat{\sigma}^{2}}{\chi_{\alpha / 2}^{2}}\right]=1-\alpha \end {aligned} \]
因此, \(\sigma^2\) 的 \(100(1-\alpha)\%\) 为:
\[ \left[ (n-2) \frac{\hat{\sigma}^{2}}{\chi_{1-\alpha/2}^{2}}, \quad (n-2) \frac{\hat{\sigma}^{2}}{\chi_{\alpha / 2}^{2}}\right] \]
我们继续利用样本数据对教育和工资案例进行分析。
教育和工资案例的总体回归模型(PRM)如下:
\[ \begin{aligned} Wage_i & = \beta_1 + \beta_2 Edu_i +u_i \\ Y_i & = \beta_1 + \beta_2 X_i +u_i \\ \end{aligned} \]
教育和工资案例的总体回归模型(SRM)如下:
\[ \begin{aligned} \widehat{Wage}_i & = \hat{\beta}_1 + \hat{\beta}_2 Edu_i +e_i \\ \hat{Y}_i & = \hat{\beta}_1 + \hat{\beta}_2 X_i + e_i \\ \end{aligned} \]
我们之前已算出“教育程度案例”中的如下计算量:
回归系数: \(\hat{\beta}_1 =\) -0.0145; \(\hat{\beta}_2 =\) 0.7241; \(\hat{\sigma}^2=\) 0.8812 。
回归误差方差: \(\hat{\sigma}^2=\) 0.8812。
回归系数的样本方差: \(S^2_{\hat{\beta}_1} = \frac{\sum{X_i^2}}{n} \cdot \frac{\hat{\sigma}^2} {\sum{x_i^2}}=\) 0.7650; \(S^2_{\hat{\beta}_2} = \frac{\hat{\sigma}^2} {\sum{x_i^2}}=\) 0.0048;
回归系数的样本标准差: \(S_{\hat{\beta}_1} =\) 0.8746; \(S_{\hat{\beta}_2} =\) 0.0696。
给定 \(\alpha=0.05,\quad (1-\alpha) 100 \%=95 \%\) ,我们可以查t分布表得到理论参照值: \(t_{\alpha / 2}(n-2)=t_{0.05 / 2}(11)=\) 2.2010
下面我们进一步计算回归系数的置信区间:
那么,截距参数 \(\beta_1\) 的95%置信区间为:
\[ \begin{aligned} \hat{\beta}_{1} - t_{\alpha / 2} \cdot S_{\hat{\beta}_{1}} \quad \leq & \beta_1 \leq \quad \hat{\beta}_{1} + t_{\alpha / 2} \cdot S_{\hat{\beta}_{1}} \\ -0.0145 - 2.201 \cdot 0.8746 \quad \leq & \beta_1 \leq \quad -0.0145 + 2.201 \cdot 0.8746 \\ -1.9395 \quad \leq & \beta_1 \leq \quad 1.9106 \\ \end{aligned} \]
那么,斜率参数 \(\beta_2\) 的95%置信区间为:
\[ \begin{aligned} \hat{\beta}_{2} - t_{\alpha / 2} \cdot S_{\hat{\beta}_{2}} \quad \leq & \beta_2 \leq \quad \hat{\beta}_{2} + t_{\alpha / 2} \cdot S_{\hat{\beta}_{2}} \\ 0.7241 - 2.201 \cdot 0.0696 \quad \leq & \beta_2 \leq \quad 0.7241 + 2.201 \cdot 0.0696 \\ 0.5709 \quad \leq & \beta_2 \leq \quad 0.8772 \\ \end{aligned} \]
给定 \(\alpha=0.05,\quad (1-\alpha) 100 \%=95 \%\)
查卡方分布表可知:
\(\chi^2_{\alpha / 2}(n-2)=\chi^2_{0.05 / 2}(11)=\chi^2_{0.025}(11)=\) 3.8157
\(\chi^2_{1-\alpha / 2}(n-2)=\chi^2_{1-0.05 / 2}(11)=\chi^2_{0.975}(11)=\) 21.9200
们之前已算出回归误差方差 \(\hat{\sigma}^2=\frac{\sum{e_i^2}}{n-2}=\) 0.8812 。因此可以算出 \(\sigma^2\) 的95%置信区间为:
\[ \begin{aligned} (n-2) \frac{\hat{\sigma}^{2}}{\chi_{\alpha}^{2}} \leq \sigma^{2} \leq(n-2) \frac{\hat{\sigma}^{2}}{\chi_{1-\alpha / 2}^{2}}\\ 11 \frac{0.8812}{21.92} \leq \sigma^{2} \leq 11 \frac{0.8812}{3.8157}\\ 0.4422 \leq \sigma^{2} \leq 2.5403\\ \end{aligned} \]
第5章 相关和回归分析 [05-03] OLS方法与参数估计