Day 04 · 手写教程

Hausman 检验

FE 还是 RE，不只是看 p 值

Hausman 检验是什么？

Hausman 检验用于面板数据里比较固定效应模型和随机效应模型。它真正检验的不是“哪个模型显著”，而是随机效应模型的关键假设是否可以接受：个体不可观测特征是否与解释变量无关。

如果这个假设站得住，随机效应模型可以使用；如果这个假设站不住，随机效应估计会不一致，通常应优先考虑固定效应。

这个案例怎么跑？

先用 xtset firm_id year 声明企业-年份面板。然后用同一组变量分别跑 xtreg ..., fe 和 xtreg ..., re，并分别保存结果。最后执行 hausman fe re, sigmamore。

这里最重要的是“同一组变量”。不要 FE 加了控制变量、RE 少了控制变量，然后拿它们做 Hausman。那样比较的是两个不同模型，不是 FE 和 RE 的差异。

结果应该怎么读？

本案例的 Hausman 统计量为 Chi2(13)=13.3383，p=0.4220。p 值不小，所以不拒绝随机效应的原假设。论文里可以说“未发现随机效应设定与固定效应估计存在系统差异”，但不能写“证明随机效应一定正确”。

论文里怎么写？

可以写：本文分别估计固定效应模型和随机效应模型，并进行 Hausman 检验。检验结果显示 Chi2(13)=13.3383，p=0.4220，未拒绝随机效应模型的外生性设定。因此，在本案例口径下，随机效应模型可以作为可接受设定之一；同时，主回归仍需结合研究设计、固定效应控制和稳健性检验共同判断。

常见错误

不要只写“p>0.05，所以选择 RE”。更严谨的说法是“不拒绝 RE 的关键假设”。如果你的主表使用聚类稳健标准误，还需要说明 classic Hausman 与稳健标准误口径并不完全等价。

本页案例代码和输出

下面这部分是本教程对应的实际案例材料，方便你把前面的解释和真实输出对上。

Stata 代码

log using "/root/workspace/empirical-wizard/workspace/4396bde4/analysis.log", replace text
global JOB_DIR "/root/workspace/empirical-wizard/workspace/4396bde4"
set more off
adopath + "/root/ado/plus"
global DATA_PATH "/root/workspace/empirical-wizard/workspace/test_e2e/csmar_innovation.csv"
import delimited "/root/workspace/empirical-wizard/workspace/test_e2e/csmar_innovation.csv", clear case(preserve)
capture confirm global JOB_DIR
if _rc global JOB_DIR "."
quietly duplicates drop
capture confirm numeric variable stkcd
if _rc {
    tempvar __hid
    encode stkcd, gen(`__hid')
    local idnum "`__hid'"
}
else {
    local idnum "stkcd"
}
tempname fh
capture file close `fh'
file open `fh' using "$JOB_DIR/hausman_test.csv", write replace
file write `fh' "统计量,值" _n
xtset `idnum' year
capture noisily xtreg patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age, fe
if _rc {
    file write `fh' "状态,skipped" _n
    file write `fh' "原因,固定效应模型不可估；可能是完整样本不足、核心解释变量无组内变化或变量共线" _n
    file close `fh'
    di as error "Hausman skipped: FE not estimable"
    exit 0
}
capture scalar __hausman_se = abs(_se[dfi_index])
if _rc {
    file write `fh' "状态,skipped" _n
    file write `fh' "原因,固定效应中核心解释变量被吸收或标准误不可取，Hausman 不具备解释性" _n
    file close `fh'
    di as error "Hausman skipped: core coefficient not available in FE"
    exit 0
}
if scalar(__hausman_se) <= 1e-12 {
    file write `fh' "状态,skipped" _n
    file write `fh' "原因,固定效应中核心解释变量被吸收或标准误为0，Hausman 不具备解释性" _n
    file close `fh'
    di as error "Hausman skipped: core coefficient not estimable in FE"
    exit 0
}
estimates store __fe
capture noisily xtreg patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age, re
if _rc {
    file write `fh' "状态,skipped" _n
    file write `fh' "原因,随机效应模型不可估；可能是完整样本不足或解释变量无有效变化" _n
    file close `fh'
    di as error "Hausman skipped: RE not estimable"
    exit 0
}
estimates store __re
capture noisily hausman __fe __re, sigmamore
if _rc {
    file write `fh' "状态,skipped" _n
    file write `fh' "原因,Hausman 统计量不可计算；FE/RE 方差差矩阵不满足要求" _n
    file close `fh'
    di as error "Hausman skipped: statistic unavailable"
    exit 0
}
scalar __chi2 = r(chi2)
scalar __df = r(df)
scalar __p = r(p)
local chi2_s : display %9.4f __chi2
local df_s : display %9.0f __df
local p_s : display %9.4f __p
* Guard against missing/negative chi2 — sigmamore should keep
* it non-negative, but FE-RE variance diffs occasionally produce
* missing values (e.g. when the FE absorbs the core regressor).
* Without the guard, missing __p would silently fall through the
* `cond(__p < 0.05, ...)` branch as "不拒绝原假设" — falsely
* recommending Random Effects.
local conclude ""
if missing(__chi2) | missing(__p) | __chi2 < 0 {
    local conclude = "无法计算（chi² 缺失或负值，FE/RE 方差差矩阵退化）；建议直接采用 FE 设定"
}
else if __p < 0.05 {
    local conclude = "拒绝原假设，使用固定效应模型 (FE)"
}
else {
    local conclude = "不拒绝原假设，使用随机效应模型 (RE)"
}
file write `fh' "Chi2,`chi2_s'" _n
file write `fh' "df,`df_s'" _n
file write `fh' "p-value,`p_s'" _n
file write `fh' "结论,`conclude'" _n
file close `fh'
di "Hausman: chi2=`chi2_s' df=`df_s' p=`p_s'"
di "结论: `conclude'"
log close

输出表

统计量	值
Chi2	13.3383
df	13
p-value	0.4220
结论	不拒绝原假设，使用随机效应模型 (RE)

案例图

写作检查

本文在共用企业面板样本上报告Hausman 检验，核心输出见 hausman_test.csv。结果解释时同时关注样本口径、变量构造、系数方向、标准误和适用前提，避免只凭单个 p 值完成方法选择。

确认本页使用的因变量、核心解释变量、控制变量与论文主模型一致。
先看表格里的样本口径，再看系数、p 值或诊断指标。
代码里的输出文件名要能对应网页展示的结果表。

返回方法库 · 打开 empirical-wizard