Method 08 · robustness

稳健性检验

用同一份企业面板跑出可解释的模型证据

稳健性检验 的 Markdown 风格教程:基于共用 CSMAR 风格案例生成实际代码、结果表和案例图。

返回方法库 · 共用案例 · 变量百科

一、稳健性检验是什么?

这页是 稳健性检验 的方法文档。所有表格和图都由 marketing/method_case_assets/generate_assets.py 从同一份 csmar_innovation_realistic.csv 生成,避免用占位图充当教程。重点是用 firm-year 面板说明模型设定、固定效应、聚类标准误和结果解读。

二、按这个案例走一遍

开始前先确认

  • 先有明确的基准回归。没有主结果,就谈不上稳健性。
  • 稳健性要有理由:缩尾、换变量、换模型、换样本都必须解释为什么合理。
  • 稳健性看方向、量级和显著性是否大体一致,不要求每个数字完全一样。
  • 不要只保留显著的稳健性结果;不一致的结果也需要解释。

操作顺序

步骤你在做什么做到什么程度算对
1. 选一个扰动例如 1%-99% 缩尾。扰动要能回应一个真实风险。
2. 重新生成变量对 Y、X 或控制变量做一致处理。不要只处理能让结果好看的变量。
3. 重跑主模型固定效应和标准误口径保持一致。否则差异可能来自模型变了。
4. 对比核心系数看 DFI 系数和 p 值。方向不变且量级接近,稳健性较好。
5. 写边界说明稳健性覆盖了什么风险。不要写成“所有稳健性均通过”这种空话。

代码逐行解释

代码/命令这行在干什么
winsor2 ... cuts(1 99)按 1% 和 99% 分位缩尾,降低极端值影响。
suffix(_w)给缩尾变量加后缀,避免覆盖原始变量。
xtreg ln_patent1_w dfi_index_w ...用缩尾后的变量重跑主模型。
vce(cluster firm_id)标准误口径和基准回归保持一致。
export delimited保存稳健性表,和基准表并排比较。

结果表怎么读

格子读法
基准系数原始主模型的 DFI 系数。
缩尾系数处理极端值后的 DFI 系数。
系数变化变化很小,说明主结果不太依赖极端值;变化很大就要解释。
缩尾后 p 值显著性是否保持,但不能只看它。

最容易写错的地方

  • 不要把稳健性当成“多跑几张显著表”。每个检验都要对应一个风险。
  • 不要换了样本、变量、标准误,却只说结果稳健;要说明到底换了什么。
  • 不要因为稳健性不显著就偷偷删掉。真实教程应该告诉读者如何解释冲突。
自己复现时要做到

复现时做一张对照小表:基准系数、稳健系数、差值、你的解释。

三、先看这个案例的结论

  • dfi_index = 0.5660***;0.5530***。
  • roa = 0.3040***;0.2924***。
  • lev = -0.0548;-0.0545。
  • 这些数字来自页面里的结果表;写论文时先解释数值含义,再讨论理论含义。

四、案例口径

字段口径
数据CSMAR 风格 A 股企业创新面板
原始样本196 家上市公司,2015-2020 年,约 1200 个公司-年观测;各方法有效样本以本页输出表 N 为准
因变量patent_count;回归页通常使用 ln(1 + patent_count)
核心解释变量dfi_index,数字普惠金融指数;部分真实烟测输出展示的是标准化后的 dfi_index
控制变量roa、lev、size、growth、cashflow、tobinq、top1、dual、board、indep、soe、age
输出文件regression_table_稳健性检验.csv
角色要求dv、iv
依赖包无额外 Stata 社区包要求

五、实际代码

下面是本页对应的最小可复现 Stata 代码。生产环境里 empirical-wizard 会在此基础上处理变量映射、输出校验、失败诊断和报告装配。

log using "/root/workspace/empirical-wizard/workspace/2a35153b/analysis.log", replace text
global JOB_DIR "/root/workspace/empirical-wizard/workspace/2a35153b"
set more off
adopath + "/root/ado/plus"
global DATA_PATH "/root/workspace/empirical-wizard/workspace/test_e2e/csmar_innovation.csv"
import delimited "/root/workspace/empirical-wizard/workspace/test_e2e/csmar_innovation.csv", clear case(preserve)
capture confirm global JOB_DIR
if _rc global JOB_DIR "."
quietly duplicates drop
local dvvar "patent_count"
local ivvar "dfi_index"
local controls "roa lev size growth cashflow tobinq top1 dual board indep soe age"
local idvar "stkcd"
local timevar "year"
local industryvar "ind"
local geovar ""
local allvars "dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age"
capture confirm variable `dvvar'
if _rc {
    di as error "Dependent variable not found: `dvvar'"
    exit 111
}
capture confirm variable `ivvar'
if _rc {
    di as error "Core independent variable not found: `ivvar'"
    exit 111
}
local absorb_id "`idvar'"
if "`idvar'" != "" {
    capture confirm numeric variable `idvar'
    if _rc {
        tempvar __ewiz_id
        encode `idvar', gen(`__ewiz_id')
        local absorb_id "`__ewiz_id'"
    }
}
local cluster_id "`absorb_id'"
local fe_time "`timevar'"
if "`timevar'" != "" {
    capture confirm numeric variable `timevar'
    if _rc {
        tempvar __ewiz_time
        encode `timevar', gen(`__ewiz_time')
        local fe_time "`__ewiz_time'"
    }
}
local industry_fe "`industryvar'"
if "`industryvar'" != "" {
    capture confirm numeric variable `industryvar'
    if _rc {
        tempvar __ewiz_industry
        encode `industryvar', gen(`__ewiz_industry')
        local industry_fe "`__ewiz_industry'"
    }
}
local cluster_industry "`industry_fe'"
local geo_fe "`geovar'"
if "`geovar'" != "" {
    capture confirm numeric variable `geovar'
    if _rc {
        tempvar __ewiz_geo
        encode `geovar', gen(`__ewiz_geo')
        local geo_fe "`__ewiz_geo'"
    }
}
local cluster_geo "`geo_fe'"
tempfile ewiz_summary
tempname ewiz_sum
postfile `ewiz_sum' str32 spec double coef se pvalue str4 stars double N R2 using `ewiz_summary', replace
foreach v of local allvars {
    local cell_r1_`v' ""
    local secell_r1_`v' ""
    local cell_r2_`v' ""
    local secell_r2_`v' ""
    local cell_r3_`v' ""
    local secell_r3_`v' ""
    local cell_r4_`v' ""
    local secell_r4_`v' ""
    local cell_r7_`v' ""
    local secell_r7_`v' ""
}
preserve
quietly areg `dvvar' `ivvar' roa lev size growth cashflow tobinq top1 dual board indep soe age i.`fe_time', absorb(`absorb_id') vce(cluster `cluster_id' `fe_time')
restore
capture local N_r1 : display %12.0f e(N)
if "`N_r1'"=="" local N_r1 = .
capture local R2_r1 : display %9.4f e(r2)
if "`R2_r1'"=="" local R2_r1 = .
capture scalar __ewiz_main_b_r1 = _b[`ivvar']
capture scalar __ewiz_main_se_r1 = _se[`ivvar']
capture local coefmain_r1 : display %9.4f __ewiz_main_b_r1
capture if abs(__ewiz_main_b_r1) < 0.00005 & __ewiz_main_b_r1 != 0 local coefmain_r1 : display %9.2e __ewiz_main_b_r1
capture local semain_r1 : display %9.4f __ewiz_main_se_r1
capture if abs(__ewiz_main_se_r1) < 0.00005 & __ewiz_main_se_r1 != 0 local semain_r1 : display %9.2e __ewiz_main_se_r1
local pmain_r1 = .
local starsmain_r1 ""
local __ewiz_model_valid_r1 = 1
capture scalar __ewiz_coef_abs_r1 = abs(_b[`ivvar'])
if _rc local __ewiz_model_valid_r1 = 0
capture scalar __ewiz_se_abs_r1 = abs(_se[`ivvar'])
if _rc local __ewiz_model_valid_r1 = 0
capture if scalar(__ewiz_se_abs_r1) <= 1e-12 local __ewiz_model_valid_r1 = 0
capture if scalar(__ewiz_coef_abs_r1) <= 1e-12 & scalar(__ewiz_se_abs_r1) <= 1e-12 local __ewiz_model_valid_r1 = 0
capture local __df_r1 = e(df_r)
capture local __z_r1 = abs(_b[`ivvar'] / _se[`ivvar'])
if !_rc & `__ewiz_model_valid_r1' == 1 {
    if "`__df_r1'"=="" | "`__df_r1'"=="." local pmain_r1 = 2*(1-normal(`__z_r1'))
    else local pmain_r1 = 2*ttail(`__df_r1', `__z_r1')
}
if `__ewiz_model_valid_r1' == 1 local starsmain_r1 = cond(`pmain_r1'<0.01,"***",cond(`pmain_r1'<0.05,"**",cond(`pmain_r1'<0.1,"*","")))
if `__ewiz_model_valid_r1' == 1 {
    capture post `ewiz_sum' ("(R1)") (_b[`ivvar']) (_se[`ivvar']) (`pmain_r1') ("`starsmain_r1'") (`N_r1') (`R2_r1')
    if _rc post `ewiz_sum' ("(R1)") (.) (.) (.) ("") (`N_r1') (`R2_r1')
}
else post `ewiz_sum' ("(R1)") (.) (.) (.) ("") (`N_r1') (`R2_r1')
if `__ewiz_model_valid_r1' == 1 {
    local __included_vars "`ivvar' dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age"
    foreach v of local allvars {
        local __is_included = 0
        foreach __inc of local __included_vars {
            if "`v'" == "`__inc'" local __is_included = 1
        }
        if `__is_included' == 0 continue
        local model_term "`v'"
        if "`v'" == "`ivvar'" local model_term "`ivvar'"
        capture scalar __ewiz_term_b = _b[`model_term']
        capture scalar __ewiz_term_se = _se[`model_term']
        if _rc | missing(__ewiz_term_se) | __ewiz_term_se <= 1e-12 {
            local cell_r1_`v' "omitted"
            local secell_r1_`v' "(absorbed)"
            continue
        }
        capture local __coef : display %9.4f __ewiz_term_b
        capture if abs(__ewiz_term_b) < 0.00005 & __ewiz_term_b != 0 local __coef : display %9.2e __ewiz_term_b
        if !_rc {
            capture local __se : display %9.4f __ewiz_term_se
            capture if abs(__ewiz_term_se) < 0.00005 & __ewiz_term_se != 0 local __se : display %9.2e __ewiz_term_se
            local __p = .
            local __stars ""
            capture local __df = e(df_r)
            capture local __z = abs(_b[`model_term'] / _se[`model_term'])
            if !_rc {
                if "`__df'"=="" | "`__df'"=="." local __p = 2*(1-normal(`__z'))
                else local __p = 2*ttail(`__df', `__z')
            }
            if !_rc local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
            local cell_r1_`v' "`__coef'`__stars'"
            local secell_r1_`v' "(`__se')"
        }
    }
}
di "(R1): `ivvar'=`coefmain_r1'`starsmain_r1' (se=`semain_r1', p=`pmain_r1'), N=`N_r1', R2=`R2_r1'"
tempvar dv_w iv_w
gen double `dv_w' = `dvvar'
gen double `iv_w' = `ivvar'
quietly summarize `dvvar', detail
local dv_p1 = r(p1)
local dv_p99 = r(p99)
replace `dv_w' = `dv_p1' if !missing(`dv_w') & `dv_w' < `dv_p1'
replace `dv_w' = `dv_p99' if !missing(`dv_w') & `dv_w' > `dv_p99'
quietly summarize `ivvar', detail
local iv_p1 = r(p1)
local iv_p99 = r(p99)
replace `iv_w' = `iv_p1' if !missing(`iv_w') & `iv_w' < `iv_p1'
replace `iv_w' = `iv_p99' if !missing(`iv_w') & `iv_w' > `iv_p99'
quietly areg `dv_w' `iv_w' roa lev size growth cashflow tobinq top1 dual board indep soe age i.`fe_time', absorb(`absorb_id') vce(robust)
capture local N_r2 : display %12.0f e(N)
if "`N_r2'"=="" local N_r2 = .
capture local R2_r2 : display %9.4f e(r2)
if "`R2_r2'"=="" local R2_r2 = .
capture scalar __ewiz_main_b_r2 = _b[`iv_w']
capture scalar __ewiz_main_se_r2 = _se[`iv_w']
capture local coefmain_r2 : display %9.4f __ewiz_main_b_r2
capture if abs(__ewiz_main_b_r2) < 0.00005 & __ewiz_main_b_r2 != 0 local coefmain_r2 : display %9.2e __ewiz_main_b_r2
capture local semain_r2 : display %9.4f __ewiz_main_se_r2
capture if abs(__ewiz_main_se_r2) < 0.00005 & __ewiz_main_se_r2 != 0 local semain_r2 : display %9.2e __ewiz_main_se_r2
local pmain_r2 = .
local starsmain_r2 ""
local __ewiz_model_valid_r2 = 1
capture scalar __ewiz_coef_abs_r2 = abs(_b[`iv_w'])
if _rc local __ewiz_model_valid_r2 = 0
capture scalar __ewiz_se_abs_r2 = abs(_se[`iv_w'])
if _rc local __ewiz_model_valid_r2 = 0
capture if scalar(__ewiz_se_abs_r2) <= 1e-12 local __ewiz_model_valid_r2 = 0
capture if scalar(__ewiz_coef_abs_r2) <= 1e-12 & scalar(__ewiz_se_abs_r2) <= 1e-12 local __ewiz_model_valid_r2 = 0
capture local __df_r2 = e(df_r)
capture local __z_r2 = abs(_b[`iv_w'] / _se[`iv_w'])
if !_rc & `__ewiz_model_valid_r2' == 1 {
    if "`__df_r2'"=="" | "`__df_r2'"=="." local pmain_r2 = 2*(1-normal(`__z_r2'))
    else local pmain_r2 = 2*ttail(`__df_r2', `__z_r2')
}
if `__ewiz_model_valid_r2' == 1 local starsmain_r2 = cond(`pmain_r2'<0.01,"***",cond(`pmain_r2'<0.05,"**",cond(`pmain_r2'<0.1,"*","")))
if `__ewiz_model_valid_r2' == 1 {
    capture post `ewiz_sum' ("(R2)") (_b[`iv_w']) (_se[`iv_w']) (`pmain_r2') ("`starsmain_r2'") (`N_r2') (`R2_r2')
    if _rc post `ewiz_sum' ("(R2)") (.) (.) (.) ("") (`N_r2') (`R2_r2')
}
else post `ewiz_sum' ("(R2)") (.) (.) (.) ("") (`N_r2') (`R2_r2')
if `__ewiz_model_valid_r2' == 1 {
    local __included_vars "`iv_w' dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age"
    foreach v of local allvars {
        local __is_included = 0
        foreach __inc of local __included_vars {
            if "`v'" == "`__inc'" local __is_included = 1
        }
        if `__is_included' == 0 continue
        local model_term "`v'"
        if "`v'" == "`ivvar'" local model_term "`iv_w'"
        capture scalar __ewiz_term_b = _b[`model_term']
        capture scalar __ewiz_term_se = _se[`model_term']
        if _rc | missing(__ewiz_term_se) | __ewiz_term_se <= 1e-12 {
            local cell_r2_`v' "omitted"
            local secell_r2_`v' "(absorbed)"
            continue
        }
        capture local __coef : display %9.4f __ewiz_term_b
        capture if abs(__ewiz_term_b) < 0.00005 & __ewiz_term_b != 0 local __coef : display %9.2e __ewiz_term_b
        if !_rc {
            capture local __se : display %9.4f __ewiz_term_se
            capture if abs(__ewiz_term_se) < 0.00005 & __ewiz_term_se != 0 local __se : display %9.2e __ewiz_term_se
            local __p = .
            local __stars ""
            capture local __df = e(df_r)
            capture local __z = abs(_b[`model_term'] / _se[`model_term'])
            if !_rc {
                if "`__df'"=="" | "`__df'"=="." local __p = 2*(1-normal(`__z'))
                else local __p = 2*ttail(`__df', `__z')
            }
            if !_rc local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
            local cell_r2_`v' "`__coef'`__stars'"
            local secell_r2_`v' "(`__se')"
        }
    }
}
di "(R2): `iv_w'=`coefmain_r2'`starsmain_r2' (se=`semain_r2', p=`pmain_r2'), N=`N_r2', R2=`R2_r2'"
tempvar dv_w5 iv_w5
gen double `dv_w5' = `dvvar'
gen double `iv_w5' = `ivvar'
quietly summarize `dvvar', detail
local dv_p5 = r(p5)
local dv_p95 = r(p95)
replace `dv_w5' = `dv_p5' if !missing(`dv_w5') & `dv_w5' < `dv_p5'
replace `dv_w5' = `dv_p95' if !missing(`dv_w5') & `dv_w5' > `dv_p95'
quietly summarize `ivvar', detail
local iv_p5 = r(p5)
local iv_p95 = r(p95)
replace `iv_w5' = `iv_p5' if !missing(`iv_w5') & `iv_w5' < `iv_p5'
replace `iv_w5' = `iv_p95' if !missing(`iv_w5') & `iv_w5' > `iv_p95'
quietly areg `dv_w5' `iv_w5' roa lev size growth cashflow tobinq top1 dual board indep soe age i.`fe_time', absorb(`absorb_id') vce(robust)
capture local N_r3 : display %12.0f e(N)
if "`N_r3'"=="" local N_r3 = .
capture local R2_r3 : display %9.4f e(r2)
if "`R2_r3'"=="" local R2_r3 = .
capture scalar __ewiz_main_b_r3 = _b[`iv_w5']
capture scalar __ewiz_main_se_r3 = _se[`iv_w5']
capture local coefmain_r3 : display %9.4f __ewiz_main_b_r3
capture if abs(__ewiz_main_b_r3) < 0.00005 & __ewiz_main_b_r3 != 0 local coefmain_r3 : display %9.2e __ewiz_main_b_r3
capture local semain_r3 : display %9.4f __ewiz_main_se_r3
capture if abs(__ewiz_main_se_r3) < 0.00005 & __ewiz_main_se_r3 != 0 local semain_r3 : display %9.2e __ewiz_main_se_r3
local pmain_r3 = .
local starsmain_r3 ""
local __ewiz_model_valid_r3 = 1
capture scalar __ewiz_coef_abs_r3 = abs(_b[`iv_w5'])
if _rc local __ewiz_model_valid_r3 = 0
capture scalar __ewiz_se_abs_r3 = abs(_se[`iv_w5'])
if _rc local __ewiz_model_valid_r3 = 0
capture if scalar(__ewiz_se_abs_r3) <= 1e-12 local __ewiz_model_valid_r3 = 0
capture if scalar(__ewiz_coef_abs_r3) <= 1e-12 & scalar(__ewiz_se_abs_r3) <= 1e-12 local __ewiz_model_valid_r3 = 0
capture local __df_r3 = e(df_r)
capture local __z_r3 = abs(_b[`iv_w5'] / _se[`iv_w5'])
if !_rc & `__ewiz_model_valid_r3' == 1 {
    if "`__df_r3'"=="" | "`__df_r3'"=="." local pmain_r3 = 2*(1-normal(`__z_r3'))
    else local pmain_r3 = 2*ttail(`__df_r3', `__z_r3')
}
if `__ewiz_model_valid_r3' == 1 local starsmain_r3 = cond(`pmain_r3'<0.01,"***",cond(`pmain_r3'<0.05,"**",cond(`pmain_r3'<0.1,"*","")))
if `__ewiz_model_valid_r3' == 1 {
    capture post `ewiz_sum' ("(R3)") (_b[`iv_w5']) (_se[`iv_w5']) (`pmain_r3') ("`starsmain_r3'") (`N_r3') (`R2_r3')
    if _rc post `ewiz_sum' ("(R3)") (.) (.) (.) ("") (`N_r3') (`R2_r3')
}
else post `ewiz_sum' ("(R3)") (.) (.) (.) ("") (`N_r3') (`R2_r3')
if `__ewiz_model_valid_r3' == 1 {
    local __included_vars "`iv_w5' dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age"
    foreach v of local allvars {
        local __is_included = 0
        foreach __inc of local __included_vars {
            if "`v'" == "`__inc'" local __is_included = 1
        }
        if `__is_included' == 0 continue
        local model_term "`v'"
        if "`v'" == "`ivvar'" local model_term "`iv_w5'"
        capture scalar __ewiz_term_b = _b[`model_term']
        capture scalar __ewiz_term_se = _se[`model_term']
        if _rc | missing(__ewiz_term_se) | __ewiz_term_se <= 1e-12 {
            local cell_r3_`v' "omitted"
            local secell_r3_`v' "(absorbed)"
            continue
        }
        capture local __coef : display %9.4f __ewiz_term_b
        capture if abs(__ewiz_term_b) < 0.00005 & __ewiz_term_b != 0 local __coef : display %9.2e __ewiz_term_b
        if !_rc {
            capture local __se : display %9.4f __ewiz_term_se
            capture if abs(__ewiz_term_se) < 0.00005 & __ewiz_term_se != 0 local __se : display %9.2e __ewiz_term_se
            local __p = .
            local __stars ""
            capture local __df = e(df_r)
            capture local __z = abs(_b[`model_term'] / _se[`model_term'])
            if !_rc {
                if "`__df'"=="" | "`__df'"=="." local __p = 2*(1-normal(`__z'))
                else local __p = 2*ttail(`__df', `__z')
            }
            if !_rc local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
            local cell_r3_`v' "`__coef'`__stars'"
            local secell_r3_`v' "(`__se')"
        }
    }
}
di "(R3): `iv_w5'=`coefmain_r3'`starsmain_r3' (se=`semain_r3', p=`pmain_r3'), N=`N_r3', R2=`R2_r3'"
tempvar _resid_keep
quietly areg `dvvar' `ivvar' roa lev size growth cashflow tobinq top1 dual board indep soe age i.`fe_time', absorb(`absorb_id') vce(robust)
capture predict double `_resid_keep', rstudent
if _rc == 0 {
    replace `_resid_keep' = . if abs(`_resid_keep') > 3
}
preserve
capture drop if missing(`_resid_keep')
quietly areg `dvvar' `ivvar' roa lev size growth cashflow tobinq top1 dual board indep soe age i.`fe_time', absorb(`absorb_id') vce(robust)
restore
capture local N_r4 : display %12.0f e(N)
if "`N_r4'"=="" local N_r4 = .
capture local R2_r4 : display %9.4f e(r2)
if "`R2_r4'"=="" local R2_r4 = .
capture scalar __ewiz_main_b_r4 = _b[`ivvar']
capture scalar __ewiz_main_se_r4 = _se[`ivvar']
capture local coefmain_r4 : display %9.4f __ewiz_main_b_r4
capture if abs(__ewiz_main_b_r4) < 0.00005 & __ewiz_main_b_r4 != 0 local coefmain_r4 : display %9.2e __ewiz_main_b_r4
capture local semain_r4 : display %9.4f __ewiz_main_se_r4
capture if abs(__ewiz_main_se_r4) < 0.00005 & __ewiz_main_se_r4 != 0 local semain_r4 : display %9.2e __ewiz_main_se_r4
local pmain_r4 = .
local starsmain_r4 ""
local __ewiz_model_valid_r4 = 1
capture scalar __ewiz_coef_abs_r4 = abs(_b[`ivvar'])
if _rc local __ewiz_model_valid_r4 = 0
capture scalar __ewiz_se_abs_r4 = abs(_se[`ivvar'])
if _rc local __ewiz_model_valid_r4 = 0
capture if scalar(__ewiz_se_abs_r4) <= 1e-12 local __ewiz_model_valid_r4 = 0
capture if scalar(__ewiz_coef_abs_r4) <= 1e-12 & scalar(__ewiz_se_abs_r4) <= 1e-12 local __ewiz_model_valid_r4 = 0
capture local __df_r4 = e(df_r)
capture local __z_r4 = abs(_b[`ivvar'] / _se[`ivvar'])
if !_rc & `__ewiz_model_valid_r4' == 1 {
    if "`__df_r4'"=="" | "`__df_r4'"=="." local pmain_r4 = 2*(1-normal(`__z_r4'))
    else local pmain_r4 = 2*ttail(`__df_r4', `__z_r4')
}
if `__ewiz_model_valid_r4' == 1 local starsmain_r4 = cond(`pmain_r4'<0.01,"***",cond(`pmain_r4'<0.05,"**",cond(`pmain_r4'<0.1,"*","")))
if `__ewiz_model_valid_r4' == 1 {
    capture post `ewiz_sum' ("(R4)") (_b[`ivvar']) (_se[`ivvar']) (`pmain_r4') ("`starsmain_r4'") (`N_r4') (`R2_r4')
    if _rc post `ewiz_sum' ("(R4)") (.) (.) (.) ("") (`N_r4') (`R2_r4')
}
else post `ewiz_sum' ("(R4)") (.) (.) (.) ("") (`N_r4') (`R2_r4')
if `__ewiz_model_valid_r4' == 1 {
    local __included_vars "`ivvar' dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age"
    foreach v of local allvars {
        local __is_included = 0
        foreach __inc of local __included_vars {
            if "`v'" == "`__inc'" local __is_included = 1
        }
        if `__is_included' == 0 continue
        local model_term "`v'"
        if "`v'" == "`ivvar'" local model_term "`ivvar'"
        capture scalar __ewiz_term_b = _b[`model_term']
        capture scalar __ewiz_term_se = _se[`model_term']
        if _rc | missing(__ewiz_term_se) | __ewiz_term_se <= 1e-12 {
            local cell_r4_`v' "omitted"
            local secell_r4_`v' "(absorbed)"
            continue
        }
        capture local __coef : display %9.4f __ewiz_term_b
        capture if abs(__ewiz_term_b) < 0.00005 & __ewiz_term_b != 0 local __coef : display %9.2e __ewiz_term_b
        if !_rc {
            capture local __se : display %9.4f __ewiz_term_se
            capture if abs(__ewiz_term_se) < 0.00005 & __ewiz_term_se != 0 local __se : display %9.2e __ewiz_term_se
            local __p = .
            local __stars ""
            capture local __df = e(df_r)
            capture local __z = abs(_b[`model_term'] / _se[`model_term'])
            if !_rc {
                if "`__df'"=="" | "`__df'"=="." local __p = 2*(1-normal(`__z'))
                else local __p = 2*ttail(`__df', `__z')
            }
            if !_rc local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
            local cell_r4_`v' "`__coef'`__stars'"
            local secell_r4_`v' "(`__se')"
        }
    }
}
di "(R4): `ivvar'=`coefmain_r4'`starsmain_r4' (se=`semain_r4', p=`pmain_r4'), N=`N_r4', R2=`R2_r4'"
capture noisily poisson `dvvar' `ivvar' roa lev size growth cashflow tobinq top1 dual board indep soe age i.`timevar' i.`industry_fe', vce(cluster `cluster_id')
if _rc != 0 {
    capture noisily nbreg `dvvar' `ivvar' roa lev size growth cashflow tobinq top1 dual board indep soe age i.`timevar' i.`industry_fe', vce(cluster `cluster_id')
}
capture local N_r7 : display %12.0f e(N)
if "`N_r7'"=="" local N_r7 = .
capture local R2_r7 : display %9.4f e(r2)
if "`R2_r7'"=="" local R2_r7 = .
capture scalar __ewiz_main_b_r7 = _b[`ivvar']
capture scalar __ewiz_main_se_r7 = _se[`ivvar']
capture local coefmain_r7 : display %9.4f __ewiz_main_b_r7
capture if abs(__ewiz_main_b_r7) < 0.00005 & __ewiz_main_b_r7 != 0 local coefmain_r7 : display %9.2e __ewiz_main_b_r7
capture local semain_r7 : display %9.4f __ewiz_main_se_r7
capture if abs(__ewiz_main_se_r7) < 0.00005 & __ewiz_main_se_r7 != 0 local semain_r7 : display %9.2e __ewiz_main_se_r7
local pmain_r7 = .
local starsmain_r7 ""
local __ewiz_model_valid_r7 = 1
capture scalar __ewiz_coef_abs_r7 = abs(_b[`ivvar'])
if _rc local __ewiz_model_valid_r7 = 0
capture scalar __ewiz_se_abs_r7 = abs(_se[`ivvar'])
if _rc local __ewiz_model_valid_r7 = 0
capture if scalar(__ewiz_se_abs_r7) <= 1e-12 local __ewiz_model_valid_r7 = 0
capture if scalar(__ewiz_coef_abs_r7) <= 1e-12 & scalar(__ewiz_se_abs_r7) <= 1e-12 local __ewiz_model_valid_r7 = 0
capture local __df_r7 = e(df_r)
capture local __z_r7 = abs(_b[`ivvar'] / _se[`ivvar'])
if !_rc & `__ewiz_model_valid_r7' == 1 {
    if "`__df_r7'"=="" | "`__df_r7'"=="." local pmain_r7 = 2*(1-normal(`__z_r7'))
    else local pmain_r7 = 2*ttail(`__df_r7', `__z_r7')
}
if `__ewiz_model_valid_r7' == 1 local starsmain_r7 = cond(`pmain_r7'<0.01,"***",cond(`pmain_r7'<0.05,"**",cond(`pmain_r7'<0.1,"*","")))
if `__ewiz_model_valid_r7' == 1 {
    capture post `ewiz_sum' ("(R7)") (_b[`ivvar']) (_se[`ivvar']) (`pmain_r7') ("`starsmain_r7'") (`N_r7') (`R2_r7')
    if _rc post `ewiz_sum' ("(R7)") (.) (.) (.) ("") (`N_r7') (`R2_r7')
}
else post `ewiz_sum' ("(R7)") (.) (.) (.) ("") (`N_r7') (`R2_r7')
if `__ewiz_model_valid_r7' == 1 {
    local __included_vars "`ivvar' dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age"
    foreach v of local allvars {
        local __is_included = 0
        foreach __inc of local __included_vars {
            if "`v'" == "`__inc'" local __is_included = 1
        }
        if `__is_included' == 0 continue
        local model_term "`v'"
        if "`v'" == "`ivvar'" local model_term "`ivvar'"
        capture scalar __ewiz_term_b = _b[`model_term']
        capture scalar __ewiz_term_se = _se[`model_term']
        if _rc | missing(__ewiz_term_se) | __ewiz_term_se <= 1e-12 {
            local cell_r7_`v' "omitted"
            local secell_r7_`v' "(absorbed)"
            continue
        }
        capture local __coef : display %9.4f __ewiz_term_b
        capture if abs(__ewiz_term_b) < 0.00005 & __ewiz_term_b != 0 local __coef : display %9.2e __ewiz_term_b
        if !_rc {
            capture local __se : display %9.4f __ewiz_term_se
            capture if abs(__ewiz_term_se) < 0.00005 & __ewiz_term_se != 0 local __se : display %9.2e __ewiz_term_se
            local __p = .
            local __stars ""
            capture local __df = e(df_r)
            capture local __z = abs(_b[`model_term'] / _se[`model_term'])
            if !_rc {
                if "`__df'"=="" | "`__df'"=="." local __p = 2*(1-normal(`__z'))
                else local __p = 2*ttail(`__df', `__z')
            }
            if !_rc local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
            local cell_r7_`v' "`__coef'`__stars'"
            local secell_r7_`v' "(`__se')"
        }
    }
}
di "(R7): `ivvar'=`coefmain_r7'`starsmain_r7' (se=`semain_r7', p=`pmain_r7'), N=`N_r7', R2=`R2_r7'"
postclose `ewiz_sum'
capture file close f_robust
file open f_robust using "$JOB_DIR/regression_table_稳健性检验.csv", write replace
file write f_robust "变量,(R1),(R2),(R3),(R4),(R7)" _n
file write f_robust ",双重聚类(个体+时间),1% Winsor,5% Winsor,剔除异常残差 |rstudent|>3,Poisson/NB count model" _n
file write f_robust "dfi_index,`cell_r1_dfi_index',`cell_r2_dfi_index',`cell_r3_dfi_index',`cell_r4_dfi_index',`cell_r7_dfi_index'" _n
file write f_robust ",`secell_r1_dfi_index',`secell_r2_dfi_index',`secell_r3_dfi_index',`secell_r4_dfi_index',`secell_r7_dfi_index'" _n
file write f_robust "roa,`cell_r1_roa',`cell_r2_roa',`cell_r3_roa',`cell_r4_roa',`cell_r7_roa'" _n
file write f_robust ",`secell_r1_roa',`secell_r2_roa',`secell_r3_roa',`secell_r4_roa',`secell_r7_roa'" _n
file write f_robust "lev,`cell_r1_lev',`cell_r2_lev',`cell_r3_lev',`cell_r4_lev',`cell_r7_lev'" _n
file write f_robust ",`secell_r1_lev',`secell_r2_lev',`secell_r3_lev',`secell_r4_lev',`secell_r7_lev'" _n
file write f_robust "size,`cell_r1_size',`cell_r2_size',`cell_r3_size',`cell_r4_size',`cell_r7_size'" _n
file write f_robust ",`secell_r1_size',`secell_r2_size',`secell_r3_size',`secell_r4_size',`secell_r7_size'" _n
file write f_robust "growth,`cell_r1_growth',`cell_r2_growth',`cell_r3_growth',`cell_r4_growth',`cell_r7_growth'" _n
file write f_robust ",`secell_r1_growth',`secell_r2_growth',`secell_r3_growth',`secell_r4_growth',`secell_r7_growth'" _n
file write f_robust "cashflow,`cell_r1_cashflow',`cell_r2_cashflow',`cell_r3_cashflow',`cell_r4_cashflow',`cell_r7_cashflow'" _n
file write f_robust ",`secell_r1_cashflow',`secell_r2_cashflow',`secell_r3_cashflow',`secell_r4_cashflow',`secell_r7_cashflow'" _n
file write f_robust "tobinq,`cell_r1_tobinq',`cell_r2_tobinq',`cell_r3_tobinq',`cell_r4_tobinq',`cell_r7_tobinq'" _n
file write f_robust ",`secell_r1_tobinq',`secell_r2_tobinq',`secell_r3_tobinq',`secell_r4_tobinq',`secell_r7_tobinq'" _n
file write f_robust "top1,`cell_r1_top1',`cell_r2_top1',`cell_r3_top1',`cell_r4_top1',`cell_r7_top1'" _n
file write f_robust ",`secell_r1_top1',`secell_r2_top1',`secell_r3_top1',`secell_r4_top1',`secell_r7_top1'" _n
file write f_robust "dual,`cell_r1_dual',`cell_r2_dual',`cell_r3_dual',`cell_r4_dual',`cell_r7_dual'" _n
file write f_robust ",`secell_r1_dual',`secell_r2_dual',`secell_r3_dual',`secell_r4_dual',`secell_r7_dual'" _n
file write f_robust "board,`cell_r1_board',`cell_r2_board',`cell_r3_board',`cell_r4_board',`cell_r7_board'" _n
file write f_robust ",`secell_r1_board',`secell_r2_board',`secell_r3_board',`secell_r4_board',`secell_r7_board'" _n
file write f_robust "indep,`cell_r1_indep',`cell_r2_indep',`cell_r3_indep',`cell_r4_indep',`cell_r7_indep'" _n
file write f_robust ",`secell_r1_indep',`secell_r2_indep',`secell_r3_indep',`secell_r4_indep',`secell_r7_indep'" _n
file write f_robust "soe,`cell_r1_soe',`cell_r2_soe',`cell_r3_soe',`cell_r4_soe',`cell_r7_soe'" _n
file write f_robust ",`secell_r1_soe',`secell_r2_soe',`secell_r3_soe',`secell_r4_soe',`secell_r7_soe'" _n
file write f_robust "age,`cell_r1_age',`cell_r2_age',`cell_r3_age',`cell_r4_age',`cell_r7_age'" _n
file write f_robust ",`secell_r1_age',`secell_r2_age',`secell_r3_age',`secell_r4_age',`secell_r7_age'" _n
file write f_robust "核心解释变量口径,dfi_index,dfi_index,dfi_index,dfi_index,dfi_index" _n
file write f_robust "估计模型,OLS/FE,OLS/FE,OLS/FE,OLS/FE,OLS/FE" _n
file write f_robust "控制变量,是,是,是,是,是" _n
file write f_robust "个体固定效应,是,是,是,是,否" _n
file write f_robust "时间固定效应,是,是,是,是,否" _n
file write f_robust "聚类标准误,双重聚类(个体+时间),稳健标准误,稳健标准误,稳健标准误,稳健标准误" _n
file write f_robust "N,`N_r1',`N_r2',`N_r3',`N_r4',`N_r7'" _n
file write f_robust "R²,`R2_r1',`R2_r2',`R2_r3',`R2_r4',`R2_r7'" _n
file close f_robust
preserve
use `ewiz_summary', clear
export delimited using "$JOB_DIR/regression_table.csv", replace
drop if missing(coef)
if _N == 0 {
    capture file close f_base
    tempname __ewz_skipfh
    file open `__ewz_skipfh' using "$JOB_DIR/regression_table_稳健性检验.csv", write replace
    file write `__ewz_skipfh' "状态,值" _n
    file write `__ewz_skipfh' "状态,skipped" _n
    file write `__ewz_skipfh' "原因,所有规格估计失败(被解释变量无变异 / 控制变量完美共线 / 样本量不足等),无法生成有效回归表" _n
    file close `__ewz_skipfh'
    di as error "[regression] all specs degenerate; wrote skipped marker"
}
if _N > 0 {
    gen spec_order = _n
    gen lb = coef - 1.96 * se
    gen ub = coef + 1.96 * se
    twoway (rcap ub lb spec_order, lcolor(gs8)) (scatter coef spec_order, mcolor(navy) msymbol(D)), yline(0, lpattern(dash) lcolor(maroon)) xlabel(1 "(R1)" 2 "(R2)" 3 "(R3)" 4 "(R4)" 5 "(R7)", angle(45) labsize(small)) xtitle("Specification") ytitle("Coefficient on `ivvar'") title("Specification Comparison")
    capture graph export "$JOB_DIR/robustness_coef_plot.png", replace width(1800)
}
restore
di "[EWIZ_STATS] target=dfi_index;coef=`coefmain_r7';se=`semain_r7';pvalue=`pmain_r7';nobs=`N_r7'"
di "回归表输出完成"

* ── 安慰剂检验:随机化 IV 500 次 ──
tempname ph_fh
capture file close `ph_fh'
file open `ph_fh' using "$JOB_DIR/placebo_results.csv", write replace
file write `ph_fh' "iter,placebo_coef,placebo_p" _n
set seed 20250101
local placebo_iters = 500
local true_coef = `coefmain_r7'
local above_count = 0
preserve
forval __ph = 1/`placebo_iters' {
    capture drop __ph_iv
    gen double __ph_iv = runiform()
    * 保持分布形态:用真实 IV 的百分位映射随机数
    quietly summarize `ivvar', detail
    local __ph_min = r(min)
    local __ph_max = r(max)
    replace __ph_iv = `__ph_min' + (`__ph_max' - `__ph_min') * __ph_iv
    capture quietly reg `dvvar' __ph_iv `controls', vce(robust)
    if _rc == 0 {
        local __ph_c = _b[__ph_iv]
        local __ph_se = _se[__ph_iv]
        local __ph_p = 2*(1 - normal(abs(`__ph_c'/`__ph_se')))
        if abs(`__ph_c') >= abs(`true_coef') {
            local above_count = `above_count' + 1
        }
        file write `ph_fh' "`__ph',`__ph_c',`__ph_p'" _n
    }
}
restore
file close `ph_fh'
local placebo_p_emp = `above_count' / `placebo_iters'
tempname ph_summary
capture file close `ph_summary'
file open `ph_summary' using "$JOB_DIR/placebo_summary.csv", write replace
file write `ph_summary' "指标,值" _n
file write `ph_summary' "真实 IV 系数,`true_coef'" _n
file write `ph_summary' "打乱迭代次数,`placebo_iters'" _n
file write `ph_summary' "打乱后|系数|≥真实|系数|的次数,`above_count'" _n
file write `ph_summary' "经验 p 值(双尾),`placebo_p_emp'" _n
file close `ph_summary'
di "安慰剂检验完成:经验 p = `placebo_p_emp'"

* ── 经济显著性:标准化 β 与弹性 ──
quietly summarize `ivvar', detail
local iv_sd = r(sd)
local iv_mean = r(mean)
quietly summarize `dvvar', detail
local dv_sd = r(sd)
local dv_mean = r(mean)
local std_beta = (`true_coef' * `iv_sd') / `dv_sd'
local elasticity = (`true_coef' * `iv_mean') / `dv_mean'
tempname es_fh
capture file close `es_fh'
file open `es_fh' using "$JOB_DIR/economic_significance.csv", write replace
file write `es_fh' "指标,值,解读" _n
file write `es_fh' "原系数,`true_coef'," _n
file write `es_fh' "IV 标准差,`iv_sd'," _n
file write `es_fh' "DV 标准差,`dv_sd'," _n
file write `es_fh' "标准化 β,`std_beta',IV 每变动一个标准差 DV 变动多少个标准差" _n
file write `es_fh' "样本均值弹性,`elasticity',IV 变动 1%% DV 变动百分数" _n
file close `es_fh'
di "经济显著性:标准化 β = `std_beta'; 均值弹性 = `elasticity'"
log close

六、实际输出表

这张表就是本方法页使用的案例输出文件,保存在 marketing/method_case_assets/robustness/result.csv

变量(R1)(R2)(R3)(R4)(R7)
双重聚类(个体+时间)1% Winsor5% Winsor剔除异常残差 |rstudent|>3Poisson/NB count model
dfi_index 0.5660*** 0.5530*** 0.5583*** 0.5660*** 0.2957***
( 0.0639)( 0.0451)( 0.0471)( 0.0460)( 0.0201)
roa 0.3040*** 0.2924*** 0.2729*** 0.3040*** 0.1482***
( 0.0458)( 0.0436)( 0.0409)( 0.0449)( 0.0191)
lev -0.0548 -0.0545 -0.0614 -0.0548 -0.0132
( 0.0488)( 0.0456)( 0.0437)( 0.0473)( 0.0231)
size 0.2375*** 0.2323*** 0.2186*** 0.2375*** 0.1092***
( 0.0460)( 0.0458)( 0.0443)( 0.0461)( 0.0223)
growth -0.0819 -0.0857* -0.0804* -0.0819 -0.0198
( 0.0715)( 0.0494)( 0.0471)( 0.0509)( 0.0232)
cashflow 0.0407 0.0406 0.0379 0.0407 0.0197
( 0.0455)( 0.0450)( 0.0442)( 0.0456)( 0.0215)
tobinq 0.0247 0.0243 0.0078 0.0247 0.0146
( 0.0374)( 0.0477)( 0.0457)( 0.0488)( 0.0208)
top1 -0.0166 -0.0271 -0.0363 -0.0166 -0.0230
( 0.0460)( 0.0491)( 0.0474)( 0.0498)( 0.0229)
dual -0.0272 -0.0287 -0.0364 -0.0272 -0.0032
( 0.0513)( 0.0448)( 0.0435)( 0.0459)( 0.0197)
board -0.0383 -0.0302 -0.0218 -0.0383 -0.0021
( 0.0657)( 0.0434)( 0.0427)( 0.0447)( 0.0242)
indep -0.0309 -0.0268 -0.0167 -0.0309 -0.0075
( 0.0360)( 0.0480)( 0.0461)( 0.0491)( 0.0208)
soe 0.0025 0.0023 -0.0013 0.0025 -0.0055
( 0.0397)( 0.0491)( 0.0469)( 0.0503)( 0.0236)
age -0.0539 -0.0523 -0.0428 -0.0539 -0.0061
( 0.0414)( 0.0444)( 0.0433)( 0.0450)( 0.0222)
核心解释变量口径dfi_indexdfi_indexdfi_indexdfi_indexdfi_index
估计模型OLS/FEOLS/FEOLS/FEOLS/FEOLS/FE
控制变量
个体固定效应
时间固定效应
聚类标准误双重聚类(个体+时间)稳健标准误稳健标准误稳健标准误稳健标准误
N 720 720 720 720 720
0.4098 0.4034 0.3885 0.4098 .

补充输出

下面这些文件来自同一次案例运行或烟测输出,用来补齐主表之外的诊断信息。

economic_significance.csv

指标解读
原系数.2957
IV 标准差1.014028344931995
DV 标准差1.346176410712608
标准化 β.2227406298388962IV 每变动一个标准差 DV 变动多少个标准差
样本均值弹性-.0092732140896711IV 变动 1%% DV 变动百分数

placebo_results.csv

iterplacebo_coefplacebo_p
1.0422755054569741.0970925062563102
2-.0065886697061428.8008666111798852
3.0040773466649033.8727249389093115
4.0367103784152566.1388927961891919
5.0338720727626226.1744955386427338
6.0316865554752359.2420682323029761
7-.045687137571697.0695783024294538
8-.0421014733843154.0931455454485262
9-.0005652588075882.9820618340989864
10.0150228098974096.544185211418162
11.012211326861772.6467912938829625
12-.0494454278266741.0440870374472586
13.0350923725880614.1806690339590573
14-.0055093124090006.8300929012336378
15.0278231223840121.2654802124103861
16.005068688753756.8444113990905013
17.0179324524164106.468420924875327
18-.029792581403417.2489301642154378
19.0063703468936055.8019263946215645
20.0121415432401664.628011576333392
21.0257868915876518.2964418241316977
22-.0249144499412879.3311852520231255
23.0228448723389608.3848312934454989
24.0197430782200771.4419589760747307
25-.033662822970117.1868657196671086
26.006494889433034.8061992116001064
27-.0011002707828248.9670925975219595
28-.0105787285247736.6757942035032509
29.0154102174570747.5549087014038609
30-.018559019091525.4633283943526829
31.01748290355602.4908557570304632
32.0205975826156778.4094703461632099
33.0320784798988639.2464671443443844
34-.0104049567516386.6795776080894909
35.0137289053644033.5942775899892003
36-.0015537527351706.9531343678062938

这里只展示前 36 行,完整文件保存在资产目录中。

placebo_summary.csv

指标
真实 IV 系数.2957
打乱迭代次数500
打乱后|系数|≥真实|系数|的次数0
经验 p 值(双尾)0

regression_table.csv

speccoefsepvaluestarsNR2
(R1).566040838796411.063915245.00030515241***720.40979999
(R2).5530192008772081.0450821746.2751406e-31***720.4034
(R3).5582563962233911.0470936633.4581854e-29***720.38850001
(R4).5660408387964105.046000324.3109577e-31***720.40979999
(R7).29570926818386.0201194380***720
robustness_coef_plot.png
robustness_coef_plot.png

七、案例图

这是一张由同一份案例数据生成的页面内诊断图。

稳健性检验 的共用案例输出图。
稳健性检验 的共用案例输出图。

八、论文里怎么写

本文在共用企业面板样本上报告稳健性检验,核心输出见 regression_table_稳健性检验.csv。结果解释时同时关注样本口径、变量构造、系数方向、标准误和适用前提,避免只凭单个 p 值完成方法选择。

九、检查清单

  • 确认本页使用的因变量、核心解释变量、控制变量与论文主模型一致。
  • 先看表格里的样本口径,再看系数、p 值或诊断指标。
  • 代码里的输出文件名要能对应网页展示的结果表。

返回方法库 · 打开 empirical-wizard