Method 10 · heterogeneity

异质性分析

用同一份企业面板跑出可解释的模型证据

异质性分析 的 Markdown 风格教程:基于共用 CSMAR 风格案例生成实际代码、结果表和案例图。

返回方法库 · 共用案例 · 变量百科

一、异质性分析是什么?

这页是 异质性分析 的方法文档。所有表格和图都由 marketing/method_case_assets/generate_assets.py 从同一份 csmar_innovation_realistic.csv 生成,避免用占位图充当教程。重点是用 firm-year 面板说明模型设定、固定效应、聚类标准误和结果解读。

二、按这个案例走一遍

开始前先确认

  • 先有稳定的基准回归。平均效应不稳时,不要急着讲异质性。
  • 分组变量要有理论含义。本案例用 SOE 区分国企和民企。
  • 先看每组样本量。某组太小,系数不稳定。
  • 只比较“这一组显著、那一组不显著”是不够的,最好做组间差异检验。

操作顺序

步骤你在做什么做到什么程度算对
1. 确定分组例如 SOE=1 国企,SOE=0 民企。分组要能对应理论机制。
2. 分组跑回归分别估计 DFI 对创新的系数。先看方向和量级。
3. 加交互项用 dfi_index × SOE 检验组间差异。交互项比肉眼比较更可靠。
4. 看样本量确认两组 N 是否足够。N 太小的组不要过度解释。
5. 写结论说清哪组更强、差异是否显著。不要只说“存在异质性”却不给检验。

代码逐行解释

代码/命令这行在干什么
xtreg $y c.$x##i.soe ...同时估计 DFI 主效应和 DFI×SOE 交互项。
c.$x告诉 Stata dfi_index 是连续变量。
i.soe告诉 Stata SOE 是分组虚拟变量。
lincom c.dfi_index#1.soe检验国企组相对民企组的系数差。
vce(cluster firm_id)继续使用企业聚类标准误。

结果表怎么读

格子读法
dfi_index 两组系数先看方向是否一致,再看哪组更大。
N两组样本量决定结果可信度。
辅助看模型解释度,不是异质性的核心证据。
组间差异检验p 值小才能更有把握说两组效应不同。

最容易写错的地方

  • 不要只凭一组显著、一组不显著就说存在异质性。
  • 不要没有理论理由地反复切样本寻找显著结果。
  • 不要忽略小组样本量。小样本不显著可能只是统计功效不足。
自己复现时要做到

复现时写两句话:第一句报告两组系数,第二句报告组间差异检验。缺一不可。

三、先看这个案例的结论

  • 分组回归里 dfi_index 两组系数分别是 0.5890*** / 0.5784***;先看方向是否一致,再讨论差异大小。
  • 两组样本量是 360 / 360,R² 是 0.2840 / 0.3323。
  • 异质性页不能只贴两列回归表,最好还补一个交互项或组间差异检验,否则“差异”只是肉眼判断。

四、案例口径

字段口径
数据CSMAR 风格 A 股企业创新面板
原始样本196 家上市公司,2015-2020 年,约 1200 个公司-年观测;各方法有效样本以本页输出表 N 为准
因变量patent_count;回归页通常使用 ln(1 + patent_count)
核心解释变量dfi_index,数字普惠金融指数;部分真实烟测输出展示的是标准化后的 dfi_index
控制变量roa、lev、size、growth、cashflow、tobinq、top1、dual、board、indep、soe、age
输出文件regression_table_异质性检验.csv
角色要求dv、iv
依赖包无额外 Stata 社区包要求

五、实际代码

下面是本页对应的最小可复现 Stata 代码。生产环境里 empirical-wizard 会在此基础上处理变量映射、输出校验、失败诊断和报告装配。

log using "/root/workspace/empirical-wizard/workspace/f3e008aa/analysis.log", replace text
global JOB_DIR "/root/workspace/empirical-wizard/workspace/f3e008aa"
set more off
adopath + "/root/ado/plus"
global DATA_PATH "/root/workspace/empirical-wizard/workspace/test_e2e/csmar_innovation.csv"
import delimited "/root/workspace/empirical-wizard/workspace/test_e2e/csmar_innovation.csv", clear case(preserve)
capture confirm global JOB_DIR
if _rc global JOB_DIR "."
* 自动去除完全重复行(同列同值),避免 N 虚增与 xtset 失败
quietly duplicates drop
local idvar ""
local timevar ""
capture confirm variable stkcd
if !_rc {
    capture confirm numeric variable stkcd
    if _rc {
        tempvar __ewiz_id
        capture encode stkcd, gen(`__ewiz_id')
        if !_rc local idvar "`__ewiz_id'"
    }
    else {
        local idvar "stkcd"
    }
}
else {
    di as text "面板ID变量不存在,跳过 xtset ID:stkcd"
}
capture confirm variable year
if !_rc {
    capture confirm numeric variable year
    if _rc {
        tempvar __ewiz_time
        capture encode year, gen(`__ewiz_time')
        if !_rc local timevar "`__ewiz_time'"
    }
    else {
        local timevar "year"
    }
}
else {
    di as text "时间变量不存在,跳过 xtset time:year"
}
if "`idvar'" != "" & "`timevar'" != "" {
    capture xtset `idvar' `timevar'
}
capture confirm numeric variable size
tempvar __grp_src
if _rc {
    encode size, gen(`__grp_src')
}
else {
    gen double `__grp_src' = size
}
quietly summarize `__grp_src', detail
local __med = r(p50)
tempvar __grp
gen byte `__grp' = cond(`__grp_src' >= `__med', 1, 0) if !missing(`__grp_src')
levelsof `__grp', local(groups_all)
// 样本充分性护栏:每组至少需要 params+10 观测才跑回归,否则跳过
local _min_n = 27  // params (iv+controls) + safety buffer
local groups ""
foreach g of local groups_all {
    quietly count if `__grp'==`g'
    if r(N) >= `_min_n' {
        local groups "`groups' `g'"
    }
    else {
        di as text "[异质性] 跳过 g=`g',N=" r(N) " < " `_min_n' "(样本不足以承载 FE + 控制变量,会导致 R²=1 过拟合)"
    }
}
if "`groups'" == "" {
    di as error "[异质性] 所有分组样本均不足,无法执行异质性分析。建议改为 by_median 或 by_terciles 粗分组。"
    exit 0
}
tempname fh
capture file close `fh'
file open `fh' using "$JOB_DIR/regression_table_异质性检验.csv", write replace
local __hdr = "变量"
foreach g of local groups {
    local __grp_label "(size<中位数)"
    if "`g'" == "1" local __grp_label "(size≥中位数)"
    local __hdr = "`__hdr',`__grp_label'"
}
file write `fh' "`__hdr'" _n
local __row = "dfi_index"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __c = _b[dfi_index]
        capture local __se = _se[dfi_index]
        capture local __p = 2*ttail(e(df_r), abs(`__c'/`__se'))
        local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
        local __coef_s : display %9.4f `__c'
        capture if abs(`__c') < 0.00005 & `__c' != 0 local __coef_s : display %9.2e `__c'
        local __row = "`__row',`__coef_s'`__stars'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = ""
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __se = _se[dfi_index]
        local __se_s : display %9.4f `__se'
        capture if abs(`__se') < 0.00005 & `__se' != 0 local __se_s : display %9.2e `__se'
        local __row = "`__row',(`__se_s')"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = "roa"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __c = _b[roa]
        capture local __se = _se[roa]
        capture local __p = 2*ttail(e(df_r), abs(`__c'/`__se'))
        local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
        local __coef_s : display %9.4f `__c'
        capture if abs(`__c') < 0.00005 & `__c' != 0 local __coef_s : display %9.2e `__c'
        local __row = "`__row',`__coef_s'`__stars'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = ""
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __se = _se[roa]
        local __se_s : display %9.4f `__se'
        capture if abs(`__se') < 0.00005 & `__se' != 0 local __se_s : display %9.2e `__se'
        local __row = "`__row',(`__se_s')"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = "lev"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __c = _b[lev]
        capture local __se = _se[lev]
        capture local __p = 2*ttail(e(df_r), abs(`__c'/`__se'))
        local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
        local __coef_s : display %9.4f `__c'
        capture if abs(`__c') < 0.00005 & `__c' != 0 local __coef_s : display %9.2e `__c'
        local __row = "`__row',`__coef_s'`__stars'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = ""
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __se = _se[lev]
        local __se_s : display %9.4f `__se'
        capture if abs(`__se') < 0.00005 & `__se' != 0 local __se_s : display %9.2e `__se'
        local __row = "`__row',(`__se_s')"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = "size"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __c = _b[size]
        capture local __se = _se[size]
        capture local __p = 2*ttail(e(df_r), abs(`__c'/`__se'))
        local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
        local __coef_s : display %9.4f `__c'
        capture if abs(`__c') < 0.00005 & `__c' != 0 local __coef_s : display %9.2e `__c'
        local __row = "`__row',`__coef_s'`__stars'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = ""
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __se = _se[size]
        local __se_s : display %9.4f `__se'
        capture if abs(`__se') < 0.00005 & `__se' != 0 local __se_s : display %9.2e `__se'
        local __row = "`__row',(`__se_s')"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = "growth"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __c = _b[growth]
        capture local __se = _se[growth]
        capture local __p = 2*ttail(e(df_r), abs(`__c'/`__se'))
        local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
        local __coef_s : display %9.4f `__c'
        capture if abs(`__c') < 0.00005 & `__c' != 0 local __coef_s : display %9.2e `__c'
        local __row = "`__row',`__coef_s'`__stars'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = ""
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __se = _se[growth]
        local __se_s : display %9.4f `__se'
        capture if abs(`__se') < 0.00005 & `__se' != 0 local __se_s : display %9.2e `__se'
        local __row = "`__row',(`__se_s')"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = "cashflow"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __c = _b[cashflow]
        capture local __se = _se[cashflow]
        capture local __p = 2*ttail(e(df_r), abs(`__c'/`__se'))
        local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
        local __coef_s : display %9.4f `__c'
        capture if abs(`__c') < 0.00005 & `__c' != 0 local __coef_s : display %9.2e `__c'
        local __row = "`__row',`__coef_s'`__stars'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = ""
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __se = _se[cashflow]
        local __se_s : display %9.4f `__se'
        capture if abs(`__se') < 0.00005 & `__se' != 0 local __se_s : display %9.2e `__se'
        local __row = "`__row',(`__se_s')"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = "tobinq"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __c = _b[tobinq]
        capture local __se = _se[tobinq]
        capture local __p = 2*ttail(e(df_r), abs(`__c'/`__se'))
        local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
        local __coef_s : display %9.4f `__c'
        capture if abs(`__c') < 0.00005 & `__c' != 0 local __coef_s : display %9.2e `__c'
        local __row = "`__row',`__coef_s'`__stars'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = ""
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __se = _se[tobinq]
        local __se_s : display %9.4f `__se'
        capture if abs(`__se') < 0.00005 & `__se' != 0 local __se_s : display %9.2e `__se'
        local __row = "`__row',(`__se_s')"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = "top1"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __c = _b[top1]
        capture local __se = _se[top1]
        capture local __p = 2*ttail(e(df_r), abs(`__c'/`__se'))
        local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
        local __coef_s : display %9.4f `__c'
        capture if abs(`__c') < 0.00005 & `__c' != 0 local __coef_s : display %9.2e `__c'
        local __row = "`__row',`__coef_s'`__stars'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = ""
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __se = _se[top1]
        local __se_s : display %9.4f `__se'
        capture if abs(`__se') < 0.00005 & `__se' != 0 local __se_s : display %9.2e `__se'
        local __row = "`__row',(`__se_s')"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = "dual"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __c = _b[dual]
        capture local __se = _se[dual]
        capture local __p = 2*ttail(e(df_r), abs(`__c'/`__se'))
        local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
        local __coef_s : display %9.4f `__c'
        capture if abs(`__c') < 0.00005 & `__c' != 0 local __coef_s : display %9.2e `__c'
        local __row = "`__row',`__coef_s'`__stars'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = ""
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __se = _se[dual]
        local __se_s : display %9.4f `__se'
        capture if abs(`__se') < 0.00005 & `__se' != 0 local __se_s : display %9.2e `__se'
        local __row = "`__row',(`__se_s')"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = "board"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __c = _b[board]
        capture local __se = _se[board]
        capture local __p = 2*ttail(e(df_r), abs(`__c'/`__se'))
        local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
        local __coef_s : display %9.4f `__c'
        capture if abs(`__c') < 0.00005 & `__c' != 0 local __coef_s : display %9.2e `__c'
        local __row = "`__row',`__coef_s'`__stars'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = ""
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __se = _se[board]
        local __se_s : display %9.4f `__se'
        capture if abs(`__se') < 0.00005 & `__se' != 0 local __se_s : display %9.2e `__se'
        local __row = "`__row',(`__se_s')"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = "indep"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __c = _b[indep]
        capture local __se = _se[indep]
        capture local __p = 2*ttail(e(df_r), abs(`__c'/`__se'))
        local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
        local __coef_s : display %9.4f `__c'
        capture if abs(`__c') < 0.00005 & `__c' != 0 local __coef_s : display %9.2e `__c'
        local __row = "`__row',`__coef_s'`__stars'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = ""
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __se = _se[indep]
        local __se_s : display %9.4f `__se'
        capture if abs(`__se') < 0.00005 & `__se' != 0 local __se_s : display %9.2e `__se'
        local __row = "`__row',(`__se_s')"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = "soe"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __c = _b[soe]
        capture local __se = _se[soe]
        capture local __p = 2*ttail(e(df_r), abs(`__c'/`__se'))
        local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
        local __coef_s : display %9.4f `__c'
        capture if abs(`__c') < 0.00005 & `__c' != 0 local __coef_s : display %9.2e `__c'
        local __row = "`__row',`__coef_s'`__stars'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = ""
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __se = _se[soe]
        local __se_s : display %9.4f `__se'
        capture if abs(`__se') < 0.00005 & `__se' != 0 local __se_s : display %9.2e `__se'
        local __row = "`__row',(`__se_s')"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = "age"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __c = _b[age]
        capture local __se = _se[age]
        capture local __p = 2*ttail(e(df_r), abs(`__c'/`__se'))
        local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
        local __coef_s : display %9.4f `__c'
        capture if abs(`__c') < 0.00005 & `__c' != 0 local __coef_s : display %9.2e `__c'
        local __row = "`__row',`__coef_s'`__stars'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = ""
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        capture local __se = _se[age]
        local __se_s : display %9.4f `__se'
        capture if abs(`__se') < 0.00005 & `__se' != 0 local __se_s : display %9.2e `__se'
        local __row = "`__row',(`__se_s')"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = "N"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        local __v = e(N)
        local __v_s : display %9.0f `__v'
        local __row = "`__row',`__v_s'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
local __row = "R²"
foreach g of local groups {
    capture quietly reghdfe patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp'==`g', absorb(`timevar' ind) vce(robust) keepsingletons
    if _rc==0 {
        local __v = e(r2)
        local __v_s : display %9.4f `__v'
        local __row = "`__row',`__v_s'"
    }
    else {
        local __row = "`__row',"
    }
}
file write `fh' "`__row'" _n
file close `fh'

* ── 组间系数差异检验(two-group case)──
* 构造 group × 核心解释变量 交互项,若交互显著则组间差异显著
tempname diff_fh
capture file close `diff_fh'
file open `diff_fh' using "$JOB_DIR/heterogeneity_diff_test.csv", write replace
file write `diff_fh' "项目,值,说明" _n
quietly count if !missing(`__grp')
local _n_total = r(N)
quietly levelsof `__grp', local(_all_g)
local _n_groups : word count `_all_g'
file write `diff_fh' "分组数,`_n_groups',仅二分组情形下 Chow/交互项检验成立" _n
if `_n_groups' == 2 {
    tempvar __g_bin __iv_x_g
    quietly gen double `__g_bin' = `__grp' - `: word 1 of `_all_g''
    quietly replace `__g_bin' = `__g_bin' / (`: word 2 of `_all_g'' - `: word 1 of `_all_g'')
    quietly gen double `__iv_x_g' = dfi_index * `__g_bin'
    capture reg patent_count dfi_index `__g_bin' `__iv_x_g' roa lev size growth cashflow tobinq top1 dual board indep soe age if !missing(`__grp'), vce(robust)
    if _rc == 0 {
        local diff_coef = _b[`__iv_x_g']
        local diff_se = _se[`__iv_x_g']
        local diff_t = `diff_coef' / `diff_se'
        * Use t-distribution with regression df_r when available; fall
        * back to normal only when df_r is missing. Normal approximation
        * over-rejects in small/medium samples.
        capture local _diff_df = e(df_r)
        if "`_diff_df'" == "" | "`_diff_df'" == "." {
            local diff_p = 2*(1 - normal(abs(`diff_t')))
        }
        else {
            local diff_p = 2*ttail(`_diff_df', abs(`diff_t'))
        }
        local diff_coef_s : display %12.6f `diff_coef'
        local diff_se_s : display %12.6f `diff_se'
        local diff_p_s : display %9.4f `diff_p'
        file write `diff_fh' "组间系数差(组2-组1),`diff_coef_s'," _n
        file write `diff_fh' "标准误,`diff_se_s'," _n
        file write `diff_fh' "p 值,`diff_p_s',p<0.1 * / p<0.05 ** / p<0.01 ***" _n
        if `diff_p' < 0.01 {
            file write `diff_fh' "结论,组间差异显著,在 1%% 水平显著" _n
        }
        else if `diff_p' < 0.05 {
            file write `diff_fh' "结论,组间差异显著,在 5%% 水平显著" _n
        }
        else if `diff_p' < 0.1 {
            file write `diff_fh' "结论,组间差异边际显著,在 10%% 水平显著" _n
        }
        else {
            file write `diff_fh' "结论,组间差异不显著,p 值 = `diff_p_s'不能拒绝组间系数相等" _n
        }
    }
    else {
        file write `diff_fh' "结论,交互项回归失败," _n
    }
}
else if `_n_groups' >= 3 & `_n_groups' <= 6 {
    * ── >2 组:suest 多组系数联合相等检验 ──
    * 对每个 group 单独估计同一规格,suest 合并方差矩阵后做
    * H0: [g1] iv = [g2] iv = ... = [gk] iv 的 Wald 检验。
    local __suest_ok = 1
    local __est_names ""
    local __test_eq ""
    local __i = 0
    foreach _gv of local _all_g {
        local __i = `__i' + 1
        capture noisily {
            quietly reg patent_count dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age if `__grp' == `_gv', vce(robust)
        }
        if _rc != 0 {
            local __suest_ok = 0
        }
        else {
            estimates store __het`__i'
            local __est_names "`__est_names' __het`__i'"
            if `__i' == 1 {
                local __test_eq "[__het`__i'_mean]dfi_index"
            }
            else {
                local __test_eq "`__test_eq' = [__het`__i'_mean]dfi_index"
            }
        }
    }
    if `__suest_ok' == 1 {
        capture noisily suest `__est_names', vce(robust)
        if _rc == 0 {
            capture noisily test `__test_eq'
            if _rc == 0 {
                local joint_chi2 = r(chi2)
                local joint_df = r(df)
                local joint_p = r(p)
                local joint_chi2_s : display %9.4f `joint_chi2'
                local joint_df_s : display %2.0f `joint_df'
                local joint_p_s : display %9.4f `joint_p'
                file write `diff_fh' "Wald chi2,`joint_chi2_s',suest 联合检验" _n
                file write `diff_fh' "df,`joint_df_s'," _n
                file write `diff_fh' "p 值,`joint_p_s'," _n
                if `joint_p' < 0.01 file write `diff_fh' "结论,各组系数显著不等,1% 水平拒绝同质" _n
                else if `joint_p' < 0.05 file write `diff_fh' "结论,各组系数显著不等,5% 水平拒绝同质" _n
                else if `joint_p' < 0.1 file write `diff_fh' "结论,各组系数边际不等,10% 水平拒绝同质" _n
                else file write `diff_fh' "结论,各组系数同质性未被拒绝,p=`joint_p_s'" _n
            }
            else {
                file write `diff_fh' "结论,suest test 失败,可能是估计向量不可比" _n
            }
        }
        else {
            file write `diff_fh' "结论,suest 合并方差矩阵失败," _n
        }
    }
    else {
        file write `diff_fh' "结论,部分子组回归失败 — 无法做 suest 联合检验," _n
    }
    foreach _en of local __est_names {
        capture estimates drop `_en'
    }
}
else {
    file write `diff_fh' "结论,分组数过多 (>6) — 跳过 suest,建议改为子样本回归," _n
}
file close `diff_fh'

* ── 扩展异质性候选矩阵(exploratory disclosure matrix)──
* 目的:记录候选分组变量的组间差异检验结果;不把显著项自动包装为预设主结论。
tempname cm_fh
capture file close `cm_fh'
file open `cm_fh' using "$JOB_DIR/heterogeneity_candidate_matrix.csv", write replace
file write `cm_fh' "candidate,method,source_groups,n_group0,n_group1,diff_coef,diff_se,diff_p,status,note" _n
* candidate grouping: size
capture confirm variable size
if _rc {
    file write `cm_fh' "size,auto_binary_or_median,.,.,.,.,.,.,missing_variable,候选变量不存在" _n
}
else {
    tempvar __cm_src __cm_grp __cm_gbin __cm_xg __cm_tag
    capture confirm numeric variable size
    if _rc {
        capture encode size, gen(`__cm_src')
    }
    else {
        capture gen double `__cm_src' = size
    }
    if _rc {
        file write `cm_fh' "size,auto_binary_or_median,.,.,.,.,.,.,encoding_failed,变量无法数值化" _n
    }
    else {
        quietly egen byte `__cm_tag' = tag(`__cm_src') if !missing(`__cm_src')
        quietly count if `__cm_tag' == 1
        local __cm_ng = r(N)
        if `__cm_ng' < 2 {
            file write `cm_fh' "size,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,有效分组少于2组" _n
        }
        else {
            if `__cm_ng' == 2 {
                quietly levelsof `__cm_src' if !missing(`__cm_src'), local(__cm_lvls)
                quietly gen double `__cm_grp' = `__cm_src'
                local __cm_method "binary_or_two_value"
            }
            else {
                quietly summarize `__cm_src' if !missing(`__cm_src'), detail
                local __cm_med = r(p50)
                quietly gen byte `__cm_grp' = (`__cm_src' >= `__cm_med') if !missing(`__cm_src')
                local __cm_method "median_split"
            }
            quietly levelsof `__cm_grp' if !missing(`__cm_grp'), local(__cm_glvls)
            local __cm_gg : word count `__cm_glvls'
            if `__cm_gg' != 2 {
                file write `cm_fh' "size,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,二分后仍不是2组" _n
            }
            else {
                local __cm_g0 : word 1 of `__cm_glvls'
                local __cm_g1 : word 2 of `__cm_glvls'
                quietly count if `__cm_grp'==`__cm_g0'
                local __cm_n0 = r(N)
                quietly count if `__cm_grp'==`__cm_g1'
                local __cm_n1 = r(N)
                quietly gen double `__cm_gbin' = (`__cm_grp' == `__cm_g1') if !missing(`__cm_grp')
                quietly gen double `__cm_xg' = dfi_index * `__cm_gbin'
                capture reg patent_count dfi_index `__cm_gbin' `__cm_xg' roa lev growth cashflow tobinq top1 dual board indep soe age if !missing(`__cm_grp'), vce(robust)
                if _rc == 0 {
                    local __cm_diff = _b[`__cm_xg']
                    local __cm_se = _se[`__cm_xg']
                    local __cm_t = `__cm_diff' / `__cm_se'
                    capture local __cm_df = e(df_r)
                    local __cm_p = cond("`__cm_df'"=="" | "`__cm_df'"==".", 2*(1-normal(abs(`__cm_t'))), 2*ttail(`__cm_df', abs(`__cm_t')))
                    local __cm_diff_s : display %12.6f `__cm_diff'
                    local __cm_se_s : display %12.6f `__cm_se'
                    local __cm_p_s : display %9.4f `__cm_p'
                    local __cm_status "not_supported"
                    if `__cm_p' < 0.1 local __cm_status "supported"
                    file write `cm_fh' "size,`__cm_method',`__cm_ng',`__cm_n0',`__cm_n1',`__cm_diff_s',`__cm_se_s',`__cm_p_s',`__cm_status',exploratory disclosed candidate" _n
                }
                else {
                    file write `cm_fh' "size,auto_binary_or_median,`__cm_ng',`__cm_n0',`__cm_n1',.,.,.,regression_failed,交互项回归失败" _n
                }
            }
        }
    }
}
* candidate grouping: age
capture confirm variable age
if _rc {
    file write `cm_fh' "age,auto_binary_or_median,.,.,.,.,.,.,missing_variable,候选变量不存在" _n
}
else {
    tempvar __cm_src __cm_grp __cm_gbin __cm_xg __cm_tag
    capture confirm numeric variable age
    if _rc {
        capture encode age, gen(`__cm_src')
    }
    else {
        capture gen double `__cm_src' = age
    }
    if _rc {
        file write `cm_fh' "age,auto_binary_or_median,.,.,.,.,.,.,encoding_failed,变量无法数值化" _n
    }
    else {
        quietly egen byte `__cm_tag' = tag(`__cm_src') if !missing(`__cm_src')
        quietly count if `__cm_tag' == 1
        local __cm_ng = r(N)
        if `__cm_ng' < 2 {
            file write `cm_fh' "age,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,有效分组少于2组" _n
        }
        else {
            if `__cm_ng' == 2 {
                quietly levelsof `__cm_src' if !missing(`__cm_src'), local(__cm_lvls)
                quietly gen double `__cm_grp' = `__cm_src'
                local __cm_method "binary_or_two_value"
            }
            else {
                quietly summarize `__cm_src' if !missing(`__cm_src'), detail
                local __cm_med = r(p50)
                quietly gen byte `__cm_grp' = (`__cm_src' >= `__cm_med') if !missing(`__cm_src')
                local __cm_method "median_split"
            }
            quietly levelsof `__cm_grp' if !missing(`__cm_grp'), local(__cm_glvls)
            local __cm_gg : word count `__cm_glvls'
            if `__cm_gg' != 2 {
                file write `cm_fh' "age,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,二分后仍不是2组" _n
            }
            else {
                local __cm_g0 : word 1 of `__cm_glvls'
                local __cm_g1 : word 2 of `__cm_glvls'
                quietly count if `__cm_grp'==`__cm_g0'
                local __cm_n0 = r(N)
                quietly count if `__cm_grp'==`__cm_g1'
                local __cm_n1 = r(N)
                quietly gen double `__cm_gbin' = (`__cm_grp' == `__cm_g1') if !missing(`__cm_grp')
                quietly gen double `__cm_xg' = dfi_index * `__cm_gbin'
                capture reg patent_count dfi_index `__cm_gbin' `__cm_xg' roa lev size growth cashflow tobinq top1 dual board indep soe if !missing(`__cm_grp'), vce(robust)
                if _rc == 0 {
                    local __cm_diff = _b[`__cm_xg']
                    local __cm_se = _se[`__cm_xg']
                    local __cm_t = `__cm_diff' / `__cm_se'
                    capture local __cm_df = e(df_r)
                    local __cm_p = cond("`__cm_df'"=="" | "`__cm_df'"==".", 2*(1-normal(abs(`__cm_t'))), 2*ttail(`__cm_df', abs(`__cm_t')))
                    local __cm_diff_s : display %12.6f `__cm_diff'
                    local __cm_se_s : display %12.6f `__cm_se'
                    local __cm_p_s : display %9.4f `__cm_p'
                    local __cm_status "not_supported"
                    if `__cm_p' < 0.1 local __cm_status "supported"
                    file write `cm_fh' "age,`__cm_method',`__cm_ng',`__cm_n0',`__cm_n1',`__cm_diff_s',`__cm_se_s',`__cm_p_s',`__cm_status',exploratory disclosed candidate" _n
                }
                else {
                    file write `cm_fh' "age,auto_binary_or_median,`__cm_ng',`__cm_n0',`__cm_n1',.,.,.,regression_failed,交互项回归失败" _n
                }
            }
        }
    }
}
* candidate grouping: indep
capture confirm variable indep
if _rc {
    file write `cm_fh' "indep,auto_binary_or_median,.,.,.,.,.,.,missing_variable,候选变量不存在" _n
}
else {
    tempvar __cm_src __cm_grp __cm_gbin __cm_xg __cm_tag
    capture confirm numeric variable indep
    if _rc {
        capture encode indep, gen(`__cm_src')
    }
    else {
        capture gen double `__cm_src' = indep
    }
    if _rc {
        file write `cm_fh' "indep,auto_binary_or_median,.,.,.,.,.,.,encoding_failed,变量无法数值化" _n
    }
    else {
        quietly egen byte `__cm_tag' = tag(`__cm_src') if !missing(`__cm_src')
        quietly count if `__cm_tag' == 1
        local __cm_ng = r(N)
        if `__cm_ng' < 2 {
            file write `cm_fh' "indep,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,有效分组少于2组" _n
        }
        else {
            if `__cm_ng' == 2 {
                quietly levelsof `__cm_src' if !missing(`__cm_src'), local(__cm_lvls)
                quietly gen double `__cm_grp' = `__cm_src'
                local __cm_method "binary_or_two_value"
            }
            else {
                quietly summarize `__cm_src' if !missing(`__cm_src'), detail
                local __cm_med = r(p50)
                quietly gen byte `__cm_grp' = (`__cm_src' >= `__cm_med') if !missing(`__cm_src')
                local __cm_method "median_split"
            }
            quietly levelsof `__cm_grp' if !missing(`__cm_grp'), local(__cm_glvls)
            local __cm_gg : word count `__cm_glvls'
            if `__cm_gg' != 2 {
                file write `cm_fh' "indep,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,二分后仍不是2组" _n
            }
            else {
                local __cm_g0 : word 1 of `__cm_glvls'
                local __cm_g1 : word 2 of `__cm_glvls'
                quietly count if `__cm_grp'==`__cm_g0'
                local __cm_n0 = r(N)
                quietly count if `__cm_grp'==`__cm_g1'
                local __cm_n1 = r(N)
                quietly gen double `__cm_gbin' = (`__cm_grp' == `__cm_g1') if !missing(`__cm_grp')
                quietly gen double `__cm_xg' = dfi_index * `__cm_gbin'
                capture reg patent_count dfi_index `__cm_gbin' `__cm_xg' roa lev size growth cashflow tobinq top1 dual board soe age if !missing(`__cm_grp'), vce(robust)
                if _rc == 0 {
                    local __cm_diff = _b[`__cm_xg']
                    local __cm_se = _se[`__cm_xg']
                    local __cm_t = `__cm_diff' / `__cm_se'
                    capture local __cm_df = e(df_r)
                    local __cm_p = cond("`__cm_df'"=="" | "`__cm_df'"==".", 2*(1-normal(abs(`__cm_t'))), 2*ttail(`__cm_df', abs(`__cm_t')))
                    local __cm_diff_s : display %12.6f `__cm_diff'
                    local __cm_se_s : display %12.6f `__cm_se'
                    local __cm_p_s : display %9.4f `__cm_p'
                    local __cm_status "not_supported"
                    if `__cm_p' < 0.1 local __cm_status "supported"
                    file write `cm_fh' "indep,`__cm_method',`__cm_ng',`__cm_n0',`__cm_n1',`__cm_diff_s',`__cm_se_s',`__cm_p_s',`__cm_status',exploratory disclosed candidate" _n
                }
                else {
                    file write `cm_fh' "indep,auto_binary_or_median,`__cm_ng',`__cm_n0',`__cm_n1',.,.,.,regression_failed,交互项回归失败" _n
                }
            }
        }
    }
}
* candidate grouping: ind
capture confirm variable ind
if _rc {
    file write `cm_fh' "ind,auto_binary_or_median,.,.,.,.,.,.,missing_variable,候选变量不存在" _n
}
else {
    tempvar __cm_src __cm_grp __cm_gbin __cm_xg __cm_tag
    capture confirm numeric variable ind
    if _rc {
        capture encode ind, gen(`__cm_src')
    }
    else {
        capture gen double `__cm_src' = ind
    }
    if _rc {
        file write `cm_fh' "ind,auto_binary_or_median,.,.,.,.,.,.,encoding_failed,变量无法数值化" _n
    }
    else {
        quietly egen byte `__cm_tag' = tag(`__cm_src') if !missing(`__cm_src')
        quietly count if `__cm_tag' == 1
        local __cm_ng = r(N)
        if `__cm_ng' < 2 {
            file write `cm_fh' "ind,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,有效分组少于2组" _n
        }
        else {
            if `__cm_ng' == 2 {
                quietly levelsof `__cm_src' if !missing(`__cm_src'), local(__cm_lvls)
                quietly gen double `__cm_grp' = `__cm_src'
                local __cm_method "binary_or_two_value"
            }
            else {
                quietly summarize `__cm_src' if !missing(`__cm_src'), detail
                local __cm_med = r(p50)
                quietly gen byte `__cm_grp' = (`__cm_src' >= `__cm_med') if !missing(`__cm_src')
                local __cm_method "median_split"
            }
            quietly levelsof `__cm_grp' if !missing(`__cm_grp'), local(__cm_glvls)
            local __cm_gg : word count `__cm_glvls'
            if `__cm_gg' != 2 {
                file write `cm_fh' "ind,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,二分后仍不是2组" _n
            }
            else {
                local __cm_g0 : word 1 of `__cm_glvls'
                local __cm_g1 : word 2 of `__cm_glvls'
                quietly count if `__cm_grp'==`__cm_g0'
                local __cm_n0 = r(N)
                quietly count if `__cm_grp'==`__cm_g1'
                local __cm_n1 = r(N)
                quietly gen double `__cm_gbin' = (`__cm_grp' == `__cm_g1') if !missing(`__cm_grp')
                quietly gen double `__cm_xg' = dfi_index * `__cm_gbin'
                capture reg patent_count dfi_index `__cm_gbin' `__cm_xg' roa lev size growth cashflow tobinq top1 dual board indep soe age if !missing(`__cm_grp'), vce(robust)
                if _rc == 0 {
                    local __cm_diff = _b[`__cm_xg']
                    local __cm_se = _se[`__cm_xg']
                    local __cm_t = `__cm_diff' / `__cm_se'
                    capture local __cm_df = e(df_r)
                    local __cm_p = cond("`__cm_df'"=="" | "`__cm_df'"==".", 2*(1-normal(abs(`__cm_t'))), 2*ttail(`__cm_df', abs(`__cm_t')))
                    local __cm_diff_s : display %12.6f `__cm_diff'
                    local __cm_se_s : display %12.6f `__cm_se'
                    local __cm_p_s : display %9.4f `__cm_p'
                    local __cm_status "not_supported"
                    if `__cm_p' < 0.1 local __cm_status "supported"
                    file write `cm_fh' "ind,`__cm_method',`__cm_ng',`__cm_n0',`__cm_n1',`__cm_diff_s',`__cm_se_s',`__cm_p_s',`__cm_status',exploratory disclosed candidate" _n
                }
                else {
                    file write `cm_fh' "ind,auto_binary_or_median,`__cm_ng',`__cm_n0',`__cm_n1',.,.,.,regression_failed,交互项回归失败" _n
                }
            }
        }
    }
}
* candidate grouping: board
capture confirm variable board
if _rc {
    file write `cm_fh' "board,auto_binary_or_median,.,.,.,.,.,.,missing_variable,候选变量不存在" _n
}
else {
    tempvar __cm_src __cm_grp __cm_gbin __cm_xg __cm_tag
    capture confirm numeric variable board
    if _rc {
        capture encode board, gen(`__cm_src')
    }
    else {
        capture gen double `__cm_src' = board
    }
    if _rc {
        file write `cm_fh' "board,auto_binary_or_median,.,.,.,.,.,.,encoding_failed,变量无法数值化" _n
    }
    else {
        quietly egen byte `__cm_tag' = tag(`__cm_src') if !missing(`__cm_src')
        quietly count if `__cm_tag' == 1
        local __cm_ng = r(N)
        if `__cm_ng' < 2 {
            file write `cm_fh' "board,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,有效分组少于2组" _n
        }
        else {
            if `__cm_ng' == 2 {
                quietly levelsof `__cm_src' if !missing(`__cm_src'), local(__cm_lvls)
                quietly gen double `__cm_grp' = `__cm_src'
                local __cm_method "binary_or_two_value"
            }
            else {
                quietly summarize `__cm_src' if !missing(`__cm_src'), detail
                local __cm_med = r(p50)
                quietly gen byte `__cm_grp' = (`__cm_src' >= `__cm_med') if !missing(`__cm_src')
                local __cm_method "median_split"
            }
            quietly levelsof `__cm_grp' if !missing(`__cm_grp'), local(__cm_glvls)
            local __cm_gg : word count `__cm_glvls'
            if `__cm_gg' != 2 {
                file write `cm_fh' "board,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,二分后仍不是2组" _n
            }
            else {
                local __cm_g0 : word 1 of `__cm_glvls'
                local __cm_g1 : word 2 of `__cm_glvls'
                quietly count if `__cm_grp'==`__cm_g0'
                local __cm_n0 = r(N)
                quietly count if `__cm_grp'==`__cm_g1'
                local __cm_n1 = r(N)
                quietly gen double `__cm_gbin' = (`__cm_grp' == `__cm_g1') if !missing(`__cm_grp')
                quietly gen double `__cm_xg' = dfi_index * `__cm_gbin'
                capture reg patent_count dfi_index `__cm_gbin' `__cm_xg' roa lev size growth cashflow tobinq top1 dual indep soe age if !missing(`__cm_grp'), vce(robust)
                if _rc == 0 {
                    local __cm_diff = _b[`__cm_xg']
                    local __cm_se = _se[`__cm_xg']
                    local __cm_t = `__cm_diff' / `__cm_se'
                    capture local __cm_df = e(df_r)
                    local __cm_p = cond("`__cm_df'"=="" | "`__cm_df'"==".", 2*(1-normal(abs(`__cm_t'))), 2*ttail(`__cm_df', abs(`__cm_t')))
                    local __cm_diff_s : display %12.6f `__cm_diff'
                    local __cm_se_s : display %12.6f `__cm_se'
                    local __cm_p_s : display %9.4f `__cm_p'
                    local __cm_status "not_supported"
                    if `__cm_p' < 0.1 local __cm_status "supported"
                    file write `cm_fh' "board,`__cm_method',`__cm_ng',`__cm_n0',`__cm_n1',`__cm_diff_s',`__cm_se_s',`__cm_p_s',`__cm_status',exploratory disclosed candidate" _n
                }
                else {
                    file write `cm_fh' "board,auto_binary_or_median,`__cm_ng',`__cm_n0',`__cm_n1',.,.,.,regression_failed,交互项回归失败" _n
                }
            }
        }
    }
}
* candidate grouping: cashflow
capture confirm variable cashflow
if _rc {
    file write `cm_fh' "cashflow,auto_binary_or_median,.,.,.,.,.,.,missing_variable,候选变量不存在" _n
}
else {
    tempvar __cm_src __cm_grp __cm_gbin __cm_xg __cm_tag
    capture confirm numeric variable cashflow
    if _rc {
        capture encode cashflow, gen(`__cm_src')
    }
    else {
        capture gen double `__cm_src' = cashflow
    }
    if _rc {
        file write `cm_fh' "cashflow,auto_binary_or_median,.,.,.,.,.,.,encoding_failed,变量无法数值化" _n
    }
    else {
        quietly egen byte `__cm_tag' = tag(`__cm_src') if !missing(`__cm_src')
        quietly count if `__cm_tag' == 1
        local __cm_ng = r(N)
        if `__cm_ng' < 2 {
            file write `cm_fh' "cashflow,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,有效分组少于2组" _n
        }
        else {
            if `__cm_ng' == 2 {
                quietly levelsof `__cm_src' if !missing(`__cm_src'), local(__cm_lvls)
                quietly gen double `__cm_grp' = `__cm_src'
                local __cm_method "binary_or_two_value"
            }
            else {
                quietly summarize `__cm_src' if !missing(`__cm_src'), detail
                local __cm_med = r(p50)
                quietly gen byte `__cm_grp' = (`__cm_src' >= `__cm_med') if !missing(`__cm_src')
                local __cm_method "median_split"
            }
            quietly levelsof `__cm_grp' if !missing(`__cm_grp'), local(__cm_glvls)
            local __cm_gg : word count `__cm_glvls'
            if `__cm_gg' != 2 {
                file write `cm_fh' "cashflow,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,二分后仍不是2组" _n
            }
            else {
                local __cm_g0 : word 1 of `__cm_glvls'
                local __cm_g1 : word 2 of `__cm_glvls'
                quietly count if `__cm_grp'==`__cm_g0'
                local __cm_n0 = r(N)
                quietly count if `__cm_grp'==`__cm_g1'
                local __cm_n1 = r(N)
                quietly gen double `__cm_gbin' = (`__cm_grp' == `__cm_g1') if !missing(`__cm_grp')
                quietly gen double `__cm_xg' = dfi_index * `__cm_gbin'
                capture reg patent_count dfi_index `__cm_gbin' `__cm_xg' roa lev size growth tobinq top1 dual board indep soe age if !missing(`__cm_grp'), vce(robust)
                if _rc == 0 {
                    local __cm_diff = _b[`__cm_xg']
                    local __cm_se = _se[`__cm_xg']
                    local __cm_t = `__cm_diff' / `__cm_se'
                    capture local __cm_df = e(df_r)
                    local __cm_p = cond("`__cm_df'"=="" | "`__cm_df'"==".", 2*(1-normal(abs(`__cm_t'))), 2*ttail(`__cm_df', abs(`__cm_t')))
                    local __cm_diff_s : display %12.6f `__cm_diff'
                    local __cm_se_s : display %12.6f `__cm_se'
                    local __cm_p_s : display %9.4f `__cm_p'
                    local __cm_status "not_supported"
                    if `__cm_p' < 0.1 local __cm_status "supported"
                    file write `cm_fh' "cashflow,`__cm_method',`__cm_ng',`__cm_n0',`__cm_n1',`__cm_diff_s',`__cm_se_s',`__cm_p_s',`__cm_status',exploratory disclosed candidate" _n
                }
                else {
                    file write `cm_fh' "cashflow,auto_binary_or_median,`__cm_ng',`__cm_n0',`__cm_n1',.,.,.,regression_failed,交互项回归失败" _n
                }
            }
        }
    }
}
* candidate grouping: dual
capture confirm variable dual
if _rc {
    file write `cm_fh' "dual,auto_binary_or_median,.,.,.,.,.,.,missing_variable,候选变量不存在" _n
}
else {
    tempvar __cm_src __cm_grp __cm_gbin __cm_xg __cm_tag
    capture confirm numeric variable dual
    if _rc {
        capture encode dual, gen(`__cm_src')
    }
    else {
        capture gen double `__cm_src' = dual
    }
    if _rc {
        file write `cm_fh' "dual,auto_binary_or_median,.,.,.,.,.,.,encoding_failed,变量无法数值化" _n
    }
    else {
        quietly egen byte `__cm_tag' = tag(`__cm_src') if !missing(`__cm_src')
        quietly count if `__cm_tag' == 1
        local __cm_ng = r(N)
        if `__cm_ng' < 2 {
            file write `cm_fh' "dual,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,有效分组少于2组" _n
        }
        else {
            if `__cm_ng' == 2 {
                quietly levelsof `__cm_src' if !missing(`__cm_src'), local(__cm_lvls)
                quietly gen double `__cm_grp' = `__cm_src'
                local __cm_method "binary_or_two_value"
            }
            else {
                quietly summarize `__cm_src' if !missing(`__cm_src'), detail
                local __cm_med = r(p50)
                quietly gen byte `__cm_grp' = (`__cm_src' >= `__cm_med') if !missing(`__cm_src')
                local __cm_method "median_split"
            }
            quietly levelsof `__cm_grp' if !missing(`__cm_grp'), local(__cm_glvls)
            local __cm_gg : word count `__cm_glvls'
            if `__cm_gg' != 2 {
                file write `cm_fh' "dual,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,二分后仍不是2组" _n
            }
            else {
                local __cm_g0 : word 1 of `__cm_glvls'
                local __cm_g1 : word 2 of `__cm_glvls'
                quietly count if `__cm_grp'==`__cm_g0'
                local __cm_n0 = r(N)
                quietly count if `__cm_grp'==`__cm_g1'
                local __cm_n1 = r(N)
                quietly gen double `__cm_gbin' = (`__cm_grp' == `__cm_g1') if !missing(`__cm_grp')
                quietly gen double `__cm_xg' = dfi_index * `__cm_gbin'
                capture reg patent_count dfi_index `__cm_gbin' `__cm_xg' roa lev size growth cashflow tobinq top1 board indep soe age if !missing(`__cm_grp'), vce(robust)
                if _rc == 0 {
                    local __cm_diff = _b[`__cm_xg']
                    local __cm_se = _se[`__cm_xg']
                    local __cm_t = `__cm_diff' / `__cm_se'
                    capture local __cm_df = e(df_r)
                    local __cm_p = cond("`__cm_df'"=="" | "`__cm_df'"==".", 2*(1-normal(abs(`__cm_t'))), 2*ttail(`__cm_df', abs(`__cm_t')))
                    local __cm_diff_s : display %12.6f `__cm_diff'
                    local __cm_se_s : display %12.6f `__cm_se'
                    local __cm_p_s : display %9.4f `__cm_p'
                    local __cm_status "not_supported"
                    if `__cm_p' < 0.1 local __cm_status "supported"
                    file write `cm_fh' "dual,`__cm_method',`__cm_ng',`__cm_n0',`__cm_n1',`__cm_diff_s',`__cm_se_s',`__cm_p_s',`__cm_status',exploratory disclosed candidate" _n
                }
                else {
                    file write `cm_fh' "dual,auto_binary_or_median,`__cm_ng',`__cm_n0',`__cm_n1',.,.,.,regression_failed,交互项回归失败" _n
                }
            }
        }
    }
}
* candidate grouping: growth
capture confirm variable growth
if _rc {
    file write `cm_fh' "growth,auto_binary_or_median,.,.,.,.,.,.,missing_variable,候选变量不存在" _n
}
else {
    tempvar __cm_src __cm_grp __cm_gbin __cm_xg __cm_tag
    capture confirm numeric variable growth
    if _rc {
        capture encode growth, gen(`__cm_src')
    }
    else {
        capture gen double `__cm_src' = growth
    }
    if _rc {
        file write `cm_fh' "growth,auto_binary_or_median,.,.,.,.,.,.,encoding_failed,变量无法数值化" _n
    }
    else {
        quietly egen byte `__cm_tag' = tag(`__cm_src') if !missing(`__cm_src')
        quietly count if `__cm_tag' == 1
        local __cm_ng = r(N)
        if `__cm_ng' < 2 {
            file write `cm_fh' "growth,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,有效分组少于2组" _n
        }
        else {
            if `__cm_ng' == 2 {
                quietly levelsof `__cm_src' if !missing(`__cm_src'), local(__cm_lvls)
                quietly gen double `__cm_grp' = `__cm_src'
                local __cm_method "binary_or_two_value"
            }
            else {
                quietly summarize `__cm_src' if !missing(`__cm_src'), detail
                local __cm_med = r(p50)
                quietly gen byte `__cm_grp' = (`__cm_src' >= `__cm_med') if !missing(`__cm_src')
                local __cm_method "median_split"
            }
            quietly levelsof `__cm_grp' if !missing(`__cm_grp'), local(__cm_glvls)
            local __cm_gg : word count `__cm_glvls'
            if `__cm_gg' != 2 {
                file write `cm_fh' "growth,auto_binary_or_median,`__cm_ng',.,.,.,.,.,degenerate,二分后仍不是2组" _n
            }
            else {
                local __cm_g0 : word 1 of `__cm_glvls'
                local __cm_g1 : word 2 of `__cm_glvls'
                quietly count if `__cm_grp'==`__cm_g0'
                local __cm_n0 = r(N)
                quietly count if `__cm_grp'==`__cm_g1'
                local __cm_n1 = r(N)
                quietly gen double `__cm_gbin' = (`__cm_grp' == `__cm_g1') if !missing(`__cm_grp')
                quietly gen double `__cm_xg' = dfi_index * `__cm_gbin'
                capture reg patent_count dfi_index `__cm_gbin' `__cm_xg' roa lev size cashflow tobinq top1 dual board indep soe age if !missing(`__cm_grp'), vce(robust)
                if _rc == 0 {
                    local __cm_diff = _b[`__cm_xg']
                    local __cm_se = _se[`__cm_xg']
                    local __cm_t = `__cm_diff' / `__cm_se'
                    capture local __cm_df = e(df_r)
                    local __cm_p = cond("`__cm_df'"=="" | "`__cm_df'"==".", 2*(1-normal(abs(`__cm_t'))), 2*ttail(`__cm_df', abs(`__cm_t')))
                    local __cm_diff_s : display %12.6f `__cm_diff'
                    local __cm_se_s : display %12.6f `__cm_se'
                    local __cm_p_s : display %9.4f `__cm_p'
                    local __cm_status "not_supported"
                    if `__cm_p' < 0.1 local __cm_status "supported"
                    file write `cm_fh' "growth,`__cm_method',`__cm_ng',`__cm_n0',`__cm_n1',`__cm_diff_s',`__cm_se_s',`__cm_p_s',`__cm_status',exploratory disclosed candidate" _n
                }
                else {
                    file write `cm_fh' "growth,auto_binary_or_median,`__cm_ng',`__cm_n0',`__cm_n1',.,.,.,regression_failed,交互项回归失败" _n
                }
            }
        }
    }
}
file close `cm_fh'
di "异质性分析完成:分组变量=size,方法=by_median"
log close

六、实际输出表

这张表就是本方法页使用的案例输出文件,保存在 marketing/method_case_assets/heterogeneity/result.csv

变量(size<中位数)(size≥中位数)
dfi_index 0.5890*** 0.5784***
( 0.0596)( 0.0542)
roa 0.2790*** 0.3030***
( 0.0636)( 0.0592)
lev -0.0311 -0.0202
( 0.0613)( 0.0655)
size 0.1006 0.3969***
( 0.0855)( 0.1092)
growth 0.0389 -0.1402**
( 0.0611)( 0.0640)
cashflow 0.0532 -0.0009
( 0.0649)( 0.0546)
tobinq 0.0692 -0.0029
( 0.0695)( 0.0608)
top1 -0.0406 0.0147
( 0.0633)( 0.0638)
dual -0.0110 0.0127
( 0.0603)( 0.0657)
board -0.0549 0.0066
( 0.0552)( 0.0593)
indep 0.0553 -0.0808
( 0.0667)( 0.0574)
soe 0.0379 -0.0521
( 0.0602)( 0.0681)
age 0.0449 -0.0628
( 0.0606)( 0.0569)
N 360 360
0.2840 0.3323

补充输出

下面这些文件来自同一次案例运行或烟测输出,用来补齐主表之外的诊断信息。

heterogeneity_candidate_matrix.csv

candidatemethodsource_groupsn_group0n_group1diff_coefdiff_sediff_pstatusnote
sizemedian_split713360360 -0.014187 0.081464 0.8618not_supportedexploratory disclosed candidate
agemedian_split711360360 0.080481 0.082598 0.3302not_supportedexploratory disclosed candidate
indepmedian_split711360360 -0.041161 0.083297 0.6214not_supportedexploratory disclosed candidate
indmedian_split8360360 -0.042618 0.082585 0.6060not_supportedexploratory disclosed candidate
boardmedian_split716360360 -0.083428 0.084312 0.3228not_supportedexploratory disclosed candidate
cashflowmedian_split712360360 0.027120 0.081387 0.7391not_supportedexploratory disclosed candidate
dualmedian_split715360360 0.177181 0.081392 0.0298supportedexploratory disclosed candidate
growthmedian_split715360360 -0.009234 0.081983 0.9104not_supportedexploratory disclosed candidate

heterogeneity_diff_test.csv

项目说明
分组数2仅二分组情形下 Chow/交互项检验成立
组间系数差(组2-组1) -0.000924
标准误 0.080567
p 值 0.9909p<0.1 * / p<0.05 ** / p<0.01 ***
结论组间差异不显著p 值 = 0.9909不能拒绝组间系数相等

七、案例图

这是一张由同一份案例数据生成的页面内诊断图。

异质性分析 的共用案例输出图。
异质性分析 的共用案例输出图。

八、论文里怎么写

本文在共用企业面板样本上报告异质性分析,核心输出见 regression_table_异质性检验.csv。结果解释时同时关注样本口径、变量构造、系数方向、标准误和适用前提,避免只凭单个 p 值完成方法选择。

九、检查清单

  • 确认本页使用的因变量、核心解释变量、控制变量与论文主模型一致。
  • 先看表格里的样本口径,再看系数、p 值或诊断指标。
  • 代码里的输出文件名要能对应网页展示的结果表。

返回方法库 · 打开 empirical-wizard