Method 07 · baseline_regression
基准回归
用同一份企业面板跑出可解释的模型证据
基准回归 的 Markdown 风格教程:基于共用 CSMAR 风格案例生成实际代码、结果表和案例图。
一、基准回归是什么?
这页是 基准回归 的方法文档。所有表格和图都由 marketing/method_case_assets/generate_assets.py 从同一份 csmar_innovation_realistic.csv 生成,避免用占位图充当教程。重点是用 firm-year 面板说明模型设定、固定效应、聚类标准误和结果解读。
二、按这个案例走一遍
开始前先确认
- 先确认 Y、X、控制变量来自同一份清洗后的样本。
- 先决定固定效应:企业面板通常至少要考虑年份固定效应,很多时候还要企业固定效应。
- 标准误要和数据结构匹配。公司面板常用企业层面 cluster。
- 主表通常做成多列,从简单模型逐步加控制和固定效应。
操作顺序
| 步骤 | 你在做什么 | 做到什么程度算对 |
|---|---|---|
| 1. 简单回归 | 只放核心变量和年份效应。 | 看最基础方向。 |
| 2. 加控制变量 | 加入 ROA、Lev、Size 等控制。 | 看 DFI 系数是否大幅变化。 |
| 3. 加企业固定效应 | 控制企业不随时间变化的特征。 | 这是面板论文常见主规格。 |
| 4. 加聚类标准误 | 按公司聚类。 | 同一公司多年观测不应当成完全独立。 |
| 5. 读稳定性 | 比较四列系数、R²、N。 | 方向和显著性稳定,主结果才更可信。 |
代码逐行解释
| 代码/命令 | 这行在干什么 |
|---|---|
| global y ln_patent1 | 把因变量写成宏,后面改模型更方便。 |
| global x dfi_index | 核心解释变量。论文主结论主要看这一行。 |
| global controls ... | 控制变量清单。不要每列手打,容易漏。 |
| xtreg ..., fe | 固定效应回归,适合企业面板。 |
| vce(cluster firm_id) | 按企业聚类标准误,处理同一企业内部相关。 |
结果表怎么读
| 格子 | 读法 |
|---|---|
| dfi_index 系数 | 这是主结论。正数表示 DFI 越高,ln Patent 越高。 |
| 星号 / p 值 | 说明统计显著性,但不要只看星号,也要看系数大小。 |
| N | 四列 N 如果不一致,要解释为什么。 |
| R² | 看模型解释度变化,但不是越高越好;固定效应会改变 R² 含义。 |
最容易写错的地方
- 不要把第一列当最终结论。通常要看控制变量和固定效应后的主规格。
- 不要只写显著,不写经济含义。系数大小要能解释。
- 不要忘记说明标准误口径。cluster 和 robust 不是同一件事。
自己复现时要做到
复现时用一句话读每一列:这一列比上一列多控制了什么,DFI 系数有没有变。
三、先看这个案例的结论
- 核心变量 dfi_index 在四个模型中的系数是 0.5835*** / 0.5709*** / 0.5671*** / 0.5660***,方向和显著性都比较稳定。
- R² 从 0.1932 / 0.2824 / 0.4069 / 0.4098 逐步上升,说明控制变量、个体固定效应和时间固定效应确实解释了额外变异。
- 四列样本量是 720 / 720 / 720 / 720;如果你自己的表里样本量每列跳动很大,先回去检查缺失和变量口径。
四、案例口径
| 字段 | 口径 |
|---|---|
| 数据 | CSMAR 风格 A 股企业创新面板 |
| 原始样本 | 196 家上市公司,2015-2020 年,约 1200 个公司-年观测;各方法有效样本以本页输出表 N 为准 |
| 因变量 | patent_count;回归页通常使用 ln(1 + patent_count) |
| 核心解释变量 | dfi_index,数字普惠金融指数;部分真实烟测输出展示的是标准化后的 dfi_index |
| 控制变量 | roa、lev、size、growth、cashflow、tobinq、top1、dual、board、indep、soe、age |
| 输出文件 | regression_table_基准回归.csv |
| 角色要求 | dv、iv |
| 依赖包 | 无额外 Stata 社区包要求 |
五、实际代码
下面是本页对应的最小可复现 Stata 代码。生产环境里 empirical-wizard 会在此基础上处理变量映射、输出校验、失败诊断和报告装配。
log using "/root/workspace/empirical-wizard/workspace/e53e8eb7/analysis.log", replace text
global JOB_DIR "/root/workspace/empirical-wizard/workspace/e53e8eb7"
set more off
adopath + "/root/ado/plus"
global DATA_PATH "/root/workspace/empirical-wizard/workspace/test_e2e/csmar_innovation.csv"
import delimited "/root/workspace/empirical-wizard/workspace/test_e2e/csmar_innovation.csv", clear case(preserve)
capture confirm global JOB_DIR
if _rc global JOB_DIR "."
quietly duplicates drop
local dvvar "patent_count"
local ivvar "dfi_index"
local controls "roa lev size growth cashflow tobinq top1 dual board indep soe age"
local idvar "stkcd"
local timevar "year"
local industryvar "ind"
local geovar ""
local allvars "dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age"
capture confirm variable `dvvar'
if _rc {
di as error "Dependent variable not found: `dvvar'"
exit 111
}
capture confirm variable `ivvar'
if _rc {
di as error "Core independent variable not found: `ivvar'"
exit 111
}
local absorb_id "`idvar'"
if "`idvar'" != "" {
capture confirm numeric variable `idvar'
if _rc {
tempvar __ewiz_id
encode `idvar', gen(`__ewiz_id')
local absorb_id "`__ewiz_id'"
}
}
local cluster_id "`absorb_id'"
local fe_time "`timevar'"
if "`timevar'" != "" {
capture confirm numeric variable `timevar'
if _rc {
tempvar __ewiz_time
encode `timevar', gen(`__ewiz_time')
local fe_time "`__ewiz_time'"
}
}
local industry_fe "`industryvar'"
if "`industryvar'" != "" {
capture confirm numeric variable `industryvar'
if _rc {
tempvar __ewiz_industry
encode `industryvar', gen(`__ewiz_industry')
local industry_fe "`__ewiz_industry'"
}
}
local cluster_industry "`industry_fe'"
local geo_fe "`geovar'"
if "`geovar'" != "" {
capture confirm numeric variable `geovar'
if _rc {
tempvar __ewiz_geo
encode `geovar', gen(`__ewiz_geo')
local geo_fe "`__ewiz_geo'"
}
}
local cluster_geo "`geo_fe'"
tempfile ewiz_summary
tempname ewiz_sum
postfile `ewiz_sum' str32 spec double coef se pvalue str4 stars double N R2 using `ewiz_summary', replace
foreach v of local allvars {
local cell_b1_`v' ""
local secell_b1_`v' ""
local cell_b2_`v' ""
local secell_b2_`v' ""
local cell_b3_`v' ""
local secell_b3_`v' ""
local cell_b4_`v' ""
local secell_b4_`v' ""
}
preserve
quietly reg `dvvar' `ivvar', vce(robust)
restore
capture local N_b1 : display %12.0f e(N)
if "`N_b1'"=="" local N_b1 = .
capture local R2_b1 : display %9.4f e(r2)
if "`R2_b1'"=="" local R2_b1 = .
capture scalar __ewiz_main_b_b1 = _b[`ivvar']
capture scalar __ewiz_main_se_b1 = _se[`ivvar']
capture local coefmain_b1 : display %9.4f __ewiz_main_b_b1
capture if abs(__ewiz_main_b_b1) < 0.00005 & __ewiz_main_b_b1 != 0 local coefmain_b1 : display %9.2e __ewiz_main_b_b1
capture local semain_b1 : display %9.4f __ewiz_main_se_b1
capture if abs(__ewiz_main_se_b1) < 0.00005 & __ewiz_main_se_b1 != 0 local semain_b1 : display %9.2e __ewiz_main_se_b1
local pmain_b1 = .
local starsmain_b1 ""
local __ewiz_model_valid_b1 = 1
capture scalar __ewiz_coef_abs_b1 = abs(_b[`ivvar'])
if _rc local __ewiz_model_valid_b1 = 0
capture scalar __ewiz_se_abs_b1 = abs(_se[`ivvar'])
if _rc local __ewiz_model_valid_b1 = 0
capture if scalar(__ewiz_se_abs_b1) <= 1e-12 local __ewiz_model_valid_b1 = 0
capture if scalar(__ewiz_coef_abs_b1) <= 1e-12 & scalar(__ewiz_se_abs_b1) <= 1e-12 local __ewiz_model_valid_b1 = 0
capture local __df_b1 = e(df_r)
capture local __z_b1 = abs(_b[`ivvar'] / _se[`ivvar'])
if !_rc & `__ewiz_model_valid_b1' == 1 {
if "`__df_b1'"=="" | "`__df_b1'"=="." local pmain_b1 = 2*(1-normal(`__z_b1'))
else local pmain_b1 = 2*ttail(`__df_b1', `__z_b1')
}
if `__ewiz_model_valid_b1' == 1 local starsmain_b1 = cond(`pmain_b1'<0.01,"***",cond(`pmain_b1'<0.05,"**",cond(`pmain_b1'<0.1,"*","")))
if `__ewiz_model_valid_b1' == 1 {
capture post `ewiz_sum' ("(1)") (_b[`ivvar']) (_se[`ivvar']) (`pmain_b1') ("`starsmain_b1'") (`N_b1') (`R2_b1')
if _rc post `ewiz_sum' ("(1)") (.) (.) (.) ("") (`N_b1') (`R2_b1')
}
else post `ewiz_sum' ("(1)") (.) (.) (.) ("") (`N_b1') (`R2_b1')
if `__ewiz_model_valid_b1' == 1 {
local __included_vars "`ivvar' dfi_index"
foreach v of local allvars {
local __is_included = 0
foreach __inc of local __included_vars {
if "`v'" == "`__inc'" local __is_included = 1
}
if `__is_included' == 0 continue
local model_term "`v'"
if "`v'" == "`ivvar'" local model_term "`ivvar'"
capture scalar __ewiz_term_b = _b[`model_term']
capture scalar __ewiz_term_se = _se[`model_term']
if _rc | missing(__ewiz_term_se) | __ewiz_term_se <= 1e-12 {
local cell_b1_`v' "omitted"
local secell_b1_`v' "(absorbed)"
continue
}
capture local __coef : display %9.4f __ewiz_term_b
capture if abs(__ewiz_term_b) < 0.00005 & __ewiz_term_b != 0 local __coef : display %9.2e __ewiz_term_b
if !_rc {
capture local __se : display %9.4f __ewiz_term_se
capture if abs(__ewiz_term_se) < 0.00005 & __ewiz_term_se != 0 local __se : display %9.2e __ewiz_term_se
local __p = .
local __stars ""
capture local __df = e(df_r)
capture local __z = abs(_b[`model_term'] / _se[`model_term'])
if !_rc {
if "`__df'"=="" | "`__df'"=="." local __p = 2*(1-normal(`__z'))
else local __p = 2*ttail(`__df', `__z')
}
if !_rc local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
local cell_b1_`v' "`__coef'`__stars'"
local secell_b1_`v' "(`__se')"
}
}
}
di "(1): `ivvar'=`coefmain_b1'`starsmain_b1' (se=`semain_b1', p=`pmain_b1'), N=`N_b1', R2=`R2_b1'"
preserve
quietly reg `dvvar' `ivvar' roa lev size growth cashflow tobinq top1 dual board indep soe age, vce(robust)
restore
capture local N_b2 : display %12.0f e(N)
if "`N_b2'"=="" local N_b2 = .
capture local R2_b2 : display %9.4f e(r2)
if "`R2_b2'"=="" local R2_b2 = .
capture scalar __ewiz_main_b_b2 = _b[`ivvar']
capture scalar __ewiz_main_se_b2 = _se[`ivvar']
capture local coefmain_b2 : display %9.4f __ewiz_main_b_b2
capture if abs(__ewiz_main_b_b2) < 0.00005 & __ewiz_main_b_b2 != 0 local coefmain_b2 : display %9.2e __ewiz_main_b_b2
capture local semain_b2 : display %9.4f __ewiz_main_se_b2
capture if abs(__ewiz_main_se_b2) < 0.00005 & __ewiz_main_se_b2 != 0 local semain_b2 : display %9.2e __ewiz_main_se_b2
local pmain_b2 = .
local starsmain_b2 ""
local __ewiz_model_valid_b2 = 1
capture scalar __ewiz_coef_abs_b2 = abs(_b[`ivvar'])
if _rc local __ewiz_model_valid_b2 = 0
capture scalar __ewiz_se_abs_b2 = abs(_se[`ivvar'])
if _rc local __ewiz_model_valid_b2 = 0
capture if scalar(__ewiz_se_abs_b2) <= 1e-12 local __ewiz_model_valid_b2 = 0
capture if scalar(__ewiz_coef_abs_b2) <= 1e-12 & scalar(__ewiz_se_abs_b2) <= 1e-12 local __ewiz_model_valid_b2 = 0
capture local __df_b2 = e(df_r)
capture local __z_b2 = abs(_b[`ivvar'] / _se[`ivvar'])
if !_rc & `__ewiz_model_valid_b2' == 1 {
if "`__df_b2'"=="" | "`__df_b2'"=="." local pmain_b2 = 2*(1-normal(`__z_b2'))
else local pmain_b2 = 2*ttail(`__df_b2', `__z_b2')
}
if `__ewiz_model_valid_b2' == 1 local starsmain_b2 = cond(`pmain_b2'<0.01,"***",cond(`pmain_b2'<0.05,"**",cond(`pmain_b2'<0.1,"*","")))
if `__ewiz_model_valid_b2' == 1 {
capture post `ewiz_sum' ("(2)") (_b[`ivvar']) (_se[`ivvar']) (`pmain_b2') ("`starsmain_b2'") (`N_b2') (`R2_b2')
if _rc post `ewiz_sum' ("(2)") (.) (.) (.) ("") (`N_b2') (`R2_b2')
}
else post `ewiz_sum' ("(2)") (.) (.) (.) ("") (`N_b2') (`R2_b2')
if `__ewiz_model_valid_b2' == 1 {
local __included_vars "`ivvar' dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age"
foreach v of local allvars {
local __is_included = 0
foreach __inc of local __included_vars {
if "`v'" == "`__inc'" local __is_included = 1
}
if `__is_included' == 0 continue
local model_term "`v'"
if "`v'" == "`ivvar'" local model_term "`ivvar'"
capture scalar __ewiz_term_b = _b[`model_term']
capture scalar __ewiz_term_se = _se[`model_term']
if _rc | missing(__ewiz_term_se) | __ewiz_term_se <= 1e-12 {
local cell_b2_`v' "omitted"
local secell_b2_`v' "(absorbed)"
continue
}
capture local __coef : display %9.4f __ewiz_term_b
capture if abs(__ewiz_term_b) < 0.00005 & __ewiz_term_b != 0 local __coef : display %9.2e __ewiz_term_b
if !_rc {
capture local __se : display %9.4f __ewiz_term_se
capture if abs(__ewiz_term_se) < 0.00005 & __ewiz_term_se != 0 local __se : display %9.2e __ewiz_term_se
local __p = .
local __stars ""
capture local __df = e(df_r)
capture local __z = abs(_b[`model_term'] / _se[`model_term'])
if !_rc {
if "`__df'"=="" | "`__df'"=="." local __p = 2*(1-normal(`__z'))
else local __p = 2*ttail(`__df', `__z')
}
if !_rc local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
local cell_b2_`v' "`__coef'`__stars'"
local secell_b2_`v' "(`__se')"
}
}
}
di "(2): `ivvar'=`coefmain_b2'`starsmain_b2' (se=`semain_b2', p=`pmain_b2'), N=`N_b2', R2=`R2_b2'"
preserve
quietly areg `dvvar' `ivvar' roa lev size growth cashflow tobinq top1 dual board indep soe age, absorb(`absorb_id') vce(robust)
restore
capture local N_b3 : display %12.0f e(N)
if "`N_b3'"=="" local N_b3 = .
capture local R2_b3 : display %9.4f e(r2)
if "`R2_b3'"=="" local R2_b3 = .
capture scalar __ewiz_main_b_b3 = _b[`ivvar']
capture scalar __ewiz_main_se_b3 = _se[`ivvar']
capture local coefmain_b3 : display %9.4f __ewiz_main_b_b3
capture if abs(__ewiz_main_b_b3) < 0.00005 & __ewiz_main_b_b3 != 0 local coefmain_b3 : display %9.2e __ewiz_main_b_b3
capture local semain_b3 : display %9.4f __ewiz_main_se_b3
capture if abs(__ewiz_main_se_b3) < 0.00005 & __ewiz_main_se_b3 != 0 local semain_b3 : display %9.2e __ewiz_main_se_b3
local pmain_b3 = .
local starsmain_b3 ""
local __ewiz_model_valid_b3 = 1
capture scalar __ewiz_coef_abs_b3 = abs(_b[`ivvar'])
if _rc local __ewiz_model_valid_b3 = 0
capture scalar __ewiz_se_abs_b3 = abs(_se[`ivvar'])
if _rc local __ewiz_model_valid_b3 = 0
capture if scalar(__ewiz_se_abs_b3) <= 1e-12 local __ewiz_model_valid_b3 = 0
capture if scalar(__ewiz_coef_abs_b3) <= 1e-12 & scalar(__ewiz_se_abs_b3) <= 1e-12 local __ewiz_model_valid_b3 = 0
capture local __df_b3 = e(df_r)
capture local __z_b3 = abs(_b[`ivvar'] / _se[`ivvar'])
if !_rc & `__ewiz_model_valid_b3' == 1 {
if "`__df_b3'"=="" | "`__df_b3'"=="." local pmain_b3 = 2*(1-normal(`__z_b3'))
else local pmain_b3 = 2*ttail(`__df_b3', `__z_b3')
}
if `__ewiz_model_valid_b3' == 1 local starsmain_b3 = cond(`pmain_b3'<0.01,"***",cond(`pmain_b3'<0.05,"**",cond(`pmain_b3'<0.1,"*","")))
if `__ewiz_model_valid_b3' == 1 {
capture post `ewiz_sum' ("(3)") (_b[`ivvar']) (_se[`ivvar']) (`pmain_b3') ("`starsmain_b3'") (`N_b3') (`R2_b3')
if _rc post `ewiz_sum' ("(3)") (.) (.) (.) ("") (`N_b3') (`R2_b3')
}
else post `ewiz_sum' ("(3)") (.) (.) (.) ("") (`N_b3') (`R2_b3')
if `__ewiz_model_valid_b3' == 1 {
local __included_vars "`ivvar' dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age"
foreach v of local allvars {
local __is_included = 0
foreach __inc of local __included_vars {
if "`v'" == "`__inc'" local __is_included = 1
}
if `__is_included' == 0 continue
local model_term "`v'"
if "`v'" == "`ivvar'" local model_term "`ivvar'"
capture scalar __ewiz_term_b = _b[`model_term']
capture scalar __ewiz_term_se = _se[`model_term']
if _rc | missing(__ewiz_term_se) | __ewiz_term_se <= 1e-12 {
local cell_b3_`v' "omitted"
local secell_b3_`v' "(absorbed)"
continue
}
capture local __coef : display %9.4f __ewiz_term_b
capture if abs(__ewiz_term_b) < 0.00005 & __ewiz_term_b != 0 local __coef : display %9.2e __ewiz_term_b
if !_rc {
capture local __se : display %9.4f __ewiz_term_se
capture if abs(__ewiz_term_se) < 0.00005 & __ewiz_term_se != 0 local __se : display %9.2e __ewiz_term_se
local __p = .
local __stars ""
capture local __df = e(df_r)
capture local __z = abs(_b[`model_term'] / _se[`model_term'])
if !_rc {
if "`__df'"=="" | "`__df'"=="." local __p = 2*(1-normal(`__z'))
else local __p = 2*ttail(`__df', `__z')
}
if !_rc local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
local cell_b3_`v' "`__coef'`__stars'"
local secell_b3_`v' "(`__se')"
}
}
}
di "(3): `ivvar'=`coefmain_b3'`starsmain_b3' (se=`semain_b3', p=`pmain_b3'), N=`N_b3', R2=`R2_b3'"
preserve
quietly areg `dvvar' `ivvar' roa lev size growth cashflow tobinq top1 dual board indep soe age i.`fe_time', absorb(`absorb_id') vce(robust)
restore
capture local N_b4 : display %12.0f e(N)
if "`N_b4'"=="" local N_b4 = .
capture local R2_b4 : display %9.4f e(r2)
if "`R2_b4'"=="" local R2_b4 = .
capture scalar __ewiz_main_b_b4 = _b[`ivvar']
capture scalar __ewiz_main_se_b4 = _se[`ivvar']
capture local coefmain_b4 : display %9.4f __ewiz_main_b_b4
capture if abs(__ewiz_main_b_b4) < 0.00005 & __ewiz_main_b_b4 != 0 local coefmain_b4 : display %9.2e __ewiz_main_b_b4
capture local semain_b4 : display %9.4f __ewiz_main_se_b4
capture if abs(__ewiz_main_se_b4) < 0.00005 & __ewiz_main_se_b4 != 0 local semain_b4 : display %9.2e __ewiz_main_se_b4
local pmain_b4 = .
local starsmain_b4 ""
local __ewiz_model_valid_b4 = 1
capture scalar __ewiz_coef_abs_b4 = abs(_b[`ivvar'])
if _rc local __ewiz_model_valid_b4 = 0
capture scalar __ewiz_se_abs_b4 = abs(_se[`ivvar'])
if _rc local __ewiz_model_valid_b4 = 0
capture if scalar(__ewiz_se_abs_b4) <= 1e-12 local __ewiz_model_valid_b4 = 0
capture if scalar(__ewiz_coef_abs_b4) <= 1e-12 & scalar(__ewiz_se_abs_b4) <= 1e-12 local __ewiz_model_valid_b4 = 0
capture local __df_b4 = e(df_r)
capture local __z_b4 = abs(_b[`ivvar'] / _se[`ivvar'])
if !_rc & `__ewiz_model_valid_b4' == 1 {
if "`__df_b4'"=="" | "`__df_b4'"=="." local pmain_b4 = 2*(1-normal(`__z_b4'))
else local pmain_b4 = 2*ttail(`__df_b4', `__z_b4')
}
if `__ewiz_model_valid_b4' == 1 local starsmain_b4 = cond(`pmain_b4'<0.01,"***",cond(`pmain_b4'<0.05,"**",cond(`pmain_b4'<0.1,"*","")))
if `__ewiz_model_valid_b4' == 1 {
capture post `ewiz_sum' ("(4)") (_b[`ivvar']) (_se[`ivvar']) (`pmain_b4') ("`starsmain_b4'") (`N_b4') (`R2_b4')
if _rc post `ewiz_sum' ("(4)") (.) (.) (.) ("") (`N_b4') (`R2_b4')
}
else post `ewiz_sum' ("(4)") (.) (.) (.) ("") (`N_b4') (`R2_b4')
if `__ewiz_model_valid_b4' == 1 {
local __included_vars "`ivvar' dfi_index roa lev size growth cashflow tobinq top1 dual board indep soe age"
foreach v of local allvars {
local __is_included = 0
foreach __inc of local __included_vars {
if "`v'" == "`__inc'" local __is_included = 1
}
if `__is_included' == 0 continue
local model_term "`v'"
if "`v'" == "`ivvar'" local model_term "`ivvar'"
capture scalar __ewiz_term_b = _b[`model_term']
capture scalar __ewiz_term_se = _se[`model_term']
if _rc | missing(__ewiz_term_se) | __ewiz_term_se <= 1e-12 {
local cell_b4_`v' "omitted"
local secell_b4_`v' "(absorbed)"
continue
}
capture local __coef : display %9.4f __ewiz_term_b
capture if abs(__ewiz_term_b) < 0.00005 & __ewiz_term_b != 0 local __coef : display %9.2e __ewiz_term_b
if !_rc {
capture local __se : display %9.4f __ewiz_term_se
capture if abs(__ewiz_term_se) < 0.00005 & __ewiz_term_se != 0 local __se : display %9.2e __ewiz_term_se
local __p = .
local __stars ""
capture local __df = e(df_r)
capture local __z = abs(_b[`model_term'] / _se[`model_term'])
if !_rc {
if "`__df'"=="" | "`__df'"=="." local __p = 2*(1-normal(`__z'))
else local __p = 2*ttail(`__df', `__z')
}
if !_rc local __stars = cond(`__p'<0.01,"***",cond(`__p'<0.05,"**",cond(`__p'<0.1,"*","")))
local cell_b4_`v' "`__coef'`__stars'"
local secell_b4_`v' "(`__se')"
}
}
}
di "(4): `ivvar'=`coefmain_b4'`starsmain_b4' (se=`semain_b4', p=`pmain_b4'), N=`N_b4', R2=`R2_b4'"
postclose `ewiz_sum'
capture file close f_base
file open f_base using "$JOB_DIR/regression_table_基准回归.csv", write replace
file write f_base "变量,(1),(2),(3),(4)" _n
file write f_base ",OLS,OLS + Controls,个体固定效应,个体固定效应 + 时间固定效应" _n
file write f_base "dfi_index,`cell_b1_dfi_index',`cell_b2_dfi_index',`cell_b3_dfi_index',`cell_b4_dfi_index'" _n
file write f_base ",`secell_b1_dfi_index',`secell_b2_dfi_index',`secell_b3_dfi_index',`secell_b4_dfi_index'" _n
file write f_base "roa,`cell_b1_roa',`cell_b2_roa',`cell_b3_roa',`cell_b4_roa'" _n
file write f_base ",`secell_b1_roa',`secell_b2_roa',`secell_b3_roa',`secell_b4_roa'" _n
file write f_base "lev,`cell_b1_lev',`cell_b2_lev',`cell_b3_lev',`cell_b4_lev'" _n
file write f_base ",`secell_b1_lev',`secell_b2_lev',`secell_b3_lev',`secell_b4_lev'" _n
file write f_base "size,`cell_b1_size',`cell_b2_size',`cell_b3_size',`cell_b4_size'" _n
file write f_base ",`secell_b1_size',`secell_b2_size',`secell_b3_size',`secell_b4_size'" _n
file write f_base "growth,`cell_b1_growth',`cell_b2_growth',`cell_b3_growth',`cell_b4_growth'" _n
file write f_base ",`secell_b1_growth',`secell_b2_growth',`secell_b3_growth',`secell_b4_growth'" _n
file write f_base "cashflow,`cell_b1_cashflow',`cell_b2_cashflow',`cell_b3_cashflow',`cell_b4_cashflow'" _n
file write f_base ",`secell_b1_cashflow',`secell_b2_cashflow',`secell_b3_cashflow',`secell_b4_cashflow'" _n
file write f_base "tobinq,`cell_b1_tobinq',`cell_b2_tobinq',`cell_b3_tobinq',`cell_b4_tobinq'" _n
file write f_base ",`secell_b1_tobinq',`secell_b2_tobinq',`secell_b3_tobinq',`secell_b4_tobinq'" _n
file write f_base "top1,`cell_b1_top1',`cell_b2_top1',`cell_b3_top1',`cell_b4_top1'" _n
file write f_base ",`secell_b1_top1',`secell_b2_top1',`secell_b3_top1',`secell_b4_top1'" _n
file write f_base "dual,`cell_b1_dual',`cell_b2_dual',`cell_b3_dual',`cell_b4_dual'" _n
file write f_base ",`secell_b1_dual',`secell_b2_dual',`secell_b3_dual',`secell_b4_dual'" _n
file write f_base "board,`cell_b1_board',`cell_b2_board',`cell_b3_board',`cell_b4_board'" _n
file write f_base ",`secell_b1_board',`secell_b2_board',`secell_b3_board',`secell_b4_board'" _n
file write f_base "indep,`cell_b1_indep',`cell_b2_indep',`cell_b3_indep',`cell_b4_indep'" _n
file write f_base ",`secell_b1_indep',`secell_b2_indep',`secell_b3_indep',`secell_b4_indep'" _n
file write f_base "soe,`cell_b1_soe',`cell_b2_soe',`cell_b3_soe',`cell_b4_soe'" _n
file write f_base ",`secell_b1_soe',`secell_b2_soe',`secell_b3_soe',`secell_b4_soe'" _n
file write f_base "age,`cell_b1_age',`cell_b2_age',`cell_b3_age',`cell_b4_age'" _n
file write f_base ",`secell_b1_age',`secell_b2_age',`secell_b3_age',`secell_b4_age'" _n
file write f_base "核心解释变量口径,dfi_index,dfi_index,dfi_index,dfi_index" _n
file write f_base "估计模型,OLS/FE,OLS/FE,OLS/FE,OLS/FE" _n
file write f_base "控制变量,否,是,是,是" _n
file write f_base "个体固定效应,否,否,是,是" _n
file write f_base "时间固定效应,否,否,否,是" _n
file write f_base "聚类标准误,稳健标准误,稳健标准误,稳健标准误,稳健标准误" _n
file write f_base "N,`N_b1',`N_b2',`N_b3',`N_b4'" _n
file write f_base "R²,`R2_b1',`R2_b2',`R2_b3',`R2_b4'" _n
file close f_base
capture file close f_trip
file open f_trip using "$JOB_DIR/回归结果三线表.csv", write replace
file write f_trip "变量,(1),(2),(3),(4)" _n
file write f_trip ",OLS,OLS + Controls,个体固定效应,个体固定效应 + 时间固定效应" _n
file write f_trip "dfi_index,`cell_b1_dfi_index',`cell_b2_dfi_index',`cell_b3_dfi_index',`cell_b4_dfi_index'" _n
file write f_trip ",`secell_b1_dfi_index',`secell_b2_dfi_index',`secell_b3_dfi_index',`secell_b4_dfi_index'" _n
file write f_trip "roa,`cell_b1_roa',`cell_b2_roa',`cell_b3_roa',`cell_b4_roa'" _n
file write f_trip ",`secell_b1_roa',`secell_b2_roa',`secell_b3_roa',`secell_b4_roa'" _n
file write f_trip "lev,`cell_b1_lev',`cell_b2_lev',`cell_b3_lev',`cell_b4_lev'" _n
file write f_trip ",`secell_b1_lev',`secell_b2_lev',`secell_b3_lev',`secell_b4_lev'" _n
file write f_trip "size,`cell_b1_size',`cell_b2_size',`cell_b3_size',`cell_b4_size'" _n
file write f_trip ",`secell_b1_size',`secell_b2_size',`secell_b3_size',`secell_b4_size'" _n
file write f_trip "growth,`cell_b1_growth',`cell_b2_growth',`cell_b3_growth',`cell_b4_growth'" _n
file write f_trip ",`secell_b1_growth',`secell_b2_growth',`secell_b3_growth',`secell_b4_growth'" _n
file write f_trip "cashflow,`cell_b1_cashflow',`cell_b2_cashflow',`cell_b3_cashflow',`cell_b4_cashflow'" _n
file write f_trip ",`secell_b1_cashflow',`secell_b2_cashflow',`secell_b3_cashflow',`secell_b4_cashflow'" _n
file write f_trip "tobinq,`cell_b1_tobinq',`cell_b2_tobinq',`cell_b3_tobinq',`cell_b4_tobinq'" _n
file write f_trip ",`secell_b1_tobinq',`secell_b2_tobinq',`secell_b3_tobinq',`secell_b4_tobinq'" _n
file write f_trip "top1,`cell_b1_top1',`cell_b2_top1',`cell_b3_top1',`cell_b4_top1'" _n
file write f_trip ",`secell_b1_top1',`secell_b2_top1',`secell_b3_top1',`secell_b4_top1'" _n
file write f_trip "dual,`cell_b1_dual',`cell_b2_dual',`cell_b3_dual',`cell_b4_dual'" _n
file write f_trip ",`secell_b1_dual',`secell_b2_dual',`secell_b3_dual',`secell_b4_dual'" _n
file write f_trip "board,`cell_b1_board',`cell_b2_board',`cell_b3_board',`cell_b4_board'" _n
file write f_trip ",`secell_b1_board',`secell_b2_board',`secell_b3_board',`secell_b4_board'" _n
file write f_trip "indep,`cell_b1_indep',`cell_b2_indep',`cell_b3_indep',`cell_b4_indep'" _n
file write f_trip ",`secell_b1_indep',`secell_b2_indep',`secell_b3_indep',`secell_b4_indep'" _n
file write f_trip "soe,`cell_b1_soe',`cell_b2_soe',`cell_b3_soe',`cell_b4_soe'" _n
file write f_trip ",`secell_b1_soe',`secell_b2_soe',`secell_b3_soe',`secell_b4_soe'" _n
file write f_trip "age,`cell_b1_age',`cell_b2_age',`cell_b3_age',`cell_b4_age'" _n
file write f_trip ",`secell_b1_age',`secell_b2_age',`secell_b3_age',`secell_b4_age'" _n
file write f_trip "核心解释变量口径,dfi_index,dfi_index,dfi_index,dfi_index" _n
file write f_trip "估计模型,OLS/FE,OLS/FE,OLS/FE,OLS/FE" _n
file write f_trip "控制变量,否,是,是,是" _n
file write f_trip "个体固定效应,否,否,是,是" _n
file write f_trip "时间固定效应,否,否,否,是" _n
file write f_trip "聚类标准误,稳健标准误,稳健标准误,稳健标准误,稳健标准误" _n
file write f_trip "N,`N_b1',`N_b2',`N_b3',`N_b4'" _n
file write f_trip "R²,`R2_b1',`R2_b2',`R2_b3',`R2_b4'" _n
file close f_trip
preserve
use `ewiz_summary', clear
export delimited using "$JOB_DIR/regression_table.csv", replace
drop if missing(coef)
if _N == 0 {
capture file close f_base
tempname __ewz_skipfh
file open `__ewz_skipfh' using "$JOB_DIR/regression_table_基准回归.csv", write replace
file write `__ewz_skipfh' "状态,值" _n
file write `__ewz_skipfh' "状态,skipped" _n
file write `__ewz_skipfh' "原因,所有规格估计失败(被解释变量无变异 / 控制变量完美共线 / 样本量不足等),无法生成有效回归表" _n
file close `__ewz_skipfh'
di as error "[regression] all specs degenerate; wrote skipped marker"
}
if _N > 0 {
gen spec_order = _n
gen lb = coef - 1.96 * se
gen ub = coef + 1.96 * se
twoway (rcap ub lb spec_order, lcolor(gs8)) (scatter coef spec_order, mcolor(navy) msymbol(D)), yline(0, lpattern(dash) lcolor(maroon)) xlabel(1 "(1)" 2 "(2)" 3 "(3)" 4 "(4)", angle(45) labsize(small)) xtitle("Specification") ytitle("Coefficient on `ivvar'") title("Specification Comparison")
capture graph export "$JOB_DIR/baseline_coef_plot.png", replace width(1800)
}
restore
di "[EWIZ_STATS] target=dfi_index;coef=`coefmain_b4';se=`semain_b4';pvalue=`pmain_b4';nobs=`N_b4'"
di "回归表输出完成"
log close
六、实际输出表
这张表就是本方法页使用的案例输出文件,保存在 marketing/method_case_assets/baseline_regression/result.csv。
| 变量 | (1) | (2) | (3) | (4) |
|---|---|---|---|---|
| OLS | OLS + Controls | 个体固定效应 | 个体固定效应 + 时间固定效应 | |
| dfi_index | 0.5835*** | 0.5709*** | 0.5671*** | 0.5660*** |
| ( 0.0426) | ( 0.0409) | ( 0.0459) | ( 0.0460) | |
| roa | 0.3016*** | 0.3059*** | 0.3040*** | |
| ( 0.0407) | ( 0.0446) | ( 0.0449) | ||
| lev | -0.0141 | -0.0509 | -0.0548 | |
| ( 0.0436) | ( 0.0469) | ( 0.0473) | ||
| size | 0.2278*** | 0.2377*** | 0.2375*** | |
| ( 0.0403) | ( 0.0461) | ( 0.0461) | ||
| growth | -0.0462 | -0.0849* | -0.0819 | |
| ( 0.0433) | ( 0.0503) | ( 0.0509) | ||
| cashflow | 0.0441 | 0.0423 | 0.0407 | |
| ( 0.0419) | ( 0.0456) | ( 0.0456) | ||
| tobinq | 0.0335 | 0.0246 | 0.0247 | |
| ( 0.0441) | ( 0.0485) | ( 0.0488) | ||
| top1 | -0.0310 | -0.0170 | -0.0166 | |
| ( 0.0450) | ( 0.0499) | ( 0.0498) | ||
| dual | -0.0025 | -0.0244 | -0.0272 | |
| ( 0.0424) | ( 0.0458) | ( 0.0459) | ||
| board | -0.0155 | -0.0396 | -0.0383 | |
| ( 0.0404) | ( 0.0447) | ( 0.0447) | ||
| indep | -0.0306 | -0.0282 | -0.0309 | |
| ( 0.0429) | ( 0.0488) | ( 0.0491) | ||
| soe | -0.0102 | 0.0015 | 0.0025 | |
| ( 0.0435) | ( 0.0501) | ( 0.0503) | ||
| age | -0.0151 | -0.0502 | -0.0539 | |
| ( 0.0423) | ( 0.0456) | ( 0.0450) | ||
| 核心解释变量口径 | dfi_index | dfi_index | dfi_index | dfi_index |
| 估计模型 | OLS/FE | OLS/FE | OLS/FE | OLS/FE |
| 控制变量 | 否 | 是 | 是 | 是 |
| 个体固定效应 | 否 | 否 | 是 | 是 |
| 时间固定效应 | 否 | 否 | 否 | 是 |
| 聚类标准误 | 稳健标准误 | 稳健标准误 | 稳健标准误 | 稳健标准误 |
| N | 720 | 720 | 720 | 720 |
| R² | 0.1932 | 0.2824 | 0.4069 | 0.4098 |
补充输出
下面这些文件来自同一次案例运行或烟测输出,用来补齐主表之外的诊断信息。
regression_table.csv
| spec | coef | se | pvalue | stars | N | R2 |
|---|---|---|---|---|---|---|
| (1) | .5834721605890186 | .042610165 | 4.3467875e-38 | *** | 720 | .19320001 |
| (2) | .5709054170767351 | .040931679 | 0 | *** | 720 | .28240001 |
| (3) | .5670792211042022 | .045913894 | 2.5794169e-31 | *** | 720 | .40689999 |
| (4) | .5660408387964105 | .04600032 | 4.3109577e-31 | *** | 720 | .40979999 |
回归结果三线表.csv
| 变量 | (1) | (2) | (3) | (4) |
|---|---|---|---|---|
| OLS | OLS + Controls | 个体固定效应 | 个体固定效应 + 时间固定效应 | |
| dfi_index | 0.5835*** | 0.5709*** | 0.5671*** | 0.5660*** |
| ( 0.0426) | ( 0.0409) | ( 0.0459) | ( 0.0460) | |
| roa | 0.3016*** | 0.3059*** | 0.3040*** | |
| ( 0.0407) | ( 0.0446) | ( 0.0449) | ||
| lev | -0.0141 | -0.0509 | -0.0548 | |
| ( 0.0436) | ( 0.0469) | ( 0.0473) | ||
| size | 0.2278*** | 0.2377*** | 0.2375*** | |
| ( 0.0403) | ( 0.0461) | ( 0.0461) | ||
| growth | -0.0462 | -0.0849* | -0.0819 | |
| ( 0.0433) | ( 0.0503) | ( 0.0509) | ||
| cashflow | 0.0441 | 0.0423 | 0.0407 | |
| ( 0.0419) | ( 0.0456) | ( 0.0456) | ||
| tobinq | 0.0335 | 0.0246 | 0.0247 | |
| ( 0.0441) | ( 0.0485) | ( 0.0488) | ||
| top1 | -0.0310 | -0.0170 | -0.0166 | |
| ( 0.0450) | ( 0.0499) | ( 0.0498) | ||
| dual | -0.0025 | -0.0244 | -0.0272 | |
| ( 0.0424) | ( 0.0458) | ( 0.0459) | ||
| board | -0.0155 | -0.0396 | -0.0383 | |
| ( 0.0404) | ( 0.0447) | ( 0.0447) | ||
| indep | -0.0306 | -0.0282 | -0.0309 | |
| ( 0.0429) | ( 0.0488) | ( 0.0491) | ||
| soe | -0.0102 | 0.0015 | 0.0025 | |
| ( 0.0435) | ( 0.0501) | ( 0.0503) | ||
| age | -0.0151 | -0.0502 | -0.0539 | |
| ( 0.0423) | ( 0.0456) | ( 0.0450) | ||
| 核心解释变量口径 | dfi_index | dfi_index | dfi_index | dfi_index |
| 估计模型 | OLS/FE | OLS/FE | OLS/FE | OLS/FE |
| 控制变量 | 否 | 是 | 是 | 是 |
| 个体固定效应 | 否 | 否 | 是 | 是 |
| 时间固定效应 | 否 | 否 | 否 | 是 |
| 聚类标准误 | 稳健标准误 | 稳健标准误 | 稳健标准误 | 稳健标准误 |
| N | 720 | 720 | 720 | 720 |
| R² | 0.1932 | 0.2824 | 0.4069 | 0.4098 |
七、案例图
这是一张由同一份案例数据生成的页面内诊断图。

八、论文里怎么写
本文在共用企业面板样本上报告基准回归,核心输出见 regression_table_基准回归.csv。结果解释时同时关注样本口径、变量构造、系数方向、标准误和适用前提,避免只凭单个 p 值完成方法选择。
九、检查清单
- 确认本页使用的因变量、核心解释变量、控制变量与论文主模型一致。
- 先看表格里的样本口径,再看系数、p 值或诊断指标。
- 代码里的输出文件名要能对应网页展示的结果表。