IPUMS USA 将美国人口普查、ACS 和部分历史样本整理成跨年可比的 harmonized microdata。它不是单一年份数据,而是统一变量名、标签和样本权重的抽取平台,适合跨年劳动、收入、教育、移民、住房和人口结构研究。
harmonized extracts — rolling · updated: rolling · official pageIPUMS USA — Harmonized Census and ACS Microdata registers 18 variables — 3 commonly used as dependent variables, 3 as core regressors and 13 as controls. Common research directions for this dataset include: 教育回报与工资收入, 群体就业差异.
教育回报与工资收入: DV INCWAGE, IV EDUC, controls AGE, SEX, RACE.
群体就业差异: DV EMPSTAT, IV EDUC, SEX, RACE, controls AGE, STATEFIP, YEAR.
| weight (pweight) | PERWT |
|---|---|
| source | IPUMS USA variable documentation |
Stata svyset:
svyset _n [pweight=PERWT]
Note:个人层面分析使用 PERWT;家庭层面分析使用 HHWT。部分 IPUMS extract 还可包含 STRATA / CLUSTER 或 replicate weights,具体以抽取变量为准。
| name | label | type/role | data file | description · keywords |
|---|---|---|---|---|
| YEAR | 调查/普查年份 Census / survey year | identifier / time | extract | 样本年份,跨年分析核心时间变量 year年份 |
| SAMPLE | IPUMS 样本代码 IPUMS sample identifier | identifier / identifier | extract | IPUMS 样本标识,用于区分 ACS、census 和历史样本 sample样本 |
| SERIAL | 家庭序号 Household serial number | identifier / identifier | extract | 家庭/住户编号,与 PERNUM 共同识别个人 householdid家庭 |
| PERNUM | 家庭内个人序号 Person number within household | identifier / identifier | extract | 家庭内个人序号 personid个人 |
| HHWT | 家庭权重 Household weight | continuous / control | extract | 家庭层面权重,家庭/住房变量估计需使用 weight家庭权重 |
| PERWT | 个人权重 Person weight | continuous / control | extract | 个人层面权重,个人收入、教育、就业等估计需使用 weight个人权重 |
| STATEFIP | 州 FIPS 代码 State FIPS code | categorical / control | extract | 州代码;跨年分析应注意州边界和样本可识别地理层级 statefips州 |
| AGE | 年龄 Age | continuous / control | extract | 年龄,部分样本有 top-code age年龄 |
| SEX | 性别 Sex | binary / control | extract | IPUMS harmonized sex variable sexgender性别 |
| RACE | 种族 Race | categorical / control | extract | 跨年统一后的种族分类;详细口径随历史时期变化 race种族 |
| HISPAN | 西语裔身份 Hispanic origin | categorical / control | extract | Hispanic origin harmonized variable hispanicethnicity |
| EDUC | 教育程度 Educational attainment | ordinal / iv,control | extract | 教育程度分类变量;跨年可比但类别含义需看 IPUMS comparability notes education学历教育 |
| EDUCD | 教育程度详细码 Detailed educational attainment | ordinal / iv,control | extract | 教育程度详细分类,适合构造高中/本科/研究生等虚拟变量 educationdetailed |
| EMPSTAT | 就业状态 Employment status | categorical / dv,control | extract | 就业、失业和非劳动力状态 employment就业 |
| LABFORCE | 是否在劳动力 Labor force status | binary / control | extract | 是否属于劳动力人口,常作为就业/工资样本筛选条件 labor force劳动力 |
| INCWAGE | 工资收入 Wage and salary income | continuous / dv | extract | 工资和薪金收入;需处理无收入、top-code、通胀平减和样本筛选 wageincome工资 |
| FTOTINC | 家庭总收入 Family total income | continuous / dv,control | extract | 家庭总收入,跨年分析应按 CPI 平减并处理 top-code family income家庭收入 |
| UHRSWORK | 通常每周工作小时 Usual hours worked per week | continuous / iv,control | extract | 通常每周工作小时,工资率构造常用 hours工作小时 |
The codebook is an open lookup. The Wizard consumes the same codebook to drive code generation — upload your data and the system auto-recognises variables, applies cleaning rules, recommends research designs, runs regressions and produces a Word report.