论文来源: OPENREVIEW2026年6月28日重要度: 4/5

Contrast-Induced Class Overlap as a Fairness Bottleneck in Dermatological AI: Evidence from HAM10000

中文标题: 对比诱导的类别重叠：皮肤病学AI公平性的瓶颈——基于HAM10000的证据

英文摘要

AI skin cancer triage systems generate about 106 excess unnecessary referrals per 1,000 darker-skin patients due to over-prediction, not missed cancers. This over-prediction stems from melanin concentration reducing lesion-background optical contrast, causing class overlap. The authors formalize this as a signal-to-noise ratio (SNR) framework, predicting a 5.2× SNR reduction from lighter to darker skin tones. Experiments on the HAM10000 dataset with a high-confidence ITA subset show dark skin achieves slightly higher sensitivity (0.848 vs. 0.821) but substantially lower specificity (0.720 vs. 0.831, Δ=−11.1pp). An ablation study compares ITA-based tone conditioning (feature calibration) and dark-skin augmentation (decision boundary adjustment), revealing their distinct effects. Zero-shot transfer to the DDI dataset (n=656) confirms the AUC gap. Code and trained weights are publicly released.

中文摘要

AI皮肤癌分诊系统在深肤色患者中每千人约产生106次不必要的额外转诊，原因并非漏诊，而是过度预测。其根源在于黑色素浓度降低皮损与背景的光学对比度，导致类别重叠。作者用信噪比（SNR）框架形式化该机制，预测从浅肤色到深肤色SNR下降5.2倍。在HAM10000数据集高置信度ITA子集上，深肤色灵敏度略高（0.848 vs. 0.821），但特异性显著更低（0.720 vs. 0.831，降幅11.1个百分点）。消融研究对比了ITA色调调节（特征校准）与深肤色增强（决策边界调整），揭示两者独立效果。零样本迁移至DDI数据集（n=656）确认了AUC差距。代码和训练权重已公开发布。

关键要点

AI triage systems generate ~106 excess unnecessary referrals per 1,000 darker-skin patients due to over-prediction, not under-detection.
AI分诊系统每千名深肤色患者约产生106次不必要的额外转诊，原因是过度预测而非漏诊。
Melanin concentration systematically reduces lesion-background optical contrast, leading to class overlap and a predicted 5.2× SNR reduction from light to dark skin.
黑色素浓度系统性降低皮损-背景光学对比度，导致类别重叠，从浅肤色到深肤色信噪比预计降低5.2倍。
On a high-confidence ITA subset of HAM10000, dark skin specificity is 11.1pp lower than light skin (0.720 vs 0.831), while sensitivity is similar.
在HAM10000高置信度ITA子集上，深肤色特异性比浅肤色低11.1个百分点（0.720 vs 0.831），灵敏度相近。
Ablation study contrasts ITA-based tone conditioning (feature calibration) and dark-skin augmentation (decision boundary placement), showing independent effects.
消融研究对比了基于ITA的色调调节（特征校准）和深肤色增强（决策边界调整），显示了独立效果。
Zero-shot transfer to DDI dataset confirms the AUC gap and score suppression, and all code and weights are publicly released.
零样本迁移至DDI数据集确认了AUC差距和分数抑制，所有代码和权重已公开发布。

打开原文