Adaptive Bias Discovery for Learning Debiased Classifier

ACCV 2024
Jun-Hyun Bae, Minho Lee, Heechul Jung
Kyungpook National University

Abstract

Training deep neural networks with empirical risk minimization (ERM) often captures dataset biases, hindering generalization to new or unseen data. Previous solutions either require prior knowledge of biases or utilize training intentionally biased models as auxiliaries; however, they still suffer from multiple biases. To address this, we introduce Adaptive Bias Discovery (ABD), a novel learning framework designed to mitigate the impact of multiple unknown biases. ABD trains an auxiliary model to be adapted to biases based on the debiased parameters from the debiasing phase, allowing it to navigate through multiple biases. Then, samples are reweighted based on the discovered biases to update debiased parameters. Extensive evaluations of synthetic experiments and real-world datasets demonstrate that ABD consistently outperforms existing methods, particularly in real-world applications where multiple unknown biases are prevalent.

Overview

사전 바이어스 정보 없이 데이터에 존재하는 여러 바이어스를 순차적으로 발견하고 제거하는 학습 프레임워크를 제안한다.

Bias-adapted model — Debiased 파라미터 $\theta$ 에서 1-step gradient descent로 바이어스에 민감한 보조 모델 $f_\phi$ 를 생성한다.
Adaptive group formation — $f_\phi$ 의 예측으로 데이터를 바이어스 정렬 그룹($G^\odot$ )과 비정렬 그룹($G^\otimes$ )으로 분할한다.
Iterative debiasing — Group DRO로 worst-case 그룹 손실을 최소화하며, $\theta$ 가 한 바이어스에 강건해지면 $\phi$ 가 자연스럽게 다음 바이어스를 발견한다.

ABD Framework

ABD 프레임워크 개요. 두 가지 바이어스(Bias1, Bias2)와 두 학습 스텝을 예시로 도식화.

Method

ERM으로 학습된 모델은 데이터에 존재하는 spurious correlation을 쉽게 포착하여 일반화 성능이 저하된다. 기존 방법들은 바이어스 정보를 사전에 알고 있거나, 단일 바이어스만 처리할 수 있다는 한계가 있다.

ABD는 두 단계로 구성된다. 먼저 debiased 파라미터 $\theta$ 에서 한 스텝 gradient descent로 bias-adapted 파라미터 $\phi = \theta - \alpha \nabla_\theta \mathcal{L}(f_\theta)$ 를 얻는다. 이 $f_\phi$ 는 데이터의 표면적 패턴에 민감하게 반응하므로, 예측 결과를 기반으로 데이터를 바이어스 정렬 그룹($G^\odot$ )과 비정렬 그룹($G^\otimes$ )으로 분할한다. 이후 group DRO를 통해 worst-case 그룹의 손실을 최소화하도록 $\theta$ 를 업데이트한다.

핵심은 $\phi$ 가 매 스텝마다 $\theta$ 로부터 재생성된다는 점이다. $\theta$ 가 첫 번째 바이어스에 대해 강건해지면, $\phi$ 는 자연스럽게 다음으로 두드러진 바이어스를 포착하게 된다. 이 MAML 유사 구조 덕분에 사전 바이어스 정보 없이도 여러 바이어스를 순차적으로 발견하고 제거할 수 있다.

아래 GradCAM 시각화는 biased model $f_\phi$ 의 attention이 학습이 진행됨에 따라 다른 영역으로 이동하는 것을 보여준다. ABD가 학습 과정에서 다양한 바이어스를 적응적으로 발견함을 확인할 수 있다.

Biased Model Evolution

ERM 모델과 ABD의 biased model $f_\phi$의 GradCAM 시각화. 학습 스텝이 진행되면서 $f_\phi$의 attention이 다른 바이어스 특징으로 이동한다.

Results

Colored MNIST

OoD test accuracy (%). Bias: Color만 있는 경우와 Color & Patch가 동시에 존재하는 경우.

Algorithm	Color (OoD)	Color & Patch (OoD)
ERM	16.4	14.0
IRM	66.9	13.4
Group DRO	13.6	14.1
PI	70.2	15.3
ABD (Ours)	70.7	62.3
Optimal	75.0	75.0

PI는 가장 지배적인 바이어스(Color)만 발견하는 반면, ABD는 Color → Patch 순으로 여러 바이어스를 순차적으로 발견한다.

PI Baseline

PI의 그룹 내 Pearson 상관계수. PI는 Color 바이어스만 발견하고 Patch는 포착하지 못한다.

Bias Discovery - ABD

ABD의 그룹 내 Pearson 상관계수 시각화. 학습이 진행되면서 Color → Patch 순으로 바이어스를 발견한다.

Real-World Tasks

CivilComments (worst-case acc.), MultiNLI (worst-case acc.), Camelyon17 (OoD acc.), FMoW (worst-region acc.).

Algorithm	CivilComments	MultiNLI	Camelyon17	FMoW
ERM	56.0	61.8	70.3	32.3
Group DRO	70.0	62.7	68.4	30.8
JTT	69.3	63.2	63.8	33.4
PI	61.1	61.5	71.7	31.2
LISA	—	—	77.1	35.5
ABD (Ours)	71.1	67.1	81.1	34.1

MultiNLI Analysis

MultiNLI에서 오분류 그룹 $G^\otimes$의 바이어스 구성 변화. Negation 바이어스 발견 후 Overlap 바이어스가 점차 드러난다.

MetaShift 테스트 데이터의 GradCAM 시각화. ERM은 배경에 의존하지만, ABD는 대상 객체에 집중한다.

BibTeX

@InProceedings{Bae_2024_ACCV,
  author    = {Bae, Jun-Hyun and Lee, Minho and Jung, Heechul},
  title     = {Adaptive Bias Discovery for Learning Debiased Classifier},
  booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)},
  month     = {December},
  year      = {2024},
  pages     = {3074-3090}
}

Abstract#

Overview#

Method#

Results#

Colored MNIST#

Real-World Tasks#

BibTeX#