Jun-Hyun Bae*, Taewon Park*, Minho Lee
Kyungpook National University
* Equal Contribution
Abstract
Learning associative reasoning is necessary to implement human-level artificial intelligence even when a model faces unfamiliar associations of learned components. However, conventional memory augmented neural networks (MANNs) have shown degraded performance on systematically different data since they lack consideration of systematic generalization. In this work, we propose a novel architecture for MANNs which explicitly aims to learn recomposable representations with a modular structure of RNNs. Our method binds learned representations with a Tensor Product Representation (TPR) to manifest their associations and stores the associations into TPR-based external memory. In addition, to demonstrate the effectiveness of our approach, we introduce a new benchmark for evaluating systematic generalization performance on associative reasoning, which contains systematically different combinations of words between training and test data. From the experimental results, our method shows superior test accuracy on systematically different data compared to other models. Furthermore, we validate the models using TPR by analyzing whether the learned representations have symbolic properties.
Overview
๊ธฐ์กด MANN์ด ์ฒด๊ณ์ ์ผ๋ก ๋ค๋ฅธ ํ ์คํธ ๋ฐ์ดํฐ์์ ์คํจํ๋ ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ๊ธฐ ์ํด, modular encoder์ TPR ๊ธฐ๋ฐ ์ธ๋ถ ๋ฉ๋ชจ๋ฆฌ๋ฅผ ๊ฒฐํฉํ ์๋ก์ด ์ํคํ ์ฒ๋ฅผ ์ ์ํ๋ค.
- Modular encoding โ Recurrent Independent Mechanisms(RIMs)๋ก ์ ๋ ฅ์ \(N\) ๊ฐ ๋ ๋ฆฝ ๋ชจ๋์ด ๊ฒฝ์์ ์ผ๋ก ์ธ์ฝ๋ฉํ์ฌ ์ฌ์กฐํฉ ๊ฐ๋ฅํ ํํ์ ํ์ตํ๋ค.
- TPR binding โ Tensor Product Representation์ผ๋ก role๊ณผ filler์ ์ฐ๊ด ๊ด๊ณ๋ฅผ ์ํ์ ์ผ๋ก ๋ฐ์ธ๋ฉํ๋ค: \(T = \sum_{k=1}^N \mathbf{r}_k \otimes \mathbf{f}_k\)
- Memory-based recall โ TPR ๊ธฐ๋ฐ ์ธ๋ถ ๋ฉ๋ชจ๋ฆฌ์ ์ฐ๊ด ๊ด๊ณ๋ฅผ ์ ์ฅํ๊ณ , ํ์ตํ์ง ์์ ์กฐํฉ์์๋ ์ฒด๊ณ์ ์ผ๋ก ์ถ๋ก ํ๋ค.
Method
๊ธฐ์กด memory augmented neural network (MANN)๋ ํ์ต ๋ฐ์ดํฐ์ ์ฒด๊ณ์ ์ผ๋ก ๋ค๋ฅธ(systematically different) ํ ์คํธ ๋ฐ์ดํฐ์์ ์ฑ๋ฅ์ด ๊ธ๋ฝํ๋ค. ์ฐ๋ฆฌ๋ modular RNN encoder + TPR-based external memory๋ฅผ ๊ฒฐํฉํ์ฌ ์ฒด๊ณ์ ์ผ๋ฐํ(systematic generalization)๋ฅผ ๋ฌ์ฑํ๋ค.
ํต์ฌ ๊ตฌ์ฑ:
- Recurrent Independent Mechanisms (RIMs): \(N\) ๊ฐ์ RNN ๋ชจ๋์ด competitive learning์ผ๋ก ๊ฐ์ ๋ ๋ฆฝ์ ์ธ ์ธ์ฝ๋ฉ ๋ฉ์ปค๋์ฆ ํ์ต
- Tensor Product Representation (TPR): role๊ณผ filler์ tensor product๋ก ์ฐ๊ด ๊ด๊ณ๋ฅผ ์ํ์ ์ผ๋ก ๋ฐ์ธ๋ฉ โ \(T = \sum_{k=1}^N \mathbf{r}_k \otimes \mathbf{f}_k\)
- TPR-based External Memory: ๊ฐ ์๊ฐ ๋จ๊ณ์์ role/filler ํํ์ ์ถ์ถํ์ฌ ๋ฉ๋ชจ๋ฆฌ์ superpose
- Systematic Associative Recall (SAR): ์ฒด๊ณ์ ์ผ๋ฐํ ํ๊ฐ๋ฅผ ์ํ ์ ๋ฒค์น๋งํฌ ์ ์
Results
Quantitative
![]()
SAR ํ์คํฌ์์ DNC, FWM, ์ ์ ๋ฐฉ๋ฒ์ ํ์ต/ํ ์คํธ ์ ํ๋ ๋น๊ต. DNC์ FWM์ ์ฒด๊ณ์ ์ผ๋ก ๋ค๋ฅธ ๋ฐ์ดํฐ(test different)์์ ํฐ ์ฑ๋ฅ ์ ํ๋ฅผ ๋ณด์ด์ง๋ง, ์ฐ๋ฆฌ ๋ชจ๋ธ์ ์ฑ๊ณต์ ์ผ๋ก ์ฒด๊ณ์ ์ผ๋ฐํ๋ฅผ ๋ฌ์ฑํ๋ค.
| Model | Test Accuracy |
|---|---|
| LSTM | 80.88% |
| Transformer-XL | 87.66% |
| Meta-learned Neural Memory | 88.97% |
| Fast Weight Memory (FWM) | 96.75% |
| FWM (our trial) | 94.94% |
| Ours | 96.63% |
๋๊ท๋ชจ ์ง์์๋ต ํ์คํฌ(catbAbI)์์๋ FWM์ ํ์ ํ๋ ์ฑ๋ฅ์ ๋ฌ์ฑํ๋ฉฐ, ๋ชจ๋ ๊ธฐ๋ฐ ์ธ์ฝ๋์ ์ผ๋ฐ์ ์ ํจ์ฑ์ ํ์ธ.
Analysis
ํ์ต๋ ํํ์ด ์ฌ๋ฐ๋ฅธ symbolic property๋ฅผ ๊ฐ๋์ง ๊ฒ์ฆํ๋ค. Role ๋ฒกํฐ์ unbinding ๋ฒกํฐ ๊ฐ์ ์ ์ฌ๋๋ฅผ ๋ถ์ํ๋ฉด, FWM์ orthogonalํ์ง ์์ง๋ง ์ฐ๋ฆฌ ๋ฐฉ๋ฒ์ ๊ฑฐ์ ์๋ฒฝํ orthogonality๋ฅผ ๋ณด์ธ๋ค.
(a) FWM
(b) Ours
Role ๋ฒกํฐ์ unbinding ๋ฒกํฐ ๊ฐ ์ ์ฌ๋ ํ๋ ฌ. FWM์ orthogonalํ์ง ์์ง๋ง, ์ฐ๋ฆฌ ๋ฐฉ๋ฒ์ ๊ฑฐ์ ์๋ฒฝํ orthogonality๋ฅผ ๋ณด์ฌ ์ฌ๋ฐ๋ฅธ symbolic representation์ ํ์ตํ์์ ํ์ธํ ์ ์๋ค.
๋์ผํ ๋์ ๊ฐ์ฒด์ ๋ํ read ๋ฒกํฐ์ ์ผ๊ด์ฑ์ ๋ถ์ํ๋ฉด, ์ฐ๋ฆฌ ๋ฐฉ๋ฒ์์ ์กฐํฉ์ ๊ด๊ณ์์ด ๋์ผํ ๊ฒฐ๊ณผ๋ฅผ ์ถ๋ ฅํ๋ค.
(a) FWM
(b) Ours
๋์ผํ ๋์ ๊ฐ์ฒด์ ๋ํ read ๋ฒกํฐ ๊ฐ ์ ์ฌ๋. ์ฐ๋ฆฌ ๋ฐฉ๋ฒ์์ read ์ถ๋ ฅ์ด ์กฐํฉ์ ๊ด๊ณ์์ด ๊ฑฐ์ ๋์ผํ์ฌ, ์ฒด๊ณ์ ์ธ ์ฐ๊ด ์ถ๋ก ์ ์ํํ๊ณ ์์์ ๋ณด์ฌ์ค๋ค.
BibTeX
@inproceedings{bae2022learning,
author = {Bae, Jun-Hyun and Park, Taewon and Lee, Minho},
title = {Learning Associative Reasoning Towards Systematicity Using Modular Networks},
booktitle = {International Conference on Neural Information Processing (ICONIP)},
year = {2022},
publisher = {Springer},
doi = {10.1007/978-3-031-30108-7_10}
}