Relation Graph Attention Network for Aspect-based Sentiment Analysis

[ABSA]Relation Graph Attention Network for Aspect-based Sentiment Analysis 2021.07.02 1

[ABSA]Relation Graph Attention Network for Aspect-based Sentiment Analysis

2021. 7. 2. 12:00

728x90

Relational Graph Attention Network for Aspect-based Sentiment Analysis

최근에는 ABSA를 attention-based neural network을 사용해서 해결하고자 한다.
- 하지만 언어의 복잡성과 한 문장에서 복수의 aspect가 존재할 수 있기 때문에 연결성에 혼란을 가진다.
이 논문에서는 이 문제를 구문정보의 효과적인 인코딩을 사용해 해결하고자 한다.
- 기존의 dependency tree를 reshape, pruning해서 target aspect를 root로 가지는 aspect-oriented dependency tree 구조를 정의한다.
- sentiment prediction을 위해서 새 tree 구조를 encode 할 R-GAT를 제안한다.
이 방법이 더 연결성을 잘가지고 있는 것을 보임

1. Introduction

두 개의 aspect가 서로 다른 sentiment를 나타낼 수 있기 때문에 sentence-level sentiment polarity 를 적용하는 것은 부적절하다.
Attention 메커니즘을 가지는 구조에는 가까운 단어에 더 attend하는 특성을 가지고 있는데, 이는 잘못된 결과를 초래할 수 있다.
Syntactic structure를 이용하려는 많은 시도가 있었고, Dependency-based parse는 comprehensive syntactic information을 제공한다.
- 최근에는 GNN + dependency tree를 시도함
이러한 구조들은 다음과 같은 단점을 가지고 있다.
1. 측면과 의견 단어 사이의 연관성을 나타낼 수 있는 의존 관계는 무시된다.
2. 트리의 일부부만 필요한 것이지, 전체 트리 구조가 필요한 것은 아니다.
3. encoding process가 트리 종속적이기 때문에 batch operation이 불편한다.
이 논문에서는 syntax information을 재검토하고, task-related syntactic 구조를 알아내는 것이 중요하다.
- 따라서 novel aspect-oriented dependency tree 구조를 세 단계로 제안한다.
  - 기존의 parser를 사용해서 sentence의 dependency tree를 얻는다.
  - 질문의 target aspect로 root가 되게 dependency tree를 reshape한다.
  - aspect와 직접 연결된 edge들만 남긴다.
이러한 unified tree 구조는 aspect와 가능성 있는 opinion word 사이의 관계성에 집중할 뿐만 아니라 batch와 parallel 연산도 이용할 수 있다.
또한 이러한 새 tree를 encode하기 위해 R-GAT(relational graph attention network)를 제안함

2. Related Work

pass

3. Aspect-Oriented Dependency Tree

3.1 Aspect, Attention and Syntax

syntactic 구조는 문법 구조를 나타내는 종속성 트리를 생성하는 작업인 종속성 구문 분석을 통해 알아낼 수 있다.
- 단어 사이의 관계는 directed edge와 label로 나타낸다.

위 그림은 aspect, attention, syntax 간의 관계를 그림으로 그린 것이다.
(a)의 경우 like는 동사로 사용되었으며, aspect recipe에 대해 positive를 표현한다.
- 즉 이는 attention-based LSTM에 의해 성공적으로 attend되었다는 것을 알 수 있다.
(b)에서의 like는 다른 의미로 사용되었지만 여전히 recipe와 연결되어 있어 잘못된 예측을 했다.
(c)는 두 개의 aspect가 등장하는 경우다.
- chicken같은 경우에는 but, dried와 강하게 연결되었고, 잘못된 예측을 하도록 했다.
  - attention-based 모델의 약점을 보여주는 예시임

3.2 Aspect-Oriented Dependency Tree

3.1의 내용을 보면 aspect와 related opinion이 직접적으로 연결된 dependency 관계에 더 직접적으로 집중한다.
하지만 ABSA에서는 target aspect에 더 집중해야 할 것이다.
- 그래서 target aspect를 root로 하는 aspect-oriented dependency tree를 제안한다.
  1. 기존 Dependency Tree를 구축
  2. target aspect를 root로 구축
  3. aspect에 직접 연결되어 있는 connection을 children으로 set한다.
  4. n:con이라는 가상 관계를 부여한다. 이 때, n은 두 노드 사이의 거리를 나타낸다.
  5. 이 과정을 aspect가 여러개라면 각각 만든다.
- 그러면 다음과 같이 dependency tree가 바뀌게 된다.
- aspect-oriented 구조는 다음과 같은 2가지 장점을 가진다.
  1. 각 aspect는 고유의 dependency tree를 가진다. 따라서 관계없는 node나 관계에 덜 영향을 받는다.
  2. aspect가 하나 보다 더 많은 단어로 구성되어 있어도 분석이 가능하다.
- 이 아이디어는 aspect와 관계 있는 일부 단어들만 사용해도 가능하다는 것 때문에 가능하다.
- batch, parallel 연산을 가능하게 한다.
- n:con 관계가 좀 더 robust하게 만든다.

4. Relational Graph Attention Network

새 트리 구조를 encoding 하기 위해서 GAT를 확장한 relational graph attention network를 제안한다.

4.1 Graph Attention Network

Dependency tree는 n개의 노드를 가지고 있는 graph G로 표현된다.
graph G의 edge는 워드간의 dependency를 N_i는 neighborhood 노드를 나타낸다.
multi-head attention을 사용해서 이웃 노드 representation을 취합해 node representation을 업데이트 한다.

4.2 Relational Graph Attention Network

GAT는 dependency relation을 잃어버릴 수 있다.
다른 의존 관계를 가지고 있는 neighborhood는 서로 다른 영향을 준다.
GAT에 relation head를 추가하는데 이는 neighborhood로 부터 오는 정보 흐름을 조절하는 relation-wise gate 역할을 수행한다.
Dependency relation을 vector representation으로 map하고, relation head는 아래 식과 같이 계산한다.

4.3 Model Training

BiLSTM을 사용해서 tree node의 word embedding을 encode하고, 또 다른 BiLSTM을 사용해서 aspect 단어를 encode하고 이를 평균내서 초기 root representation으로 사용한다.
Aspect-oriented tree의 R-GAT를 적용한 후에, fc layer와 softmax를 활용해서 sentiment를 예측한다.

5. Experiment

5.1 Datasets

SemEval 2014 Task의 Laptop과 Restaurant 리뷰 데이터셋을 사용한다.
이와 함께 Twitter 데이터셋도 같이 사용

5.1.1 Implementation Details

dependency parsing을 위해서 Biaffine Parser를 사용한다.
- 차원은 300으로 고정
R-GAT는 GLoVe를 사용, R-GAT+BERT는 BERT의 last hidden state를 사용한다.

5.2 Baseline Methods

Syntax-aware models
- LSTM+SynATT, AdaRNN, PhraseRNN, AS-GCN, CDT, GAT, TD-GAT
Attention-based models
- ATAE-LSTM, IAN, RAM, MGAN, attention-equipped LSTM and fine-tuned BERt
Other recent method
- JCI, TNET

5.3 Results and Analysis

5.3.1 Overall Performance

R-GAT 모델이 다른 baseline들보다 성능이 좋음
- 특히 일반 GAT를 쓸 때 보다 큰 성능향상이 있었다.
BERT를 사용했을 떄, ABSA에서 좋은 성능을 기록했는데 여기에 R-GAT를 같이 사용하면 더 좋아졌다.

5.3.2 Effect of Multiple Aspects

한 문장에서 multiple aspect가 나올 수 있다.
만약 두 개 이상의 asepct가 나오면 각 aspect 별로 Euclidean distance를 구한다.
GAT, R-GAT, R-GAT+BERT를 골라서 distance에 따라 정확도를 측정한 것.

거리가 가까운 측면이 정확도 점수를 낮추는 경향이 있다는 것을 관찰할 수 있으며, 이는 문장에서 의미 유사성이 높은 측면이 모델을 혼동할 수 있음을 나타낸다.
- R-GAT는 이 문제를 완화했다.

5.3.3 Effect of Differnet Parsers

Dependency parsing을 다르게 하면서 성능을 비교해본다.

Biaffine을 사용할 때 성능이 더 좋음

5.3.4 Ablation Study

aspect-oriented dependency tree와 relation head의 영향력을 평가함

R-GAT를 사용할 때 더 성능이 올라가는 것을 보이고, aspect-oriented로 reshape 했을 때 성능이 더 향상 되는 것이 보임
특히 R-GAT에서 n:con의 관계를 빼면 성능이 떨어지는 것을 보아 좋은 영향을 주는 것을 알 수 있음

5.3.5 Error Analysis

ABAS의 한계를 알기 위해서 실패한 것들 중에서 100개를 랜덤으로 고름
실패하는 이유는 여러가지가 있었다.
- 대부분의 이유로는 중립 리뷰로 인한 것이였다.
- 다른 요인은 실제로 이해하기 어려운 언어라 분석하기 어려웠었다.
- 문장에서 뚜렷한 특징이 없거나
- 이중 부정이 나타나는 경우에 분석이 힘들었다.

728x90

저작자표시 비영리 동일조건 (새창열림)

'Paper' 카테고리의 다른 글

[ABSA]Does syntax matter? A strong baseline for Aspect-based Sentiment Analysis with RoBERTa (0)	2021.06.25
[DB Tune]High-Dimensional Bayesian Optimization with Multi-Task Learning for RocksDB (0)	2021.06.18
[ABSA]Aspect Based Sentiment Analysis with Gated Convolutional Networks (0)	2021.06.11
[Multilingual]Massively Multilingual Sentence Embeddings for Zero-shot Cross-Lingual Transfer and Beyond (2)	2021.06.04
[Multimodal Sentiment Analysis]Gated Mechanism For Attention Based Multimodal Sentiment Analysis (0)	2020.07.24

PREV 1 NEXT

CodeDrive

Relation Graph Attention Network for Aspect-based Sentiment Analysis

[ABSA]Relation Graph Attention Network for Aspect-based Sentiment Analysis

Relational Graph Attention Network for Aspect-based Sentiment Analysis

1. Introduction

2. Related Work

3. Aspect-Oriented Dependency Tree

3.1 Aspect, Attention and Syntax

3.2 Aspect-Oriented Dependency Tree

4. Relational Graph Attention Network

4.1 Graph Attention Network

4.2 Relational Graph Attention Network

4.3 Model Training

5. Experiment

5.1 Datasets

5.1.1 Implementation Details

5.2 Baseline Methods

5.3 Results and Analysis

5.3.1 Overall Performance

5.3.2 Effect of Multiple Aspects

5.3.3 Effect of Differnet Parsers

5.3.4 Ablation Study

5.3.5 Error Analysis

'Paper' 카테고리의 다른 글

+ Recent posts

티스토리툴바