Fine-grained Multimodal Sentiment Analysis Based on Gating and Attention Mechanism

Yingxue Sun, Junbo Gao

Abstract


In recent years, more and more people express their feelings through both images and texts, boosting the growth of multimodal data. Multimodal data contains richer semantics and is more conducive to judging the real emotions of people. To fully learn the features of every single modality and integrate modal information, this paper proposes a fine-grained multimodal sentiment analysis method FCLAG based on gating and attention mechanism. First, the method is carried out from the character level and the word level in the text aspect. CNN is used to extract more fine-grained emotional information from characters, and the attention mechanism is used to improve the expressiveness of the keywords. In terms of images, a gating mechanism is added to control the flow of image information between networks. The images and text vectors represent the original data collectively. Then the bidirectional LSTM is used to complete further learning, which enhances the information interaction capability between the modalities. Finally, put the multimodal feature expression into the classifier. This method is verified on a self-built image and text dataset. The experimental results show that compared with other sentiment classification models, this method has greater improvement in accuracy and F1 score and it can effectively improve the performance of multimodal sentiment analysis.


Keywords


Multimodal Sentiment Analysis; Fine-grained; Attention Mechanism; Gating Mechanism; Late-fusion

Full Text:

PDF

Included Database


References


Ye Q, Lin B, Li YJ. Sentiment classification for Chinese reviews: A comparison between SVM and semantic approaches. 2005 International Conference on Machine Learning and Cybernetics. IEEE 2005.

Montoyo A, Martinez-Barco P, Balahur A. Subjectivity and sentiment analysis: An overview of the current state of the area and envisaged developments. Decision Support Systems 2012; 53(4): 675–679.

Rui W, Xing K, Jia Y. BOWL: Bag of word clusters text representation using word embeddings. Knowledge Science, Engineering and Management 2016; 3–14.

Wang H, He J, Zhang X, et al. A short text classification method based on N-Gram and CNN. Chinese Journal of Electronics 2020; 29(2): 248–254.

Yu Y, Chen K, Shou L, et al. Sentiment analysis of user reviews based on keyword and key sentence extraction. Computer Science 2019; 46(10): 19–26.

Alessa A, Faezipour M, Alhassan Z. Text classification of flu-related tweets using fasttext with sentiment and keyword features. 2018 IEEE International Conference on Healthcare Informatics (ICHI). IEEE 2018.

Bai X, Chen F, Zhan S. A study on sentiment computing and classification of Sina Weibo with word2vec. 2014 IEEE International Congress on Big Data. IEEE 2014.

Alshari EM, Azman A, Doraisamy S, et al. Effective method for sentiment lexical dictionary enrichment based on Word2Vec for sentiment analysis. 2018 Fourth International Conference on Information Retrieval and Knowledge Management (CAMP). IEEE 2018.

Sharma Y, Agrawal G, Jain P, et al. Vector representation of words for sentiment analysis using GloVe. 2017 International Conference on Intelligent Communication and Computational Techniques (ICCT). IEEE 2018.

Sun C, Huang L, Qiu X. Utilizing BERT for aspect-based sentiment analysis via constructing auxiliary sentence. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2019; 1: 380–385.

Gao Z, Feng A, Song X, et al. Target-dependent sentiment classification with BERT. IEEE Access 2019; 7: 154290–154299.

Socher R, Perelygin A, Wu J, et al. Recursive deep models for semantic compositionality over a sentiment treebank. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing 2013; 1631–1642.

Ho TK. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 1998; 20(8): 832–844.

Wang Y, Huang M, Zhu X, et al. Attention-based LSTM for aspect-level sentiment classification. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing 2016; 606–615.

Chen T, Xu R, He Y, et al. Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Systems with Applications 2017; 72: 221–230.

Chen Y, Yuan J, You Q, et al. Twitter sentiment analysis via Bi-sense emoji embedding and attention-based LSTM. Proceedings of the 26th ACM International Conference on Multimedia 2018; 117–125.

Siersdorfer S, Minack E, Deng F, et al. Analyzing and predicting sentiment of images on the social web. Proceedings of the 18th ACM International Conference on Multimedia 2010; 715–718.

Borth D, Ji R, Chen T, et al. Large-scale visual sentiment ontology and detectors using adjective noun pairs. Proceedings of the 21st ACM International Conference on Multimedia 2013; 223–232.

Xu C, Cetintas S, Lee KC, et al. Visual sentiment prediction with deep convolutional neural networks.

Yang J, She D, Sun M, et al. Visual sentiment prediction based on automatic discovery of affective regions. IEEE Transactions on Multimedia 2018; 20(9): 2513–2525.

Srivastava RK, Greff K, Schmidhuber J. Highway networks. Computer Science 2015.

Liu Q, Zhang D, Wu L, et al. Multi-modal sentiment analysis with context-augmented LSTM. Computer Science 2019; 46(11): 181–185.

Lin M, Meng Z. Multimodal sentiment analysis based on attention neural network. Computer Science 2020, 47(11A): 508–514, 548.

Liao Y, Wang J, Liu T, et al. Joint visual-textual approach for microblog sentiment analysis. Computer Engineering and Design 2019; 40(4): 1099–1105.

Poria S, Cambria E, Gelbukh A. Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing 2015; 2539–2544.

Kim T, Lee B. Multi-attention multimodal sentiment analysis. Proceedings of the 2020 International Conference on Multimedia Retrieval 2020; 436–441.

Poria S, Cambria E, Hazarika D, et al. Multi-level multiple attentions for contextual multimodal sentiment analysis. 2017 IEEE International Conference on Data Mining (ICDM) 2017.

Majumder N, Hazarika D, Gelbukh A, et al. Multi-modal sentiment analysis using hierarchical fusion with context modeling. Knowledge-Based Systems 2018; 161: 124–133.

Zhang Y, Rong L, Song D, et al. A survey on multimodal sentiment analysis. Pattern Recognition and Artificial Intelligence 2020; 33(5): 426–438.

Zadeh A, Zellers R, Pincus E, et al. Multimodal sentiment intensity analysis in videos: Facial gestures and verbal messages. IEEE Intelligent Systems 2016; 31(6): 82–88.




DOI: https://doi.org/10.18686/esta.v7i4.166

Refbacks

  • There are currently no refbacks.


Copyright (c) 2020 Yingxue Sun, Junbo Gao

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.