Corpus Annotation and Analysis of Sarcasm in Twitter: #CatsMovie vs. #TheRiseOfSkywalker
AbstractSentiment analysis is a natural language processing task that has received increased attention in the last decade due to the vast amount of opinionated data on social media platforms such as Twitter. Although the methodologies employed have grown in number and sophistication, analysing irony and sarcasm still poses a severe problem. From the linguistic perspective, sarcasm has been studied in discourse analysis from several perspectives, but little attention has been given to specific metrics that measure its relevance. In this paper we describe the creation of a manually-annotated dataset where detailed text markers are included. This dataset is a sample from a larger corpus of tweets (n= 76,764) on two highly controversial films: Cats and Star Wars: The Rise of Skywalker. We took two different samples for each film, one before and one after their release, to compare reception and presence of sarcasm. We then used a sentiment analysis tool to measure the impact of sarcasm in polarity detection and then manually classified the mechanisms of sarcasm generation. The resulting corpus will be useful for machine learning approaches to sarcasm detection as well as discourse analysis studies on irony and sarcasm.
Abrams, Jeffrey Jacob. 2019. Star Wars: The Rise of Skywalker. Walt Disney Studios Motion Pictures.
Amir, Silvio et al. 2016. “Modelling Context with User Embeddings for Sarcasm Detection in Social Media.” In Riezler and Goldberg 2016, 167-77.
Apidianaki, Marianna et al., eds. 2018. Proceedings of the 12th International Workshop on Semantic Evaluation. New Orleans: Association for Computational Linguistics.
Armstrong, Jennifer Keishin. 2017. “The Joy of Hate-Watching.” BBC. June 26. [Accessed July 20, 2021].
Artstein, Ron and Massimo Poesio. 2008. “Inter-Coder Agreement for Computational Linguistics.” Computational Linguistics 34 (4): 555-96.
Attardo, Salvatore. 2000. “Irony as Relevant Inappropriateness.” Journal of Pragmatics 32 (6): 793-826.
Balahur, Alexandra et al., eds. 2011. Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis. Portland: Association for Computational Linguistics.
—, eds. 2014. Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Baltimore: Associacion for Computational Linguistics.
—, eds. 2016. Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. San Diego: Associacion for Computational Linguistics.
Balahur, Alexandra, Erik van der Goot and Andres Montoyo, eds. 2013. Proceedings of the 4th Workshop on Computational Approaches to Subjectivity and Social Media Analysis. Atlanta: Association for Computational Linguistics.
Barbieri, Francesco, Horacio Saggion and Francesco Ronzano. 2014. “Modelling Sarcasm in Twitter, a Novel Approach.” In Balahur et al. 2014, 50-58.
Bouazizi, Mondher and Tomoaki Ohtsuki. 2015. “Sarcasm Detection in Twitter: ‘All Your Products Are Incredibly Amazing‼!’ - Are They Really?” In Tiedemann 2015, 1-6.
Buschmeier, Konstantin, Philipp Cimiano and Roman Klinger. 2014. “An Impact Analysis of Features in a Classification Approach to Irony Detection in Product Reviews.” In Balahur et al. 2014, 42-49.
Calzolari, Nicoletta et al., eds. 2018. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018). Miyazaki, Japan: European Language Resources Association.
Cambria, Erik et al., 2017a. “Affective Computing and Sentiment Analysis.” In Cambria et al. 2017b, 1-10.
—, eds. 2017b. A Practical Guide to Sentiment Analysis. Berlin: Springer.
Camp, Elisabeth. 2012. “Sarcasm, Pretense and the Semantics/Pragmatics Distinction.” Nous 46 (4): 587-634.
Campbell, John D. and Albert N. Katz. 2012. “Are There Necessary Conditions for Inducing a Sense of Sarcastic Irony?” Discourse Processes 49 (6): 459-80.
Cohen, Jacob. 1960. “A Coefficient of Agreement for Nominal Scales.” Educational and Psychological Measurement XX (1): 37-46.
Cohen, William et al., eds. 2010. Proceedings of the 4th International AAAI Conference on Weblogs and Social Media. Menlo Park: The AAAI Press.
Cole, Peter, ed. 1981. Radical Pragmatics. New York: Academic Press.
Cui, Bin et al., eds. 2016. Web-Age Information Management. Berlin: Springer.
Dale, Daniel. 2020. “Fact Check: Trump Lies That He Was Being ‘Sarcastic’ When He Talked about Injecting Disinfectant.” CNN, April 24. [Accessed July 20, 2021].
Davidov, Dmitry, Oren Tsur and Ari Rappoport. 2010. “Semi-Supervised Recognition of Sarcastic Sentences in Twitter and Amazon.” In Farkas et al. 2010, 107-16.
Dekang, Lin, Yuji Matsumoto and Rada Mihalcea, ed. 2011. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers-Volume 2. Oregon: Association for Computational Linguistics.
Eisterhold, Jodi, Salvatore Attardo and Diana Boxer. 2006. “Reactions to Irony in Discourse: Evidence for the Least Disruption Principle.” Journal of Pragmatics 38 (8): 1239-56.
Erk, Katrin and Noah A. Smith, eds. 2016. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin: Association for Computational Linguistics.
Farkas, Richárd et al., eds. 2010. Proceedings of the 14th Conference on Computational Natural Language Learning. Uppsala: Association for Computational Linguistics.
Feltman, Rachel. 2020. “Drinking, Bathing in or Injecting Yourself with Bleach Can Be Deadly—and It Won’t Cure COVID-19.” Popular Science, April 24. [Accessed July 20, 2021].
Fleiss, Joseph L. 1981. Statistical Methods for Rates and Proportions. New York: John Wiley.
Ghosh, Debanjan, Avijit Vajpayee and Smaranda Muresan. 2020. “A Report on the 2020 Sarcasm Detection Shared Task.” In Klebanov et al. 2020, 1-6.
González-Ibáñez, Roberto, Smaranda Muresan and Nina Wacholder. 2011. “Identifying Sarcasm in Twitter: A Closer Look.” In Dekang, Matsumoto and Mihalcea 2011, 581-86.
Gosh, Aniruddha and Tony Veale. 2016. “Fracking Sarcasm Using Neural Network.” In Balahur et al. 2016, 161-69.
Hernández Farias, Delia Irazú and Paolo Rosso. 2017. “Irony, Sarcasm and Sentiment Analysis.” In Pozzi et al. 2017, 113-28.
Hooper, Tom. 2019. CATS. Universal Pictures.
Joshi, Aditya, Pushpak Bhattacharyya and Mark J. Carman. 2017. “Automatic Sarcasm Detection: A Survey.” ACM Computing Surveys 50 (5): 1-22.
Joshi, Aditya et al. 2016. “Are Word Embedding-Based Features for Sarcasm Detection?” In Su, Duh and Carreras 2016, 1006-11.
Klebanov, Beata B. et al., eds. 2020. Proceedings of the Second Workshop on Figurative Language Processing. Online: Association for Computational Linguistics.
Kreuz, Roger J. and Richard M. Roberts. 1993. “On Satire and Parody: The Importance of Being Ironic.” Metaphor and Symbolic Activity 8 (2): 97-109.
Krippendorff, K. 2004. Content Analysis: An Introduction to its Methodology. Thousand Oaks: SAGE Publications.
Kunneman, Florian et al. 2015. “Signaling Sarcasm: From Hyperbole to Hashtag.” Information Processing & Management 51 (4): 500-09.
Liebrecht, Christine, Florian Kunneman and Antal van den Bosch. 2013. “The Perfect Solution for Detecting Sarcasm in Tweets #not.” In Balahur, van der Goot and Montoyo 2013, 29-37.
Liu, Bing. 2011. Web Data Mining: Exploring Hyperlinks, Contents and Usage Data. Berlin: Springer.
Liu, Peng et al. 2014. “Sarcasm Detection in Social Media Based on Imbalanced Classification.” In Cui et al. 2016, 459-71.
Mishra, Abhijit et al. 2016. “Harnessing Cognitive Features for Sarcasm Detection.” In Erk and Smith 2016, 1095-104.
Montani, Ines et al. 2021. Prodigy v1.11.4 (version v3.1.0). Explosion.
Moreno-Ortiz, Antonio. 2021. Lingmotif 2 (version 2.05). Python, Angular.
Moreno-Ortiz, Antonio and Chantal Pérez-Hernández. 2018. “Lingmotif-Lex: A Wide-Coverage, State-of-the-Art Lexicon for Sentiment Analysis.” In Calzolari et al. 2018, 2653-59.
Moreno-Ortiz, Antonio, Soluna Salles-Bernal and Aroa Orrequia-Barea. 2019. “Design and Validation of Annotation Schemas for Aspect-Based Sentiment Analysis in the Tourism Sector.” Information Technology & Tourism 21 (4): 535-57.
Morris, Tom. 2019. “The Fandom Menace: Profiling Star Wars’ Influential Fanbase.” GWI. December 3. [Accessed July 20, 2021].
Pallavicini, Federica, Pietro Cipresso and Fabrizia Mantovani. 2017. “Beyond Sentiment: How Social Network Analytics Can Enhance Opinion Mining and Sentiment Analysis.” In Pozzi et al. 2017, 13-30.
Partington, Alan. 2011. “Phrasal Irony: Its Form, Function and Exploitation.” Journal of Pragmatics 43 (6): 1786-800.
Pozzi, Federico et al., eds. 2017. Sentiment Analysis in Social Networks. Milan: Elsevier.
Reyes, Antonio and Paolo Rosso. 2011. “Mining Subjective Knowledge from Customer Reviews: A Specific Case of Irony Detection.” In Balahur et al. 2011, 118-24.
Reyes, Antonio, Paolo Rosso and Tony Veale. 2013. “A Multidimensional Approach for Detecting Irony in Twitter.” Language Resources & Evaluation 47 (1): 239-68.
Riezler, Stefan and Yoav Goldberg, eds. 2016. Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Berlin: Association for Computational Linguistics.
Rockwell, Patricia. 2000. “Lower, Slower, Louder: Vocal Cues of Sarcasm.” Journal of Psycholinguistic Research 29 (5): 483-95.
Rohde, Hannah. 2006. “Rhetorical Questions as Redundant Interrogatives.” San Diego Linguistics Papers 2: 134-68.
Rotten Tomatoes. 2021. “Cats (2019).” [Accessed March 17, 2022].
Sestero, Greg and Tom Bissell. 2013. The Disaster Artist: My Life inside the Room, the Greatest Bad Movie Ever Made. Simon & Schuster.
Sperber, Dan and Deirdre Wilson. 1981. “Irony and the Use-Mention Distinction.” In Cole 1981, 295-318.
Stieger, Stefan, Anton K. Formann and Christoph Burger. 2011. “Humor Styles and Their Relationship to Explicit and Implicit Self-Esteem.” Personality and Individual Differences 50 (5): 747-50.
Su, Jian, Kevin Duh and Xavier Carreras, eds. 2016. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin: Association for Computational Linguistics.
Tiedemann, Ed, ed. 2015. Proceedings of the 58th Global Communications Conference (IEEE GLOBECOM 2015). San Diego: Institute of Electrical and Electronic Engineers.
Toutanova, Kristina and Hua Wu, eds. 2014. Proceedings of the Annual Meeting of the Association for Computational Linguistics. Baltimore: Association for Computational Linguistics.
Tsur, Oren, Dmitry Davidov and Ari Rappoport. 2010. “ICWSM-A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews.” In Cohen et al. 2010, 162-9.
Van Hee, Cynthia. 2017. “Can Machines Sense Irony?: Exploring Automatic Irony Detection on Social Media.” PhD diss., Ghent University.
Van Hee, Cynthia, Els Lefever and Veronique Hoste. 2018. “SemEval-2018 Task 3: Irony Detection in English Tweets.” In Apidianaki et al. 2018, 39-50.
Wallace, Byron C. et al. 2014. “Humans Require Context to Infer Ironic Intent (so Computers Probably Do, Too).” In Toutanova and Wu 2014, 512-6.
Wilson, Deirdre. 2013. “Irony Comprehension: A Developmental Perspective.” Journal of Pragmatics 59: 40-56.
The authors retain copyright of articles. They authorise AEDEAN to publish them in its journal Atlantis and to include them in the indexing and abstracting services, academic databases and repositories the journal participates in.
Under the terms of the Creative Commons Attribution NonCommercial ShareAlike 4.0 International Licence (CC BY-NC-SA 4.0), for non-commercial (i.e., personal or academic) purposes only, users are free to share (i.e., copy and redistribute in any medium or format) and adapt (i.e., remix, transform and build upon) articles published in Atlantis, free of charge and without obtaining prior permission from the publisher or the author(s), as long as they give appropriate credit to the author, the journal (Atlantis) and the publisher (AEDEAN), provide the relevant URL link to the original publication and indicate if changes were made. Such attribution may be done in any reasonable manner, but not in any way that suggests the journal endorses the user or their use of the material published therein. Users who adapt (i.e., remix, transform or build upon the material) must distribute their contributions under the same licence as the original.
Self-archiving is also permitted, so that authors are allowed to deposit the published PDF version of their articles in academic and/or institutional repositories, without fee or embargo. Authors may also post their individual articles on their personal websites, again on condition that the original link to the online edition is provided.
Authors are expected to know and heed basic ground rules that preclude simultaneous submission and/or duplicate publication. Prospective contributors to Atlantis commit themselves to the following when they submit a manuscript:
- That no concurrent consideration of the same, or almost identical, work by any other journal and/or publisher is taking place.
- That the potential contribution has not appeared previously, in any form whatsoever, in another journal, electronic format or as a chapter/section of a book.
Seeking permission for the use of copyright material is the responsibility of the author.
Consejería de Economía, Conocimiento, Empresas y Universidad, Junta de Andalucía
Grant numbers “SentiTur: Sistema de monitorización de opinión de usuarios de recursos turísticos andaluces basado en análisis de sentimiento y análisis visual” (UMA18-FEDERJA-158)
Consejería de Economía, Conocimiento, Empresas y Universidad, Junta de Andalucía
Grant numbers “EAVITur: Extracción, análisis y visualización de inteligencia turística. Ecosistema innovador con inteligencia artificial para Andalucía 2025” (CEI A-Tech)