University of Exeter
Browse

CIDER: Context-sensitive polarity measurement for short-form text

Download (3.23 MB)
journal contribution
posted on 2025-08-02, 12:18 authored by JC Young, R Arthur, HTP Williams
Researchers commonly perform sentiment analysis on large collections of short texts like tweets, Reddit posts or newspaper headlines that are all focused on a specific topic, theme or event. Usually, general-purpose sentiment analysis methods are used. These perform well on average but miss the variation in meaning that happens across different contexts, for example, the word "active" has a very different intention and valence in the phrase "active lifestyle" versus "active volcano". This work presents a new approach, CIDER (Context Informed Dictionary and sEmantic Reasoner), which performs context-sensitive linguistic analysis, where the valence of sentiment-laden terms is inferred from the whole corpus before being used to score the individual texts. In this paper, we detail the CIDER algorithm and demonstrate that it outperforms state-of-the-art generalist unsupervised sentiment analysis techniques on a large collection of tweets about the weather. CIDER is also applicable to alternative (non-sentiment) linguistic scales. A case study on gender in the UK is presented, with the identification of highly gendered and sentiment-laden days. We have made our implementation of CIDER available as a Python package: https://pypi.org/project/ciderpolarity/.

Funding

Engineering and Physical Sciences Research Council (EPSRC)

NE/ P017436/1

Natural Environment Research Council (NERC)

History

Related Materials

Rights

© 2024 Young et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Notes

This is the final version. Available from Public Library of Science via the DOI in this record. Data Availability Statement: All relevant data are within the paper and its Supporting information files.

Journal

PLoS One

Publisher

Public Library of Science

Editors

AlShehhi, AM

Place published

United States

Version

  • Version of Record

Language

en

FCD date

2024-06-26T15:28:09Z

FOA date

2024-06-26T15:30:36Z

Citation

Vol. 19, No. 4, article e0299490

Department

  • Computer Science

Usage metrics

    University of Exeter

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC