Show simple item record

dc.contributor.authorChenghua, Linen_GB
dc.date.accessioned2011-12-12T08:47:58Zen_GB
dc.date.accessioned2013-03-21T11:56:54Z
dc.date.issued2011-09-26en_GB
dc.description.abstractSentiment analysis aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text, and has received a rapid growth of interest in natural language processing in recent years. Probabilistic topic models, on the other hand, are capable of discovering hidden thematic structure in large archives of documents, and have been an active research area in the field of information retrieval. The work in this thesis focuses on developing topic models for automatic sentiment analysis of web data, by combining the ideas from both research domains. One noticeable issue of most previous work in sentiment analysis is that the trained classifier is domain dependent, and the labelled corpora required for training could be difficult to acquire in real world applications. Another issue is that the dependencies between sentiment/subjectivity and topics are not taken into consideration. The main contribution of this thesis is therefore the introduction of three probabilistic topic models, which address the above concerns by modelling sentiment/subjectivity and topic simultaneously. The first model is called the joint sentiment-topic (JST) model based on latent Dirichlet allocation (LDA), which detects sentiment and topic simultaneously from text. Unlike supervised approaches to sentiment classification which often fail to produce satisfactory performance when applied to new domains, the weakly-supervised nature of JST makes it highly portable to other domains, where the only supervision information required is a domain-independent sentiment lexicon. Apart from document-level sentiment classification results, JST can also extract sentiment-bearing topics automatically, which is a distinct feature compared to the existing sentiment analysis approaches. The second model is a dynamic version of JST called the dynamic joint sentiment-topic (dJST) model. dJST respects the ordering of documents, and allows the analysis of topic and sentiment evolution of document archives that are collected over a long time span. By accounting for the historical dependencies of documents from the past epochs in the generative process, dJST gives a richer posterior topical structure than JST, and can better respond to the permutations of topic prominence. We also derive online inference procedures based on a stochastic EM algorithm for efficiently updating the model parameters. The third model is called the subjectivity detection LDA (subjLDA) model for sentence-level subjectivity detection. Two sets of latent variables were introduced in subjLDA. One is the subjectivity label for each sentence; another is the sentiment label for each word token. By viewing the subjectivity detection problem as weakly-supervised generative model learning, subjLDA significantly outperforms the baseline and is comparable to the supervised approach which relies on much larger amounts of data for training. These models have been evaluated on real world datasets, demonstrating that joint sentiment topic modelling is indeed an important and useful research area with much to offer in the way of good results.en_GB
dc.identifier.citationLin, C., He, Y., Everson, R. and R¨uger, S. Weakly-supervised Joint Sentiment-Topic Detection from Text, IEEE Transactions on Knowledge and Data Engineering (TKDE), to appear.en_GB
dc.identifier.citationLin, C., He, Y., and Everson, R. A Comparative Study of Bayesian Models for Unsupervised Sentiment Detection, In Proceedings of the 14th Con- ference on Computational Natural Language Learning (CoNLL), Uppsala, Sweden, 2010.en_GB
dc.identifier.citationLin, C. and He, Y. Joint Sentiment/Topic Model for Sentiment Analysis, In Proceedings of the 18th ACM Conference on Information and Knowl- edge Management (CIKM), Hong Kong, China, 2009.en_GB
dc.identifier.citationLin, C., He, Y. and Everson, R. Sentence Subjectivity Detection with Weakly-Supervised Learning, In Proceedings of the 5th International Joint Conference on Natural Language Processing (IJCNLP), ChiangMai, Thailand, 2011.en_GB
dc.identifier.urihttp://hdl.handle.net/10036/3307en_GB
dc.language.isoenen_GB
dc.publisherUniversity of Exeteren_GB
dc.subjectsentiment analysisen_GB
dc.subjectopinion miningen_GB
dc.subjectsubjectivity detectionen_GB
dc.subjectjoint sentiment-topic modelen_GB
dc.subjectlatent Dirichlet allocationen_GB
dc.subjecttopic modelen_GB
dc.titleProbabilistic topic models for sentiment analysis on the Weben_GB
dc.typeThesis or dissertationen_GB
dc.date.available2011-12-12T08:47:58Zen_GB
dc.date.available2013-03-21T11:56:54Z
dc.contributor.advisorRichard, Eversonen_GB
dc.contributor.advisorYulan, Heen_GB
dc.publisher.departmentComputer Scienceen_GB
dc.type.degreetitlePhD in Computer Scienceen_GB
dc.type.qualificationlevelDoctoralen_GB
dc.type.qualificationnamePhDen_GB


Files in this item

This item appears in the following Collection(s)

Show simple item record