Résumé :
|
Long before the invention of the Internet, the purchasing process and customer behaviour were supported by the word-of-mouth, as it was the only channel to acquire feedback and customer reviews. In many cases, our buying choices were made with a leap of faith and hope that our purchase turned out to be everything we expected. But with the rise of Web 2.0, customers share information or opinions about products and services, politics, current events online. As result, people and organisations refer to these information to harvest valuable insights and hence, make intelligent decisions. This shared information is a gold mine, if leveraged effectively, can provide rich and valuable insights. The problem with this information is that it is informal and unstructured, thus, difficult to assess automatically and in huge volume. Accordingly, these data require appropriate processing to obtain useful information. Sentiment analysis (SA) is used to extract knowledge from online data. Research in the field of SA seek to extract sentiment from textual data. In this thesis, two approaches are provided to conduct sentiment analysis on text. The first one is a lexicon-based approach for multi-class Twitter sentiment analysis by developing a sentiment lexicon specific to the social media domain. The second one is a deep learning approach for binary-class sentiment analysis of reviews by proposing a convolutional neural network (CNN). This research uses universally accessible data, i.e Twitter and movie reviews datasets to evaluate the proposed frameworks for their reliability and validity. Experiments were conducted using the proposed methodologies; firstly, the lexicon-based approach was evaluated on Twitter data. The results show that the developed lexicon is able to capture sentiment intensity and handle social media text. Secondly, the proposed CNN model was trained and tested using the IMDb dataset. For evaluation, accuracy was used. A sizeable performance improvement was reported whereby the proposed network yielded better results compared to prior models from the related work.
|