An Analysis on Use of Deep Learning and Lexical-Semantic Based Sentiment Analysis Method on Twitter Data to Understand the Demographic Trend of Telemedicine
Technology has turned into a fundamental piece of everybody's life. Social media technology is already used widely by the public to speak out once mind openly. This data can be leveraged to have a better understanding of the current state of decision making. However, Twitter data is highly unstructured. Sentiment analysis can be applied to such health-related data to extract useful information regarding public opinion. The aim of the research is to understand: (i) the correlation between Deep Learning versus lexical and semantic-based sentiment prediction methods, (ii) the sentiment prediction accuracy of these methods on manually annotated sentiment dataset (iii) domain-specific knowledge on accuracy of the sentiment prediction methods, and (iv) to utilize Twitterbased sentiment to understand the influence of telemedicine in regards to heart attack and epilepsy. Four sentiment prediction methods are utilized for the research; Lexical and Semantic-based (Valence Aware Dictionary and Sentiment Reasoner (VADER) and TextBlob) and Deep Learning based (Long Short Term Memory (LSTM) and sentiment model from Stanford CoreNLP). The dataset that we retrieved consists of 1.84 million old health-related tweets. Our finding suggests that lexical and semantic-based methods for sentiment prediction offer better accuracy than Deep Learning methods; when a large enough and evenly distributed training dataset is not available. We observed that domain-specific knowledge affects the prediction accuracy of sentiment, mainly when the target text contains more domain-specific words. Sentiment prediction on Twitter data can be utilized to understand the demographic distribution of sentiment. In our case, we observed that telemedicine has a high number of positive sentiment. It is still in its infancy and has not spread to a broader demographic.