VIRTUAL EMOTION DETECTION BY SENTIMENT

ANALYSIS

Rahul Kamdi

Assistant Professor, Yeshwantrao Chavan College of Engineering, Nagpur Maharashtra, (India).

Prasheel N. Thakre

Assistant Professor, Shri Ramdeobaba College of Engineering and Management, Nagpur,

Maharashtra, (India).

Ajinkya P. Nilawar

Assistant Professor, Shri Ramdeobaba College of Engineering and Management, Nagpur,

Maharashtra, (India).

J. D. Kene

Assistant Professor, Shri Ramdeobaba College of Engineering and Management, Nagpur,

Maharashtra, (India).

E-mail: jagdish.kene@gmail.com

Reception: 15/11/2022 Acceptance: 30/11/2022 Publication: 29/12/2022

Suggested citation:

Kamdi, R., Thakre, P. N., Nilawar, A. P., y Kane, J. D. (2022). Virtual emotion detection by sentiment analysis.

3C TIC. Cuadernos de desarrollo aplicados a las TIC, 11(2), 175-181. https://doi.org/

10.17993/3ctic.2022.112.175-181

https://doi.org/10.17993/3ctic.2022.112.175-181

3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529

Ed. 41 Vol. 11 N.º 2 August - December 2022

175

ABSTRACT

As websites, social networks, blogs, and online portals proliferate on the internet, authors are

producing reviews, opinions, ideas, ratings, and feedback. The emotional content of this writer may be

about things like books, people, hotels, items, studies, events, and so on. These emotions have great

value for businesses, for governments, and for people. The majority of the writer-generated material

requires the usage of text mining algorithms and sentiment analysis, even if this information is meant

to be instructive. Sentiment Analysis is a technique in Natural Language Processing (NLP) that tries

to identify and extract assessments communicated within a given text. This paper intends to execute

different content handling strategies in NLP and use of Valence Aware Dictionary for Sentiment

Reasoning (VADER) Model that is sensitive to both polarity (positive/negative) and intensity (strength)

of emotion.

KEYWORDS

Natural Language Processing (NLP), Polarity Intensity, Sentiment Analysis, Virtual Emotion

Detection, VADER.

https://doi.org/10.17993/3ctic.2022.112.175-181

3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529

Ed. 41 Vol. 11 N.º 2 August - December 2022

176

1. INTRODUCTION

The emotions are of primarily of six types - love, happiness, anger, sadness, surprise and fear.

Sentiment analysis is used to analyze the emotions present in a text. It is a method for determining if a

given piece of writing is positive, negative, or neutral. The goal of sentiment analysis is to determine

the writer's attitude, sentiments, and emotions in a written text using a computational treatment of

subjectivity in the text. By application of sentiment analysis one can decipher the given sentence,

paragraph or a document contains a positive or negative emotions or expressions in it. In sentiment

analysis we classify the polarity of given text, it results by telling about opinion whether it’s positive,

negative or neutral.

2. LITERATURE SURVEY

Sentiment analysis has multiple ways and Vader is one of the best way (Mozetic et al., 2016) which is

being used. Vader stands for Valence Aware Dictionary and Sentiment Reasoning. It works on a ruled

based sentiment analysis and it contains list of lexical features which are labeled as per semantic

orientation. By analyzing the intensity of wordings in the text, the sentiment score can be obtained of

that text. Vader is smart enough to extract meaning of these words or texts as positive sentiments and

words like Sad, bad, awful as negative sentiments. Vader only cares about the expressions in the text.

Opinions like positive negative or neutral are the expression which is concerned for Vader.

Steven Bird & Loper (2009) express when a text is changed into its canonical or standard form, then it

is called as Text normalization. A few processes have to be done to standardize the content and convert

it into fitting structure which would then be given to the machine learning (ML) model. This helps in

reducing unnecessary information that the computer does not require, thus subsequently improving

efficiency. Library utilized for this is given in. Steps associated with this cycle are shown in Figure 1

and explained briefly in the following sections.

Fig 1. Text Normalization.

2.1 REMOVING STOPWORDS

A stopword is an ordinarily utilized word that can be disregarded, both when ordering sections for

looking and while recovering them (Saif et al., 2014). Using pre-compiled stopword lists or more

complex algorithms for dynamic stopword recognition, removing stopwords from textual data is a

popular procedure for reducing noise. The Natural Language Toolkit (NLTK) in Python includes a list

of stopwords for 16 different dialects.

2.2 TOKENIZATION

The method of breaking text into smaller units called tokens is commonly known as Tokenization

(Stanford NLP Group, 2015). Tokens can be words, characters, or subwords in this case. As a result,

tokenization can be divided into three categories: word, character, and sub-word tokenization.

2.3 DERIVING ROOT WORD

Nicolai & Kondrak (2016) explains in the zones of Natural Language Processing we come across

circumstances where a word has many offshoots Stemming and lemmatization are the two main NLP

https://doi.org/10.17993/3ctic.2022.112.175-181

3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529

Ed. 41 Vol. 11 N.º 2 August - December 2022

177

processes for generating root words. The root type of an inflected word is produced by both stemming

and lemmatization (Samir & Lahbib, 2018). Stemming computation works by removing the word's

postfix. Lemmatization thinks about morphological examination of the words. It restores the lemma

which is the base type of all its inflectional structures.

2.4 FEATURE EXTRACTION

Machine learning calculations are unable to work on Crude content legitimately. The procedure of

component extraction requires converting content into matrix or in vector form. The module can be

used to extract features from database consisting of formats such as text and images and extract

features in a format supported by machine learning algorithms the most popular strategies that

includes feature extraction are Bag-of-Words and TF-IDF Vectorizer. Normalize with diminishing

important tokens that appears in majority samples/documents (Mahajan et al., 2020).

2.5 POLARITY AND INTENSITY SCORE IN EMOTIONAL ANALYSIS

A key element of emotional analysis is to examine the body of a text to understand the concept it

expresses. Emotional analysis is appropriate for positive or negative values, known as polarity.

The perfect situation usually ends up being good, neutral or bad with the help of a polarity point

calculation. In general, emotional analysis works better in a submissive text than in a single

context text of purpose. Emotional analysis is widely used, as part of the analysis of social media

in any domain, to understand the functioning of any system, to be controlled by people and that

their response is based on their opinions.

Text textual analysis data can be calculated at most levels, either at the level of each sentence, at

the paragraph level, or throughout the document. There are two major theories in emotional

analysis. First is Learning Prescribed machine reading or in-depth learning: In this approach,

traditional machine learning techniques with a TF-IDF model using the n-gram method. These

divisions are the mindless Bayes of many lands, the orderliness of things, the closest neighbor k

and the uninhabited forest. In all four classes, orderliness is achieved with minimal accuracy.

Second is unsupported dictionary control: This method is to use a large learning process. The

accuracy we get from reading a lot is much less than how to learn by machine. After obtaining

excellent performance and fragmentation, the next step is to create a final model for back-to-work

using certain advanced machine learning methods.

2.6 VALENCE AWARE DICTIONARY FOR SENTIMENT REASONING

(VADER)

VADER is a model used for analysis of text sentiment from which it can detect both polarity

(positive/negative) and emotion intensity or strength. VADER majorly relies on a dictionary that

matches the lexical features to emotion intensities also known as sentiment scores for sentimental

analysis (Beri, 2020). By summing up the intensity of each word in the text we can get the

sentiment score of a text. Sentiment analysis statistically detects whether the polarity of a piece of

text is negative or positive. Sentimental analysis is based on two approaches: polarity-based

analysis, in which texts are classed as either negative or positive, and valence-based analysis, in

which the intensity of the emotion or sentiment is considered.

2.7 WORKING OF VADER

VADER is a sentiment or emotion analysis method that uses lexicons of sentiment-related words.

Each word in the lexicon is classed as positive or negative, and the strength of positivity or

negativity is also examined using this method. Table 1 depicts the sentiment rating of an excerpt

from VADER's lexicon, with higher positive ratings for more positive words and lower negative

ratings for more negative terms.

Table I. Sentiment rating of the various words in a text.

https://doi.org/10.17993/3ctic.2022.112.175-181

3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529

Ed. 41 Vol. 11 N.º 2 August - December 2022

178

According to Hutto & Gilbert in 2014, to determine if these terms are favorable or negative, a group of

individuals must personally rate them, which is both time-consuming and costly (Hutto & Gilbert,

2014). The lexicon must contain a good coverage of the words of interest in your text; otherwise,

accuracy will be poor. When the lexicon and the text are well-matched, this method is quite precise,

and it even produces speedy results on vast volumes of text. VADER not only matches the words in

the text with its lexicon, but it also takes into account certain aspects of the way the words are written

as well as their context meaning.

3. EXPERIMENTATION AND PERFORMANCE ANALYSIS

In this paper, sentiment analysis by the utilization of VADER is performed. Firstly, utilize

standardization methods in Natural Language Processing (NLP) for converting the test text into its

vector form. The implementation of emotion analysis evaluated in Python using Natural Language

Toolkit (NLTK) library. Figure 2 shows the virtual emotion analysis of the text where the emotional

content is adjudged as 75% positive. Figure 3 shows the virtual emotion analysis of the text where the

emotion is adjudged as 26% positive which is shown in red color depicting 26% negative. Figure 4

shows the sentiment analysis of long text where the sentiment is adjudged as 92% positive. By adding

this feature of detecting gibberish, we are able to detect gibberish text where the algorithm returns

50% as can be seen in Figure 5.

Fig 2. Virtual emotion detection of text showing 75% positive polarity.

Word Sentiment rating

tragedy -3.3

rejoiced 2.0

insane -1.6

disaster -3.2

great 3.3

https://doi.org/10.17993/3ctic.2022.112.175-181

3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529

Ed. 41 Vol. 11 N.º 2 August - December 2022

179

Fig 3. Virtual emotion detection of text showing 26% negative polarity.

Fig 4. Virtual emotion detection of long text showing 92% positive polarity.

Fig 5. Emotion detection of gibberish text.

4. CONCLUSION

The main focus of this paper is to calculate two scores of text: polarity and its intensity by using

machine learning. The polarity range is between -1 to 1(negative to positive) and which help us to find

whether the text is positive or negative. In this paper we have tested different emotions of a text by

using natural language processing toolkit in python and quantified this emotion as positive or negative.

Also we were able to identify a gibberish text i.e. words which does not make any sense or are

rubbish.

REFERENCES

[1] Mozetic, I., Grcar, M. & Smailovic, J. (2016), ‘Multilingual twitter sentiment classification: The

role of human annotators’, PLoS One 11(5), 1–26.

[2] Steven Bird, E. K. & Loper, E. (2009), Natural Language Processing with Python – Analyz- ing

Text with the Natural Language Toolkit, O’Reilly Media.

[3] Saif, H., Fernandez, M., He, Y. & Alani, H. (2014), On stopwords, filtering and data sparsity for

sentiment analysis of Twitter, in ‘Proceedings of the Ninth International Conference on

Language Resources and Evaluation (LREC’14)’.

[4] Stanford NLP Group (2015), CoreNLP, Stanford University, Stanford USA. https://

stanfordnlp.github.io/CoreNLP/index.html.

[5] Nicolai, G. & Kondrak, G. (2016), Leveraging inflection tables for stemming and lemmati- zation,

in ‘Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics’,

Vol. 1, Association for Computational Linguistics, Berlin, pp. 1138–1147.

https://doi.org/10.17993/3ctic.2022.112.175-181

3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529

Ed. 41 Vol. 11 N.º 2 August - December 2022

180

[6] Samir, A. & Lahbib, Z. (2018), Stemming and lemmatization for information retrieval sys- tems in

amazigh language, in M. A. A. N. E. Y. Tabii, M. Lazaar, ed., ‘Big Data, Cloud and

Applications. BDCA 2018’, Vol. 872 of Communications in Computer and Informa- tion

Science, Springer, Cham, Kenitra, Morocco, pp. 222–233.

[7] Mahajan, A., Ray, A., Verma, A., Kohad, S. & Thakare, P. N. (2020), ‘Sentiment analy- sis using

supervised machine learning’, International Journal of Advance Research and Innovative Ideas

in Education 6(6), 103–109.

[8] Beri, A. (2020), ‘Sentimental analysis using VADER’, Available online https://

towardsdatascience.com/sentimental-analysis-using-vader-a3415fef7664.

[9] Hutto, C. & Gilbert, E. (2014), VADER: A parsimonious rule-based model for sentiment analysis

of social media text, in ‘Proceedings of the Eighth International AAAI Con- ference on Weblogs

and Social Media’, Association for the Advancement of Artificial Intelligence, Ann Arbor, MI,

pp. 216–225.

https://doi.org/10.17993/3ctic.2022.112.175-181

3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529

Ed. 41 Vol. 11 N.º 2 August - December 2022

181