3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
77
AN OPTIMIZED DEEP NEURAL NETWORK-BASED FINANCIAL
STATEMENT FRAUD DETECTION IN TEXT MINING
Ajit Kr. Singh Yadav
Assistant Professor, Department of Computer Science and Engineering,
NERIST, Itanagar, Arunachal Pradesh (India), and Research Scholar,
Department of Computer Science and Engineering,
Rajiv Gandhi University, Itanagar, Arunachal Pradesh, (India).
E-mail: ajityadav101@redimail.com ORCID: https://orcid.org/0000-0002-2208-0828
Marpe Sora
Associate Professor. Department of Computer Science and Engineering,
Rajiv Gandhi University, Itanagar, Arunachal Pradesh, (India).
E-mail: marpe.sora@rgu.ac.in ORCID: https://orcid.org/0000-0003-0159-5416
Recepción: 23/07/2021 Aceptación: 03/11/2021 Publicación: 24/11/2021
Citación sugerida:
Singh, A. K., y Sora, M. (2021). An optimized deep neural network-based nancial statement fraud
detection in text mining. 3C Empresa. Investigación y pensamiento crítico, 10(4), 77-105. https://doi.
org/10.17993/3cemp.2021.100448.77-105
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
78 https://doi.org/10.17993/3cemp.2021.100448.77-105
ABSTRACT
Identifying Financial Statement Fraud (FSF) events is very crucial in text mining. The researcher’s
community is mostly utilized the data mining method for detecting FSF. In this direction, mostly the
quantitative data has utilized by research i.e. the nancial ratio is presented for detecting fraud in
nancial statements. On the text investigation there is no researches like auditor's remarks present in
published reports. For this reason, this paper develops the optimized deep neural network-based FSF
detection in the qualitative data present in nancial reports. The pre-processing of text is performed
initially using ltering, lemmatization, and tokenization. Then, the feature selection is done by the
Harris Hawks Optimization (HHO) algorithm. Finally, a Deep Neural Network-Based Deer Hunting
Optimization (DNN-DHO) is utilized to identify the fraud or no-fraud report in the nancial statements.
The developed FSF detection methodology executed in Python environment using nancial statement
datasets. The output of the developed approach gives high classication accuracy (96%) in comparison
to the standard classiers like DNN, CART, LR, SVM, Bayes, BP-NN, and KNN. Also, it provides better
outcomes in all performance metrics.
KEYWORDS
Financial statements, Fraud, Non-fraud, Text mining, Deep neural network, Deer hunting optimization.
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
79 https://doi.org/10.17993/3cemp.2021.100448.77-105
1. INTRODUCTION
Financial fraud is a major challenging task for dierent administrations across industries and in several
states as it takes vast destruction to business. Due to nancial fraud, billions of dollars are lost every year
in the bank of America, for instance approves to pay $16.5 billion for solving the case of nancial fraud
(Rezaee & Kedia, 2012). Material omissions resultant from an intentional failure to report nancial data
in accordance with usually acknowledged secretarial ethics are termed as FSF (Dalnial et al., 2014). The
companies provide the nancial statements that include the textual data in the form of auditors' remarks
and expose as records with nancial proportions. The qualitative data consist of indicators of fraudulent
nancial reporting in the form of intentionally located idioms. The agents use the adverbial phrases,
selective sentence constructions, and selective adjectives to cover the fraudulent activity (Throckmorton
et al., 2015; Song et al., 2014). To identify fraudulent nancial fraud, nancial statement users and
regulators expect external auditors. Financial statements are the organization's elementary documents
to reect its scal rank (Kanapickienė & Grundienė, 2015).
A careful analysis of the nancial accounts can denote whether the corporation is running eciently or
is in crisis. If the corporation is in crisis, nancial accounts can show if the maximum dangerous entity
handled by the organization is prot or cash or something dierent (Perols & Lougee, 2011). In every
quarter and every year, most of the organizations are needed to publish their nancial statements (Gray
& Debreceny, 2014).
FSF can be executed to build stock values or to acquire loans from banks. It may be done to allocate smaller
prots to investors. One more feasible reason might be to stay away from the expense of assessments
(Manurung & Hardika, 2015). Recently, dierent organizations are creating usage of fraud nancial
reports to cover up their real scal rank and create self-interested improvements at the expenditure of
shareholders. In the detection of FSF, nancial ratios are prime elements because they present a pure
image of the nancial strength of the corporation (Hajek & Henriques, 2017).
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
80 https://doi.org/10.17993/3cemp.2021.100448.77-105
The economy of an organization is caused by the illegal task of FSF. In determining capitalizing in a
corporation, the investigation of nancial reports helps the contributors to the investment market (Omar
et al., 2014). The performance of the company provided by the data presented in these statements in
terms of scal rank to the creditors, shareholders, and auditors.
In worldwide organizations, nding and prevention of FSF have become a signicant challenge (Gupta
et al., 2012a). In the failure of the prevention process, the detection of fraudulent nancial reporting is
a challenging issue. Though, the prevention of FSF is a better method (Asare et al., 2015). The interior
and exterior auditors have to play a signicant task in the discovery and prevention of FSF. But they
cannot be said only accountable for the identication and detection of FSF (Gupta et al., 2012b). Study
about fraud detection and antecedents is signicant since it adds to the sympathetic about fraud. To
enhance the auditors’ and regulators’ capability, it has the potential to identify the fraud either directly
or by helping as a basis for future fraud research that does (Ravisankar et al., 2011). Better-quality fraud
detection can assist the defrauded organizations, and their workers, investors, and creditors curb costs
linked with fraud and also enhance the eciency of the market. This knowledge is interest to auditors
once delivering guarantee about whether nancial accounts are free of substantial misstatements aected
by fraud (Ngai et al., 2011), mainly during audit planning and client selection.
Several researchers have been analysed the quantitative data for the recognition of false nancial
reporting (Jan, 2018). Therefore, the text mining technique is utilized to recognize fraud and non-fraud
nancial reports in the qualitative contents of nancial statements (Lin et al., 2015). Text mining is
the method of mining signicant structured data from unstructured text. It can be utilized for nding
the fraud or non-fraud reports and also it can examine the words (Gupta et al., 2012c). At present,
extensive data is produced from dierent sources in the Internet-dependent world. In an unstructured
format, a vast amount of data is obtainable. Text mining and data mining methods can permit well
decision making for analysing unstructured data (Kumar & Ravi, 2016). Dierent types of tasks involved
in text mining, for example, text summarization, web page classication, sentiment analysis, detection
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
81 https://doi.org/10.17993/3cemp.2021.100448.77-105
of plagiarism, malware analysis, classication of the document, detection of a topic, patent analysis,
etc. In the nancial statements, the textual data is unstructured (Dong, Liao, & Liang, 2016). Before
applying any data mining approaches like classication or clustering, the text is must be transformed into
structured data because the form of text is shapeless for the discovery of FSF.
This work contributes mainly:
In nding the solution of the nancial report fraud discovery.
To design the model for identifying the fraudulent and non-fraudulent statement.
To use optimal feature selection approaches to get high accuracy.
To model a new hybrid classier for nancial statement fraud discovery.
The remaining work of this paper is shown in following sections: Section two denes the recent works
related to this paper. The proposed method to detect the FSF is given in section three, the section four
provides the outcomes of the simulation and conclusion and future scope is given in section ve.
2. RELATED WORKS
An interpretable fuzzy rule-based system was presented by Hajek (2019) for detecting FSF. The developed
fuzzy rule-based detection approach combines the rule extraction and element of feature selection to
obtain the granularity and rule complexity. A genetic feature selection method is utilized to eliminate
the irrelevant features. A qualied investigation of fuzzy systems was performed with evolutionary fuzzy
rule-based schemes and FURIA. The developed system leads both desirable interpretability and good
accuracy. The result provides signicant eects for auditors and other operators of discovery structures
of FSF.
Fraud detection was introduced by Chen et al. (2019) for economic reports of business groups. For
fraud discovery, this article suggests a methodology in the nancial reports of business assemblies. The
established technique to improve the welfares of investment for creditors and investors and to lessen the
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
82 https://doi.org/10.17993/3cemp.2021.100448.77-105
investment losses and risks. The learning points were obtained by the subsequent stages: (i) construct
an eective model for fraud discovery in the nancial reports of business assemblies, (ii) dierent fraud
nding methods were applied in the nancial reports, and (iii) valuation of the developed system.
A Financial Fraudulent Statements (FFS) detection was developed by Temponeras et al. (2019) using the
deep dense Articial Neural Network (ANN). This system reviews the nancial statements of multiple
companies. A deep dense ANN is derived from the decisions about conceivable accounting fraud. To
accurately classify the FFS, the data is obtained from 164 Greek companies. Therefore, the main objective
was to test a neural system structure in the forecasting FFS. In the classication FFS task, the developed
approach provides superior outcomes than other earlier classiers in investigating the Greek data.
A CHAID, SVM (Support Vector Machine), and C5.0 were discussed by Chi et al. (2019) for FSF
detection. Through an active detection scheme, an approach of C5.0, SVM, and CHAID are applied
to the discovery of FSF. From the Taiwan Economic Journal (TEJ), the research data is obtained. The
source sample contains 28 companies involved in FSF and 84 corporations are not intricate in such
frauds on the Taipei Exchange and the Taiwan Stock Exchange amid the investigation time. Before
constructing the system, the paper chooses key variables with C5.0 and SVM. For FSF, the non-nancial
and nancial variables are utilized to improve the precision of recognition.
An application of a cooperative Random Forest (RF) classier was presented by Patel et al. (2019) for
identifying nancial report management of Indian registered corporations. Recently, the investigator
has tried to discover the dierent modelling methods for FFS detection. The researcher has selected
a 92 non-FFS and 86 FFS of manufacturing corporations to accomplish the test. From the Bombay
stock exchange, the research data were obtained for the dimension of 2008-2011. For the identication
of non-FFS and FFS companies, the auditor's report was deliberated. The T-test utilises 31 signicant
nancial proportions. The training dataset is employed to train the model and the trained model is used
for classication with better accuracy.
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
83 https://doi.org/10.17993/3cemp.2021.100448.77-105
3. METHODOLOGY
The group of nancial statements is considered as the input in the text mining systems. Here, the fraud
and non-fraud types of nancial reports are gathered to classify fake nancial reports.
The nancial statement fraud discovery includes four steps such as text pre-processing, feature extraction,
feature selection, and text classication. the workow of the proposed approach is shown in Figure 1.
Figure 1. Overall proposed Methodology.
Source: own elaboration.
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
84 https://doi.org/10.17993/3cemp.2021.100448.77-105
In-text mining, pre-processing plays a major role. The high quality of the pre-processing step provides
better results. The pre-processing step includes the number of roles such as ltering, tokenization, and
lemmatization. The words in all documents are transformed into the lower case during pre-processing.
Then the TF-IDF, LDA and Word2vec approach is utilized for feature extraction. It describes the text
to have a set of measurable dimensions like frequency of words. The process of feature selection is
utilized to enhance the performance of a text classier and also decrease the dimension of the feature.
Here, the HHO algorithm is used for feature selection. Finally, the new hybrid classier of DNN-DHO
is proposed to identify the fraud and non-fraud nancial statements for classication. In the DNN,
weights are updated using the DHO algorithm. This hybrid classier concept minimizes the error during
classication.
3.1. PROBLEM STATEMENT
FSF is the main problem for society. The detection of FSF is a challenging process. FSF is not a victimless
corruption, but instead leaves behind actual genuine economic losses that contain workers, shareholders,
and investors. A trust in controllers, reduction in self-assurance and reduction in the reliability of
nancial markets are extensive costs to society. It leads to high transaction costs and minimum eciency.
In developing markets, the challenges of business-related with investing to improve the incentives for
handling nancial statements and also avoid the taxes in the home country. Recently, dierent cases of
FSF have been increased. Every incidence is a dense disappointment to shareholders, and investors and
it expenses the public extremely. So, the construction of an ecient scheme to identify FSF is a major
concern.
3.2. TEXT PRE-PROCESSING
It is a signicant role and a dangerous phase in text mining. To mining motivating, non-trivial, and
information from amorphous text data, a pre-processing method is applied in text mining. The basic
units of the fonts, words, and sentences are recognized in this phase and it’s delivered to all further
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
85 https://doi.org/10.17993/3cemp.2021.100448.77-105
processing phases. The steps of pre-processing contain the number of roles, for example, ltering,
lemmatization, and tokenization.
3.2.1. TOKENIZATION
A given text is broken into phrases, words, symbols, or other important components are known as tokens.
It may be thrown away particular characters like punctuation marks. The main application of this
process is to identify the signicant keywords.
3.2.2. FILTERING
This process eliminates the particular words in the documents. The elimination of stop words is a
common ltering approach. Stop words are repeatedly utilized common words like ‘this’, ‘are’, ‘and’ etc.
They are not applicable in document classication. Therefore, they must be eliminated.
3.2.3. LEMMATIZATION
This process to eliminate inectional terminations and to return the base form of a word, which is named
as the lemma. This process refers to the usage of the dictionary and morphological study of words.
3.3. TEXT FEATURE EXTRACTION
Text feature extraction is the procedure of extracting list of words from the textual data for the feature
selection in classier. In-text classication, it plays a major role because it directly impacts the classication
accuracy. The following methods are utilized for extracting features from text data.
3.3.1. TERM FREQUENCY AND INVERSE DOCUMENT FREQUENCY (TF-IDF)
TF-IDF is an important weighting method in text mining (Kalra et al., 2019). A word is frequent in the
number of times in a text which denoted as the word frequency. To compute the reverse likelihood of
nding a word, the IDF approach is employed in a text. The signicance of a term in a text is denoted
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
86 https://doi.org/10.17993/3cemp.2021.100448.77-105
as TF-IDF within a corpus. Here, a document refers to a nancial report, a term refers to a solitary word
in a statement, and a corpus refers to the assortment of reports. In a document d, the weight of TF-IDF
for a term t is computed by:
(1)
(2)
(3)
3.3.2. LATENT DIRICHLET ALLOCATION (LDA)
LDA is the topic modelling scheme (Jelodar et al., 2019). It adopts that every text can be dened as a
probabilistic distribution over hidden topic. In all documents, the common Dirichlet prior is shared by
the topic distribution. A common Dirichlet prior shared by the word distributions of topics. Assumed
a corpus D that contains M documents. Each document d having Nd words d 1,…, M. This method
based on the subsequent reproductive procedure:
From a Dirichlet dissemination with factor β, choose a multinomial spreading for a topic t(t
1,…,T).
For document d(d 1,…,M), select a multinomial spreading θd from a Dirichlet dissemination with
factor α.
Pick a topic zn from θd and take a word wn from zn for a word wn (n 1,…, Nd) in a document d.
Here, the words are only detected variables in documents whereas others are hyper factors (α and β) and
hidden variables ( and θ). The likelihood of perceived data D is calculated by:
(4)
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
87 https://doi.org/10.17993/3cemp.2021.100448.77-105
α the spreading of words over topics and β constraints of topic Dirichlet prior are obtained from Dirichlet
dissemination. Here the number of topics is dened by T, the number of documents is denoted by M,
and the size of the vocabulary is denoted by N. The Dirichlet-multinomial pair is considered as (α,θ)
and (β,) for the corpus-level topic distributions and the topic-word distributions. The document-level
variables are denoted by θd, and the word-level variables are represented by wdn.
3.3.3. WORD2VEC
In this process, the depiction of a word as a vector plays a signicant role. This process more helpful
for discovering antonyms, synonyms, and sentence equivalent with comparable meaning. This process
converting the word into a vector form (Wang, Ma, & Zhang, 2016). It contains two dierent models
for constraint updation. One is Continuous Bag of Words (CBOW) and skip-gram. CBOW is used to
forecast words utilizing contexts of its environments. The Skip-gram uses a word’s data in forecasting of
adjacent words. Three layers are used such as input, projection and output are used in both the methods.
Here, the CBOW approach is considered as an instance to clarify the working of word2vec.
A sentence S is assumed as:
, where wt refers to the target term. Then the input layer is dened
as follows:
(5)
Where c(v(wt)) refers to the context of the term v(wt). Next, the projecting layer is used to construct a
contextual vector v(wt) as follows:
(6)
A word is considered as a Leaf Node (LN) in a Human tree based on its event in the corpus in the
output layer. Every word has a single path between the Root Node (RN) and the LN. Using the logistic
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
88 https://doi.org/10.17993/3cemp.2021.100448.77-105
model, the likelihood of choosing left or right child can be computed at every node excluding the leaf
node which is given by:
(7)
At every node, using an invention of likelihoods p(v(xw)c(v(xw))) can be learned in the tree which is given by:
(8)
Here, the jth digit in word w's Human code is dened by dw
j [0,1] and any node on the path is denoted
as j excluding as the LN.
By enhancing the log-likelihood, the objective purpose can be erudite by (9). Then the gradient descent
approach is utilized to improve θ, v(xw) and its relative words.
(9)
3.4. FEATURE SELECTION USING HHO
It is a crucial step for text classication and it is the procedure of choosing a certain subcategory of
terms of the training set and these are utilizing for further classication procedure. It also lessens the size
of information, improves the classication accuracy by removing noisy features, eliminates overtting
problem and it makes the training faster. The HHO algorithm is introduced in the feature selection
process to choose the optimal nest features for text classication. This algorithm analyses the number
of features to obtain more relevant features.
3.4.1. HARRIS HAWKS OPTIMIZATION ALGORITHM
HHO is inspired by the behaviour of Harris hawks to discover the prey, surprise pounce, and dissimilar
violence methods in the environment (Heidari et al., 2019). The hawks are denoted as the applicant
solutions and the nest solution is termed as prey. Using their powerful eyes, the Harris hawks eort to
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
89 https://doi.org/10.17993/3cemp.2021.100448.77-105
trail the prey and execute the surprise pounce to hook the prey detected. In this process, three features
such as TF-IDF, LDA, and word2vec are taken as input. These three features are not similar to each text.
Therefore, the HHO is utilized to select the optimal feature for the classication of text.
Generally, HHO includes the exploration and exploitation stages. The HHO algorithm can be transferred
from exploration to exploitation. The exploration behaviour is improved based on the escaping energy
of prey (E) and it is given by:
(10)
(11)
Here the present iteration is denoted by t, the maximum number of iterations is represented by T, the
initial energy is dened by E0 that lies between [-1, 1] and r denoted as a random number in [0, 1].
3.4.1.1. EXPLORATION PHASE
Through the arbitrary position, the location of the hawk is modernized which can be given as:
(12)
Here, the location of the hawk is dened by X, the location of the arbitrarily chosen hawk is denoted as
Xk, and the location of the prey is dened as Xr. The lower and upper limits of hunt space are signied by
lb and ub individually. In the range of [0, 1], the ve independent arbitrary numbers are dened by r1 , r2 ,
r3 , r4 , and q. The ordinary location of the present populace of hawks is dened by Xm and it is given by:
(13)
Here, the nth hawk is denoted as Xn and the number of hawks is dened by N.
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
90 https://doi.org/10.17993/3cemp.2021.100448.77-105
3.4.1.2. FITNESS FUNCTION
The tness value is computed for each hawk and stored for future reference. The tness function of this
feature selection process is computed by:
(14)
3.4.1.3. EXPLOTATION PHASE
According to the four dissimilar conditions, the location of the hawk is improved in this process. This
process is accomplished only depends on the chance of prey is eectively escaping (r < 0.5) or not
eectively escaping (r ≥ 0.5) beforehand surprise bounce and the escaping energy of prey (E).
Soft Besiege
If |E| 0.5 and r 0.5, this stage only occurs. Here, the location of the hawk is updated by the
subsequent expression:
(15)
Here, the dissimilarity between the current hawk and the position of the prey is denoted as ∆X and the
jump strength is denoted by J. Both parameters can be dened as:
(16)
(17)
Where r5 is a constant value in the range of 0 and 1 that changes unevenly in every single iteration.
Hard Besiege
If |E| < 0.5 and r 0.5, this phase only happens. Here, the location of the hawk is updated by the
following expression:
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
91 https://doi.org/10.17993/3cemp.2021.100448.77-105
(18)
Soft Besiege with Progressive Rapid Dives
If |E| 0.5 and r < 0.5, this stage is happened. The hawk gradually picks the nest probable dive to
catch the prey. Here, the two dierent solutions are produced by,
(19)
(20)
Here, the newly produced hawks are denoted by Y and Z. the total number of dimensions is denoted as
D, α is an arbitrary vector and Levy is the function of levy ight which is given by:
(21)
Here, u and v are the self-governing arbitrary numbers produced from the standard distribution and σ
is given by:
(22)
Where β is a constant value xed to 1.5. Here, the location of the hawk is reorganized by:
(23)
Where the tness function is dened as F(.) , Y and Z are two dierent solutions gained from Equations
(19) and (20).
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
92 https://doi.org/10.17993/3cemp.2021.100448.77-105
Hard Besiege with Progressive Rapid Dives
If |E| < 0.5 and r < 0.5, this process is occurred. The two dierent solutions are made by:
(24)
(25)
The location of the hawk is updated by:
(26)
Where Y and Z are two fresh solutions achieved from Equations (24) and (25).
3.5. OPTIMIZED DNN BASED CLASSIFICATION USING DHO
The structure of DNN includes the input layer, hidden layers, and output layer as exposed in Figure 2. By
the exertion of weight tness, the network is constructed. DNN updates the weight value in the hidden
layer using the DHO algorithm (Brammya et al., 2019). Owing to the improved training repetitions, this
system frequently ts the considered training information's judgment border. The total quantity of nodes
is evaluated in the hidden layers which are given as:
(27)
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
93 https://doi.org/10.17993/3cemp.2021.100448.77-105
Figure 2. DNN with SoftMax regression.
Source: own elaboration.
Here, the sum of the hidden layer is n, the layer of input is a, the layer of output is dened as b and c is
a constant where 0≤c≥1. The sigmoid utility is used as an activation function for empowering the non-
linear capability which is computed as:
(28)
The input information of the system is considered as and the mapping function is dened as Mf .
(29)
In this w is weight matrix, and β is bias between output and hidden layer. A data model (x, l) can be taken
and the loss form computed as:
(30)
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
94 https://doi.org/10.17993/3cemp.2021.100448.77-105
Here, Ws and bs are bias subsets, hidden layer nodes are m the sum of neurons in the hidden layer is
signied as m. The Cross-Entropy (CE) for the testing and training of the model is taken as loss form for
the deep neural network. This can be estimated as:
(31)
Here sample of training is n, the kth output is yk from training set and the expected kth output is Ŷk. The
network weight value is estimated by the DHO method.
Then the old and fresh solutions are compared. Only the best solutions are considered for the next
iteration. Furthermore, it simply needs the alteration of the population dimensions. The number of
iterations updates the calculation in only one stage.
In DHO the two hunters one is leader and other is successor must be at their best position. For this they
update their angle and position hunt the deer. the leader updates his angle and position as:
(32)
Where, Yi is the current position, Yi+1 is the next position, p is a random number belongs to [0, 2].
Leader present position is Ylead from a present population.
The position can be updated by successor position as:
(33)
Where successor position is Ysuccessor. The coecient factors can be calculated as:
(34)
(35)
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
95 https://doi.org/10.17993/3cemp.2021.100448.77-105
imax is maximum repetition, b is an arbitrary number between -1 to 1, here c is a number from 0 to 1. The
mean value of leader and successor can be used for weight updation:
(36)
The current and earlier solution are compared. It will replace the earlier solution if the earlier solution is
enhanced otherwise, it will keep the earlier solution. This procedure is frequent up to the end condition
is satised. DHO algorithm for weight estimation is shown in Figure 3.
Figure 3. DHO for weight updation.
Source: own elaboration.
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
96 https://doi.org/10.17993/3cemp.2021.100448.77-105
3.6. DATASET DESCRIPTION
The standard datasets are utilized for FSF detection. The full dataset is taken from the link https://
surfdrive.surf.nl/les/index.php/s/m34LCElefSj6M8y. Here annual reports are stored in two zip les.
One le contains the annual reports in the 'fraud' category and other les in the 'no fraud' category. 1646
statements are included in the datasets. It contains 1319 no fraud statements and 327 fraud statements.
For this work 70% data is used for training purpose and 30% data is used for testing purpose.
4. RESULTS
DNN-DHO methods are proposed here to optimize the DNN model for detection of FSF. Dierent
classiers such as DNN, K-nearest neighbour (KNN) SVM, backpropagation neural network (BP-
NN), classication and regression tree (CART), Bayes classier (Bayes), and logistic regression (LR) are
compared with the proposed approach and a comparative evaluation performance is made.
4.1 PERFORMANCE ANALYSIS
The proposed (DNN-DHO) approach is executed on the nancial statement dataset. The proposed
scheme correctly identies the fraud and non-fraud statement. Initially, the pre-processing step is executed
on the text data. It includes the number of tasks such as tokenization, ltering, and lemmatization. These
stages are performed on the text. After that, dierent feature extraction methods like TF-IDF, LDA, and
word2vec are utilized. Then, the HHO algorithm selects the optimal. The features selected optimally is
used by DNN-DHO algorithm to classify the nancial statement. The proposed method provides better
outcomes compared to the standard classiers like CART, DNN, SVM, NB, LR, BP-NN, and KNN.
The performance metrics are evaluated for dierent classiers. The accuracy obtained by this method is
better (96%) than other standard classiers.
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
97 https://doi.org/10.17993/3cemp.2021.100448.77-105
-
-
-
-
309 18
35 1284
Fraud
Non Fraud
Non Fraud
Fraud
Confusion Matrix
Figure 4. Confusion Matrix.
Source: own elaboration.
The confusion matrix obtained by the proposed method is shown in Figure 4. The nancial statements
consist of 327 fraud statements and 1319 non-fraud statements. In 327 fraud statements, 309 nancial
statements are correctly identied as a fraud statement remaining 18 nancial statements are wrongly
identied as a non-fraud statement. Similarly, in the 1319 non-fraud statements, 1284 nancial statements
are correctly identied as a non-fraud statement remaining 35 nancial statements are wrongly identied
as a fraud statement. Therefore, the proposed scheme properly identies the fraud or non-fraud in the
nancial statements. The accuracy evaluation is shown in Figure 5.
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
98 https://doi.org/10.17993/3cemp.2021.100448.77-105
14.6%
14.1%
11.0%
13.0%
12.9% 12.0%
11.2%
11.2%DNN
SVM
LR KNN
CART
BAYES
BPNN
DNN-DHO
(Proposed)
Accuracy
Figure 5. Accuracy evaluation.
Source: own elaboration.
The outcomes of the proposed classier and standard classiers like SVM, CART, Bayes, BP-NN, DNN,
LR, and KNN is shown in Table 1 and Figure 6. The performance of accuracy is 96% for the DNN-
DHO, 93% for DNN, 86% for SVM, 74% for CART, 73% for BP-NN, 85% for LR, 74% for Naïve
Bayes and 79% for KNN. DNN-DHO method outperforms in all other parameters also. The proposed
approach is better in comparison to the existing classiers for classication of FSF.
Table 1. performance comparison among DNN-DHO classier and others.
Performance
parameters
DNN-DHO
(Proposed) DNN SVM LR KNN CART Bayes BPNN
Accuracy 0.9678 0.93 0.86 0.85 0.79 0.74 0.74 0.73
Sensitivity 0.9734 0.94 0.89 0.87 0.82 0.83 0.82 0.81
FPR 0.055 0.065 0.16 0.16 0.23 0.34 0.34 0.35
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
99 https://doi.org/10.17993/3cemp.2021.100448.77-105
FNR 0.0265 0.06 0.106 0.12 0.17 0.16 0.17 0.18
Precision 0.9861 0.93 0.84 0.84 0.77 0.71 0.70 0.69
F1 score 0.9797 0.93 0.87 0.85 0.80 0.76 0.76 0.75
Specicity 0.9449 0.935 0.84 0.83 0.76 0.66 0.65 0.64
BER 0.0321 0.0625 0.133 0.14 0.20 0.25 0.25 0.26
AUC 0.9423 0.9178 0.8725 0.86 0.79 0.78 0.75 0.72
Source: own elaboration.
1.2
1
0.8
0.6
0.4
0.2
0
DNN-DHO (Proposed) DNN SVM LR KNN CART Bayes BPNN
Figure 6. performance comparison among DNN-DHO classier and others.
Source: own elaboration.
5. CONCLUSIONS
In this paper, an optimized deep neural network based FSF discovery in text mining has been proposed.
The model of fraud detection initiates with an assortment of nancial reports for both fraud and no-
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
100 https://doi.org/10.17993/3cemp.2021.100448.77-105
fraud administrations. The pre-processing stage is performed through lemmatization, ltering, and
tokenization. Then the TF-IDF, LDA and word2vec approach is used for mining the data concealed
in the document for fraud and no-fraud administrations. Further, the HHO procedure is utilized to
select the nest features. Then the DNN-DHO classier utilises these features with a SoftMax classier
for classication of fraud and no-fraud statements. In the classication process, the weight of the
whole network is updated by the DHO algorithm. The outcomes shows that the proposed method
is the best model for detecting FSF. The accuracy (96%), Sensitivity (97%), precision (98%), F1 score
(97%), Specicity (94%), BER (0.03), FPR (0.05), FNR (0.026) and AUC is 0.94 are calculated for the
developed method and it’s compared to the existing classiers. The proposed approach is to provide the
best performance results than other classiers of BP-NN, DNN, CART, SVM, LR, KNN, and Bayes.
REFERENCES
Asare, S. K., Wright, A., & Zimbelman, M. F. (2015). Challenges facing auditors in detecting
nancial statement fraud: Insights from fraud investigations. Journal of Forensic and Investigative
Accounting, 7(2), 63-111. http://web.nacva.com/JFIA/Issues/JFIA-2015-2_4.pdf
Brammya, G., Praveena, S., Ninu, N. S., Ramya, R., Rajakumar, B. R., & Binu, D. (2019). Deer
Hunting Optimization Algorithm: A New Nature-Inspired Meta-heuristic Paradigm. The Computer
Journal, bxy133. https://doi.org/10.1093/comjnl/bxy133
Chen, Y.-J., Liou, W.-C., Chen, Y.-M., & Wu, J.-H. (2019). Fraud detection for nancial statements
of business groups. International Journal of Accounting Information Systems, 32(C), 1-23. https://ideas.
repec.org/a/eee/ijoais/v32y2019icp1-23.html
Chi, D.-J., Chu, C.-C., & Chen, D. (2019). Applying Support Vector Machine, C5. 0, and CHAID to
the Detection of Financial Statements Frauds. In International Conference on Intelligent Computing, pp.
327-336. Springer, Cham.
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
101 https://doi.org/10.17993/3cemp.2021.100448.77-105
Dalnial, H., Kamaluddin, A., Sanusi, Z. M., & Khairuddin, K. S. (2014). Detecting fraudulent
nancial reporting through nancial statement analysis. Journal of Advanced Management Science, 2(1),
17-22. http://www.joams.com/index.php?m=content&c=index&a=show&catid=36&id=108
Dong, W., Liao, S., & Liang, L. (2016). Financial Statement Fraud Detection using Text Mining: A
Systemic Functional Linguistics Theory Perspective. In Pacic Asia Conference On Information Systems
(PACIS), p. 188. https://core.ac.uk/download/pdf/301369656.pdf
Gray, G. L., & Debreceny, S. R. (2014). A taxonomy to guide research on the application of data
mining to fraud detection in nancial statement audits. International Journal of Accounting Information
Systems, 15(4), 357-380. https://doi.org/10.1016/j.accinf.2014.05.006
Gupta, R., & Gill, N. S. (2012a). A data mining framework for prevention and detection of nancial
statement fraud. International Journal of Computer Applications, 50(8). https://research.ijcaonline.org/
volume50/number8/pxc3880889.pdf
Gupta, R., & Gill, N. S. (2012b). Financial statement fraud detection using text mining. International
Journal of Advanced Computer Science and Applications (IJACSA), 3(12). http://dx.doi.org/10.14569/
IJACSA.2012.031230
Gupta, R., & Gill, N. S. (2012c). Prevention and detection of nancial statement fraud–An
implementation of data mining framework. International Journal of Advanced Computer Science and
Applications (IJACSA), 3(8). http://dx.doi.org/10.14569/IJACSA.2012.030825
Hajek, P. (2019). Interpretable Fuzzy Rule-Based Systems for Detecting Financial Statement Fraud. In
IFIP International Conference on Articial Intelligence Applications and Innovations, pp. 425-436. Springer,
Cham.
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
102 https://doi.org/10.17993/3cemp.2021.100448.77-105
Hajek, P., & Henriques, R. (2017). Mining corporate annual reports for intelligent detection of
nancial statement fraud–A comparative study of machine learning methods. Knowledge-Based
Systems, 128, 139-152. https://doi.org/10.1016/j.knosys.2017.05.001
Heidari, A. A., Mirjalili, S., faris, H., Aljarah, I., Mafarja, M., & Chen, H. (2019). Harris hawks
optimization: Algorithm and applications. Future Generation Computer Systems 97, 849-872. https://
doi.org/10.1016/j.future.2019.02.028
Jan, C.-L. (2018). An eective nancial statement fraud detection model for the sustainable development
of nancial markets: Evidence from Taiwan. Sustainability, 10(2), 513. https://doi.org/10.3390/
su10020513
Jelodar, H., Wang, Y., Yuan, C., Feng, X., Jiang, X., Li, Y., & Zhao, L. (2019). Latent Dirichlet
Allocation (LDA) and Topic modeling: models, applications, a survey. Multimedia Tools and
Applications, 78(11), 15169-15211. https://arxiv.org/abs/1711.04305
Kalra, S., Li, L., & Tizhoosh, H. R. (2019). Automatic Classication of Pathology Reports using TF-
IDF Features. arXiv preprint arXiv:1903.07406. https://arxiv.org/abs/1903.07406
Kanapickienė, R., & Grundienė, Ž. (2015). The model of fraud detection in nancial statements
by means of nancial ratios. Procedia-Social and Behavioral Sciences, 213, 321-327. https://doi.
org/10.1016/j.sbspro.2015.11.545
Kumar, B. S., & Ravi, V. (2016). A survey of the applications of text mining in the nancial domain.
Knowledge-Based Systems, 114, 128-147. https://doi.org/10.1016/j.knosys.2016.10.003
Lin, C., Chiu, A., Huang, S.Y., & Yen, D.C. (2015). Detecting the nancial statement fraud: The
analysis of the dierences between data mining techniques and experts’ judgments. Knowledge-
Based Systems, 89, 459-470. https://www.semanticscholar.org/paper/Detecting-the-nancial-
statement-fraud%3A-The-of-the-Lin-Chiu/48bc08514070341439e382f887faba42b21212d9
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
103 https://doi.org/10.17993/3cemp.2021.100448.77-105
Manurung, D. T. H., & Hardika, A. L. (2015). Analysis of factors that inuence nancial statement
fraud in the perspective fraud diamond: Empirical study on banking companies listed on the
Indonesia Stock Exchange year 2012 to 2014. In International Conference on Accounting Studies (ICAS),
279-286. https://core.ac.uk/download/pdf/42984276.pdf
Ngai, E. W. T., Hu, Y., Wong, Y. H., Chen, Y., & Sun, X. (2011). The application of data mining
techniques in nancial fraud detection: A classication framework and an academic review of the
literature. Decision support systems, 50(3), 559-569. https://doi.org/10.1016/j.dss.2010.08.006
Omar, N. B., Koya, R. K., Sanusi, Z. M., & Shae, N. A. (2014). Financial Statement Fraud: A
Case Examination Using Beneish Model and Ratio Analysis. International journal trade, economics and
nance, 5, 184-186. https://www.semanticscholar.org/paper/Financial-Statement-Fraud%3A-A-
Case-Examination-Using-Omar-Koya/75657feb5f290f2c5447eb71573b3b6753c17bfb
Patel, H., Parikh, S., Patel, A., & Parikh, A. (2019). An application of ensemble random forest
classier for detecting nancial statement manipulation of Indian listed companies. In Advances
in Intelligent Systems and Computing, Recent Developments in Machine Learning and Data Analytics. Springer
Proceedings. https://www.researchgate.net/prole/Satyen-Parikh/publication/327604170_
An_Application_of_Ensemble_Random_Forest_Classier_for_Detecting_Financial_Statement_
Manipulation_of_Indian_Listed_Companies_IC3_2018/links/5e8b631e299bf1307983c98e/
An-Application-of-Ensemble-Random-Forest-Classifier-for-Detecting-Financial-Statement-
Manipulation-of-Indian-Listed-Companies-IC3-2018.pdf
Perols, J. L., & Lougee, B. A. (2011). The relation between earnings management and nancial
statement fraud. Advances in Accounting, 27(1), 39-53. https://doi.org/10.1016/j.adiac.2010.10.004
Ravisankar, P., Ravi, V., Rao, G. R., & Bose, I. (2011). Detection of nancial statement fraud and
feature selection using data mining techniques. Decision Support Systems, 50(2), 491-500. https://doi.
org/10.1016/j.dss.2010.11.006
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
104 https://doi.org/10.17993/3cemp.2021.100448.77-105
Rezaee, Z., & Kedia, B. L. (2012). Role of corporate governance participants in preventing and
participants in preventing and detecting nancial statement fraud. Journal of Forensic & Investigative
Accounting, 4(2).
Song, X.-P., Hu, Z.-H., Du, J.-G., & Sheng, Z.-H. (2014). Application of machine learning methods
to risk assessment of nancial statement fraud: evidence from China. Journal of Forecasting, 33(8),
611-626. https://doi.org/10.1002/for.2294
Temponeras, G. S., Alexandropoulos, S. N., Kotsiantis, S. B., & Vrahatis, M. N. (2019).
Financial Fraudulent Statements Detection through a Deep Dense Articial Neural Network. In
10th International Conference on Information, Intelligence, Systems, and Applications (IISA), pp. 1-5. IEEE.
https://ieeexplore.ieee.org/abstract/document/8900741
Throckmorton, C. S., Mayew, W. J., Venkatachalam, M., & Collins, L. M. (2015). Financial
fraud detection using vocal, linguistic and nancial cues. Decision Support Systems, 74, 78-87. https://
doi.org/10.1016/j.dss.2015.04.006
Wang, Z., Ma, L., & Zhang, Y. (2016). A Hybrid Document Feature Extraction Method Using Latent
Dirichlet Allocation and Word2Vec. In 2016 IEEE First International Conference on Data Science in
Cyberspace (DSC), 98-103. https://www.semanticscholar.org/paper/A-Hybrid-Document-Feature-
Extraction-Method-Using-Wang-Ma/840894b784378fe64ef977c44db759b8aa0527cf
3C Empresa. Investigación y pensamiento crítico. ISSN: 2254-3376 Ed. 48 Vol. 10 N.º 4 Noviembre 2021 - Febrero 2022
105 https://doi.org/10.17993/3cemp.2021.100448.77-105