DEEP LEARNING BASED MISSIN
G OBJECT
DETECTION AND PERSON IDENTIFICATION
AND APPLICATION FOR SMART CCTV
R. C. Dharmik
Assistant Professor, Department of Information Technology Yeshwantrao Chavan College of
Engineering, Nagpur, Maharashtra, (India).
Sushilkumar Chavhan
Assistant Professor, Department of Information Technology Yeshwantrao Chavan College of
Engineering, Nagpur, Maharashtra, (India).
S. R. Sathe
Professor, Department of CSE, VNIT Nagpur, (India).
Reception: 10/09/2022 Acceptance: 25/09/2022 Publication: 29/12/2022
Suggested citation:
Dharmik, R. C., Chavhan, S., y Sathe, S. R. (2022). Deep learning based missing object detection and person
identification: an application for smart CCTV. 3C Tecnología. Glosas de innovación aplicadas a la pyme, 11(2),
51-57. https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57
https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57
3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254-4143
Ed. 42 Vol. 11 N.º 2 August - December 2022
51
ABSTRACT
Security and protection are the most crucial concerns in today’s quickly developing world. Deep
Learning methods and computer vision assist in resolving both problems. One of the computer vision
subtasks that allows us to recognise things is object detection. Videos are a source that is taken into
account for detection, and image processing technology helps to increase the effectiveness of state-of-
the-art techniques. With all of these technologies, CCTV is recognised as a key element. Using a deep
convolutional neural network, we accept CCTV data in real time in this article. The main objective is
to make content the centre of things. Using the YOLO technique, we were able to detect the missing
item with an improvement of 10% sparsity over the current state-of-the-art algorithm in the context of
surveillance systems, where object detection is a crucial step. It can be utilised to take immediate
additional action.
KEYWORDS
Deep Learning, Object Detection, Computer vision
https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57
3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254-4143
Ed. 42 Vol. 11 N.º 2 August - December 2022
52
1. INTRODUCTION
The method of object detection involves comprehending the entire image, concentrating on proper
categorization while focusing on the item in the image. This process is a subtask of computer vision,
which also covers face detection and skeleton identification by Zhao, Zheng, Xu, and Wu
(2019) ,Felzenszwalb, Girshick, Mcallester, and Ramanan (2010),Sung and Poggio (1998),Doll´ar,
Wojek, Schiele, and Perona (2011)and Sampat and Bovik (2003). The development of computational
models that offer the most fundamental data required by computer vision applications is the aim of
object detection. An industrial revolution changes the face of the computer vision task where
identification, surveillance, medical, robotics, self-driving cars can be part of it. In recent era
advancement in deep leaning accelerated the object detection task. A specific form of machine
learning called ”deep learning” (DL) uses neural networks to learn in phases. It can therefore mimic
human thought. Video analytics is closely related with the deep learning which provides application in
different fields. Traditional Object detection is complex in complexity and low level features are also
leads to saturation to increased complexities (Zhao et al., 2019).
Image classification and detection are the steps for the detection of object for surveillance. Primary
task for both the operation to receive the relevant features. Deep Learning is the better solution for it.
It learns from previous stages and extracts new features. Deep Network may conventio al neural
network or Multilayer Perceptron networks which consist of activation functions and various hidden
layers. Image is extracted from the video with fixed size as neural network required for model (Sung
and Poggio, 1998). If fully integrated networks are necessary. However, size reduction results in
information loss inside the image, reducing accuracy, precision, and sparsity as well as applicability
for surveillance system (Zhao et al., 2019). Viola-Jones Detector (Viola and Jones, 2001) and HOG
Detector (Dalal and Triggs, 2005) are two examples of conventional object detection algorithms. One-
stage and two-stage algorithms based on deep learning are the two main categories. The two-stage
algorithms RCNN and SPPNet(Girshick, Donahue, Darrell, and Malik, 2013),(Liu, Anguelov, Erhan,
Szegedy, Reed, Fu, and Berg, 2016), Fast RCNN and Faster RCNN(He, Zhang, Ren, and Sun, 2014),
Mask R–CNN, Pyramid Networks/FPN(Girshick, 2015), and G–RCNN(Ren, He, Girshick, and Sun,
2015) propose object region utilising deep feature before classification. Without the region suggestion,
one-stage algorithms like YOLO (Redmon, Divvala, Girshick, and Farhadi, 2016), SSD (Liu et al.,
2016), RetinaNet (Girshick, 2015), YOLOv3, YOLOv4, etc. anticipate bounding boxes over the
pictures.
In this paper we created application using CCTV will be helpful to understand whether any object is
stolen from the area (room, cabin, etc.) and recognition of the person, thus helping to form the better
security infrastructure. This system can be used at official workspaces where there is a crucial need for
more secure surveillance than earlier. Analyses frames and find stolen objects by using structural
similarities. A deep learning-based object recognition system coupled with YOLO for remote
surveillance that can accurately and quickly identify the target inside video frames has been suggested.
proposed system has the ability to send data to a localised remote server after automatically detecting
a person or item. To detect and transmit the information, a thin coating of YOLO was utilised. We
applied filters to the data to keep it clean.
2. LITERATURE REVIEW
Viola and Jones (2001) proposed a machine Learning based object detection which is populary know
as Viola-Jones Detector where they calculate new feature, learn by using AdaBoost and able to clasify
more complex features. Dalal and Triggs (2005) proposed HOG (Histograms of Oriented Gradient)
object detection algorithm which uses dense overlapping grid gives which provides betteer results for
person identification, reducing false positive rates with resppect to Haar wavelet based detector.
Forsyth (2014) Exnded the above work and proposed the DPM model for object Detection. They
https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57
3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254-4143
Ed. 42 Vol. 11 N.º 2 August - December 2022
53
modify the SVM algorithm of data mining for the purpose object detection. It uses divide and conqure
technology with the root filter and various part filters which make it Multi instance Leaning algorithm.
All above traditional methods have far surpassed many algorithm in term of accuracy.
Girshick et al. (2013) proposed RCNN object detection algorithm with mean average precision. As
local Search combine with CNN it said to be RCNN. They Combine the CV and Deep Learning for
better results as comparre to traditional Methods. He et al. (2014) proposed SPPNet model for object
detection. It Detect the object by variriyng the image size. For that they changes the image size as
most of object detection algorithms are used the strandard size of 255*255.Girshick (2015) proposed
Fast RCNN detector, advanced verison of R-CNN and SPPNet. They suggested Dense boxex and and
Parse preposal to acceleate process as it is costly process.
Ren et al. (2015) proposed Faster RCNN detector and target is too reduce the time of RCNN. Which is
mostly presented as RPN as it uses more deep learning network.Redmon et al. (2016) proposed
Feature Selection framewok inside convonet popurarly know as FPN (Feature Pyramid Network)
which is based on Faster RCNN with framework. They suggest multi scale problems with deep
learning. Redmon et al. (2016) proposed most popolar and widely used algoritrhm you only looks one
(YOLO) one stage detector using deep learning. Insteed of consideration of whole image, it convert
image in to region and simulteniously predict the bounding boxes and probability of each region. Also
they train the loss function which helps model to detect the images. Further they improved to
YOLOv3, YOLOv4.
3. METHODOLOGY
By observing the various literature, we apply the CNN with two layer architecture and different
activation function and Filter. Identified Process is as follows.
Fig 1: Feature Selection.
https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57
3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254-4143
Ed. 42 Vol. 11 N.º 2 August - December 2022
54
Fig 2: Flow of Process.
In this process we used real time data capture from CCTV and camera. The images which received
which are filter using LBPS parameters. This helps process simpler. Then 2 layer deep neural network
with Relu activation functions.
4. RESULTS
Step1: Collection of images: We accept the images from CTTV camera and then we create the slices
as shown in bellow figure. Which store in specific folder with less size.
Fig 3: Extraction of images of one frame.
Step2: Extraction of features using specific one layer deep neural Network. The Features are selected as core
point of the real time flowing image.
https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57
3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254-4143
Ed. 42 Vol. 11 N.º 2 August - December 2022
55
Fig 4: Feature Selection
Step3: identification of images using second deep layer where we can identified the images. Following figure
demonstrate the detection of images.
Fig 5: Detection of images.
5. DISCUSSION
The System Developed is shown below:
Fig 6: System GUI.
It is application which having Portable CCTV with some in-built night vision capability. In That by using CNN
we adding deep learning if having high power device. It consist of feature such as Deadly weapon detection,
Accident detection,Fire Detection,much more.. Working of standalone device. As shown in following figure
Fig 7: Object Detection.
6. CONCLUSION AND FUTURE SCOPE
In this paper we used two layer deep networks for the object Detection. Using architecture we are able
to chive accuracy up to 90%. CNN level are provided the features which helps to make the fast
detection of object. In This study it is observed that reduction of image sizes helps to train the
networks more faster and deep network can also provide the essentials features. As we implement this
https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57
3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254-4143
Ed. 42 Vol. 11 N.º 2 August - December 2022
56
study for security purposed we able to detect object and provide suitable alarm messages.as a Future
scope we can used high end camera and test the accuracy and speed.
REFERENCES
[1]
Dalal, N. and Triggs, B. 2005. Histograms of oriented gradients for human detection. Comput. Vision
Pattern Recognit. 1, 886–893.
[2]
DollaÅLr, P., Wojek, C., Schiele, B., and Perona, P. 2011. Pedestrian detection: An evaluation of the
state of the art. IEEE transactions on pattern analysis and machine intelligence 34, 743–61.
[3]
Felzenszwalb, P., Girshick, R., Mcallester, D., and Ramanan, D. 2010. Object detection with
discriminatively trained part-based models. IEEE transactions on pattern análisis and machine
intelligence 32, 1627–45.
[4] Forsyth, D. 2014. Object detection with discriminatively trained part-based models. Computer 47, 6–7.
[5] Girshick, R. 2015. Fast r-cnn.
[6] Girshick, R., Donahue, J., Darrell, T., and Malik, J. 2013. Rich feature hierarchies for accurate object
detection and semantic segmentation. Proceedings of the IEEE Computer Society Conference on
Computer Vision and Pattern Recognition.
[7] He, K., Zhang, X., Ren, S., and Sun, J. 2014. Spatial pyramid pooling in deep convolutional networks
for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 37.
[8]
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., and Berg, A. 2016. Ssd: Single shot
multibox detector. Vol. 9905. 21–37.
[9]
Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. 2016. You only look once: Unified, real-time
object detection. 779–788.
[10] Ren, S., He, K., Girshick, R., and Sun, J. 2015. Faster r-cnn: Towards real-time object detection with
region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39.
[11]
Sampat, M. and Bovik, A. 2003. Detection of spiculated lesions in mammograms. Annual
International Conference of the IEEE Engineering in Medicine and Biology - Proceedings 1.
[12] Sung, K. and Poggio, T. 1998. Example based learning for view-based human face detection. Pattern
Analysis and Machine Intelligence, IEEE Transactions on 20, 39 – 51.
[13] Viola, P. and Jones, M. 2001. Rapid object detection using a boosted cascade of simple features. IEEE
Conf Comput Vis Pattern Recognit 1, I–511.
[14]
Zhao, Z.-Q., Zheng, P., Xu, S.-T., and Wu, X. 2019. Object detection with deep learning: A review.
IEEE Transactions on Neural Networks and Learning Systems PP, 1–21.
https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57
3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254-4143
Ed. 42 Vol. 11 N.º 2 August - December 2022
57