DEEP LEARNING BASED MISSIN

G OBJECT

DETECTION AND PERSON IDENTIFICATION

AND APPLICATION FOR SMART CCTV

R. C. Dharmik

Assistant Professor, Department of Information Technology Yeshwantrao Chavan College of

Engineering, Nagpur, Maharashtra, (India).

Sushilkumar Chavhan

Assistant Professor, Department of Information Technology Yeshwantrao Chavan College of

Engineering, Nagpur, Maharashtra, (India).

S. R. Sathe

Professor, Department of CSE, VNIT Nagpur, (India).

Reception: 10/09/2022 Acceptance: 25/09/2022 Publication: 29/12/2022

Suggested citation:

Dharmik, R. C., Chavhan, S., y Sathe, S. R. (2022). Deep learning based missing object detection and person

identification: an application for smart CCTV. 3C Tecnología. Glosas de innovación aplicadas a la pyme, 11(2),

51-57. https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57

https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57

3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254-4143

Ed. 42 Vol. 11 N.º 2 August - December 2022

ABSTRACT

Security and protection are the most crucial concerns in today’s quickly developing world. Deep

Learning methods and computer vision assist in resolving both problems. One of the computer vision

subtasks that allows us to recognise things is object detection. Videos are a source that is taken into

account for detection, and image processing technology helps to increase the effectiveness of state-of-

the-art techniques. With all of these technologies, CCTV is recognised as a key element. Using a deep

convolutional neural network, we accept CCTV data in real time in this article. The main objective is

to make content the centre of things. Using the YOLO technique, we were able to detect the missing

item with an improvement of 10% sparsity over the current state-of-the-art algorithm in the context of

surveillance systems, where object detection is a crucial step. It can be utilised to take immediate

additional action.

KEYWORDS

Deep Learning, Object Detection, Computer vision

https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57

3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254-4143

Ed. 42 Vol. 11 N.º 2 August - December 2022

1. INTRODUCTION

The method of object detection involves comprehending the entire image, concentrating on proper

categorization while focusing on the item in the image. This process is a subtask of computer vision,

which also covers face detection and skeleton identification by Zhao, Zheng, Xu, and Wu

(2019) ,Felzenszwalb, Girshick, Mcallester, and Ramanan (2010),Sung and Poggio (1998),Doll´ar,

Wojek, Schiele, and Perona (2011)and Sampat and Bovik (2003). The development of computational

models that offer the most fundamental data required by computer vision applications is the aim of

object detection. An industrial revolution changes the face of the computer vision task where

identification, surveillance, medical, robotics, self-driving cars can be part of it. In recent era

advancement in deep leaning accelerated the object detection task. A specific form of machine

learning called ”deep learning” (DL) uses neural networks to learn in phases. It can therefore mimic

human thought. Video analytics is closely related with the deep learning which provides application in

different fields. Traditional Object detection is complex in complexity and low level features are also

leads to saturation to increased complexities (Zhao et al., 2019).

Image classification and detection are the steps for the detection of object for surveillance. Primary

task for both the operation to receive the relevant features. Deep Learning is the better solution for it.

It learns from previous stages and extracts new features. Deep Network may conventio al neural

network or Multilayer Perceptron networks which consist of activation functions and various hidden

layers. Image is extracted from the video with fixed size as neural network required for model (Sung

and Poggio, 1998). If fully integrated networks are necessary. However, size reduction results in

information loss inside the image, reducing accuracy, precision, and sparsity as well as applicability

for surveillance system (Zhao et al., 2019). Viola-Jones Detector (Viola and Jones, 2001) and HOG

Detector (Dalal and Triggs, 2005) are two examples of conventional object detection algorithms. One-

stage and two-stage algorithms based on deep learning are the two main categories. The two-stage

algorithms RCNN and SPPNet(Girshick, Donahue, Darrell, and Malik, 2013),(Liu, Anguelov, Erhan,

Szegedy, Reed, Fu, and Berg, 2016), Fast RCNN and Faster RCNN(He, Zhang, Ren, and Sun, 2014),

Mask R–CNN, Pyramid Networks/FPN(Girshick, 2015), and G–RCNN(Ren, He, Girshick, and Sun,

2015) propose object region utilising deep feature before classification. Without the region suggestion,

one-stage algorithms like YOLO (Redmon, Divvala, Girshick, and Farhadi, 2016), SSD (Liu et al.,

2016), RetinaNet (Girshick, 2015), YOLOv3, YOLOv4, etc. anticipate bounding boxes over the

pictures.

In this paper we created application using CCTV will be helpful to understand whether any object is

stolen from the area (room, cabin, etc.) and recognition of the person, thus helping to form the better

security infrastructure. This system can be used at official workspaces where there is a crucial need for

more secure surveillance than earlier. Analyses frames and find stolen objects by using structural

similarities. A deep learning-based object recognition system coupled with YOLO for remote

surveillance that can accurately and quickly identify the target inside video frames has been suggested.

proposed system has the ability to send data to a localised remote server after automatically detecting

a person or item. To detect and transmit the information, a thin coating of YOLO was utilised. We

applied filters to the data to keep it clean.

2. LITERATURE REVIEW

Viola and Jones (2001) proposed a machine Learning based object detection which is populary know

as Viola-Jones Detector where they calculate new feature, learn by using AdaBoost and able to clasify

more complex features. Dalal and Triggs (2005) proposed HOG (Histograms of Oriented Gradient)

object detection algorithm which uses dense overlapping grid gives which provides betteer results for

person identification, reducing false positive rates with resppect to Haar wavelet based detector.

Forsyth (2014) Exnded the above work and proposed the DPM model for object Detection. They

https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57

3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254-4143

Ed. 42 Vol. 11 N.º 2 August - December 2022

modify the SVM algorithm of data mining for the purpose object detection. It uses divide and conqure

technology with the root filter and various part filters which make it Multi instance Leaning algorithm.

All above traditional methods have far surpassed many algorithm in term of accuracy.

Girshick et al. (2013) proposed RCNN object detection algorithm with mean average precision. As

local Search combine with CNN it said to be RCNN. They Combine the CV and Deep Learning for

better results as comparre to traditional Methods. He et al. (2014) proposed SPPNet model for object

detection. It Detect the object by variriyng the image size. For that they changes the image size as

most of object detection algorithms are used the strandard size of 255*255.Girshick (2015) proposed

Fast RCNN detector, advanced verison of R-CNN and SPPNet. They suggested Dense boxex and and

Parse preposal to acceleate process as it is costly process.

Ren et al. (2015) proposed Faster RCNN detector and target is too reduce the time of RCNN. Which is

mostly presented as RPN as it uses more deep learning network.Redmon et al. (2016) proposed

Feature Selection framewok inside convonet popurarly know as FPN (Feature Pyramid Network)

which is based on Faster RCNN with framework. They suggest multi scale problems with deep

learning. Redmon et al. (2016) proposed most popolar and widely used algoritrhm you only looks one

(YOLO) one stage detector using deep learning. Insteed of consideration of whole image, it convert

image in to region and simulteniously predict the bounding boxes and probability of each region. Also

they train the loss function which helps model to detect the images. Further they improved to

YOLOv3, YOLOv4.

3. METHODOLOGY

By observing the various literature, we apply the CNN with two layer architecture and different

activation function and Filter. Identified Process is as follows.

Fig 1: Feature Selection.

https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57

3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254-4143

Ed. 42 Vol. 11 N.º 2 August - December 2022

Fig 2: Flow of Process.

In this process we used real time data capture from CCTV and camera. The images which received

which are filter using LBPS parameters. This helps process simpler. Then 2 layer deep neural network

with Relu activation functions.

4. RESULTS

Step1: Collection of images: We accept the images from CTTV camera and then we create the slices

as shown in bellow figure. Which store in specific folder with less size.

Fig 3: Extraction of images of one frame.

Step2: Extraction of features using specific one layer deep neural Network. The Features are selected as core

point of the real time flowing image.

https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57

3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254-4143

Ed. 42 Vol. 11 N.º 2 August - December 2022

Fig 4: Feature Selection

Step3: identification of images using second deep layer where we can identified the images. Following figure

demonstrate the detection of images.

Fig 5: Detection of images.

5. DISCUSSION

The System Developed is shown below:

Fig 6: System GUI.

It is application which having Portable CCTV with some in-built night vision capability. In That by using CNN

we adding deep learning if having high power device. It consist of feature such as Deadly weapon detection,

Accident detection,Fire Detection,much more.. Working of standalone device. As shown in following figure

Fig 7: Object Detection.

6. CONCLUSION AND FUTURE SCOPE

In this paper we used two layer deep networks for the object Detection. Using architecture we are able

to chive accuracy up to 90%. CNN level are provided the features which helps to make the fast

detection of object. In This study it is observed that reduction of image sizes helps to train the

networks more faster and deep network can also provide the essentials features. As we implement this

https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57

3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254-4143

Ed. 42 Vol. 11 N.º 2 August - December 2022

study for security purposed we able to detect object and provide suitable alarm messages.as a Future

scope we can used high end camera and test the accuracy and speed.

REFERENCES

[1]

Dalal, N. and Triggs, B. 2005. Histograms of oriented gradients for human detection. Comput. Vision

Pattern Recognit. 1, 886–893.

[2]

DollaÅLr, P., Wojek, C., Schiele, B., and Perona, P. 2011. Pedestrian detection: An evaluation of the

state of the art. IEEE transactions on pattern analysis and machine intelligence 34, 743–61.

[3]

Felzenszwalb, P., Girshick, R., Mcallester, D., and Ramanan, D. 2010. Object detection with

discriminatively trained part-based models. IEEE transactions on pattern análisis and machine

intelligence 32, 1627–45.

[4] Forsyth, D. 2014. Object detection with discriminatively trained part-based models. Computer 47, 6–7.

[5] Girshick, R. 2015. Fast r-cnn.

[6] Girshick, R., Donahue, J., Darrell, T., and Malik, J. 2013. Rich feature hierarchies for accurate object

detection and semantic segmentation. Proceedings of the IEEE Computer Society Conference on

Computer Vision and Pattern Recognition.

[7] He, K., Zhang, X., Ren, S., and Sun, J. 2014. Spatial pyramid pooling in deep convolutional networks

for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 37.

[8]

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., and Berg, A. 2016. Ssd: Single shot

multibox detector. Vol. 9905. 21–37.

[9]

Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. 2016. You only look once: Unified, real-time

object detection. 779–788.

[10] Ren, S., He, K., Girshick, R., and Sun, J. 2015. Faster r-cnn: Towards real-time object detection with

region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39.

[11]

Sampat, M. and Bovik, A. 2003. Detection of spiculated lesions in mammograms. Annual

International Conference of the IEEE Engineering in Medicine and Biology - Proceedings 1.

[12] Sung, K. and Poggio, T. 1998. Example based learning for view-based human face detection. Pattern

Analysis and Machine Intelligence, IEEE Transactions on 20, 39 – 51.

[13] Viola, P. and Jones, M. 2001. Rapid object detection using a boosted cascade of simple features. IEEE

Conf Comput Vis Pattern Recognit 1, I–511.

[14]

Zhao, Z.-Q., Zheng, P., Xu, S.-T., and Wu, X. 2019. Object detection with deep learning: A review.

IEEE Transactions on Neural Networks and Learning Systems PP, 1–21.

https://doi.org/10.17993/3ctecno.2022.v11n2e42.51-57

3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254-4143

Ed. 42 Vol. 11 N.º 2 August - December 2022