Using Closed Circuit Television (CCTV), a human can monitor areas for intrusion, but typically this data is used after-the-fact to verify incidents or for historical analysis. Given the large number of cameras deployed, it’s not possible to have a person per camera feed to monitor the area. Typically, a single security agent will monitor a large number of camera feeds in real-time.
Deep learning can solve this problem by automatically detecting not only if a person enters a camera’s field of view but also indicate whether the person is within a specific area in that field of view. This allows multiple restricted areas to be monitored with a real-time notification if someone violates a restricted zone. In this example of the Intel® OpenVINO™ toolkit, we’ll look at how video images can be used to identify whether a person enters a user-designated restricted area.
In prior blog posts, we’ve seen examples of face and vehicle detection using images captured by a video camera. In this application, we’ll look at a different type of detection using deep learning to identify a ‘person’ and whether they are in a restricted zone.
Figure 1 shows the pipeline for the Restricted Zone Monitor deep-learning application. Let’s explore this pipeline and the activities that occur.
Figure 1: The Restricted Zone Tracking Pipeline diagram illustrates how this application of the OpenVINO™ toolkit processes a captured image to identify whether a person enters an area and determine if that area is in a user-defined restricted zone. (Source: Author)
This image processing application uses images captured by a video camera mounted above an area that includes a restricted zone. A Convolutional Neural Network (CNN)—a type of image processing deep neural network—processes the captured images to determine if a person is violating the restricted zone. First, the CNN identifies whether a person is in the capture frame. If a person is detected, the CNN then checks to see if the person is in the restricted zone area. The user can define the restricted zone with a captured image and a mouse to create a plane in the image. Once defined, the application will generate a notification if a detected person has entered the restricted zone.
Figure 2 shows an example of the completed process of this deep neural network. Note that in this example, the CNN identified the person in under half a second, and also determined that the detected person is not in the restricted zone.
Figure 2: The Restricted Zone Monitor Output screen shows an example of this application of the OpenVINO™ toolkit identifying a person and determining that the person is not in the restricted zone. (Source: Intel)
The sample application also illustrates the use of the Message Queue Telemetry Transport (MQTT) protocol, which communicates the zone information to an industrial data analytics system.
The Restricted Zone Monitor application was developed with the Intel® distribution of OpenVINO™ and ~450 lines of Go (or 400 lines of C++). Traditional video monitoring requires a human to watch a number of monitors, which can be tedious and error prone. Removing the human from this monitoring role reduces the probability that a mistake is made and helps to ensure compliance in the workplace. Given these mistakes could result in life-threatening injuries, this is a great use of a cool technology. When paired with capable hardware such as one based upon the 6th generation Intel® Core™ processor or Intel’s Neural Compute Stick 2 powered by the Intel Movidius™ X VPU, impressive inference speeds can be attained that enable real-time analytics.
Perimeter security is an obvious use case for this technology. Detecting people in or around an area is useful as part of a physical security process, but the technology could be applied in other ways. This deep learning network is pre-trained to detect people, but it could also be trained to detect animals. For example, has a bear or other wild animal wandered into a suburban area with the potential to do harm?
Detecting people in a city could also be useful—in particular when it comes to the flow of pedestrians and traffic. Pedestrian crossings can detect when a person is waiting to cross, but stopping a busy road for one person can be less beneficial than stopping for a large group. Applying person detection to manage the flows of vehicular and pedestrian traffic could ensure the most optimal flow of people.
M. Tim Jones is a veteran embedded firmware architect with over 30 years of architecture and development experience. Tim is the author of several books and many articles across the spectrum of software and firmware development. His engineering background ranges from the development of kernels for geosynchronous spacecraft to embedded systems architecture and protocol development.