Video surveillance using digital cameras is a growing trend, some of which is driven by the growth of the Internet of Things (IoT). In 2016, there were an estimated 350 million surveillance cameras operating worldwide—with about 65 percent of those operating in Asia.
But these cameras can do more than just passively record video when movement is detected in the frame. The video can also be used in real-time for analysis. In this blog, we will explore how you can use the Parking Lot Monitor application of the Intel® OpenVINO™ toolkit to automatically identify parking spot availability based on cars entering or exiting a lot.
In past blog posts, we’ve explored applications of face and expression detection using images from a camera. In this application, we’ll explore a different use of deep learning to track vehicles based on direction and identify whether they are entering the lot or exiting.
Figure 1 shows the Parking Lot Vehicle Tracking Pipeline. Let’s take a closer look at what occurs in this deep-learning application.
Figure 1: The Parking Lot Vehicle Tracking Pipeline diagram illustrates how this application of the OpenVINO™ toolkit performs vehicle detection from a captured image and then counts the centroids (movement of detected vehicles) to determine the ingress and egress of vehicles. (Source: Author)
The application operates using images captured by a video camera mounted above the entry and exit to the parking lot. From a captured image, the deep neural network identifies the vehicles in the frame using a Convolutional Neural Network (CNN) trained and optimized for vehicle identification. CNNs are a popular type of deep neural network that are commonly used to process images. The CNN identifies vehicles in the captured frame, and then vehicle rectangles are used to calculate centroids to represent the vehicle. These centroids are then stored. When a new frame is captured and vehicles are detected, the new centroids are checked against the old, and the nearest old centroid indicates the vehicle (given the high speed of detection and the slow speed of the vehicle). These two samples can then indicate which direction the vehicle is travelling, and can be used to determine whether the vehicle is entering or exiting the parking lot.
Figure 2 shows the result of this deep neural network. Note that the green overlays in the image are the car centroids with their coordinates (used for tracking and correlation).
Figure 2: The Parking Lot Counter output screen shows centroids as green circles to determine if a vehicle is entering or exiting a parking lot. (Source: Intel)
The sample application also illustrates the use of the Message Queue Telemetry Transport (MQTT) protocol, which communicates the parking lot information to a data analytics system.
This application was developed with the Intel® distribution of OpenVINO™ and ~800 lines of Go (or 700 lines of C++). The complex part of this application is performed through the pre-trained deep neural network, which is accompanied by some glue code that implements simple calculations for vehicle tracking and correlation between frames (by tracking centroids representing the vehicles). Based on the size of the rectangle detected, the application can discard objects (such as pedestrians that wander into the frame). When paired with capable hardware such as one based upon the 6th generation Intel® Core™ processor or Intel’s Neural Compute Stick 2 powered by the Intel Movidius™ X VPU, impressive inference speeds can be attained that enable real-time analytics.
Many use cases exist for an application that can identify vehicles in a captured frame and then track them. Consider a scenario in which road safety engineers track vehicles in a troublesome intersection looking for potential issues (such as vehicles not honoring a stop sign, or near miss accidents at a blind spot intersection). The road safety engineers could use the statistics (centroid locations and speeds through the intersection) gathered by this application to propose changes (such as installing a light or additional stop signs).
Another use would be tracking the number of people standing in a given area. A camera installed above a pedestrian crossing or outside of an elevator could help to determine when to change the light—for example, if traffic is light, road safety engineers could optimize the traffic flow for pedestrians—or which floor to change to as a way to optimize the flow of people in and out of a building.
M. Tim Jones is a veteran embedded firmware architect with over 30 years of architecture and development experience. Tim is the author of several books and many articles across the spectrum of software and firmware development. His engineering background ranges from the development of kernels for geosynchronous spacecraft to embedded systems architecture and protocol development.