Recognizing QR Codes in Video

Barcodes and QR codes are all around us. They are great ways to encode information into an easily recognizable design. This article will go through the process of tracking and reading QR codes from the Eagle Eye Networks Video API. This is a complete end-to-end example from generation through tracking across multiple detections.

Step 1: Setup and Required Components

In the above video we are using a 2MP camera connected to the Eagle Eye VMS through a bridge (BR304). The specifics of the camera only matter to the point that the resolution and focal length provide an acceptable view of the work surface. In a latter step we will look at the required resolution needed for accurate detection.

The example is being run from a laptop to simplify the demonstration. There is nothing that requires this to be running at the edge instead of the cloud. The computation requirements are moderate so there is no need for high performance equipment. I would expect this could run on anything from a Raspberry Pi to a small cloud instance from your favorite provider.

Step 2: Accessing the Video Stream

The starting process is to Authenticate and Authorize through the API. In the above example the credentails are passed from the user directly through to the API. This is a valid use for an example but it is strongly recommened that a separate user management system would be included before use.

If you are processing video on the same network as the video source you can get the local RTSP setings using this call.

If you are processing the video remotely you can retrieve the video using the Get Video endpoint. This endpoint will retrieve live and/or historic video.

Step 3: Processing the Video Stream

When the video stream has been established, the next step is to process each frame looking for QR codes. The very popular open-source library ZBar is avialble to detect and parse the actual data. Using the appropriate laguage bindinds it makes it easy to incorporate this functionality.

It is worth noting that even less computational resources are required if it is run through a GStreamer pipeline instead of parsing the video and processing it through OpenCV. Looking at each frame individually simplifies the example significantly but leaves plenty of room for optimizations when actually implemented.

Each time a QR code is detected, the decoded text, metadata, time, and the location in the frame should be recorded in the database. The web front end is driven by the data in the database and can be refreshed to show the latest information.

The same QR code may appear on the same video stream in subsequent frames or it may appear in another video stream. The purpose of tracking the data in a dabatase is to allow the users to query the information and get results across video streams.

The last step shown is to link to the History Browser so that the user can see the video of the detection at the correct point in time.

Step 4: Annotating the Video Stream

Multiple QR codes may be detected in the video at the same time. Unlike computers, humans generally have a hard time distinquishing QR codes from video. I recommend annotating the video and drawing a bounding box around the detected QR code. This is being done in the above video with a red box along with the label.

Drawing on the video directly is instructive in the above example but it may not be appropiate to obscure so much of the video in a production environment. Placing annotations on the history browser allows the user to understand what was detected and where it is on screen. The display of annotations can be turned off on a per-user basis.

You can see the annotation inside the Eagle Eye History Browser in the above video. For more details please refer to the previous article on cloud annotations.

What else can we do with this?

I hope that I have shown you something new and given you an idea of something else that can be done with the Eagle Eye Api. Please reach out if you have questions.