

The Beginners Guide for Video Processing with OpenCV
ModelingPythonComputer VisionDeep LearningMachine LearningPythonposted by Ralabs January 23, 2019 Ralabs

Computer vision is a huge part of the data science/AI domain. Sometimes, computer vision engineers have to deal with videos. Here, we aim to shed light on video processing – using Python, of course.
This might be obvious for some, but nevertheless, video streaming is not a continuous process, but a discrete one.
That means, each time we deal with videos, we are actually dealing with the sequence of frames themselves. Each frame is just an image, which might be represented as an m x n array of pixels, where (m,n) is picture size. Each pixel might be represented as color intensity, depending on which color model we are using (gray-scale, RGB, or even multispectrum).
[Related Article: Streaming Video Analysis in Python]
Let’s get acquainted with the main video processing tool for Python – OpenCV. OpenCV is an open source library which provides us with the tools to perform almost any kind of image and video processing. OpenCV is written in C++ and its primary interface is in C++.
The OpenCV code is hard to develop and maintain, due to its methods naming, error logging, and sometimes weird code structures. For example, when the last frame passes out, openCV simply crashes, with such error to console:
error Traceback (most recent call last) <ipython-input-7-5981551c9e2e> in <module> 1 ret, image = cap.read() 2 —-> 3 cv2.imshow(‘img’,image) error: OpenCV(3.4.3) /io/opencv/modules/highgui/src/window.cpp:356: error: (-215:Assertion failed) size.width>0 && size.height>0 in function ‘imshow’ |
But, OpenCV is a great offer, due to its functionality.
Now, let’s take a look at video processing using an OpenCV and Python:
First of all, we are creating a cv2.VideoCapture object, cv2.VideoCapture is a class for video capturing from video files, image sequences, or cameras.
>>> import cv2
#source might be provided as video filename or integer number for camera capture >>> cap = cv2.VideoCapture(source) |
Then, in order to play video, we create a non-trivial cycle:
>>> while True:
… ret, frame = cap.read() |
It’s interesting that it’s an unbounded cycle.
>>> while True:
… ret, frame = cap.read() … cv2.imshow(‘window name’, frame) |
The cv2.imshow method displays an image in the specified window. We specify a window name as a first argument, and the frame we would like to display as a second.
And now, we need to somehow break an infinite cycle:
>>> while True:
… ret, frame = cap.read() … cv2.imshow(‘window name’, frame) … if cv2.waitKey(1) & 0xFF == ord(‘q’): … cap.release() … cv2.destroyAllWindows() |
cv2.waitKey() is a required building block for OpenCV video processing. waitKey is a method which displays the frame for specified milliseconds. The ‘0xFF == ord(‘q’)’ inside the ‘if’ statement is a special syntax to provide the ‘while’ loop break, by a keyboard key pressing event.
cap.release() and cv2.destroyAllWindows() are the methods to close video files or the capturing device, and destroy the window, which was created by the imshow method.
[Related Article: 4 Steps to Start Machine Learning with Computer Vision]
That’s it, now, we know how to process videos in python, and we know how to acсess each frame of a video file. From here, we can add any kind of image processing inside this cycle, it may be either object detection, or human pose estimation model, or both of them. The further dive into OpenCV for video processing is up to the reader. There are a lot of useful tools that will meet any of your requirement, from the easiest task (resize, crop, color/brightness adjustment), passing by image filtering to Delaunay Triangulation