An AI-driven solution
Gramener, in conjunction with the Microsoft AI for Earth program, worked with the Nisqually River Foundation to attempt to automate the detection and identification of fish species from the video clips. The Nisqually salmon detection application was built as a web app to automate the process of video feed input, detection, and classification. The automated AI solution leverages the latest deep learning algorithms implemented using the Microsoft Azure and Cognitive Services platform stack.
According to Kesari, the process begins with the video feeds being processed to extract the relevant frames. Deep learning models are then trained to detect the fish by drawing bounding boxes and accurately identifying the species within the video frames.
Given the nature of the problem and the format of the video files, processing power was a key requirement for the training and validation phases. “A GPU (Graphics Processing Unit) machine was the natural choice to run the object detection models, hence we selected a GPU virtual machine (VM) in the Azure portal,” Kesari said.
“For a reliable cloud solution with machine learning capabilities, Microsoft Azure Data Science VMs was chosen. For the purpose of extracting frames from the videos and tag them, Microsoft Visual Object Tagging Tool (VOTT) came in handy. The final object detection algorithm chosen was a YOLO V3 video detection algorithm.” YOLO stands for “you only look once” and is a state-of-the-art, real-time object detection system.
The first challenge was to process the videos and tag the fish, Kesari said. The heavy manual work involved in this was automated by leveraging the Microsoft VOTT tool. The tagged frames were then used to train a model using Microsoft Cognitive Toolkit (CNTK). This model was then tested against more frames extracted from the videos. While this solution was good, it lacked speed and real-time video detection capabilities. As an enhancement to the solution, Gramener moved to video object detection using YOLO V3, which provides a faster solution with real-time capabilities.