Top

Teaching algorithms to see

The University's Image and Video Understanding Lab develops machine learning algorithms for computer vision and object tracking. Photo by Sarah Munshi.

​-By Sonia Turosienski, KAUST News

Four hundred hours of video are uploaded to YouTube every minute. Other than using text to search the title, description or tags associated with a video, video content search is limited. In addition to helping users find content more quickly and accurately, the ability to search video content is of paramount importance to video platforms and advertisers.

Last year, YouTube was embroiled in controversy after rolling video ads for Coca-Cola and Amazon, among others, were shown before racist and extremist content. Advertising content is a major revenue stream for video platforms like YouTube, which comes under threat when an advertiser's content is associated with irrelevant and sometimes nefarious content.

 
Bernard Ghanem, KAUST associate professor of electrical engineering in the University's Visual Computing Center and principal investigator of the Image and Video Understanding Lab, is developing machine learning techniques that include computer vision for automated navigation, object tracking and other areas. When it comes to making video content searchable, Ghanem and his team, including Ph.D. students Fabian Caba Heilbron, Victor Escorcia and Humam Alwassel, have a solution.

The group has developed machine learning algorithms that can learn to detect specific types of activity within a video without relying on metadata.

"Our algorithms can detect—based on what inputs have been set—what and where a specific activity happens. This can help video platforms with detecting unwanted content [and] also help advertisers deliver more relevant and timely content. For example, a company like Nike can localize its ad within a part of the video that relates to its content, such as running or exercise," explained Ghanem.

The algorithm can also be used in other applications, such as in surveillance. Once an activity parameter has been defined, the algorithm can learn to trawl through hours of video for the relevant moments, automating the task and performing it more quickly.

The team's work has significantly contributed to the University's top 10 ranking in computer vision and computer graphics, according to CSRankings.

Leveling up unmanned aerial vehicles (UAVs)

Another strand of research within Ghanem's lab is to "empower UAVs with more intelligence," he said. Ghanem's UAVs contain an object tracking algorithm that captures video frames from the onboard camera of the desired object. The frames are then relayed to the navigation system, which uses the information to ensure the object stays in the center of the UAV's field of view. The capability has been showcased through simulations and real-world experiments at KAUST, where one of Ghanem's students programmed the UAV to follow him around.

A UAV is pictured racing through gates on a simulated version of the Safaa Stadium field at KAUST. Photo courtesy of Matthias Mueller.

 ​
Capitalizing on the recent interest in UAV racing, Ghanem and his team also developed an algorithm that teaches UAVs how to race. Currently UAV racers manually fly the vehicle through gates, competing to maneuver the UAV through the greatest number of gates in the shortest amount of time. A machine learning-trained UAV recently outperformed novice and intermediate human pilots in a simulated racing game. The algorithm accomplished this feat by training itself without direct human supervision. The team was awarded the best paper award for this work at the European Conference on Computer Vision's 2nd International Workshop on Computer Vision for UAVs this past September.

Democratizing AI

AI developments are currently limited to improving human tasks or competing in games; however, there is concern among some in the community that if the technology is not handled responsibly, the consequences could herald a different result.

Elon Musk, CEO of Tesla and SpaceX, has stated that "artificial intelligence is mankind's biggest threat," calling for more regulatory oversight and supporting the creation of OpenAI, a non-profit AI research company dedicated to developing safer artificial intelligence.

Ghanem noted there has been a recent trend towards the democratization of AI.

Using machine learning, UAVs can be trained to follow selected objects such as the boat below seen in the bay near the KAUST Beacon. Photo courtesy of Bernard Ghanem.

 ​
"The idea is that no one person or organization hoards certain information...run[ning] the risk of using AI for nefarious purposes. Sharing information helps restore a balance of power. That's why we publish all our code and data sets," said Ghanem.

Many people working on the same problem also may increase the chances of finding creative, efficient solutions.

"The other effect of democratization is that it makes the competition fiercer, but ultimately competition is beneficial to our field—it drives down prices and leads to innovation," he concluded.

Practice, practice, practice

Humans have evolved over millennia to process information efficiently with low energy consumption, but machine learning technology is relatively nascent in comparison. Humans also have the benefit of genetic information passed down through generations that predisposes learning in certain areas, such as in language. Even when machine learning algorithms reach parity with human ability, the processing power needed to achieve this is colossal.

The KAUST Image and Video Understanding Lab team discuss a recent poster entry to ECCV 2018. Photo by Sarah Munshi.​

 ​
"If the human brain required as much power as algorithms do, then it would likely melt," Ghanem illustrated. Therefore, even in areas where algorithms are outperforming humans, energy consumption is a major concern and represents an important strand of research.

"We're still playing catch up to what's in our heads," he said.

When it comes to machine learning, the old adage holds true—practice makes perfect.

"Much of learning takes place by example. In most cases, the more examples the machine experiences, the better it will get at the desired task," explained Ghanem.

With transparent and responsible machine learning development, artificial intelligence has the capacity to unlock new potential and may even shed light on how the human brain works in the process.

Related stories: