Machine detection of human-object interaction in images and videos

Credit: Virginia Tech

Jia-Bin Huang, assistant professor in the Bradley Department of Electrical and Computer Engineering and a faculty member at the Discovery Analytics Center, has received a Google Faculty Research Award to support his work in detecting human-object interaction in images and videos.

The Google award, which is in the Machine Perception category, will allow Huang to tackle the challenges of detecting two aspects of human-object interaction: modeling the relationship between a person and relevant objects/scene for gathering contextual information and mining hard examples automatically from unlabeled but interaction-rich videos.

According to Huang, while significant progress has been made in classifying, detecting, and segmenting objects, representing images/videos as a collection of isolated object instances has failed to capture the information essential for understanding activity.

“By improving the model and scaling up the training, we aim to move a step further toward building socially intelligent machines,” Huang said.

Given an image or a video, the goal is to localize persons and object instances, as well as recognize interaction, if any, between each pair of a person and an object. This provides a structured representation of a visually grounded graph over the humans and the object instances they interact with.

For example: Two men are next to each other on the sidelines of a tennis court, one standing up and holding an umbrella and one sitting on a chair holding a tennis racquet and looking at a bag on the ground beside him. As the video progresses, the two smile at each other, exchange the umbrella and tennis racquet, sit side by side, and drink from water bottles. Eventually, they turn to look at each other, exchange the umbrella and tennis racquet again, and finally, talk to one another.

“Understanding human activity in images and/or videos is a fundamental step toward building socially aware agents, semantic image/video retrieval, captioning, and question-answering,” Huang said.

He said that detecting human-computer interaction leads to a deeper understanding of human-centric activity.

“Instead of answering ‘What is where?’ the goal of human-object interaction detection is to answer the question ‘What is happening?’ The outputs of human-object interaction provide a finer-grained description of the state of the scene and allow us to better predict the future and understand their intent,” Huang said.

Ph.D. student Chen Gao will work on the project with Huang. They expect that the research will significantly advance state-of-the-art human-object detection and enable many high-impact applications, such as long-term health monitoring and socially aware robots.

Huang plans to share results of the research via publications at top-tier conferences and journals and will also make the source code, collected datasets, and pre-trained models produced from this project publicly available.

“Our project aligns well with several of Google’s on-going efforts to build ‘social visual intelligence.’ We look forward to engaging with researchers and engineers at Google to exchange and share ideas and foster future collaborative relationships,” Huang said.

###

Media Contact
Lindsey Haugh
[email protected]

https://vtnews.vt.edu/articles/2019/06/dac-jia-bin-huang-google-award.html

Machine detection of human-object interaction in images and videos

Related Posts

Five or more hours of smartphone usage per day may increase obesity

NASA’s terra satellite finds tropical storm 07W’s strength on the side

NASA finds one burst of energy in weakening Depression Dalila

Researcher’s innovative flood mapping helps water and emergency management officials

POPULAR NEWS

Saying Goodbye to PGY-6: Pediatric Fellowship Realities

Multi-Hospital Study Reveals Long Covid Burden Is Twice as High as Current Estimates

Detection of EDCs in Breast Milk and Infant Urine Up to Six Months Highlights Early Exposure Risks

New Drug Candidate Developed at McMaster Shows Potential for Treating Brain Cancer

About

Follow us

Recent News

Tracking Lanthanide-Labeled Microplastics in Plants

POSTECH Researchers Slash Cost of Reconstituted Cell-Free Systems by 95%

AI and Physics Collaborate to Design Advanced Hydrogen Storage Materials

Subscribe to Blog via Email

Welcome Back!

Retrieve your password

Machine detection of human-object interaction in images and videos

Original Source

Related Posts

POPULAR NEWS

About

Follow us

Recent News

Subscribe to Blog via Email

Welcome Back!

Retrieve your password