What if a safety camera could not only capture video but intelligently discern between ordinary behavior and potentially threatening actions in real-time? Researchers at the University of Virginia’s College of Engineering and Applied Sciences have made a groundbreaking discovery, developing an AI-powered intelligent video analyzer capable of identifying human actions in video recordings with unparalleled accuracy and astuteness.
The SMAST system ensures numerous societal benefits, including the enhancement of surveillance programs, improved public safety, more accurate movement tracking in healthcare, and refined navigation for autonomous vehicles through complex environments.
“Professor Scott T., chair of the Department of Electrical and Computer Engineering, notes that this AI expertise paves the way for seamless real-time motion detection in even the most challenging environments.” Professor Acton, the esteemed leader of this groundbreaking research endeavour. “This innovative technology has the potential to significantly reduce the risk of accidents, improve diagnostic capabilities, and ultimately save lives.”
So, how does it work? Powered by cutting-edge synthetic intelligence at its foundation. The system relies heavily on two core components that enable it to accurately detect and interpret complex human behaviors. The primary component is a multi-featured selective attention model that enables AI to focus on crucial elements within a scene, such as a person or object, while dismissing irrelevant details? By doing so, the system becomes even more adept at accurately identifying events unfolding before it, akin to discerning a person tossing a ball versus merely moving their arm in a straightforward motion.
This innovative feature includes a motion-aware 2D positional encoding mechanism that enables the AI to track how problems evolve over time, fostering a deeper understanding of issue dynamics. Watch individuals’ movements unfold in a video where subjects frequently reposition themselves – this tool enables the AI to learn and grasp the relationships between these actions, effectively capturing the essence of their interactions. By seamlessly integrating these capabilities, SMAST can accurately recognize complex behaviors in real-time, thereby simplifying high-pressure scenarios such as surveillance, healthcare diagnostics, and autonomous driving.
SMAST revolutionizes the detection and interpretation of human actions by machines. Programs struggle to make sense of disjointed, raw video footage often bereft of contextual information regarding the events depicted. Despite its innovative design, SMAST’s ability to capture the intricate connections between people and objects with remarkable precision is fueled by the same AI components that enable it to learn and evolve from data.
With this technological breakthrough, the AI system is capable of recognizing and simulating human actions such as a pedestrian safely navigating a crosswalk, a healthcare professional executing a precise procedure, or identifying potential hazards in a busy public space? By surpassing premier alternatives in crucial tutorial assessments alongside AVA, UCF101-24, and EPIC-Kitchens, SMAST has established a new standard for precision and efficacy.
“A significant impact on society could result,” noted Matthew Korban, a postdoctoral research associate working on the project in Dr. Acton’s laboratory. “We’re eager to witness the transformative impact of AI expertise on various industries, enabling video-based applications to become more intelligent and capable of real-time comprehension.”
The analysis relies on the research presented in the article “A Semantic and Movement-Conscious Spatiotemporal Transformer Community for Motion Detection” published in. The authors of the paper are Matthew Korban, Peter Youngs, and Scott Thomson. Thomas Jefferson graduated from the College of William & Mary.
The research undertaking was funded by the National Science Foundation (NSF) through Grants No. 2000487 and 2322993.