Imagine simply instructing your vehicle, “I’m short on time,” and it intelligently reroutes you through the most eco-friendly route to your destination.
Researchers at Purdue College have found that autonomous vehicles (AVs) can collaborate with AI-powered chatbots, such as those enabled by large language models, to facilitate this capability.
What new students will be joining us in September? At the 27th IEEE Worldwide Convention on Intelligent Transportation Systems, one of the pioneering experiments was conducted to assess the efficacy of using large language models for real-world autonomous vehicles (AVs) to interpret passenger instructions and navigate accordingly.
Ziran Wang, an assistant professor in Purdue’s Lyles College of Civil and Building Engineering, leading the research, posits that fully autonomous vehicles will require understanding all passenger intentions, including those inferred from behavior or context. A taxi driver, accustomed to understanding unspoken cues, can intuitively grasp your needs without needing explicit instructions on the most efficient route to avoid traffic while simply saying “I’m in a hurry.”
Although current AI assistants offer conversational interfaces, they require clarity from users more so than when communicating with humans. While giant language models can excel at interpreting and providing responses that mimic human-like approaches, their capabilities are rooted in their ability to draw connections between vast amounts of text data and continuously learn from experience?
“Standard automotive systems feature either a button-activated interface, where users must physically press keys to convey their needs, or an audio recognition system that demands clear and concise verbal input for effective command perception.” While the energy of enormous language models lies in their ability to more naturally comprehend a wide range of content you provide. I don’t assume that every other present system can do this.
On this examination, however, giant language models did not directly drive an autonomous vehicle. The team had been acting as substitutes for the autonomous vehicle’s (AV’s) drivers by relying on its existing features. Researchers Wang and his team found that by combining these approaches, autonomous vehicles (AVs) not only gained a better understanding of their passengers but also adapted their driving styles to meet each individual’s preferences.
Prior to conducting their experiments, the researchers trained ChatGPT with a diverse set of prompts that spanned the spectrum from explicit directives (“Please drive more quickly”) to nuanced and subtle requests (“I’m experiencing mild motion sickness right now”). As ChatGPT was tasked with interpreting these instructions, the researchers provided its massive language models with parameters to conform to, demanding it consider factors such as visitor guidance, road conditions, climate, and other data detected by the vehicle’s sensors, including cameras and lidar.
The researchers made the advanced linguistic frameworks publicly available via cloud computing, enabling them to integrate with a self-driving vehicle possessing level 4 autonomy, as defined by SAE International. The fourth level of autonomy, as defined by industry experts, is mere degrees shy from achieving complete autonomy in vehicles.
As passengers issued commands through the vehicle’s speech recognition technology, the sophisticated cloud-based natural language processing systems promptly analyzed and interpreted these inputs in accordance with the research team’s carefully defined parameters. The fashion-inspired guidelines subsequently influenced the development of the car’s drive-by-wire system, encompassing the throttle, brakes, gears, and steering mechanisms, providing practical advice on how to drive in harmony with these commands.
Wang’s team also explored a novel reminiscence module integrated into the system, which enabled large language models to store passenger-specific historical preferences and adapt their responses accordingly when issuing commands.
Researchers conducted numerous experiments on an airstrip in Columbus, Indiana, formerly an airport runway serving as their makeshift testing ground. Within this controlled environment, drivers were able to thoroughly assess the vehicle’s reactions to a passenger’s directions while navigating simulated city scenarios, including freeway speeds on the runway and challenging two-way intersections. Researchers also investigated the extent to which the vehicle successfully positioned itself according to passengers’ guidance within the parking lot adjacent to Purdue’s Ross-Ade Stadium.
Examinees utilised both established and novel instructions learned by large language models while driving a vehicle. According to post-ride survey responses, riders reported a significant reduction in discomfort following autonomous vehicle (AV) decision-making, compared to anticipated feelings of discomfort when driving in a Level 4 AV without AI-assisted navigation.
The crew compared the AV’s efficiency against baseline values developed from data on what people consider a typical safe and comfortable driving experience, including response times for avoiding rear-end collisions and acceleration and deceleration rates. Researchers found that the autonomous vehicle (AV) significantly outperformed baseline values when driven by large language models, including scenarios where the models responded to novel instructions they had not previously encountered.
Researchers found that the average processing time for language fashions on this study was 1.6 seconds, deemed suitable for non-time-sensitive situations but open to improvement for scenarios requiring faster responses. While addressing bias in language models is crucial, this can pose a significant challenge for giant language fashions as a whole, prompting industry-wide efforts to tackle this issue alongside academic research initiatives.
Although not the primary focus of this examination, researchers have discovered that large-scale language models, such as ChatGPT, are susceptible to hallucination – a phenomenon where they can mistakenly recall information and respond in an incorrect manner. Wang’s examination was conducted within a controlled environment featuring a fail-safe design, permitting participants to safely explore instances where large language models misinterpreted commands. While participants’ fashion preferences evolve throughout the learning process, addressing hallucinations remains a crucial hurdle before integrating large language models into autonomous vehicles.
Automakers would also benefit from conducting more extensive trials using large language models in addition to the research conducted by university scholars. To integrate these autonomous vehicle features with the AV’s control systems and enable them to operate independently, regulatory approval is necessary, according to Wang.
In the intervening period, Wang and his research team continue their efforts to develop experiments that can aid the industry in exploring the potential benefits of integrating large language models into autonomous vehicles.
Researchers have assessed various private and non-private chatbots, leveraging massive language models such as Google’s Gemini and Meta’s LLaMA AI assistants, following their examination testing with ChatGPT. To this point, ChatGPT has consistently demonstrated exceptional performance in delivering a seamless and efficient autonomous vehicle (AV) experience, exceeding expectations across key metrics. Revealed outcomes are forthcoming.
A potential next step would be exploring whether large language models from various artificial voices (AVs) could communicate with each other, analogous to helping AVs determine which one should proceed first at a four-way intersection. Wang’s research team is launching an initiative to harness the power of large-scale, cutting-edge computer vision models to aid autonomous vehicles (AVs) in navigating treacherous winter road conditions prevalent throughout the American Midwest? While fashion trends may evolve rapidly, visual representations of style, as seen in photography, prove more enduring and nuanced than fleeting linguistic expressions. The project will likely benefit from collaboration with the Center for Connected and Automated Transportation (CCAT), a program supported by the United States government. The Division of Transportation’s Office of Analytics, Innovation, and Excellence through its College Transportation Infrastructure program. Purdue University is a key partner in the College and Career Counseling Alliance of Texas.