A new study led by CU Boulder computer scientist Theodora Chaspiri suggests that certain synthetic intelligence instruments used in healthcare may struggle to accurately process the language patterns employed by people of diverse genders and racial backgrounds, potentially compromising their effectiveness.
The research relies heavily on an unspoken truth about human society: people don’t all communicate in the same way. While women tend to exhibit a more melodious tone in their speech, similar differences emerge when comparing the vocal patterns of individuals from diverse racial backgrounds, such as those between white and African American speakers.
Researchers have found that the discovery of pure variations in data may inadvertently deceive algorithms designed to showcase individuals for mental health purposes, such as anxiety or depression, potentially leading to inaccurate representations. Studies consistently demonstrate that AI, mirroring human biases, can form conclusions rooted in race and gender.
According to Chaspari, an affiliate professor in the Department of Computer Science, “unless AI receives proper education and is sufficiently knowledgeable about consulting best practices, it may perpetuate human or societal biases.”
The researchers published their results in a scientific journal on July 24.
While Chaspari may claim that AI has great potential in healthcare, the reality is that artificial intelligence may indeed be a highly promising expertise within the healthcare sector. Sophisticated algorithms can analyze audio recordings of people conversing, searching for subtle nuances in their language patterns that could indicate underlying psychological health concerns.
Despite being necessary tools for individuals from diverse demographic groups, the computer scientist emphasized that they must operate consistently. To determine whether AI has reached its potential, researchers tested machine learning algorithms by feeding them audio samples of real humans. The study’s findings sparked concern: AI tools seemed to systematically misidentify women at higher risk of depression compared to men – a discrepancy that, in reality, could deny individuals the vital care they require.
“With synthetic intelligence, we will establish these fine-grained patterns that people cannot all the time understand,” stated Chaspari, who performed the work as a college member at Texas A&M College. “While there’s a chance, there’s also a substantial amount of risk involved.”
The way individuals typically converse provides a profound window into their underlying emotions and overall wellness – an insight poets and playwrights have long intuitively grasped, acknowledging the power of language to reveal our deepest sentiments.
Individuals exhibiting scientific despair tendencies typically engage in hushed conversations, often delivered in a flat, monotonous tone. People struggling with anxiety often exhibit a higher-pitched voice and increased “breathiness,” a measure of the natural tremor or wavering in their tone.
“We’re all familiar with the notion that speech can be greatly influenced by one’s physical attributes,” Chaspari said. “For despair, studies have shown that alterations in vibration patterns within the vocal folds can occur, as well as changes in how the voice is modulated through the vocal tract.”
Over time, researchers have designed and refined AI tools to identify subtle genetic variations like these.
Researchers Chaspari and her team decided to scrutinize the algorithms at a microscopic level. To effectively gauge social skills, the group utilized a diverse array of recorded conversations, featuring individuals speaking in various settings. Men and women conversed extensively within the confines of a professional healthcare environment, akin to a physician’s office. In every situation, individual members of the audio system completed questionnaires assessing their mental wellbeing. The research included Michael Yang and Abd-Allah El-Attar, undergraduate college students at Texas A&M.
The results seemed scattered, lacking coherence.
In public speaking recordings, Latino participants revealed higher levels of nervousness compared to their white and Black counterparts. Despite its advanced capabilities, the AI failed to recognize the individual’s elevated anxiety levels. In the second experiment, the algorithms equally identified women and men as susceptible to despair. Despite appearances, the female audio system displayed pronounced signs of distress at significantly elevated rates.
Chaspari stresses that the group’s accomplishments represent merely a starting point. Researchers should examine extensive audio records from diverse populations to uncover why AI systems struggle with specific scenarios, ultimately addressing biases by refining their training datasets.
While acknowledging the significance of AI in medicine, she cautioned that its introduction requires a cautious approach.
“If our analysis indicates that an algorithm consistently underestimates despair levels among a specific population, it’s crucial we alert clinicians to this potential bias.”