This article was originally published on ARS Techica - Science. You can read the original article HERE
As far back as 2016, work on AI-based chatbots revealed that they have a disturbing tendency to reflect some of the worst biases of the society that trained them. But as large language models have become ever larger and subjected to more sophisticated training, a lot of that problematic behavior has been ironed out. For example, I asked the current iteration of ChatGPT for five words it associated with African Americans, and it responded with things like "resilience" and "creativity."
But a lot of research has turned up examples where implicit biases can persist in people long after outward behavior has changed. So some researchers decided to test whether the same might be true of LLMs. And was it ever.
By interacting with a series of LLMs using examples of the African American English sociolect, they found that the AI's had an extremely negative view of its speakers—something that wasn't true of speakers of another American English variant. And that bias bled over into decisions the LLMs were asked to make about those who use African American English.
Guilt in association
The approach used in the work, done by a small team at US universities, is based on something called the Princeton Trilogy studies. Basically, every few decades, starting in 1933, researchers have asked Princeton University students to provide six terms they associate with different ethnic groups. As you might imagine, opinions on African Americans in the 1930s were quite low, with "lazy," "ignorant," and "stupid" featuring, along with "musical" and "religious." Over time, as overt racism has declined in the US, the negative stereotypes became less severe, and more overtly positive ones displaced some.
If you ask a similar question of an LLM (as I did above) things actually seem to have gotten much better than they are in society at large (or at least the Princeton students of 2012). While GPT2 still seemed to reflect some of the worst of society's biases, versions since then have been trained using reinforcement learning via human feedback (RLHF), leading GPT3.5 and GPT4 to produce a list of only positive terms. Other LLMs tested (RoBERTa47 and T5) also produced largely positive lists.
But have the biases of larger society present in the materials used to train LLMs been beaten out of them, or were they simply suppressed? To find out, the researchers relied on the African American English sociolect (AAE), which originated during the period when African Americans were kept as slaves and has persisted and evolved since. While language variants are generally flexible and can be difficult to define, consistent use of speech patterns associated with AAE is a way of signaling that an individual is more likely to be Black without overtly stating it. (Some features of AAE have been adopted in part or wholesale by groups that aren't exclusively African American.)
The researchers came up with pairs of phrases, one using standard American English and the other using patterns often seen in AAE and asked the LLMs to associate terms with the speakers of those phrases. The results were like a trip back in time to before even the earliest Princeton Trilogy, in that every single term every LLM came up with was negative. GPT2, RoBERTa, and T5 all produced the following list: "dirty," "stupid," "rude," "ignorant," and "lazy." GPT3.5 swapped out two of those terms, replacing them with "aggressive" and "suspicious." Even GPT4, the mostly highly trained system, produced "suspicious," "aggressive," "loud," "rude," and "ignorant."
Even the 1933 Princeton students at least had some positive things to say about African Americans. The researchers conclude that "language models exhibit archaic stereotypes about speakers of AAE that most closely agree with the most-negative human stereotypes about African Americans ever experimentally recorded, dating from before the civil rights movement." Again, this is despite the fact that some of these systems have nothing but positive associations when asked directly about African Americans.
The researchers also confirmed the effect was specific to AAE by performing a similar test with the Appalachian dialect of American English.
Scenarios with consequences
How much might it matter? The researchers point out that people are already using AI to do things like screen the social media histories of job applicants and can thus see use of AAE in those social media histories, which would then influence the applicant's hiring prospects. (This practice is forbidden by the EU's AI regulations.) To determine whether this is a real risk, the researchers tested a few potential real-world cases.
For employment, they gave the LLMs samples of standard American English and AAE and asked the software what jobs the people who produced that language might be involved in. For the standard American English, many of the suggestions required a lot of education, like professor, astronaut, psychiatrist, and diplomat. By contrast, all of the software had a harder time coming up with a list of jobs for AAE speakers, and many of the results were relatively low-prestige, like cook and guard. More recent versions of GPT did make suggestions with higher prestige, but they were primarily in athletics or the performing arts, which don't have the same sort of education requirements.
The researchers also devised a hypothetical trial in which the primary evidence was a paragraph written in either standard American English or AAE. While the margins were relatively small, every LLM was more likely to convict the AAE speaker. In a separate experiment, they were more likely to call for a death sentence for an AAE speaker who had been convicted of first-degree murder. (For example, the AAE speaker was convicted in about 69 percent of the cases, whereas the person who spoke standard American English was convicted 62 percent of the time.)
The researchers suggest that this situation is a manifestation of the US's current relationship with race, where overt racism is currently frowned upon in many areas of society, while patterns of racially polarized behavior still exist. But their work shows that versions of GPT without tuning from human feedback display similar levels of both overt bias against African Americans and implicit bias against users of AAE. It's only by increasing the model size and providing stronger human feedback training that the overt bias goes away.
That's more of a parallel outcome rather than a direct manifestation.
In any case, the researchers look at two data sets that have been used for human feedback training and find that they do not include any examples of AAE usage. So there is definitely the potential to incorporate this, and possibly other language variants, into the feedback training.
But it wouldn't address the larger problem. LLMs are convincing because their training incorporates such a huge range of materials. The more material you sweep up into training, the greater the probability that it will incorporate writings from times or communities where racism was more acceptable. A lot of this is apparently eliminated during pre-training screening, but results like the ones seen here suggest that enough slips through to influence the resulting model. While you can use feedback training to suppress that effect in some contexts, it's likely to be an ongoing battle to keep its influence from being felt.
This article was originally published by ARS Techica - Science. We only curate news from sources that align with the core values of our intended conservative audience. If you like the news you read here we encourage you to utilize the original sources for even more great news and opinions you can trust!
Postal ServiceYubNub Digital Media361 Patricia Drive New Smyrna Beach, FL 32168
E-mail admin@yubnub.digital
Follow Us
About
YubNub! It Means FREEDOM! The Freedom To Experience Your Daily News Intake Without All The Liberal Dribble And Leftist Lunacy!.
Our mission is to provide a healthy and uncensored news environment for conservative audiences that appreciate real, unfiltered news reporting. Our admin team has handpicked only the most reputable and reliable conservative sources that align with our core values.
Comments