Despite the existence of thousands of languages, English is allowed virtually total dominance over machine learning and artificial intelligence. Researchers who train computers in understanding the content of a random text, typically teaches the training samples in English.
‘This introduces a significant unintentional cultural bias. Even after extensive training, the machine will never have been exposed to bull taming in India, to Chinese hot pot cooking, or to other phenomena which are familiar to millions of people, but just happen to lie outside the native English-speaking horizon, said Emanuele Bugliarello, PhD researcher at the University of Copenhagen.
Bugliarello and his colleagues created a novel tool which encourages diversity. IGLUE (Image-Grounded Language Understanding Evaluation) is a benchmark which allows the scoring of the efficiency of an ML solution in 20 languages.
Their paper was accepted for publication at the upcoming International Conference on Machine Learning (ICML).
‘When ML research teams create new solutions, they are always highly competitive. If another group has succeeded in solving a given ML task with 98 percent accuracy, you will try to get 99 percent accuracy and so forth. This is what drives the field forward. But the downside is that if you don’t have a proper benchmark for a given feature, it will not be prioritized. This has been the case for multimodal ML, and IGLUE is our attempt to change the scene,’ said Bugliarello.
It is standard to base training on images in ML. The image are usually labelled, meaning that bits of text follow each image, assisting the learning process of the machine. These labels are usually in English, so IGLUE covers 20 diverse languages, spanning 11 language families, 9 scripts, and 3 geographical macro-areas.
Some parts of the images in IGLUE are culture-specific, and are obtained through a mail campaign. Volunteers were asked to provide images and texts in their natural language about things that were important in their country.
Emanuele Bugliarello explained that the lack of multimodal ML has practical implications:
‘Let us say you have a food allergy, and you have an app which can tell you if the problematic ingredients are present in a meal. Finding yourself at a restaurant in China, you realize the menu is all in Chinese but has pictures. If your app is good, it may translate the picture into a recipe— but only if the machine was exposed to Chinese samples during training.’
Simply put, non-English speakers get poorer versions of ML-based solutions.
‘The performance of many top ML solutions will drop instantly, as they become exposed to data from non-English speaking countries. And notably, the ML solutions miss out on concepts and ideas that are not formed in Europe or North America. This is something which the ML research community needs to address,’ said Bugliarello.
He noted: ‘This all began a few years ago when we wrote a paper for the EMNLP conference (Empirical Methods in Natural Language Processing, a top conference in the field, ed.). We just wanted to point to an issue, but were soon overwhelmed with interest, and much to our surprise, our contribution was selected as Best Long Paper. People clearly saw the problem, and we were encouraged to do more.’
Bugliarello admitted that the success almost feels like a burden. ‘As a public university, we do have limited resources. We cannot pursue all aspects of this huge task. Still, we can see that other groups are joining in. We can also feel interest from the large tech corporations. They are strongly engaged in ML and are beginning to realize how English bias can be a problem. Obviously, they are not happy to see the performance of their solutions dropping significantly when applied outside English-language contexts.’
Bugliarello responded ‘oh, we are very far away,’ when asked how close we are to achieving non-biased machine learning. He stated that this wasn’t just about cultural equality.
‘The methodology behind IGLUE may find several applications. For example, we hope to improve solutions for visually impaired. Tools exist, which help visually impaired in following the plot of a movie or another type of visual communication. These tools are currently far from perfect, and I would very much like to be able to improve them. This is a bit further into the future though,’ he concluded.
By Marvellous Iwendi.
Source: University of Copenhagen.