In May, Google executives unveiled experimental new artificial intelligence trained with text and images that they said would make internet searches more intuitive. On Wednesday, Google offered an insight into how technology will change the way people search the web.
Starting next year, the Multitask Unified Model, or MUM, will allow Google users to combine text and image searches using Lens, a smartphone app that is also incorporated into Google search and other products. So you can e.g. Take a picture of a shirt with a lens and then search for “socks with this pattern”. Searching for “how to fix” an image of a bike part will show instructional videos or blog posts.
Google will incorporate MUM into the search results to suggest additional avenues for users to explore. For example, if you If Google asks how to paint, MUM can provide detailed step-by-step instructions, style guides, or how to use homemade materials. Google also plans in the coming weeks to bring MUM to YouTube videos in search, where AI will show search suggestions under videos based on video transcripts.
MUM is trained to draw conclusions about text and imagery. Integrating MUM into Google’s search results also represents a continuing march toward the use of language models that rely on large amounts of text scraped from the web and a kind of neural network architecture called Transformer. One of the first such efforts came in 2019, when Google injected a language model called BERT into the search results to change web rankings and summarize the text below the results.
Google Vice President Pandu Nayak said BERT represented the biggest change in search results in the better part of a decade, but that MUM takes the language comprehension AI applied to Google search results to the next level.
For example, MUM uses data from 75 languages instead of English alone, and it is trained in imagery and text instead of text alone. It is 1,000 times larger than BERT measured in the number of parameters or connections between artificial neurons in a deep learning system.
While Nayak calls MUM a major milestone in language comprehension, he also acknowledges that major language models have known challenges and risks.
BERT and other transformer-based models have been shown to absorb bias found in the data used to train them. In some cases, researchers have found that the larger the language model, the worse the bias and toxic text are amplified. People working to detect and change the racist, sexist and otherwise problematic output of large language models say that scrutiny of text used to train these models is crucial to reducing harm and that the way data is filtered , can have a negative impact. In April, the Allen Institute for AI reported that blocklists used in a popular dataset Google used to train its T5 language model could lead to the exclusion of entire groups, e.g. People who identify as queer, making it difficult for language models to understand text by or about these groups.
In the past year, several AI researchers at Google, including former ethics AI teams Timead Gebru and Margaret Mitchell, have said they faced opposition from leaders to their work, which shows that large language models can harm people. Among Google employees, the postponement of Gebru following a dispute over a paper critical of the environmental and social costs of major language models led to accusations of racism, trade union calls and the need for stronger whistleblower protection for AI ethics researchers .
In June, five U.S. senators cited several incidents of algorithmic bias in the Alphabet and the postponement of Gebru among reasons to question whether Google products like search or Google’s workplace are safe for black people. In a letter to executives, senators wrote: “We are concerned that algorithms will rely on data that reinforces negative stereotypes and either excludes people from seeing ads for housing, employment, credit and education or only shows predators.”