The Unintended Consequences and Negative Impact of New Machine Learning Applications

The Unintended Consequences and Negative Impact of New Machine Learning Applications

Machine learning applications are becoming more powerful and more pervasive, and as a result the risk of unintended consequences increases and must be carefully managed. Recent glitches by major companies demonstrate a failure to detect unintended consequences. In this post I describe a few examples and discuss ways to reduce unintended consequences associated with new machine learning applications.

It seems that machine learning is taking over the high-tech world. Most consumer applications have a machine learning component, and the recent progress in machine learning technology and scalable infrastructure technology lead to new use-cases that were previously considered strictly experimental. Forr example, improvements in speech recognition have recently led to a variety of digital assistants (MS Cortana, Amazon Alexa, Apple Siri, Google Now). Another example is new computer vision technology that led to automatic photo labeling. 

Creating new applications that use machine learning is difficult work. And so it is understandable that when engineers and scientists make a technological breakthrough and have an effective product they rush to market without carefully considering possible unintended consequences. On the other hand, product managers more often than not do not have the engineering and science background to provide necessary guidance and leadership. 

The ad matching algorithm may have picked up a signal in the data, but the end result is that it discriminated based on gender. In other words, the algorithm picked up on the gender pay gap and helped to perpetuate it.

I describe below three recent examples.

  • In May 2015 Flickr released an automatic image tagging capability that mistakenly labeled a black man for an ape, Auschwitz for sports, and Dachau for a jungle gym. After the story broke out, Flickr apologized and removed some offensive tags. Obviously, the engineers, scientists, and product managers did not foresee these mistakes and the anger that would follow. Instead of celebrating a technological win, Flickr suffered a PR blow.
  • Soon afterwards, Google came up with a photo labeling tool similar to Flickr, which made similar mistakes. Black men were tagged as gorillas. Google removed the offensive tags and Google’s Yonatan Zunger admitted the mistake saying “Lots of work being done, and lots still to be done. But we’re very much on it”. Amazingly, they made the same mistake as Flickr in not understanding the impact of possible mistakes on their users and on the public. The result again was offending a wide segment of the population rather than scoring a PR win due to technological “success”.
  • A recent Carnegie Mellon University study showed that Google displayed ads in a way that discriminated based on the gender of the user. The study created multiple fake persona, some males and some females, with identical browsing history. Later, a third party site showed Google ads for senior executive positions six times more to the fake men than to the fake women. The ad matching algorithm may have picked up a signal in the data, but the end result is that it discriminated based on gender. In other words, the algorithm picked up on the gender pay gap and helped to perpetuate it. There are likely millions of people today who are affected by such data-based discrimination but it is very hard to prove (without the creation of fake persons with controlled browsing history as the CMU study did).

There are more examples, but these three cases above are sufficient to show a trend. Machine learning technology is advancing fast and neither the engineers and scientists nor the product managers are able to predict negative unintended consequences and slow down new launches (at least in some cases). 

Some may argue that this is not such a big deal. The algorithms pick up signals in the data and when the team discovers that there is a “glitch” they fix it. One problem is that there may be a very large number of undetected cases with a very large impact on society. The Google ads case is one example where systematic discrimination was being perpetrated on a very large number of Google users for potentially a long time (and perhaps still is). A second problem is that the pace of innovation is high and it is hard to predict what future ML applications will do and what would be the possible unintended consequences. With self driving cars, for example, this could be a life or death situation.

Engineers and scientist need to broader their perspective and think beyond technology about company and society impact, and product managers need to have a better technical understanding of machine learning.
Based on the examples above and others, it is pretty clear that every novel machine learning application should have a careful review of estimating unintended consequences and estimating the corresponding negative impact, and if needed escalate to company executives for resolution.

This brings us to the question of what can we do to decrease the chance of such blunders happening in the future. Completely stifling machine learning innovation is not a desirable solution. There seem to be two possibilities: put the burden of anticipating unintended consequences on the engineers and scientists that develop the applications, or on product managers that provide guidance and direction.

There are two issues at hand: (a) estimating unintended consequences, and (b) assigning value or negative impact to them. Engineers and scientists are better at (a), while product managers (and if needed company executives) are better at (b). But it probably makes sense for these professionals to refine their answers to (a) and (b) by working together. Engineers and scientist need to broader their perspective and think beyond technology about company and society impact, and product managers need to have a better technical understanding of machine learning.

Based on the examples above and others, it is pretty clear that every novel machine learning application should have a careful review of estimating unintended consequences and estimating the corresponding negative impact, and if needed escalate to company executives for resolution.

We invested in an app that advise You on what to wear to be publicly recognize as beautiful based on deep learning. I come across another problem. It lead me to believe stable and reliable is actually the biggest enemy of machine learning. It will feed in the data it indirectly emits , creates an exaggerate pool of data, and a loop that limits creativity. Randomness beyond existing logic is actually needed for advancement. Something that machine is not very good at.

Like
Reply
Bart Minsaer

Founder & CEO HelloSolar I PAYG I Fintech I Agritech I Telco I Exec. Coach I Digital Transformation I MVNE I Start up

8y

Maybe these consequences indicate the boundaries of the "fail forward" strategy. If these "products" would have to be bought and actively consumed by customers the testing processes would probably be more rigorous with regard the issues you mentioned.

Like
Reply
Tim Woodruff

Data Integration and Analytics Consultant

8y

but.....marketing ;)

Like
Reply
Mariano Martín Perez

Financial Advisor |Edward Jones |Northern Trust |Morgan Stanley | UBS | Board Member | Investment Committees | Bilingual |Trusted advisor to families, foundations, religious institutes, endowments, and institutions

8y

Scary Thanks for sharing

Like
Reply
Mike Thompson

Senior Software Engineer, Microsoft

8y

I think the development teams might learn from the domain of diagnostic medical testing. A test's accuracy is described with two numbers; the false positive rate, when the test says the condition is present, but it actually isn't, and the false negative rate, when the condition is present, but not detected (https://en.wikipedia.org/wiki/Sensitivity_and_specificity). There is a trade-off. It appears the teams in your examples worked so hard to ensure nothing went unidentified that they failed to consider how it could increase incorrect identifications. I think there are three lessons here for anyone developing new products. * Break down the silos - Too often engineers, product managers, etc. are happy to stay in their own little world, trying to minimize interactions with "those people". As Guy suggests, they need to work together and understand more of the big picture. * Engineers need to remember they are solving problems within a context, not just building new gadgets. I've been guilty of getting so wrapped up in the fun of creating a new widget, I've lost sight of delivering value to the "consumer". We need to understand. We need to move beyond "how", and embrace "who", "what", "when", "where", and "why". * Risk mitigation, asking what can go wrong, should be part of everyone's job. I believe most people are predisposed to optimism. It is part talent and part learned skill to think about failure cases, errors, oversights, and omissions. The obvious next step is planning for what to do when something goes awry, because sooner or later, it will.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics