Ethics of AI

Processing with AI

For this module, let's put aside coding and machine learning to dive into the ethical concerns raised by AI. Let's see why Artificial Intelligence should be used with caution.

An overview of Ethics & AI

Technology misuse: Well, I was not expecting that

Let's start with the most discussed problem raised by AI nowadays, Deep Fakes and other misuses of AI.

For those of you who don't know, a Deep Fake is a fake image, video, or sound generated by a Deep Learning model. One of its most impressive applications is creating a fake video of someone saying or doing something that they never did.

This technology is not bad in itself, it can be used to replace actors in a movie scene (because they died, because they cannot do a stunt or, just for fun), but it's also a really powerful tool to create fake news or even revenge porn.

In the case of a single video, especially one of a personality, with a little bit of training or even sometimes with the help of AI itself, someone will probably end up spotting the fake. But what happens if it affects everyone? How would you defend yourself if a Deep Fake of you started spreading on social networks?

These tools could be used to discredit or scam someone. Imagine creating hundreds of fake social network profiles or fake newspaper articles for example. Experts are worried about the use of these tools during future elections and preventive tools were created for the 2020 American election.

Can you spot deepfakes?

Two Minute-Paper video on one of the latest deep fake models who can change what someone is saying just by editing a transcript of a video

Bias: Are you really objective?

Gender Shades: Building a diverse dataset to evaluate classification models performance.
Learn more about Joy Buolamwini in her TED talk

As a manager, you will be faced with choices when managing a project. You will be responsible for defining and ensuring that the ethical considerations of a project are respected. The data scientist will set up a performing model within the constraints you will have established. This page will cover a non-exhaustive part of these considerations. Keep them in mind, they will be very useful later in your career.

As you know, training a machine learning model requires what we call a training dataset, a first set of observations from which you want your model to make some generalization. The problem is that sometimes, this dataset isn't entirely representative of the real world (or at least of the future data that it's going to analyze)

One of the most common problems caused by this is when the dataset does not include enough diversity, remember that machine learning is not a magical tool. For example, if you never show your model people from minorities, it will greatly underperform on them, or even won't work at all (this video isn't Machine Learning but you get the point)

Face recognition/analysis software tends to be a bit biased…

Even if minorities are represented, what if the data are skewed? The consequences can go from sad (FaceApp whitening people to make them "hot") to life-changing (sending people to jail longer than they should've) to life-threatening. Indeed, these biases might end up in the medical imaging model. What if a new medical imaging technology advertised with 99% accuracy was mainly tested on white people? It could output false negatives for other patients that would have benefited from medical treatment.

The solution for this is not simple, choosing the right way to eliminate bias is a philosophical question in itself, what outcome do you want? Equal chances? Equal fairness?

We highly recommend you to read this short article on fairness by Google's AI team. It is part of What If…'s documentation, a tool made to help data scientists detect biases in their models.

The first obvious solution is to include more diversity in your dataset. For example, IBM launched a Diversity in Faces dataset to make sure computer vision models can recognize everyone.

Another way is to make a dataset for each minority and train different models with this dataset. For example, when L'Oreal decided to make a skincare diagnostic tool, they made three datasets (for white, black, and Asian skins) then trained three different models. That way, each model was specialized to detect skin "problems" for a specific skin type. While this is a good solution for skincare (where users would understand why they have to pick their skin type/color), it would be very unapropriate to create such "race-based" dataset for other applications.

The problem is that sometimes, debiasing means gathering data on minorities or subgroups that would prefer to stay in the dark. For example, in 2017 a data scientist found that transgender people weren't accurately recognized by algorithms, so he started gathering pictures from Youtubers talking about their transition. It was bad for two reasons, first, he didn't ask for permission, two it helped transphobic people build models trained to "reveal" trans people.

You can find more information on the side effects of debiasing on this Vox article

Overfitting: I know my lesson by heart

If you speak to data scientists, they will probably tell you that their biggest enemy is called Overfitting. Overfitting is when you "overtrain" a model so that it knows the training dataset so well that it performs almost perfectly on it but is unable to generalize with new inputs.

Some of the biases that we have seen in the previous chapter were the expression of overfitting, for example, the FaceApp filter was the result of a model falsely learning that a white person is "hotter" than a black person.

Natural Adversarial Examples is a dataset of non-retouched images that fools classification models

Another problem is that overfitting makes it easy to fool a model into detecting something (this is something that you may have experienced when training your KNN). You can create an image that will be recognized as anything you want while being seen by a human as something totally different. Sometimes, changing only one pixel of an image is enough, this is something called One Pixel Attack. Imagine the consequences if your model is used for security or military purposes!

Video presenting the paper Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images.

Fortunately, unlike biases, fighting against overfitting is easy. You need to separate your dataset into two parts, a training dataset and what is called a validation dataset. You only train your model using the training one, then you check its accuracy on the validation one. If the model still performs well, it increases the chances of working on unseen data. Of course, this often requires to split the dataset randomly. Indeed, sometimes instead of a perfectly random a stratified random split can be applied when the classes are imbalanced.

With the rise of generative AI, and especially image generation, the fight for copyright has reached new heights. Indeed, entire datasets spanning the internet are known for "stealing" art to train these models (think Midjourney, Dalle-3, etc.), and artists are not happy about it. And for good reasons.
If, later on, the companies responsible for these AI tools, offered an opt-out option on their next model, it's often perceived as not enough and untrustworthy by the artist. In turn, many are starting to turn tobaking their art with special tools, to either make their art unusable by a model, or even in some cases poison the dataset. You can discover more about this process by reading this article by Melissa Heikkilä.

Open source: Open Sesame

As with any software projects, data scientist can chose to either keep the code of their models secret or to open-source it, meaning can anyone can see how their models were built, reuse part of it for something else and sometimes contribute to the project. You might think that open-source is reserved to small projects made by independant developers, but on the contrary, the biggest open-source contributors are Google and Microsoft!

Modern AI was built mainly thanks to open source and data scientist sharing their discoveries: MobileNet, TensorFlow.js, Runway and most of the tools you are using in this course are based on open source tools. While this is great for education, research and getting feedback from the community, sometimes developers prefer to keep their project closed for economic reasons or because they believe their models could be misused.

You might also know that some dataset are public, you should now be familiar with ImageNet that was used to train MobileNet, but a lot of other datasets are available on websites like Kaggle

A famous example of this is OpenAI and their Natural Language Processing (NLP) model GPT.
Officially, developers declared that they were "too scared" to release their model because of the possible misuses, but the backlash was huge and their business model obviously played a role in that decision.
Also, thanks to their fundings, they can create models that requires tremendous computational power to be trained. Hence, they are very hard to replicate (GPT-3 training cost was for example estimated at $4.6 milion) and consumes a lot of energy. Yet, this implies that having access to open source code doesn't mean that you can use it. Since, if the weights are not provided, you cannot use the model. Which is the case here with GPT.

But not all models are open source, we are using more and more tools with proprietary AI which we don't know anything about, neither their training dataset nor source code, this is called a "black box". We cannot directly audit if they are fair and how they were created. We also have no way to check if the developers included a backdoor, a way to trick the models for their own benefit.

Even when the model and the dataset are public, who can explain how a deep learning model is working? Their overly complex architecture makes them black boxes by definition and are the opposite of what we call Explainable AI, a field of research whose goal is to make models decision more understandable by humans.

On the contrary, some datasets need to stay private (as we saw previously with the transgender database) or should at least be anonymized. These sensible datasets also require what's called de-identification, which makes sure no one can retrieve the original information from the trained model.

In medicine, for example, the training datasets are the medical records of real patients that cannot be open-sourced for obvious reasons.

Comparison between open and close source consequences
To sum up, developers can either open source their model or keep it private, but can also make the same choice for their dataset.

Read more on re-identification of "anonymous" data on this blog post by Dell

A visual guide on de-identification to help you understand the concept.

Read more on the Attacks against Machine Learning


Going Further

So, what do we do now? First, we start by educating future generations on these topics (that's why this chapter is included in this course :), then we should accept that not everything can nor should become AI. There are things, like empathy, that cannot be embedded in a model, so we shouldn't trust blindly without any critical thinking their predictions.

The fact that you cannot embed empathy in a model can be debatable, see Replika and what models of this type could become with advances in AI. Also, empathy might be a philosophical question like intuition that some would argue is possible for AI since AlphaGO.

Then, we realize that design choices matter, for example, why are most vocal assistants women? We also have to think about the possible misuse of our creation, creating a model that can recognize criminals with their faces obscured might become a tool to identify political opponents

Finally, and maybe, the single most important thing is to have diversity in your development team. Most of these problems wouldn't occur if someone from a minority raised their hands during development and said "This thing is not working with me" or "My community will be harmed by what we are doing right now".

Recent studies found only 18% of authors at leading AI conferences are women, and more than 80% of AI professors are men. This disparity is extreme in the AI industry: women comprise only 15% of AI research staff at Facebook and 10% at Google. There is no public data on trans workers or other gender minorities. For black workers, the picture is even worse. For example, only 2.5% of Google’s workforce is black, while Facebook and Microsoft are each at 4%.

Discriminating Systems: Gender, Race, and Power in AI.
AI Now Institute. West, S.M., Whittaker, M. and Crawford, K. (2019).

Quiz

Quiz

Some questions may need to be looked up elsewhere through a quick Internet search!

You can answer this quiz as many times as you want, only your best score will be taken into account. Simply reload the page to get a new quiz.

The due date has expired. You can no longer answer this quiz.


Resources

Sum up


You are now ready to start the Project! Prepare yourself for new tools.