- Machine learning algorithms identify patterns in data and make predictions for new, unseen data based on those patterns.
- Using a large amount of data doesn’t necessarily make an algorithm more precise, and not all correlations imply causation. Unwanted correlations can lead to biased algorithms that perform poorly on new data.
- Unwanted discriminations may happen without explicitly providing sensitive personal data, as other attributes can implicitly reveal this information.
- Human bias is a well-studied form of bias that can find its way into machine learning algorithms through biased training data.
- Selection bias, or bias in the process of collecting data, is another source of bias that can cause ML algorithms to learn and enforce bias. Because algorithms can be deployed at scale, even minimal systematic errors can lead to reinforced discrimination.