Google's Developing New Systems to Weed Out Pre-Existing Bias in Machine Learning Datasets

Published Aug. 10, 2021

Andrew Hutchinson Content and Social Media Manager

As we increase our reliance on machine learning, and automated systems that are built on usage data and consumer insights, one thing that researchers need to work to avoid is embedding unconscious bias, which is often already present in their source data, and can therefore be further amplified by such systems.

For example, if you were looking to create an algorithm to help identify top candidates for an open position at a company, you might logically use the company's existing employees as the base data source for that process. The system you create would then inevitably be skewed by that input. More males already employed might see male applicants weighted more heavily in the results, while fewer people of certain backgrounds or races could also sway the output.

Given this, it's important for AI researchers to maintain awareness of such bias, and mitigate it where possible, in order to maximize opportunity, and eliminate pre-existing leanings from input data sets.

Which is where this new research from Google comes in - this week, Google has launched its Know Your Data (KYD) dataset exploration tool, which enables researchers to identify existing biases within their base data collections, in order to combat pre-existing bias.

As you can see in this example, using image caption data, the tool enables researchers to examine their datasets for, for example, the prevalence of male and female images within a certain category. Through this, research teams may be able to weed out bias at the core, improving their input data, thereby reducing the impact of harmful, embedded stereotypes and leanings based on existing premises.

Which is an important step. At present, the KYD system is fairly restricted as to how it can extract and measure data examples, but it points to an improved future for such analysis, which could help to lessen the impacts of bias within machine learning systems.

And given that more and more of our interactions and transactions are being guided by such processes, we need to be doing all we can to combat these concerns, and ensure equal representation and opportunity through these systems.

We have a long way to go on this, but it's an important step for Google's research, and for broader algorithmic analysis.

You can read Google's full overview of its evolving KYD system here.

Google's Developing New Systems to Weed Out Pre-Existing Bias in Machine Learnin...

Google's Developing New Systems to Weed Out Pre-Existing Bias in Machine Learning Datasets

Recommend

OPPO「屏下摄像头」终于来了！拍照升级，没打孔完全看不出…｜上手体验

YouTube Launches New Video Ad Campaign to Promote Shorts Usage

专业摄影师教你如何用一台手机为你的宠物“拍肖像”

Google Looks to Simplify Custom Bidding, and Provide More Transparency Over its...

「小米MIX4」上手评测！难产3年归来，这次才是真全面屏…

Net neutrality: reacting to the Executive Order on Promoting Competition in the...

Microsoft Explains How to Fix Missing Windows 11 Beta Channel

聚焦IPO | 东南电子在经营上被大客户“拿捏”，高市场份额、高毛利率真实性存疑

Blog - The Art of Chawye Hsu

链接 - The Art of Chawye Hsu

About Joyk