1

Which AI Writes the Best Code or Generates the Most Realistic Image?

 1 month ago
source link: https://www.nytimes.com/2024/04/15/technology/ai-models-measurement.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
00roose-aimeasure-articleLarge.jpg?quality=75&auto=webp&disable=upscale
Credit...Davide Comai

The SHifT

A.I. Has a Measurement Problem

Which A.I. system writes the best computer code or generates the most realistic image? Right now, there’s no easy way to answer those questions.

By Kevin Roose

Reporting from San Francisco

There’s a problem with leading artificial intelligence tools like ChatGPT, Gemini and Claude: We don’t really know how smart they are.

That’s because, unlike companies that make cars or drugs or baby formula, A.I. companies aren’t required to submit their products for testing before releasing them to the public. There’s no Good Housekeeping seal for A.I. chatbots, and few independent groups are putting these tools through their paces in a rigorous way.

Instead, we’re left to rely on the claims of A.I. companies, which often use vague, fuzzy phrases like “improved capabilities” to describe how their models differ from one version to the next. And while there are some standard tests given to A.I. models to assess how good they are at, say, math or logical reasoning, many experts have doubts about how reliable those tests really are.

Subscribe to The Times to read as many articles as you like.


Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK