Can ChatGPT pass a CodeSignal assessment?

What generative AI means for technical hiring, now and in the future

ChatGPT, OpenAI’s artificial intelligence-powered chatbot, has inspired both awe and fear in the tech industry and beyond. Since its public launch in November 2022, ChatGPT has proven to be adept at drafting college essays, analyzing data, and even writing code—with the latter having obvious implications for technical hiring.

Working in partnership with OpenAI, CodeSignal’s data and engineering teams have analyzed the impact of ChatGPT on our live and asynchronous technical assessments, now and in the future. In this blog post, we’ll address a few of the most common questions about technical hiring and AI-powered tools like ChatGPT:

What is ChatGPT, and how good is it at writing code?
Is it possible to tell if a candidate used ChatGPT or another generative AI solution to solve a CodeSignal question?
What’s the risk of a candidate using ChatGPT to cheat on a CodeSignal assessment?
In the longer term, how should we respond to and even make use of ChatGPT in CodeSignal evaluations?

Overview of ChatGPT as a code assist tool

ChatGPT, short for Chat Generative Pre-trained Transformer, is a browser-based chatbot that excels at mimicking human speech. It works a bit like a search engine—but rather than producing a list of search results in response to a query, it generates an original and coherently worded (though often factually incorrect) answer.

If prompted, it can also generate and improve code in a number of popular coding languages, including Python, JavaScript, and C++. Researchers recently tested its skill as a debugger, and found ChatGPT to be on par with or even beat out other automated code repair tools.

The power of ChatGPT to do complex tasks (like writing and debugging code) lies in the innovative training of its AI model, which included billions of data points and human trainers. In fact, CodeSignal worked in close collaboration with OpenAI as they developed and trained the GPT-3 and Codex models that power ChatGPT. Some of our anonymized coding data and questions have even been used (with our consent!) in the training of ChatGPT models.

However, just as ChatGPT often makes convincing-sounding statements that are in fact false, it also can write code that looks correct but is not. For this reason, the popular developer forum, Stack Overflow, banned answers from ChatGPT given its unreliability. Stack Overflow gave the following reasoning for this decision:

Overall, because the average rate of getting correct answers from ChatGPT is too low, the posting of answers created by ChatGPT is substantially harmful to the site and to users who are asking or looking for correct answers.
Stack Overflow, Dec. 2022

So, what does all this mean for technical hiring and for software engineering in general? Here at CodeSignal, we believe it means that a major shift in how software engineers work—and how they should be evaluated in the recruiting process—is on the horizon. However, we aren’t there yet, due to limitations with current generative AI technology. These limitations, for one, mean that ChatGPT isn’t great at solving questions in a CodeSignal assessment—for a few reasons that we expand on below.

There are clear tells of ChatGPT-generated code

First, the use of generative AI tools, including ChatGPT, often leaves footprints of detectable indicators that CodeSignal currently tracks. Some of the red flags that raise suspicion include:

Over-commenting of code
Unnecessary statements
Code that is incompatible with the format of CodeSignal’s IDE

While any one of these attributes may appear in original code from a candidate, the combination of multiple red flags may indicate suspicious behavior. Proctoring and activity monitoring of CodeSignal assessments provide an additional layer of protection against cheating via ChatGPT.

ChatGPT-based cheating is rare on CodeSignal assessments—and when it happens, it’s ineffective

Additionally, we’ve found that ChatGPT is very rarely used by candidates to cheat their way through the hiring process. CodeSignal has analyzed the activity logs for all of our Skills Evaluation Framework-backed assessments to identify where a coding assist tool may have been used. We are confident less than 1.5% of CodeSignal assessments have ChatGPT-based plagiarism.Where proctoring was enabled, the vast majority of these instances were caught during our review process and the results were marked as unverified.

Moreover, of this 1.5%, the vast majority of candidates did not score high enough to pass the assessment. From our extensive testing of ChatGPT on our Framework-backed assessments, we’ve found that in 91% of cases, ChatGPT only makes it possible to earn up to half of all available points. In other words: the chances that a candidate can use ChatGPT to answer questions andpass a Framework-backed CodeSignal assessment are extremely low.

The future of technical hiring must embrace, rather than shun, AI tools

At CodeSignal, we believe that AI-assisted coding solutions like ChatGPT and Github Copilot will eventually become the norm in software development. Looking to the future, we plan to regularly adapt our Framework-based assessments to continue mimicking what real-world developers do in their respective job categories.

In the meantime, CodeSignal looks forward to serving as a beta tester for OpenAI’s forthcoming ChatGPT detector tool. As the solution matures, we will be able to provide our customers a consistent way to differentiate AI-written versus human-written code from within CodeSignal coding reports.

We believe ChatGPT is here to stay and will become a tool that software developers and engineers use in their everyday work. The future of technical hiring requires learning how to embrace this tool and innovate new ways of evaluating core engineering skills that incorporate ChatGPT and other AI resources.

What generative AI means for technical hiring, now and in the future

Overview of ChatGPT as a code assist tool

There are clear tells of ChatGPT-generated code

ChatGPT-based cheating is rare on CodeSignal assessments—and when it happens, it’s ineffective

The future of technical hiring must embrace, rather than shun, AI tools

Recommend

Hubble Contextual In-Product Surveys

WhatsApp: you will finally be able to correct your messages

Here’s how Mercedes hopes its new OS will give it an advantage

安宏资本投资有害生物防治及卫生解决方案供应商史伟莎集团

Passing Output Parameters to Stored Procedures With Dapper - C#

AMD Ryzen 9 7945HX stands toe-to-toe with Intel’s flagship 13th-gen HX series

Microsoft Bing AI ends chat when prompted about 'feelings'

小红书WILL商业大会发布“种草值”，让产品种草可衡量

Coaching Your Team as a Collective Makes It Stronger

TP ICAP Is Supplying High-Quality Forex Data to Blockchains Through Chainlink (L...

About Joyk