2

It's okay to be contrary, but you need to be consistently contrary: Going agains...

 2 years ago
source link: https://devblogs.microsoft.com/oldnewthing/20211210-00/?p=106021
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

It’s okay to be contrary, but you need to be consistently contrary: Going against the ambient character set

Raymond

December 10th, 2021

In Windows, you declare your character set preference implicitly by defining or not defining the symbol UNICODE before including the windows.h header file. (Related: TEXT vs. _TEXT vs. _T, and UNICODE vs. _UNICODE.) This determines whether undecorated function names redirect to the ANSI version or the Unicode version, but it doesn’t make the opposite-version inaccessible. You just have to call them by their explicit names. And it’s important that you be consistent about it. If you miss a spot, the characters get all messed up.

// UNICODE not defined
#include <windows.h>

void UpdateTitle(HWND hwnd, PCWSTR title)
{
    SetWindowTextW(hwnd, title);
}

In the above example, we did not define the symbol UNICODE, so the ambient character set is ANSI. Since we want to call the Unicode version of Set­Window­Text, we must use its explicit Unicode name Set­Window­TextW.

Most of the time, these errors are detected at compile time due to type mismatches. For example, if we forgot to put the trailing W on the function name, we would get the error

error C2664: 'BOOL SetWindowTextA(HWND,const char *)': cannot convert argument 2 from 'const wchar_t *' to 'const char *'
note: Types pointed to are unrelated; conversion requires reinterpret_cast, C-style cast or function-style cast

And that’s your clue that you forgot to W-ize the Set­Window­Text call. You should have called the W version explicitly: Set­Window­TextW.

However, there’s a category of functions that elude this compile-time detection: The functions that have separate ANSI and Unicode versions but take only character-set-independent parameters. Common examples are Dispatch­Message, Translate­Message, Translate­Accelerator, Create­Accelerator­Table, and most notably, Def­Window­Proc.

For some reason, when I get called in to investigate this sort of problem, it’s usually the Def­Window­Proc that is the source of the problem.

But I don’t think it’s because people get the others right and miss the Def­Window­Proc. I think it’s because the mistakes in the other functions are much less noticeable. The mistakes are still there, and maybe you’ll get a bug report from a user in Japan when they run into it, but that’s not something that is going to be noticed in English-based testing as much as a string that is truncated down to its first letter.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK