What's broken in the WM_KEYDOWN-WM_CHAR input model?

[Go to main articles index]

I was planning on writing a rant on the input model of the Win32 API. See, while developing the input model for NGEDIT, I've been pondering how to implement the key-mappings, input handling, etc. Although I had used it in the past, it wasn't in an edit-intensive, fully customizable, internationally targeted program (although I've implemented IME-based chat for China, Japan and Korea :) ). The more I looked at the model and I thought about how to handle it, the more it seemed a complete mess in my head. But I could not pin down exactly what it was, and what a better model may be.

I set out to grasp a complete understanding of the model, which I will be describing here for whoever needs it while programming (I couldn't find a fine explanation that made sense - of course MSDN is of no help in this kind of situations).

The Win32 model works like this: when a key is pressed, you get a WM_KEYDOWN message (more if the key is kept depressed past the auto-repeat time). When it is released, you get a WM_KEYUP. You can access the state of the modifier keys in that moment (SHIFT, ALT and CONTROL). Both messages indicate the pressed key with their virtual key code: an integer code which uniquely identifies a key on the keyboard.

Apart from this, as messages get translated by the TranslateMessage() Win32 function, they are checked with the current keyboard config, and an additional WM_CHAR message is sent for each WM_KEYDOWN, with the current-codepage character code of the key pressed.

This is all fine and dandy, although things start getting tricky after this.

  • In non-US keyboard mappings, or in the US international keyboard, you can use some keys to modify the character typed afterwards with an accent. If you press one of these “accent” keys, a WM_DEADCHAR is generated that probably you'd better ignore, and the next WM_KEYDOWN of a vowel character will generate a WM_CHAR message with a different char code representing the accented character.
  • When you want to handle a key in your program, you have a choice as to handle either the WM_KEYDOWN or the WM_CHAR message. Keys such as the cursor arrows or function keys don't have a character equivalent, so you need to handle WM_KEYDOWN. For regular character keys that are used for typing, you probably prefer to handle the WM_CHAR message, as it will get you either the lower case or upper case version or even the accented version in international version. A doubt always arises about what to do regarding other keys which generate both a WM_KEYDOWN and a WM_CHAR: what to do with TAB, RETURN, ESC, etc…? Even worse, the DEL key doesn't generate the ASCII DEL (127) WM_CHAR message that you would expect - it is a “mute” key WM_CHAR-wise. I think I remember reading Charles Petzold mourning the same decision in his programming windows book - I don't remember what conclusion he reached, but I remember it didn't satisfy me.
  • The virtual key code, numerically matches the uppercase-ASCII-code of the character on the keyboard, and has special codes for cursor arrows, function keys, and some weird keys called stuff like VK_OEM_1 (for the 'tilde' key in a US mapping). A key such as VK_OEM_1 is mapped to an actual alphabetic character in non-english european languages, making it difficult to distinguish alphabetic from non-alphabetic WM_KEYDOWNs.
  • Windows nicely throws a different version of the messages on you if the ALT key is pressed at the same time: you get WM_SYSKEYDOWN and WM_SYSCHAR instead of the regular WM_KEYDOWN and WM_CHAR. You also get WM_SYSKEYUP for the sake of consistency. This messages usually get handled by DefWindowProc() in order to bring up the corresponding menu item and some special key processing. Curiously, pressing the ALT key by itself and releasing it afterwards, buys you a WM_SYSKEYDOWN/WM_SYSKEYUP pair for the ALT key. But if you press ALT, then a character key, and release both, you will all WM_SYS* messages, except for the KEYUP's for keys released after the ALT key (and including it!), which generate regular WM_KEYUP's. The F10 key as well is a “special key”, generating WM_SYS* versions of its KEYDOWN - I guess because that brings up the menu as well, and I'm unsure about whether its KEYUP is SYS or not. No wonder the virtual key code of the ALT key is VK_MENU.
  • I couldn't obviously derive from MSDN whether having CTRL or SHIFT pressed at the same time as ALT would buy me SYS messages or plain vanilla messages.
  • European and US-int'l mapping treat the right ALT key as a different AltGr key, that buys some extra characters that are not placed in regular positions. This key is sent to the application as having both CTRL and ALT pressed at the same time, and the system takes care of mapping the WM_KEYDOWN to the appropriate char code for the WM_CHAR message.
  • If you are interested in your Asian customers, you have a new world of IME messages to serve you. It opens a whole new avenue for spec nightmare - suffice it to say that you will be receiving two WM_CHAR messages in a row for some complex kanji or hanzi or hangeul double-byte character, corresponding to no keypress, as the user selects complex characters from his/her IME window. Your US-only-thought application may even work if you just don't shuffle your WM_CHAR's too much.

After writing the list, I'm looking back at the title of the post and wondering if it's really a good question. Anyway.

Coming back to development, I keep a “todo list” of vi/vim commands that need implementation. It happens that the innocent “Redo” command in vi/vim is achieved by the Ctrl-R key combination. There I went and mapped the ‘r' WM_CHAR with the CTRL key modifier set to the undo command, pressed Ctrl-R and… nothing. To no avail. I checked what the program was receiving, and found out that Windows was translating the VK_R keypress with CTRL into the proper C-R ASCII code, which happens to be 0×12 or 18 in decimal.

After a few days, I finally decided to drag myself to do it, and started researching all the intrincacies of key input, dumping their results, and trying to build a clear mental image - I found out some stuff that I didn't know, haven't seen anywhere on the web (of course, MSDN is less than complete).

Here is a nice list of the messages that the R key can generate in its different combinations:


                  nothing   CTRL    ALT    CTRL+ALT
        nothing    KD+'r'  KD+^R  SKD+S'r'    KD
        SHIFT      KD+'R'  KD+^R  SKD+S'r'    KD
    

The row selects between having Shift pressed or not pressed. The column selects between the combination of CTRL and ALT that is pressed. ‘KD' represents the WM_KEYDOWN message, SKD represents WM_SYSKEYDOWN. A ‘x' character represents the WM_CHAR message. S'x' represents WM_SYSCHAR. ^X means the ASCII C-X control code in WM_CHAR. WM_KEYUPS are not shown. And of course, the char code in the left is toggled depending on the CAPSLOCK status (I'm unsure about what happens when CAPSLOCK is on and ALT is used together with the key, although your program'd better not behave too differently).

Other non-alphabetic keys get similar mappings, although they don't get an WM_CHAR when pressed together with CTRL (in most cases, as we'll see in a moment).

I set out to try to complete my understanding on the mapping, and wondered what would happen with other keys. It turns out, Windows goes a long length towards trying to get the ASCII code that may correspond to the keypress (in some cases quite dubious to my understanding of affairs).

All (English) alphabetic keys (from A to Z all included) get mapped to their control codes (ASCII 1 to 26). That means that if you press Ctrl-M, your program will receive a nice WM_CHAR message with the 13 code, usually corresponding to the RETURN key. Be sure to filter that if you provide Ctrl+Key mappings! It even works as a replacement to RETURN in unsophisticated programs such as notepad.

Then I set out to find out if Windows would generate the rest of the control codes - poking around, looking at ASCII charts in more detail that I'd like to, I found out that you can generate WM_CHAR with ASCII 28 with C-\, ASCII 29 with C-], ASCII 30 with C-S-6 (which is more like C-^), and ASCII 31 with C-S-hyphen (which should be read as C-_). I usually use a non-US keyboard, for which I can tell you that the ASCII mapping is a bit lousier than for the US mapping, as the ‘\[]^' symbols are moved around to other keys but Windows still performs the mapping as if the keyboard had US keytops, except in the case of DEADCHAR generating keys which simply behave differently…

The ASCII code 27, that is, ESC, can be both generated by the ESC key and by the C-[ combination.

And I also discovered some curious behavior: RETURN by itself generates WM_CHAR with ASCII 13, while C-RETURN generates WM_CHAR with ASCII 10. I was about not to tell you that C-S-RETURN generates no WM_CHAR, in order to get you lost if you aren't already :)

And a misterious ASCII code, DEL (127), was conspicuously absent on the keyboard - the DEL key would not generate it with any combination of modifier keys. But finally I found it: while BACKSPACE generates ASCII 8 (BS), combined with the CTRL key it generates the famous ASCII 127 DEL character.

Wow… I sincerely hope someone this explanation turns out useful or interesting for anyone. I would've been happy to find it myself a couple of weeks ago.

With this understanding, I could finally go and map the Ctrl-R key to the Undo command, go around binding other keys, and go on. I could also comment it with geek friends and all have a good laugh. But something was still dangling in my mind. Apart from awkward, the system seemed flawed, but I could not pinpoint why.

Even if this is all a mess, and disregarding the far-east input stuff, I was thinking that I had to come up with a better design if I was to rant about it. Of course, the SYS nightmare should be easy to fix in a new design (messages should be the same, and the fact that ALT or F10 or ALT-Shortcut brings the menu should be stuffed somewhere else), but something seemed to be fundamentally wrong and I could not grasp it.

Finally, I realized today.

The input model is losing one piece of critical information: the fact that WM_KEYDOWN and WM_CHAR are separate and completely unrelated loses the concept of a keypress. And there is no way of bringing that back.

See, having the system “cook” some type of information, such as the character mapping of the key if the key has a sensible one, is ok. Receiving extra information is great and the system will do a much better job of supporting all national specifics than the app. Even if that information is “cooked” in a weird way, such as the SYS stuff (or even the DEADCHAR stuff), it is ok and you could just cope with it. But you dearly want to be able to handle a keypress in some way, either using or ignoring the “cooked” information or just the raw virtual key code.

Realizing this, I could finally think up what a better scheme would be. Wouldn't it be great if Windows provided a WM_KEYPRESS message including all the information? Say, the virtual key code, the fully-cooked ASCII mapping that WM_CHAR gets, some sort of indication of a DEADCHAR keypress, the state of the modifier keys, maybe even the uncooked keytop ASCII char if that exists?

Probably, Asian IME could generate keypresses with a null virtual key code to send weird characters to un-aware applications just handling input, but making it clear that no key was pressed for that character to be sent to the application, and the application would not try to match it up with keyboard mappings. Key maps would always refer to virtual keys, and text input would always just take the fully-cooked ASCII component. And we would all be happy.

The good thing is that this could have been implemented in any moment, as adding a new message does not create incompatibility. But we are now stuck with the really weird WM_KEYDOWN-WM_CHAR model.

I won't hold my breath until Microsoft includes something along the line :)

(Link to original blog post)