5

Why does trying to break into the NT 3.1 kernel reboot my 486DX4 machine?

 2 years ago
source link: https://retrocomputing.stackexchange.com/questions/19655/why-does-trying-to-break-into-the-nt-3-1-kernel-reboot-my-486dx4-machine
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Short explation

The Windows NT 3.1 kernel is incompatible with enhanced 486 processors. Specifically, it is incompatible with 486 processors providing the CPUID instructions. Kernel debugging works fine with the 486DX-33 that was originally installed in the machine, and with the older non-enhanced core in a write-through Am486DX4-NV8T without SMM.

If your goal is just toying around with NT 3.1 kernel debugging, you might want to use a processor that is compatible with Windows NT 3.1 out-of-the-box. If you are as curious as me, you might want to fix Windows NT. Keep reading in this case.

The underlying issue

The incompatibility is due to a bug in KiSaveProcessorControlState (and a similar bug in the counterpart KiRestoreProcessorControlState), which is called from three locations inside NTOSKRNL.EXE:

  1. When an exception is reflected to the kernel debugger using KdpTrap (If I use Ctrl-C to break into the kernel debugger, a breakpoint exception is raised from the break-in polling functionality in the timer tick interrupt)
  2. When KeBugCheckEx is called (i.e. the "blue screen")
  3. When KiSaveProcessorState is invoked. This appears to never happen, as this function is neither exported nor called from inside NTOSKRNL if the control flow analysis by IDA in NTOSKRNL.EXE is exhaustive.

This function is supposed to save the processor control registers into an extended CONTEXT structure. Its disassembly looks like this:

.text:80106740 ; __stdcall KiSaveProcessorControlState(x)
.text:80106740                 public _KiSaveProcessorControlState@4
.text:80106740 _KiSaveProcessorControlState@4 proc near
.text:80106740
.text:80106740 dest            = dword ptr  4
.text:80106740
.text:80106740                 mov     edx, [esp+dest]
.text:80106744                 xor     ecx, ecx
.text:80106746                 mov     eax, cr0
.text:80106749                 mov     [edx+0CCh], eax
.text:8010674F                 mov     eax, cr2
.text:80106752                 mov     [edx+0D0h], eax
.text:80106758                 mov     eax, cr3
.text:8010675B                 mov     [edx+0D4h], eax
.text:80106761                 mov     [edx+0D8h], ecx
.text:80106767                 cmp     ds:word_FFDFF138, 5
.text:8010676F                 jb      short @@before_pentium
.text:80106771                 mov     eax, cr4
.text:80106774                 mov     [edx+0D8h], eax
.text:8010677A @@before_pentium:
.text:8010677A                 mov     eax, dr0
.text:8010677D                 mov     [edx+0DCh], eax
.text:80106783                 mov     eax, dr1
.text:80106786                 mov     [edx+0E0h], eax
.text:8010678C                 mov     eax, dr2
.text:8010678F                 mov     [edx+0E4h], eax
.text:80106795                 mov     eax, dr3
.text:80106798                 mov     [edx+0E8h], eax
.text:8010679E                 mov     eax, dr6
.text:801067A1                 mov     [edx+0ECh], eax
.text:801067A7                 mov     eax, dr7
.text:801067AA                 mov     dr7, ecx
.text:801067AD                 mov     [edx+0F0h], eax
.text:801067B3                 sgdt    fword ptr [edx+0F6h]
.text:801067BA                 sidt    fword ptr [edx+0FEh]
.text:801067C1                 str     word ptr [edx+104h]
.text:801067C8                 sldt    word ptr [edx+106h]
.text:801067CF                 retn    4
.text:801067CF _KiSaveProcessorControlState@4 endp

This function is supposed to save all control register (CR0, CR2, CR3, and CR4 on pentium and later processors), all debug registers (DR0-DR3, DR6, DR7) and various global protected mode settings (the address of the GDT, the address of the IDT, the selector of the active TSS and the selector of the LDT). To detect the processor type, it uses a value from the KPRCB (Kernel Processor Control Block). The KPRCB is part of the KPCR (Kernel Processor Control Region). The KPRCB for the boot processor (or the only processor on uniprocessor systems) is located at virtual address FFDFF120, which is hard-coded into this method. Geoff Chappell writes this about the relevant part of the KPRCB in NT 3.1:

+018   CHAR CpuType;
+019   CHAR CpuID;
+01A   UShort CpuStep;

These members of the KPCRB are initialized by KiSetProcessorType, which identifies the relevant processors correctly (but be aware that it mistrusts processors that report a CPUID feature level above 3 and considers them as "generic non-CPUID capable 586 compatible processors". The byte at offset 18 is set to 4 for 486 processors, 5 for Pentium processors and 6 for Pentium Pro and Pentium II/III processors. The byte at offset 19 is a boolean flag that indicates whether the processor support CPUID and it behaves "reasonable".

A very attentive reader might already have noticed the bug: The CMP instruction uses the word at address FFDFF138 (which is 18h bytes into the KPRCB), instead of the byte at that address. This means the byte at offset 19h in the KPRCB is considered part of the model number. If a processor supports CPUID, its model number is considered to be 256 bigger than it actually is. This means Windows NT 3.1 treats a CPUID capable 80-4-86 processor as 80-260-86 processor. And as 260 is way larger than 5 (Pentium), that processor better had CR4.

The fix

The fix is obvious once the bug is identified. The instruction cmp ds:word_FFDFF138, 5 only appears twice in NTOSKRNL.EXE, specifically in KiSaveProcessorControlState and KiRestoreProcessorControlState, and it needs to be patched to be a byte compare instead of a word compare. Use your favorite hex editor to patch 66 83 3D 38 F1 DF FF 05 to 90 80 3D 38 F1 DF FF 05, two times. This fix applies both the NTOSKRNL.EXE from the original NT 3.1 Advanced Server distribution as well as NT 3.1 SP3.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK