3

Does (E)IP Wrap Around in 16-bit Segments?

 1 year ago
source link: http://www.os2museum.com/wp/does-eip-wrap-around-in-16-bit-segments/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Does (E)IP Wrap Around in 16-bit Segments?

The 8086/8088 is a 16-bit processor and offsets within a 64K segment always wrap around. If a one-byte instruction at offset FFFFh is executed on an 8086, execution will continue at offset 0. This is simply a consequence of the Instruction Pointer (IP) being a 16-bit register.

Funny things happen when an access crosses a segment boundary. On an 8086, it will also wrap around; accessing a word at offset FFFFh will access one byte at offset FFFFh and one byte at offset 0 in a segment. Again, that is a consequence of 16-bit address calculations.

The 80286 got a lot smarter about this. Segment protection prevents accesses that wrap around the end of a segment, for both data and instructions. The 80386 continued using the same logic.

The 286 and 386 support one special case, stack wraparound. When the 16-bit Stack Pointer (SP) is zero, pushing (say) a word on the stack will wrap around and the new SP will be FFFEh. This feature was required for 8086 compatibility, because a full size 64K stack needs to start with SP=0 (the pushes and pops must be aligned for the wraparound to occur; unaligned accesses will cause protection faults).

Does the instruction pointer also wrap around in a way similar to the stack segment?

Let’s consider the following simple DOS program:

.model	small

.code
	mov	dx, offset msg_bot
	mov	ah, 9
	int	21h
	
	mov	ax, 4C00h
	int	21h
	
_start:
	mov	ax, _DATA
	mov	ds, ax
	
	mov	dx, offset msg_str
	mov	ah, 9
	int	21h
	
	jmp	near_end
	
	org	0FFF8h
near_end:
	mov	dx, offset msg_top
	mov	ah, 9
	int	21h
	inc	ax

.data

msg_bot	db	'Wrapped around to start of segment',13,10,'$'
msg_top	db	'Near top of code segment',13,10,'$'
msg_str	db	'Entered program',13,10,'$'

.stack

	end	_start

The program is constructed such that the one-byte ‘inc ax’ instruction is at offset FFFFh in the code segment.

When executed on a typical PC compatible system, the program will print the following:

C:\>wrap
Entered program
Near top of code segment
Wrapped around to start of segment

C:\>

Clearly the instruction pointer wrapped around 64K. Case closed.

But wait! Not so fast. Although it looks like the IP wrapped around, what actually happened is a bit more complicated, and much more interesting.

After executing ‘inc ax’ on a 386 compatible CPU, the EIP instruction pointer will not wrap to zero but rather advance to 10000h. This will trigger a #GP (General Protection) fault when attempting to execute the next instruction (of course, given that 10000h is past the 64K segment limit).

The #GP fault vector is 13 (0Dh). But in a PC compatible system, that is also the vector for hardware interrupt IRQ5. If there is nothing using IRQ5, the default BIOS handler will examine the interrupt controller state, decide that nothing happened, and execute IRET. Even if some peripheral is using IRQ5, the interrupt handler will eventually return with an IRET instruction.

And that’s where the the trick is. When the #GP fault occurs in real mode, the CPU can only push a 16-bit code offset on the stack. Instead of 10000h, it pushes zero. When the interrupt handler returns, it will continue executing at address zero instead of returning where it truly started (offset 10000h).

In protected mode, the behavior is a bit more obvious; assuming that 32-bit interrupt handlers are used, the CPU will push the full 32-bit EIP value on the stack. An IRET instruction will not be able to return because it will #GP fault trying to transfer control to an offset past the segment limit.

The same DOS program shown above does not successfully run in an OS/2 VDM. That is a strong hint that DOS applications do not rely on such wraparound, because it would be relatively easy for OS/2 to support that.

wrap-os2-640x480.png
Not wrapping around in an OS/2 VDM

Protected-mode 16-bit programs usually will terminate with some form of protection fault if they try to execute past 64K. it is only the PC compatible DOS environment where the wraparound seemingly occurs, due to a combination of interrupts losing the high half of EIP and the #GP fault being aliased to a hardware interrupt.

Needless to say, 16-bit code segments on a 386 can have any segment limit, up to 4GB. No tool that I know of supports oversized 16-bit code segments (normal 16-bit near jumps and calls can only generate 16-bit offsets, but it is possible to produce 32-bit offsets in 16-bit code). The utility of such segments is extremely problematic in real mode, because every interrupt will lose the high word of EIP. In the end, it’s much more straightforward to use proper 32-bit code segments or at least multiple 16-bit code segments.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK