PART II
INTRODUCTION:
Part 1 of this text introduced a number of essential preparatory topics, followed by the code for a basic windows program, and ended with an introduction to the concept of structured programming.
The basic window program code had the minimum code necessary to open a window on the screen, and then close it again. The screen window lacked the normal additional features, such as a menu bar, and also lacked any message response code other than that necessary to close the program. This, however, was intended as a good way to introduce the basic code, because it best helps the reader to understand the distinction between what is essential in the code for a Windows program, and what options can be added or not, as desired.
In addition, a basic 'skeleton' code like this can be used as a template, into which can be inserted whatever features might be desired in order to create a working program.
Part 2 will begin to discuss additional topics, such as painting or printing to the screen; executable file
resources, including a menu bar and menus; program icons; how to create popup menus and dialog boxes; and so on.
The direct creation of resources, like menus, without a resource compiler, will also be discussed, as well as possibilities for displaying different kinds of windows, including custom windows, drawn entirely by the program code rather than by the system.
It is impossible, however, to begin to create programs without immediately encountering the problem of how to trace the faults you will have written into your code. To have the intention to be able to write code without any faults is really a forlorn hope. In fact, dealing with, and eliminating, faults, or bugs, is one of the major parts of a programmer's task, and it is consequently necessary to learn how to do this, at the earliest opportunity. The first chapter, on debugging, in Part 2, thus deals with this problem, and discusses how to trace faults and correct them, and eliminate them from your code.
CHAPTER 12
DEBUGGING A WINDOWS PROGRAM CODE.
Debugging, or tracing the locations of faults, or 'bugs', in your code, is one of the most important, possibly the most frequent, easily the most time consuming, and sometimes the most frustrating aspect of programming.
When a fault occurs, the program will either crash, or hang, freezing the entire system, or a system dialog box will be displayed, saying that the program has created an illegal operation, or whatever, and will be closed down.
If the program simply freezes the system, you can press ctrl+alt+delete on the keyboard, which will create a dialog box with a list of running processes, or programs, from which you can select your crashed program and select end task or exit process on the
dialog box, depending on which version of Windows you are working with.
In the worst case, where nothing else works, you will have to switch off the power, creating an improper shutdown condition. Next time you start the computer, if you have an earlier version of Windows, the system may flag an improper shutdown and scan the hard disk for faults before fully loading the operating system. It is better, of course, to aviod such extreme measures, if possible.
Debugging requires some kind of systematic approach to first trace the location of the fault, and then work out what exactly the fault is. We will consider these two aspects of the matter separately.
TRACING THE LOCATION OF A FAULT:
The debugging system I use consists of printing data from the program code to a debug file, which is merely a text file (which I could call, say, debug.txt), which displays information indicating the location of the information in the code, together with other data, if required. This information is collected via two debug macros, one of which indicates the location at which it was inserted within the program code, while the other collects other kinds of data.
The macro that indicates a location within the code is called dbg, short for 'debug'. What it does is simply print a word, or words, which can be different each time, to the debug file. This means that, if it is used with a different word each time, at a different location in the code, the word can serve to indicate the line of the code from which the word was printed. That is, the words assigned to the macro serve to 'label' locations within the code.
Suppose, for example, I want to find out whether a fault occurred in WinMain, or in the WndProc part of the code. I can use the dbg macro as follows:
_WinMain@10h
dbg 'winmain'
_WndProc
dbg 'winproc'
These two debug macros, as indicated, are placed on lines immediately following the first line of the WinMain section of the code and the first line of the WndProc section of the code. If, now, I run the program, and later examine the debug file, and it contains only the word 'winmain', I know that the fault is somewhere in WinMain, since the code did not get as far as WndProc before it crashed. If, however, the word 'winproc' also appears in the debug file, I know that the fault is not in WinMain, because the program did get as far as WndProc, and therefore the fault must be somewhere within the WndProc code.
This labelling system can be refined, so as to first identify a section of the code where the fault is, and then, by creating labels within the faulty section itself, line by line, identify the actual fault location ever more precisely. This labelling process can be designed according to a method that will identify sections and sub sections that are being labelled.
Suppose that the program crashed when I clicked the left mouse button in the client area of the window. This already tells me that the fault must be somewhere within the WM_LBUTTONDOWN message response code. If we have something similar to the type of code sequence shown below, the dbg macro can be inserted in the code, on a line by line basis, as follows:
W_.LBUTTONDOWN:
dbg 'lbuttondown'
call Procedure1
dbg 'lb1'
call Procedure2
dbg'lb2'
call Procedure3
dbg 'lb3'
jmp WP_.exit
If the debug file subsequently shows the words 'lbutton-down', and 'lb1 and 'lb2', but not 'lb3', I know that the fault must be within Procedure3, since the code crashed before 'lb3' could be printed to the file. If, again, the Procedure3 code contains another series of procedure calls, I can again label within Procedure3 as follows:
Procedure3:
dbg 'lb2-1'
call Procedure3_1
dbg 'lb2-2'
call Procedure3_2
dbg 'lb2-3'
ret
If the debug file displays 'lb2-1' only, and not 'lb2-2' or 'lb2-3', I know that the fault must be within Procedure32. This process of nesting debug labels can be continued as far as might be necessary to determine the precise location of the fault. If, after 'lb2-1', instead of the procedure call, call Procedure3_1, I find a code instructions that, for example, does the following:
mov ebx,0
and, a few lines further on,
div ebx
Then I will know that the fault is that I have attempted to divide by zero. So now I can modify the code to correct this error.
Suppose, however, the debug file displays 'lb2-3' as the final label, just before the 'ret' instruction at the very end of the procedure, what then? The answer to this is that the ret instruction is popping an improper address off the stack, which means that the address pushed by the 'call Procedure3' instruction is not at the top of the stack, as it should be. This will mean that, somewhere within the Procedure3 code, I have either pushed values onto the stack, without all of them being popped off again before the end of the procedure. Alternatively, I will have popped more values off the stack than I pushed onto it, thus popping the return address off the stack before reaching the ret instruction, and thereby leaving an improper address for the ret instruction to use. In the next section we will see a method of finding the location where the wrong address was left on the stack.
IDENTIFYING THE NATURE OF THE FAULT:
The method I use for identifying the nature of a fault consists in printing register values to the same debug file as that used previously with the dbg labelling system. The file can thus contain a mixture of debug labels and printed register values. Parameters other than registers can also be printed indirectly, by reading them into a register and then printing the register value to the file.
The printing of register values to the debug file involves simply using one of the following procedure calls:
call PrintEax
call PrintEbx
call PrintEcx
call PrintEdx
call PrintEsi
call PrintEdi
Each of these is a procedure which prints the value of the relevant cpu register to the same text file (debug.txt) as that used by the dbg macro.
There will be an entire panorama of possible faults that can be discovered, using these procedures, together with the dbg macro, of which the following are a few examples:
In iterating through an array, via a loop, for example, the code might run off the end of the assigned memory and try to read an improper address. In this case the actual number of iterations will fail to correspond to the number allowed by the size of the array, as follows:
mov esi,[MEMORY1.Address]
mov edi,[MEMORY2.Address]
dbg 'a'
mov eax,[X]
call PrintEax
for [X]
set.b '=',[edi],[esi]
inc.d esi,edi
end.for
dbg 'b'
If [X] is too large, or either MEMORY1 or MEMORY2 is too small in size, esi or edi can exceed the size of MEMORY1 or MEMORY2, and create an invalid page fault. In this case, the debug file will print label 'a', but not label 'b', and also the value of [X]. This will allow you to both find the location of the fault, and also check the value of [X] to see if it is too large by comparison with the declared sizes of the memory arrays.
In the following code, the instruction pointer is stuck in an endless loop, which it can never get out of:
dbg 'a'
until.d '>',ecx,edx
mov ecx,[esi+ebx]
end.until
dbg 'b'
which should be something like
until.d '>',ecx,edx
mov ecx,[esi+ebx]
inc ebx
end.until
In the first case 'inc ebx' has been left out, so that the value of ecx never changes, and cannot ever become
greater than edx, and thus the instruction pointer can never get out of the loop. Here the program will hang, and you will have to use 'ctrl+alt+delete' to get a dialog box that will enable you to tell the system to terminate the program. The label 'a' will be the last item printed to the debug file, so you can easily find the missing code.
In the previous section we had the following code, which is slightly different here:
Procedure3:
dbg 'lb2-1'
jmp Code3_1
code3_1return:
dbg 'lb2-2'
jmp Code3_2
code3_2return:
dbg 'lb2-3'
ret
Here, the code jumps to code segments 3_1 and 3_2, and returns via jumps back to addresses, code3_1return and code3_2return, respectively, instead of calling them. The purpose of this is to avoid using call and ret instructions, which make use of the stack and thus interfere with the value at the top of the stack. This method can be used to examine the top of the stack without changing it. To find where, within Procedure3, the wrong address was left on the stack, for example, we can do the following, using the above method:
Procedure3:
dbg 'lb2-1'
pop eax
push eax
call PrintEax
jmp Code3_1
code3_1return:
dbg 'lb2-2'
pop eax
push eax
call PrintEax
jmp Code3_2
code3_2return:
dbg 'lb2-3'
pop eax
push eax
call PrintEax
ret
Here, on each occasion, we pop the current value off the top of the stack, into register eax, and immediately push it back again, and also print it from eax to the debug file. I will assume that no value is to be deliberately passed from Code3_1 to Code3_2 via the stack, which means that all the values of eax should be the same.
On the first occasion, the value will be the address that the ret instruction requires in order to return from the original call, which was a 'call Procedure3' instruction. If the wrong value is left on the stack at the ret instruction, the last value of eax, in the debug file, will not be the same as the first, as it should be. If the second value of eax is also different from the first, then the wrong address was left on the stack within Code3_1. If only the third value of eax is different from the first, then the wrong address was left on the stack somewhere within Code3_2. By this means it is possible to trace where, in the code, an unequal number of push and pop instructions were improperly applied.
TRACING THE USE OF LABELS IN THE CODE:
You may sometimes need to find all the locations in your code where a particular function name was called, or a particular memory label was used. The easiest way to do this is to temporarily comment out, or remove, the procedure or memory label where it
is defined. Then, if you try to compile the code, the assembler will list on the screen all uses of the missing label as 'undefined' label faults. In this way you can easily trace all uses of the relevant label.
SUMMARY:
The above simple scheme is really all that is required in order to debug your Windows program code. The particular difficulties of the debugging process on any particular occasion will depend on the complexity of the code, and how many different values and procedures, or other code segments, may be involved.
Remember also that, if a fault appears to be within a Windows API function code itself, you will always find that you have passed a parameter to the function with a wrong value, such as an invalid memory address, or whatever.
Also, be aware of possible problems in using the Windows messaging system. Dialog boxes, for example, can cause new messages to be sent to your program before you have completed a current response to a message. In such a case, you will not be able to also implement default processing of the message unless you have saved the message and its parameters, and then restored them before calling the API function DefWindowProc.
The most frustrating debugging experience will
undoubtedly occur when you are certain that you have identified the location where the fault must be, but cannot find anything apparently wrong with the code. In such a case, you may have developed a kind of 'blind spot', causing you to keep overlooking a fault that is probably actually staring you in the face.
In such a case, I find that it is a good idea to test all possible values of all variables and registers, even those that you might tend to presume have to be obviously correct, and therefore do not need to be checked. This will sometimes reveal an unexpectedly wrong value that will help to attract your attention to the fault. Beyond this, it is a matter of either perseverance, whatever the waste of time involved, or possibly simply redoing the section of code from scratch, in the hope that what might have been an inadvertent fault will be automatically avoided this time.
It is important not to give up, whatever the time that you might think is being wasted. Leave the problem aside, if necessary, and keep coming back to it. A fresh attempt will sometimes suddenly lead you to notice what you failed to see before.
Part I: Part I Introduction
Chapter 1: Binary numbers, code, and procedures
Chapter 3: Assembly Language
Chapter 4: Assembly Language Code example
Chapter 5: Macros and HLA
Chapter 6: HLA code example
Chapter 7: The Windows operating system
Chapter 8: Data Structures
Chapter 9: How to Create a Windows Program
Chapter 10: How to Create an Exe File
Chapter 11: Structured Programming and OOP
Part II: Part II Introduction
Chapter 12: Debugging a Windows Program Code
Chapter 13: Painting the Window Client Area
Chapter 14: Creating Window Menus
Chapter 15: How to Create Toolbars
Chapter 16: How to Create Popup Menus
Chapter 17: About the Windows Clipboard
Chapter 18: How to Create Bitmaps
Chapter 19: Icons and the Ico Format
Chapter 20: Common Dialog Boxes
Chapter 21: Working with Files
Chapter 22: Scrollbars and Scrolling
Chapter 23: How to Send Data to the Printer
Chapter 24: Miscellaneous Topics
© Alen, June 2014
alen@alenspage.net
Material on this page may be reproduced
for personal use only.