CHAPTER 5
MACROS, AND HLA LANGUAGES.
MACROS - OR, CODED CODE:
What is described in this section is not really technically necessary in order to be able to write code for Windows programs but is, nevertheless, vital from a practical point of view. Macros, used to create High Level Assembly Language (HLA) provide a coding device that allows code to be written in an easier, more compressed, and much more readable way for the programmer, as we shall see.
Just as CPU instructions, which are binary numbers, are actually written by a programmer in the form of Assembly Language code, it is also possible to write Assembly Language code itself in the form of another code, which is thus not Assembly Language, but code for Assembly Language. That is, we will have code for code, or a kind of coded code.
I am not referring here to a translation of the entire Assembly Language code, but only individual, limited segments of Assembly Language code, where this can prove useful. Such segments of coded code are referred to as 'macros'. The following example can demonstrate the nature and usefulness of macros and, hence, the reason for their existence.
Normally, one might have a section of code such as:
inc eax
inc [LengthA]
inc ebx
etc
Such Assembly Language code clearly tends to be stretched vertically down the page and, thus, it could often be convenient, and easier to read, if it were possible to write more than one instruction on the same line, as follows:
i.. inc eax, inc [LengthA], inc ebx
etc
This can be done if 'i..' is a suitably coded 'macro', and the instructions written to the right of it, on the same line, are its 'macro parameters'. This particular macro would tell the assembler to write each parameter, separated by a comma from the others, as a single instruction in the usual way. Thus, when a program is compiled, the macro code is translated to normal Assembly Language code. This is what I meant by referring to a macro as 'coded code', or 'code for code'.
An assembler that can do this is in two parts, a preprocessor, which initially translates all macros to normal assembly language code, storing the result in memory, and the normal assembler that translates the resultant Assembly Language code, along with the rest of the code, into binary code, which will later be converted to an exe file
The macro principle is, in fact, the same as that which underlies the codes of other programming languages, like C or Basic, Java, etc. Individual statements in such languages, called High Level Languages, or HLLs, do, in fact, represent blocks of Assembly Language code, just as if such statements were macros. Macros can therefore easily be used to create Assembly Language based equivalents of High Level Languages, and this text includes macro codes for quite a number of equivalents of HLL type statements. The macro 'i..', above, is, in fact, one from the collection of these HLL macros. They make Assembly Language code much easier to write, and also to read, without separating the programmer from the flexibility provided by normal Assembly Language, which can also be used at all times.
One may be tempted to ask if we need, in practice, to use a high level Assembly Language, why not just go straight to an already existing high level language, like C?
I would answer this by pointing out that, in the case of high level Assembly Language, you have access to the Assembly Language code within each of the HLA code instructions. You can thus modify this code, if you want to, or write HLA instructions of your own. In the case of an existing HLL language, like C, you have no access to the underlying Assembly Language, and cannot change or add to the set of instructions in the language.
The following is the (NASM) macro code for the macro i.. referred to above:
%macro i.. 1-*
%rep %0
%1
%rotate 1
%endrep
%endmacro
%macro tells the preprocessor that this is the start of a macro code. This is followed by the name of the macro (i..), and 1-* means that it can have any number of parameters greater than 0, i.e., from 1 to any number.
%rep %0 is an instruction to repeat the following
code %0 times, where %0 is a special term that always indicates the total number of parameters written after the macro name, when it is inserted into the code. The preprocessor translates a macro's code each time it comes across the macro name inserted into the main body of the code, and counts the particular number of parameters, which may be different on each occasion.
%1 refers to the first macro parameter (and %2, %3, etc. would refer to the second and third parameters, etc.) The parameters associated with a macro are written on the same line across the page, separated by commas, as in the example with i.. above.
%rotate 1 is an instruction to move all the parameters to the left by 1 (which means that the parameter %2 now becomes %1 etc), putting the leftmost back to the rightmost end. Hence the concept of 'rotation'. %rotate 2, %rotate 3, etc would move the parameters left by 2, or 3, etc., again moving the leftmost parameters shifted off the leftmost end to the rightmost end of the line of parameters. %endrep ends the code to be repeated. In the above example, %1 is translated as a separate line of normal Assembly Language code at each repeat, and represents a different parameter each time, in accordance with the parameters being rotated each time.
%endmacro ends the particular macro code.
Thus, in the macro instruction
i.. inc eax, inc [LengthA], inc ebx
the i.. is the macro name, and the Assembly Language instructions separated by commas are its parameters (here 3 of them). Inserting this macro instruction into Assembly Language code would cause the preprocessor to translate by inserting each of the three parameters successively as a single instruction on its own line into the code, as would be the case if the three instructions were written normally. The preprocessor selects the first parameter (%1), prints it into memory as one code instruction, rotates the parameters to the left by 1, again selects the first parameter, now the second, having been rotated into the first position, and again prints it into memory as the next code instruction, and so on until all the parameters, however many there may be, have been printed as a single code instruction.
An Assembly Language macro, such as i.., is simply inserted by name into the normal Assembly Language code, just as if it were a new kind of Assembly Language instruction, but the original, non-translated macro code, that the name refers to, is written independently, and stored elsewhere. All macros can be written and stored together, one after the other, as their code is not translated where it is defined, but only where the macro name is inserted like a normal instruction. The assembler inserts the entire, translated macro code on each occasion on which the macro name is inserted.
The macro i.., of course, works only for instructions of the form inc eax, where there is no comma within the instruction. If it were used with an instruction of the form mov eax,ebx, the translated result would be
mov eax
ebx
which would be nonsense as far as Assembly Language is concerned. For instructions of the form mov eax,ebx, I have written a corresponding macro ii.. Thus
ii.. mov eax,ebx, add ebx,[LengthA]
would result in
mov eax,ebx
add ebx,[LengthA]
which is the desired code. The ii.. macro source code is as follows:
%macro
%rep %0/2
%1,%2
%rotate 2
%endrep
%endmacro
This code treats the two parts of an instruction like mov eax,ebx as two separate parameters, since a comma is always interpreted as separating individual parameters in any macro, but puts two parameters on the same line each time, separated by a comma, i.e. %1,%2. Corresponding to this, %rotate2 shifts the parameters by two on each repeat, and the number of repeats is halved from %0 to %0/2, which is half the total number of parameters. This ensures that successive pairs of parameters are printed on a single line on each repeat, to produce the desired result.
MACRO CODE INSTRUCTIONS:
Every macro is written into Assembly Language code just like a different kind of Assembly Language instruction. It always has the form
Macroname Parameter1,Parameter2,...etc.
The macro name is followed by a space, and then the parameters are separated from one another by commas only.
The number of different necessary code instructions needed to form macros are few in number, and I list here those which can be used to form virtually any macro (macro instruction are identified by the % character):
%macro
defines the start of an individual macro code
%endmacro
defines the end of an individual macro code
%%Label
represents a code label address, which will become an address of a code instruction when the preprocessor converts the macro code to Assembly Language
%$Label
same as %%, except that it can be referred to from within another macro which is nested within the main macro (macros can be nested within one another)
%1, %2...
These represent the parameters and are inserted within the macro code. A macro can be used many times within a program's code, and the parameters can be different each time. When such a macro is translated to Assembly Language by the preprocessor, %1 in the code is replaced by Parameter1, wherever it appears, and this allows Parameter1 to be different on each occasion the macro is used. The same is true of %2, %3, etc.
%0
This is a special representation of parameters which causes the preprocessor to count the total number of parameters and replace %0 with the total parameter count.
%Label
This defines a parameter that exists only within the macro code
%assign
This assigns a number to the %Label parameter (for example: %assign %Label 10)
%define
%if
%elif
%else
%endif
These are conditional macro instructions. %elif means 'else if', and %else means 'else'. These must exist only between a %if and a %endif. %endif terminates the conditional %if sequence of instructions.
%push
%pop
These act like the ordinary push and pop instructions, but are macro instructions only. They refer to a macro 'context stack', and not the ordinary stack. A pushed context can be used to connect a macro to a nested macro
%ifctx
%ifnctx
These test for the existence of a pushed context. The first means 'if a context exists', and the second means 'if a context does not exist'
%rep
%endrep
These tell the preprocessor to print the same Assembly Language code repeatedly, which exists between the %rep and %endrep instructions (eg %rep 3 means repeat the same code 3 times)
%rotate
if the macro code requires examining all the parameters one after another, it needs to deal only with %! parameter, because %rotate rotates the parameters by putting Parameter2 in the position of Parameter1, and then Parameter3 in the position of Parameter1, and so on for all the parameters.
%error
You can use this to generate error messages to help find mistakes in your macro code. For example, if you have a context missing from the context stack, which should be there, you can arrange for an error message to appear if this causes a %ifctx instruction to fail to work.
ANOTHER USEFUL MACRO CODE EXAMPLE:
I will include here, as a further example, another simple but very important and useful macro, which allows the parameters passed to a Windows API function to be written on one line across the page, together with the function name. Without this macro, which I call simply 'api', each parameter must be pushed onto the stack in a single instruction each time, with as many instructions as there are parameters thus being listed down the page before the API function is called (we will encounter the use of API functions further on). The api macro, however, allows the API function call to be written in the following form:
api API_Name, parameter1, parameter2, etc.
rather than in the form:
push parameter1
push parameter2
etc.
call API_Name
The macro code that achieves this is as follows:
%macro api 1-*
%assign %i (%0-1)*4
%rep %0-1
%rotate -1
push dword %1
%endrep
%rotate -1
call %1@%i
%endmacro
Here we have the form %rotate -1, the '-' sign indicating that the parameters are rotated towards the right instead of the left. This is because the documentation for the API functions gives a list of parameters for each function which places the first parameter to be pushed onto the stack at the bottom of the list. Thus, if the list of parameters is printed across the page in the api macro from left to right, starting at the top of the list in the documentation, the last parameter must be pushed first, and then the next-last, etc. This is achieved by rotating to the right, which rotates the last parameter to the first position, and then the second-last, etc., as desired.
%assign %i (%0-1)*4 is an instruction that defines %i as a number defined as (%0-1)*4. %0-1 is the total number of macro parameters less 1 (the API function name itself), which is therefore the total number of API function parameters. Since each parameter of an API function is a dword, or 4 bytes, (%0-1)*4 calculates 4 times the number of API function parameters, or the total number of bytes involved, which has to be inserted as part of the API function call: call %1@%i.
%rep %0-1 repeats the code the total number of parameters (%0) minus one (the api function name itself, which is not a parameter that is pushed onto the stack like the others)
The last rotate, after %endrep, rotates the API function name back to the %1 position. The 'push dword %1' form is used because NASM requires the dword if %1 happens to be in the form 'push [Var]', since NASM gives the choice of writing push word [Var], or push dword [Var], and API parameters are always pushed as dwords.
The above introduction explains the nature and usefulness of macros.
These macros are suitable for use, of course, only in conjunction with the NASM freeware assembler, since their syntax is that of the NASM assembler preprocessor. The macro codes for other assemblers might be similar, but would have to correspond in detail to the syntax designed into them.
A macro, with its code, must be defined before it is used anywhere within the code, or you will get a “parser, instruction expected” fault message from the assembler. This means that an include file that defines a macro must be listed above any include file in which the macro is actually used in the code, which ensures that the assembler has already registered the macro name when it encounters the name as a code instruction.
The next chapter repeats the Assembly Language code example given in the previous chapter, but with the use of HLA macros, so that you can see the difference these make to the writing of the code.
Part I: Part I Introduction
Chapter 1: Binary numbers, code, and procedures
Chapter 3: Assembly Language
Chapter 4: Assembly Language Code example
Chapter 5: Macros and HLA
Chapter 6: HLA code example
Chapter 7: The Windows operating system
Chapter 8: Data Structures
Chapter 9: How to Create a Windows Program
Chapter 10: How to Create an Exe File
Chapter 11: Structured Programming and OOP
Part II: Part II Introduction
Chapter 12: Debugging a Windows Program Code
Chapter 13: Painting the Window Client Area
Chapter 14: Creating Window Menus
Chapter 15: How to Create Toolbars
Chapter 16: How to Create Popup Menus
Chapter 17: About the Windows Clipboard
Chapter 18: How to Create Bitmaps
Chapter 19: Icons and the Ico Format
Chapter 20: Common Dialog Boxes
Chapter 21: Working with Files
Chapter 22: Scrollbars and Scrolling
Chapter 23: How to Send Data to the Printer
Chapter 24: Miscellaneous Topics
© Alen, August 2013
alen@alenspage.net
Material on this page may be reproduced
for personal use only.