The Quick Guide to HLA
< Overview
This guide is designed to help those who are already familiar with x86 assembly language programming to get up to speed with HLA as rapidly as possible. HLA was designed as a tool for teaching assembly language programming to University/College students who have no prior experience with assembly language but have some high level language programming experience (C/C++, Pascal, Java, etc.). The documentation that exists for HLA comes in two forms: the HLA reference manuals and the "Art of Assembly Language Programming/32-bit Edition." The "Art of Assembly" text is suitable for students and beginners to assembly language programming; it starts from square one and teaches assembly language programming using HLA. Unfortunately, this text is not particularly suitable for those programmers who already know assembly language. The HLA reference manuals are great when you need to look up some particular feature. They do fully explain the HLA language, however, the HLA language is rather large so the assembly programmer who is new to HLA is faced with reading a tremendous amount of material just to get started with HLA. Most individuals won't bother. The purpose of this guide is to present a very small subset of HLA to the advanced x86 assembly language programmer in as few pages as possible. This guide does not attempt to teach any of HLA's special features; it assumes the reader is using an assembler such as MASM, TASM, NASM, Gas, etc., and is interested in learning how to write assembly code using HLA in a fashion comparable to those assemblers. Of course, the whole reason for such a person to learn HLA is to be able to take advantage of HLA's advanced features. However, one has to learn to walk before they run, this is the guide that will get that person walking. Once the reader is comfortable using HLA in a "traditional assembly" sense, then that reader can refer to the HLA reference manuals in order to learn the more advanced features of the language.
< Installing HLA
This manual assumes that you've already installed HLA on your system and you've verified its operation. Please see the HLA Reference Guide for details on installating HLA on your system.
< Running HLA
HLA is a command line tool that you run from the Win32 Command Prompt. This document assumes that you are familiar with basic command prompt syntax and you're familiar with various commands like "DIR" and "RENAME". To run HLA from the command line prompt, you use a command like the following:
hla optional_command_line_parameters Filename_list
The filename list consists of one or more unambiguous filenames having the extension: HLA, ASM, or OBJ. HLA will first run the HLAPARSE program on all files with the HLA extension (producing files with the same basename and an ASM or INC extension). Then HLA runs the MASM program on all files with the ASM extension (including the files produced by HLAPARSE). Finally, HLA runs the linker to combine all the OBJ files together (including the OBJ files that MASM produces). The ultimate result, assuming there were no errors along the way, is an executable file with an EXE extension.
HLA supports the following command line parameters:
-@ Do not generate a linker response file.
-aXXXXX Pass XXXXX to assembler as command line option.
-c Compile and assemble to .OBJ files only.
-dXXXXX Define symbol XXXXX and set it to true during compile.
-e name Executable output filename.
-lXXXXX Pass XXXXX to linker as command line option.
-m Create a map file during link.
-masm Use MASM to assemble output.
-o:omf When using MASM, produce OMF output rather than COFF.
-o:win32 Produce Win32 COFF files.
-s Compile to .ASM files only.
-sm Compile to MASM-compatible .ASM file (default).
-st Compile to TASM-compatible .ASM file.
-sym Dump symbol table after compile.
-tasm Assemble file using TASM32.EXE assembler.
-test Send diagnostic info to stdout rather than stderr
(This option is intended for HLA test/debug purposes).
-w Compile as windows app (default is console app).
Please see the HLA Reference Manual for an explanation of each of these options. Most of the time, you will not use any of these options when compiling typical HLA programs. The "-c" and "-s" options are the ones you will use most commonly (and this document assumes that you understand their purpose).
< HLA Language Elements
Starting with this section we being discussing the HLA source language. HLA source files must contain only seven-bit ASCII characters. These are Windows text files with each source line record containing a carriage return/line feed termination sequence. White space consists of spaces, tabs, and newline sequences. Generally, HLA does not appreciate other control characters in the file and may generate an error if they appear in the source file.
< Comments
HLA uses "//" to lead off single line comments. It uses "/*" to begin an indefinite length comments and it uses "*/" to end an indefinite length comment. C/C++, Java, and Delphi users will be quite comfortable with this notation.
< Special Symbols
The following characters are HLA lexical elements and have special meaning to HLA:
* / + - ( ) [ ] { } < > : ; , . = ? & | ^ ! @
&& || <= >= <> != == := .. << >>
This document will not explain the meaning of all these symbols, only the minimum necessary to write simple HLA programs. See the HLA Reference Manual for more details.
< Reserved Words
HLA supports a large number of reserved words (mostly, they are machine instructions). For brevity, that list does not appear here; please see the HLA reference manual for a complete and up-to-date list. Note that HLA does not allow you to use a reserved word as a program identifier, so you should scan over the list at least once to familiarize yourself with reserved words that you might be used to using as identifiers in your assembly language programs. HLA reserved words are case insensitive. That is, "MOV" and "mov" (as well as any permutation with resepect to case) both represent the HLA "mov" reserved word.
< External Symbols and Assembler Reserved Words
HLA produces an assembly language file during compilation and invokes an assembler such as MASM, TASM, or Gas to complete the compilation process. HLA automatically translates normal identifiers you declare in your program to beneign identifiers in the assembly language program. However, HLA does not translate EXTERNAL symbols, but preserves these names in the assembly language file it produces. Therefore, you must take care not to use external names that conflict with the underlying assembler's set of reserved words or that assembler will generate an error when it attempts to process HLA's output.
For a list of assembler reserved words, please see the documentation for the assembler you are using to assembly HLA's output (i.e., MASM, TASM, or Gas).
< HLA Identifiers
HLA identifiers must begin with an alphabetic character or an underscore. After the first character, the identifier may contain alphanumeric and underscore symbols. There is no technical limit on identifier length in HLA, but you should avoid external symbols greater than about 32 characters in length since the assembler and linkers that process HLA identifiers may not be able to handle such symbols.
HLA identifiers are always case neutral. This means that identifiers are case sensitive insofar as you must always spell an identifier exactly the same (with respect to alphabetic case). However, you are not allowed to declare two identifiers whose only difference is alphabetic case.
< External Identifiers
HLA lets you explicitly provide a string for external identifiers. External identifiers are not limited to the format for HLA identifiers. HLA allows any string constant to be used for an external identifier. It is your responsibility to use only those characters that are legal in the assembler that processes HLA's intermediate ASM file. Note that this feature lets you use symbols that are not legal in HLA but are legal in external code (e.g., Win32 APIs use the '@' character in identifiers). See the discussion of the @EXTERNAL option for more details.
< Data Types in HLA
< Native (Primitive) Data Types in HLA
HLA provides the following basic primitive types:
One-byte types: byte, boolean, enum, uns8, int8, and char.
Two-byte types: word, uns16, int16.
Four-byte types: dword, uns32, int32, real32, string, pointer
Eight-byte types: uns64, int64, qword, thunk, and real64.
Ten-Byte types: tbyte, and real80.
Sixteen-byte types: uns128, int128, lword, and cset
For details on these particular types, please consult the HLA Reference Manual. This document will make use of the following types:
byte, word, dword, string, real32, qword, real64, and real80
These are the types a typical assembler provides.
BYTE variables and objects may hold integer numeric values in the range -128..+255, any ASCII character constant, and the two predefined boolean values true (1) and false (0). Normally, HLA does a small amount of type checking; however, you can associate any value that can fit into eight bits with a byte-sized variable (or other object).
WORD variables and object may hold integer numeric values in the range -32768..+65535. Generally, HLA does not allow the association of other values with a WORD object.
DWORD variables and objects may hold integer numeric values in the range -2147483647..+4294967295, or the address of an object (using the "&" address-of operator).
STRING variables are also DWORD objects. STRING objects hold the address of a sequence of zero or more ASCII characters that end with a zero byte. In the four bytes immediately preceding the location contained in the string pointer is the current length of the string. In the four bytes preceding the current length is the maximum allowable length of the string. Note that HLA strings are "read-only" compatible with ASCIIZ strings used by Windows and C/C++ (read-only meaning that you can pass an HLA string to a Windows API or C/C++ function but that function should not modify the string).
QWORD, UNS64, and INT64 objects consume eight bytes of memory. As of HLA v1.37, support is available for 64-bit unsigned, signed, and hexadecimal numeric constants. TBYTE objects consume ten bytes (80 bits), but, again, HLA does not support the use of 80-bit integer or BCD constants in a program. LWORD, UNS128, and INT128 values are also legal and support 128-bit hexadecimal, unsigned, or signed constants.
REAL32, REAL64, and REAL80 types in HLA support the three different IEEE floating point formats.
< Composite Data Types
In addition to the primitive types above, HLA supports arrays, records (structures), unions, classes, and pointers of the above types (except for text objects).
< Array Data Types
HLA allows you to create an array data type by specifying the number of array elements after a type name. Consider the following HLA type declaration that defines intArray to be an array of dword objects:
The "[ 16 ]" component tells HLA that this type has 16 four-byte double words. HLA arrays use a zero-based index, so the first element is always element zero. The index of the last element, in this example, is 15 (total of 16 elements with indicies 0..15).
HLA also supports multidimensional arrays. You can specify multidimensional arrays by providing a list of indicies inside the square brackets, e.g.,
type intArray4x4 : dword[ 4, 4 ];
type intArray2x2x4 : dword[ 2,2,4 ];
< Record Data Types1
HLA's records allow programmers to create data types whose fields can be different types. The following HLA static variable declaration defines a simple record with four fields:
static Planet: record
x: dword;
y: dword;
z: dword;
density: real64;
endrecord;
Objects of type Planet will consume 20 bytes of storage at run-time.
The fields of a record may be of any legal HLA data type including other composite data types. You use dot-notation to access fields of a record object, e.g.,
< Literal Constants
Literal constants are those language elements that we normally think of as non-symbolic constant objects. HLA supports a wide variety of literal constants. The following sections describe those constants.
< Numeric Constants
HLA lets you specify several different types of numeric constants.
The first and last characters of a decimal integer constant must be decimal digits (0..9). Interior positions may contain decimal digits and underscores. The purpose of the underscore is to provide a better presentation for large decimal values (i.e., use the underscore in place of a comma in large values). Example: 1_234_265.
Hexadecimal literal constants must begin with a dollar sign ("$") followed by a hexadecimal digit and must end with a hexadecimal digit (0..9, A..F, or a..f). Interior positions may contain hexadecimal digits or underscores. Hexadecimal constants are easiest to read if each group of four digits (starting from the least significant digit) is separated from the others by an underscore. E.g., $1A_2F34_5438.
Binary literal constants begin with a percent sign ("%") followed by at least one binary digit (0/1) and they must end with a binary digit. Interior positions may contain binary digits or underscore characters. Binary constants are easiest to read if each group of four digits (starting from the least significant digit) is separated from the others by an underscore. E.g., %10_1111_1010.
Real (Floating Point) Constants:
Floating point (real) literal constants always begin with a decimal digit (never just a decimal point). A string of one or more decimal digits may be optionally followed by a decimal point and zero or more decimal digits (the fractional part). After the optional fractional part, a floating point number may be followed by "e" or "E", a sign ("+" or "-"), and a string of one or more decimal digits (the exponent part). Underscores may appear between two adjacent digits in the floating point number; their presence is intended to substitute for commas found in real-world decimal numbers.
Boolean constants consist of the two predefined identifiers true and false. Note that your program may redefine these identifiers, but doing so is incredibly bad programming style.
Character literals generally consist of a single (graphic) character surrounded by apostrophes. To represent the apostrophe character, use four apostrophies, e.g., `'''.
Another way to specify a character constant is by typing the "#" symbol followed by a numeric literal constant (decimal, hexadecimal, or binary). Examples: #13, #$D, #%1101.
String literal constants consist of a sequence of (graphic) characters surrounded by quotes. To embed a quote within a string, insert a pair of quotes into the string, e.g., "He said ""This"" to me."
If two string literal constants are adjacent in a source file (with nothing but whitespace between them), then HLA will concatenate the two strings and present them to the parser as a single string. Furthermore, if a character constant is adjacent to a string, HLA will concatenate the character and string to form a single string object. This is useful, for example, when you need to embed control characters into a string, e.g.,
"This is the first line" #$d #$a "This is the second line" #$d #$a
HLA treats the above as a single string with a newline sequence (CR/LF) at the end of each of the two lines of text.
HLA allows a very limited form of a pointer constant. If you place an ampersand in front of a static object's name (i.e., the name of a static variable, readonly variable, uninitialized variable, segment variable, procedure, method, or iterator), HLA will compute the run-time offset of that variable. Pointer constants may not be used in abitrary constant expressions. You may only use pointer constants in expressions used to initialize static or readonly variables or as constant expressions in 80x86 instructions.
HLA also supports certain structured constants including character set constants, array constants, and record constants. Please see the HLA Reference Manual for more details.
< Constant Expressions in HLA
HLA provides a rich expression evaluator to process assembly-time expressions. HLA supports the following operators (sorting by decreasing precedence):
! (unary not), - (unary negation)
The expression must be either boolean or a number. For boolean values, not ("!") computes the standard logical not operation. For numbers, not ("!") computes the bitwise not operation on the bits of the number.
- expr (unary negation operator)
expr1 * expr2 (multiplication operator)
expr1 div expr2 (integer division operator)
expr1 mod expr2 (integer remainder operator)
expr1 / expr2 (real division operator)
expr1 << expr2 (integer shift left operator)
expr1 >> expr2 (integer shift right operator)
expr1 + expr2 (addition operator)
expr1 - expr2 (subtraction operator)
expr1 = expr2 (equality comparison operator)
expr1 <> expr2 (inequality comparison operator)
expr1 < expr2 (less than comparison operator)
expr1 <= expr2 (less than or equal comparison operator)
expr1 > expr2 (greater than comparison operator)
expr1 >= expr2 (greater or equal comparison operator)
expr1 & expr2 (logical/boolean AND operator)
expr1 | expr2 (logical/boolean OR operator)
expr1 ^ expr2 (logical/boolean XOR operator)
( expr ) (override operator precedence)
HLA supports several other constant operators. Furthermore, many of the above operators are overloaded depending on the operand types. Note that for numeric (integer) operands, HLA fully support 128-bit arithmetic. Please see the HLA Reference Manual for more details.
< Program Structure
An HLA program uses the following general syntax:
The three identifiers above must all match. The declaration section (declarations) consists of type, const, val, var, static, uninitialized, readonly, segment, procedure, and macro definitions. Any number of these sections may appear and they may appear in any order; more than one of each section may appear in the declaration section.
If you wish to write a library module that contains only procedures and no main program, you would use an HLA unit. Units have a syntax that is nearly identical to programs, there isn't a BEGIN associated with the unit, e.g.,
unit TestPgm;
procedure LibraryRoutine;
begin LibraryRoutine;
<< etc. >>
end LibraryRoutine;
end TestPgm;
< Procedure Declarations
Procedure declarations are nearly identical to program declarations.
procedure identifier; noframe;
Note that HLA procedures provide a very rich set of syntactical options. The template above corresponds to the syntax that creates procedures most closely resembling those that other assemblers use. HLA's procedures allow parameters, local variable declarations, and lots of other features this document won't describe. For more details, please see the HLA Reference Manual.
Note, and this is very important, that the procedure option NOFRAME must appear in the PROCEDURE declaration. Without this declaration, HLA inserts some additional code into your procedure and it will probably fail to work as you intend (indeed, it's likely the inserted code will crash when it runs).
< Declarations
Programs, units, procedures, methods, and iterators all have a declaration section. Classes and namespaces also have a declaration section, though it is somewhat limited. A declaration section can contain one or more of the following components (among other things this document doesn't cover):
The order of these sections is irrelevant as long as you ensure that all identifiers used in a program are defined before their first use. Furthermore, as noted above, you may have multiple sections within the same set of declarations. For example, the two const sections in the following procedure declaration are perfectly legal:
< Type Section
You can declare user-defined data types in the type section. The type section appears in a declaration section and begins with the reserved word type. It continues until encountering another declaration reserved word (e.g., const, var, or val) or the reserved word begin. A typical type definition begins with an indentifier followed by a colon and a type definition. The following paragraphs describe some of the legal types of type definitions.
id1 : id2; // Defines id1 to be the same as type id2.
id1 : id2 [ dim_list ]; // Defines id1 to be an array of type id2.
id1 : record // Defines id1 as a record type.
< Const Section
You may declare manifest constants in the CONST section of an HLA program. It is illegal to attempt to change the value of a constant at some later point during assembly. Of course, at run-time the constant always has a static value.
The constant declaration section begins with the reserved word const and is followed by a sequence of constant definitions. The constant declaration section ends when HLA encounters a keyword such as const, type, var, val, etc. Actual constant definitions take the forms specified in the following paragraphs.
id := expr; // Assigns the value and type of expr to id
id1 : id2 := expr; // Creates constant id1 of type id2 of value expr.
Note that HLA supports several types of constants this section doesn't discuss (e.g., array and record constants and well as compile-time variables). See the HLA Reference Manual for more details.
< Static Section
The static section lets you declare static variables you can reference at run-time by your code. The following paragraphs list some of the forms that are legal in the STATIC section. As usual, see the HLA Reference Manual for lots of additional features that HLA supports in the STATIC section.
id1 : id2; // Declares variable id1 of type id2
id1 : id2 := expr; // Declares variable id1 of type id2, init'd with expr
id1 : id2[ expr ]; // Declares array id1 of type id2 with expr elements
< The @NOSTORAGE Option
The @ nostorage option tells HLA to associate the current offset in the segment with the specified variable, but don't actually allocate any storage for the object. This option effectively creates an alias of the current variable with the next object you declare in one of the static sections. Consider the following example:
Because the b and w variables both have the @ nostorage option associated with them, HLA does not reserve any storage for these variables. The d variable does not have the nostorage option, so HLA does reserve four bytes for this variable. The b and w variables, since they don't have storage associated with them, share the same address in memory with the d variable.
< The @EXTERNAL Option
The external option gives you the ability to reference static variables that you declare in other files. Like the external clause for procedures, there are two different syntaxes for the external clause appearing after a variable declaration:
varName: varType; @external( "external_Name" );
The first form above uses the variable's name for both the internal and external names. The second form uses varName as the internal name that HLA uses and it associates this varible with external_Name in the external modules. The @external option is always the last option associated with a variable declaration.
If the actual variable definition for an external object appears in a source file after an external declaration, this tells HLA that the definition is a public variable that other modules may access (the default is local to the current source file). This is the only way to declare a variable to be public so that other modules can use it. Usually, you would put the external declaration in a header file that all modules (wanting to access the variable) include; you also include this header file in the source file containing the actual variable declaration.
< Macros
HLA has one of the most powerful macro expansion facilities of any programming language. HLA's macros are the key to extending the HLA language. If you're a big user of macros then you will want to read the HLA Reference Manual to learn all about HLA's powerful macro facilities. This section will describe HLA's limited "Standard Macro" facility which is comparable to the macro facilities other assemblers provide.
You can declare macros in the declaration section of a program using the following syntax:
#macro identifier ( optional_parameter_list ) ;
The optional parameter list must be a list of one or more identifiers separated by commas. HLA automatically associates the type "text" with all macro parameters (except for one special case noted below). Example:
#macro MacroWParms( a, b, c );
If the macro does not allow any parameters, then you follow the identifier with a semicolon (i.e., no parentheses or parameter identifiers). See the first example in this section for a macro without any parameters.
Occasionally you may need to define some symbols that are local to a particular macro invocation (that is, each invocation of the macro generates a unique symbol for a given identifier). The local identifier list allows you to do this. To declare a list of local identifiers, simply following the parameter list (after the parenthesis but before the semicolon) with a colon (":") and a comma separated list of identifiers, e.g.,
#macro ThisMacro(parm1):id1,id2;
HLA automatically renames each symbol appearing in the local identifier list so that the new name is unique throughout the program. HLA creates unique symbols of the form " _XXXX_ " where XXXX is some hexadecimal numeric value. To guarantee that HLA can generate unique symbols, you should avoid defining symbols of this form in your own programs (in general, symbols that begin and end with an underscore are reserved for use by the compiler and the HLA standard library). Example:
To invoke a macro, you simply supply its name and an appropriate set of parameters. Unless you specify a variable number of parameters (using the array syntax) then the number of actual parameters must exactly match the number of formal parameters. If you specify a variable number of parameters, then the number of actual parameters must be greater than or equal to the number of formal parameters (not counting the array parameter).
Actual macro parameters consist of a string of characters up to, but not including a separate comma or the closing parentheses, e.g.,
" v1 " is the text for parameter #1, " x+2*y " is the text for parameter #2. Note that HLA strips all leading whitespace and control characters before and after the actual parameter when expanding the code in-line. The example immediately above would expand do the following:
If (balanced) parentheses appear in some macro's actual parameter list, HLA does not count the closing parenthesis as the end of the macro parameter. That is, the following is perfectly legal:
< The #Include Directive
Like most languages, HLA provides a source inclusion directive that inserts some other file into the middle of a source file during compilation. HLA's #INCLUDE directive is very similar to the pragma of the same name in C/C++ and you primarily use them both for the same purpose: including library header files into your programs.
HLA's include directive has the following syntax:
#include( string_expression );
< The #IncludeOnce Directive
When composing complex header files, particularly when constructing library header files, you may find in necessary to insert a #INCLUDE("file") directive into some other header files. Generally, this is not a problem, HLA certainly allows nested include files (up to 256 files deep). However, unless you are very careful about how you organize your files, it is very easy to create an "include loop" where one header file includes another and that other header file includes the first. Attempting to compile a program that includes either header file results in an infinite "include loop" during compilation. Clearly, this is not desirable.
You can use the #INCLUDEONCE directive to avoid this problem. The only difference between the two is that HLA keeps track of all files it has processed using the #INCLUDE or #INCLUDEONCE directives and will not process a header file a second time if you attempt to include it using the #INCLUDEONCE directive.
Whenever HLA processes the #INCLUDEONCE directive, it first compares its string operand with a list of strings appearing in previous #INCLUDE or #INCLUDEONCE directives. If it matches one of these previous strings, then HLA ignores the #INCLUDEONCE directive; if the include filename does not appear in HLA' internal list, then HLA adds this filename to the list and includes the file.
< The #asm..#endasm and #emit Directives
HLA is far from perfect. There are many missing instructions (some left out on purpose, some left out because of laziness, and some left out because of ignorance). For example, HLA currently doesn't support the SSE instruction set. Fret not, though, HLA v1.x provides two escape mechanisms that let you do anything legal in MASM or the underlying assembler used to process HLA's output.
The first of these escape mechanisms is the #ASM..#ENDASM section. The syntax for an assembly block is as follows:
All text appearing between the #ASM and #ENDASM directives is emitted directly to the .ASM output file produced by HLA. MASM (or whatever assembler you're using) will be responsible for assembling this code. Of course, you must supply source code that is compatible with your assembler in an assembly block or the assembly process will fail when the assembler encounters your code.
The second escape mechanism is the #EMIT directive. This directive takes the following form:
HLA evaluates the string expression and then emits that expression to the output .ASM file. See the HLA documentation for more details and restrictions on the #asm..#endasm block and the #emit directive.
< The Conditional Compilation Statements (#if)
The conditional compilation statements in HLA use the following syntax:
#if( constant_boolean_expression )
<< Statements to compile if the >>
<< expression above is true. >>
#elseif( constant_boolean_expression )
<< Statements to compile if the >>
<< expression immediately above >>
<< is true and the first expres->>
<< Statements to compile if both >>
<< the expressions above are false. >>
The #ELSEIF and #ELSE clauses are optional. As you would expect, there may be more than one #ELSEIF clause in the same conditional if sequence.
Unlike some other assemblers and high level languages, HLA's conditional compilation directives are legal anywhere whitespace is legal. You could even embed them in the middle of an instruction! While directly embedding these directives in an instruction isn't recommended (because it would make your code very hard to read), it's nice to know that you can place these directives in a macro and then replace an instruction operand with a macro invocation.
An important thing to note about this directive is that the constant expression in the #IF and #ELSEIF clauses must be of type boolean or HLA will emit an error. Any legal constant expression that produces a boolean result is legal here.
Keep in mind that conditional compilation directives are executed at compile-time, not at run-time. You would not use these directives to (attempt to) make decisions while your program is actually running.
< The 80x86 Instruction Set in HLA
One of the most obvious differences between HLA and standard 80x86 assembly language is the syntax for the machine instructions. The two primary differences are the fact that HLA uses a functional notation for machine instructions and HLA arranges the operands in a (source, dest) format rather than the (dest, source) format used by Intel.
< Zero Operand Instructions (Null Operand Instructions)
The following instructions do not require any operands. There are two sytactically allowable forms for each instruction:
The zero-operand instruction mnemonics are
aaa, aad, aam, aas, cbw, cdq, clc, cld, cli, cmc, cmpsb, cmpsd, cmpsw, cpuid, cwd, cwde, daa, das,
insb, insd, insw, into, iret, iretd, lahf, leave, lodsb, lodsd, lodsw, movsb, movsd, movsw, nop, outsb,
outsd, outsw, popad, popa, popf, popfd, pusha, pushad, pushf, pushfd, rdtsc, rep.insb, rep.insd,
rep.insw, rep.movsb, rep.movsd, rep.movsw, rep.outsb, rep.outsd, rep.outsw, rep.stosb, rep.stosd,
rep.stosw, repe.cmpsb, repe.cmpsd, repe.cmpsw, repe.scasb, repe.scasd, repe.scasw, repne.cmpsb,
repne.cmpsd, repne.cmpsw, repne.scasb, repne.scasd, repne.scasw, sahf, scasb, scasd, scasw,
stc, std, sti, stosb, stosd, stosw, wait, xlat
< General Arithmetic and Logical Instructions
These instructions include adc, add, and, mov, or, sbb, sub, test, and xor. They all take the same basic form (substitute the appropriate mnemonic for "adc" in the syntax examples below):
adc( Reg8, Reg8 )
adc( Reg16, Reg16 )
adc( Reg32, Reg32 )
adc( const, Reg8 )
adc( const, Reg16 )
adc( const, Reg32 )
adc( const, mem )
adc( Reg8, mem )
adc( Reg16, mem )
adc( Reg32, mem )
adc( mem, Reg8 )
adc( mem, Reg16 )
adc( mem, Reg32 )
adc( Reg8, AnonMem )
adc( Reg16, AnonMem )
adc( Reg32, AnonMem )
adc( AnonMem, Reg8 )
adc( AnonMem, Reg16 )
adc( AnonMem, Reg32 )
Note: for the form "adc( const, mem )", if the specified memory location does not have a size or type associated with it, you must explicitly specify the size of the memory operand, e.g., "adc(5,(type byte [eax]));"
< The XCHG Instruction
The xchg instruction allows the following syntactical forms:
xchg( source, dest );
Specific Forms:
xchg( Reg8, Reg8 )
xchg( Reg8, mem )
xchg( Reg8, AnonMem)
xchg( mem, Reg8 )
xchg( AnonMem, Reg8 )
xchg( Reg16, Reg16 )
xchg( Reg16, mem )
xchg( Reg16, AnonMem)
xchg( mem, Reg16 )
xchg( AnonMem, Reg16 )
xchg( Reg32, Reg32 )
xchg( Reg32, mem )
xchg( Reg32, AnonMem)
xchg( mem, Reg32 )
xchg( AnonMem, Reg32 )
< The CMP Instruction
The "cmp" instruction uses the following general forms:
cmp( LeftOperand, RightOperand );
cmp( Reg8, Reg8 );
cmp( Reg8, mem );
cmp( Reg8, AnonMem );
cmp( mem, Reg8 );
cmp( AnonMem, Reg8 );
cmp( Reg8, const );
cmp( Reg16, Reg16 );
cmp( Reg16, mem );
cmp( Reg16, AnonMem );
cmp( mem, Reg16 );
cmp( AnonMem, Reg16 );
cmp( Reg16, const );
cmp( Reg32, Reg32 );
cmp( Reg32, mem );
cmp( Reg32, AnonMem );
cmp( mem, Reg32 );
cmp( AnonMem, Reg32 );
cmp( Reg32, const );
cmp( mem, const );
Note that the CMP instruction's operands are ordered "dest, source" rather than the usual "source,dest" format (that is, the operands are in the same order as MASM expects them). This is to allow an intuitive use of the instruction mnemonic (that is, CMP normally reads as "compare dest to source."). We will avoid this confusion by simply referring to the operands as the "left operand" and the "right operand". Left vs. right signifies the placement of the operands around a comparison operator like "<=" (e.g., "left <= right").
For the "cmp( mem, const )" form, the memory operand must have a type or size associated with it. When using anonymous memory locations you must always coerce the type of the memory location, e.g., "cmp( (type word [ebp-4]), 0 );".
< The Multiply Instructions
HLA supports several variations on the 80x86 "MUL" and IMUL instructions. The supported forms are:
mul( reg8 )
mul( reg16)
mul( reg32 )
mul( mem )
mul( reg8, al )
mul( reg16, ax )
mul( reg32, eax )
mul( mem, al )
mul( mem, ax )
mul( mem, eax )
mul( AnonMem, ax )
mul( AnonMem, dx:ax )
mul( AnonMem, edx:eax )
imul( reg8 )
imul( reg16)
imul( reg32 )
imul( mem )
imul( reg8, al )
imul( reg16, ax )
imul( reg32, eax )
imul( mem, al )
imul( mem, ax )
imul( mem, eax )
imul( AnonMem, ax )
imul( AnonMem, dx:ax )
imul( AnonMem, edx:eax )
intmul( const, Reg16 )
intmul( const, Reg16, Reg16 )
intmul( const, mem, Reg16 )
intmul( const, AnonMem, Reg16 )
intmul( const, Reg32 )
intmul( const, Reg32, Reg32 )
intmul( const, mem, Reg32 )
intmul( const, AnonMem, Reg32 )
intmul( Reg16, Reg16 )
intmul( mem, Reg16 )
intmul( AnonMem, Reg16 )
intmul( Reg32, Reg32 )
intmul( mem, Reg32 )
intmul( AnonMem, Reg32 )
mul( const, al )
mul( const, ax )
mul( const, eax )
imul( const, al )
imul( const, ax )
imul( const, eax )
The first, and probably most important, thing to note about HLA's multiply instructions is that HLA uses a different mnemonic for the extended-precision integer multiply versus the single-precision integer multiply (i.e., IMUL vs. INTMUL). Standard MASM syntax uses the same mnemonic for both instructions. There are two reasons for this change of syntax in HLA. First, there needed to be some way to differentiate the "mul( const, al )" and the "intmul( const, al )" instructions (likewise for the instructions involving AX and EAX). Second, the behavior of the INTMUL instruction is substantially different from the IMUL instruction, so it makes sense to use different mnemonics for these instructions.
The extended syntax instructions create a static data variable, initialized with the specified constant, and then specify the address of this variable as the source operand of the MUL or IMUL instruction.
< The Divide Instructions
HLA support several variations on the 80x86 DIV and IDIV instructions. The supported forms are:
div( source );
div( source, dest );
mod( source );
mod( source, dest );
idiv( source );
idiv( source, dest );
imod( source );
imod( source, dest );
div( reg8 )
div( reg16)
div( reg32 )
div( mem )
div( reg8, ax )
div( reg16, dx:ax)
div( reg32, edx:eax )
div( mem, ax )
div( mem, dx:ax)
div( mem, edx:eax )
div( AnonMem, ax )
div( AnonMem, dx:ax )
div( AnonMem, edx:eax )
mod( reg8 )
mod( reg16)
mod( reg32 )
mod( mem )
mod( reg8, ax )
mod( reg16, dx:ax)
mod( reg32, edx:eax )
mod( mem, ax )
mod( mem, dx:ax)
mod( mem, edx:eax )
mod( AnonMem, ax )
mod( AnonMem, dx:ax )
mod( AnonMem, edx:eax )
idiv( reg8 )
idiv( reg16)
idiv( reg32 )
idiv( mem )
idiv( reg8, ax )
idiv( reg16, dx:ax)
idiv( reg32, edx:eax )
idiv( mem, ax )
idiv( mem, dx:ax)
idiv( mem, edx:eax )
idiv( AnonMem, ax )
idiv( AnonMem, dx:ax )
idiv( AnonMem, edx:eax )
imod( reg8 )
imod( reg16)
imod( reg32 )
imod( mem )
imod( reg8, ax )
imod( reg16, dx:ax)
imod( reg32, edx:eax )
imod( mem, ax )
imod( mem, dx:ax)
imod( mem, edx:eax )
imod( AnonMem, ax )
imod( AnonMem, dx:ax )
imod( AnonMem, edx:eax )
div( const, ax )
div( const, dx:ax )
div( const, edx:eax )
mod( const, ax )
mod( const, dx:ax )
mod( const, edx:eax )
idiv( const, ax )
idiv( const, dx:ax )
idiv( const, edx:eax )
imod( const, ax )
imod( const, dx:ax )
imod( const, edx:eax )
The destination operand is always implied by the 80x86 "div" and "idiv" instructions (AX, DX:AX, or EDX:EAX ). HLA allows the specification of the destination operand in order to make your programs easier to read (although the use of the destination operand is optional).
The HLA divide instructions support an extended syntax that allows you to specify a constant as the divisor (source operand). HLA allocates storage in the static data segment and initializes the storage with the specified constant, and then divides the accumulator by this newly specified memory location.
< Single Operand Arithmetic and Logical Instructions
These instructions include dec, inc, neg, and not. They take the following general forms (substituting the specific mnemonic as appropriate):
dec( dest );
Specific forms allowed:
dec( Reg8 );
dec( Reg16 );
dec( Reg32 );
dec( mem );
Note: if mem is an untyped or unsized memory location (i.e., an anonymous memory location), you must explicitly provide a size; e.g., "dec( (type word [edi]));"
< Shift and Rotate Instructions
These instructions include RCL, RCR, ROL, ROR, SAL, SAR, SHL, and SHR. These instructions support the following generic syntax, making the appropriate mnemonic substitution.
shl( const, Reg8 );
shl( const, Reg16 );
shl( const, Reg32 );
shl( const, mem );
shl( cl, Reg8 );
shl( cl, Reg16 );
shl( cl, Reg32 );
shl( cl, mem );
The " const " operand is an unsigned integer constant between zero and the maximum number of bits in the destination operand. The forms with a memory operand must have a type or size associated with the operand; e.g., when using anonymous memory locations, you must coerce the type,
< The Double Precision Shift Instructions
These instruction use the following general form (you can substitute SHRD for SHLD below):
shld( const, Reg16, Reg16 )
shld( const, Reg16, mem )
shld( const, Reg16, AnonMem )
shld( cl, Reg16, Reg16 )
shld( cl, Reg16, mem )
shld( cl, Reg16, AnonMem )
shld( const, Reg32, Reg32 )
shld( const, Reg32, mem )
shld( const, Reg32, AnonMem )
shld( cl, Reg32, Reg32 )
shld( cl, Reg32, mem )
shld( cl, Reg32, AnonMem )
< The Lea Instruction
These instructions use the following syntax:
lea( Reg32, memory )
lea( Reg32, AnonMem )
lea( Reg32, ProcID )
lea( Reg32, LabelID )
lea( memory, Reg32 )
lea( AnonMem, Reg32 )
lea( ProcID, Reg32 )
lea( LabelID, Reg32 )
lea( StringConstant, Reg32 )
lea( const ConstExpr, Reg32 )
The "lea" instruction loads the specified 32-bit register with the address of the specified memory operand, procedure, or statement label. Note in the extended syntax you can reverse the order of the operands. Since exactly one operand must be a register, there is no ambiguity between the two forms (this syntax was added to satisfy those who complained about the (reg,memory) syntax). Of course, good programming style suggests that you use only one form (either reg,memory or memory, reg) within your programs.
Note: HLA does not support an LEA instruction that loads a 16-bit address into a 16-bit register. That form of the LEA instruction is not very useful in 32-bit programs running on 32-bit operating systems.
< The Sign and Zero Extension Instructions
The HLA MOVSX and MOVZX instructions use the following syntax:
movsx( source, dest );
movzx( source, dest );
movsx( Reg8, Reg16 )
movsx( Reg8, Reg32 )
movsx( Reg16, Reg32 )
movsx( mem8, Reg16 )
movsx( mem8, Reg32 )
movsx( mem16, Reg32 )
movzx( Reg8, Reg16 )
movzx( Reg8, Reg32 )
movzx( Reg16, Reg32 )
movzx( mem8, Reg16 )
movzx( mem8, Reg32 )
movzx( mem16, Reg32 )
These instructions sign (MOVSX) or zero (MOVZX) extend their source operand into the destination operand.
< The Push and Pop Instructions
These instructions take the following general forms:
pop( reg16 );
pop( reg32 );
pop( mem );
push( Reg16 )
push( Reg32 )
push( memory )
pushw( Reg16 )
pushw( memory )
pushw( AnonMem )
pushw( Const )
pushd( Reg32 )
pushd( memory )
pushd( AnonMem )
pushd( Const )
These instructions push or pop their specified operand.
< Procedure Calls
Given a procedure or a DWORD variable (containing the address of a procedure) named "MyProc" you can call this procedure as follows:
HLA actually supports several other syntaxes for calling procedures, including a syntax that will automatically push parameters on the stack for you. See the HLA Reference Manual for more details.
< The Ret Instruction
The RET( ) statement allows two syntactical forms:
ret( );
ret( integer_constant_expression );
The first form emits a simple 80x86 RET instruction, the second form emits the 80x86 RET instruction with the specified numeric constant expression value (used to remove parameters from the stack).
< The Jmp Instructions
The HLA "jmp" instruction supports the following syntax:
jmp Label;
jmp ProcedureName;
jmp( dwordMemPtr );
jmp( anonMemPtr );
jmp( reg32 );
" Label " represents a statement label in the current procedure. (You are not allowed to jump to labels in other procedures in the current version of HLA. This restriction may be relaxed somewhat in future versions.) A statement label is a unique (within the current procedure) identifier with a colon after the identifier, e.g.,
InfiniteLoop:
<< Code inside the infinite loop>>
jmp InfiniteLoop;
Jumping to a procedure transfers control to the first instruction in the specified procedure. You are responsible for explicitly pushing any parameters and the return address for that procedure.
These instructions all return the empty string as their "returns" value.
< The Conditional Jump Instructions
These instructions include JA, JAE, JB, JBE, JC, JE, JG, JGE, JL, JLE, JO, JP, JPE, JPO, JS, JZ, JNA, JNAE, JNB, JNBE, JNC, JNE, JNG, JNGE, JNL, JNLE, JNO, JNP, JNS, JNZ, JCXZ, JECXZ, LOOP, LOOPE, LOOPZ, LOOPNE, and LOOPNZ. They all take the following generic form (substituting the appropriate instruction for "JA").
" LocalLabel " must be a statement label defined in the current procedure.
Note: due to the nature of the HLA compilation process, you should avoid the use of the JCXZ, JECXZ, LOOP, LOOPE, LOOPZ, LOOPNE, and LOOPNZ instructions. Unlike the other conditional jump instructions, these instructions have a very limited +/- 128 range. Unfortunately, HLA cannot detect if the branch is out of range (this task is handled by MASM), so if a range error occurs, HLA cannot warn you about this. The MASM assembly will fail, but the result will be hard to decipher. Fortunately, these instructions are easily, and usually more efficiently, implemented using other 80x86 instructions so this should not prove to be a problem.
< The Conditional Set Instructions
These instructions include: SETA, SETAE, SETB, SETBE, SETC, SETE, SETG, SETGE, SETL, SETLE, SETO, SETP, SETPE, SETPO, SETS, SETZ, SETNA, SETNAE, SETNB, SETNBE, SETNC, SETNE, SETNG, SETNGE, SETNL, SETNLE, SETNO, SETNP, SETNS, and SETNZ. They take the following generic forms (substituting the appropriate mnemonic for seta):
< The Conditional Move Instructions
These instructions include CMOVA, CMOVAE, CMOVB, CMOVBE, CMOVC, CMOVE, CMOVG, CMOVGE, CMOVL, CMOVLE, CMOVO, CMOVP, CMOVPE, CMOVPO, CMOVS, CMOVZ, CMOVNA, CMOVNAE, CMOVNB, CMOVNBE, CMOVNC, CMOVNE, CMOVNG, CMOVNGE, CMOVNL, CMOVNLE, CMOVNO, CMOVNP, CMOVNS, and CMOVNZ. They use the following general syntax:
CMOVcc( reg16, reg16 );
CMOVcc( reg32, reg32 );
CMOVcc( mem16, reg16 );
CMOVcc( mem32, reg32 );
These instructions move the data if the specified condition is true (specified by the cc condition). If the condition is false, these instructions behave like a no-operation.
< The Input and Output Instructions
The "in" and "out" instructions use the following syntax:
The "port" parameter must be an unsigned integer constant in the range 0..255. Note that these instructions may be priviledged instructions when running under Win32. Their use may generate a fault in certain instances or when accessing certain ports.
< The Interrupt Instruction
This instruction uses the syntax "int( constant)" where the constant operand is an unsigned integer value in the range 0..255.
This instruction returns the empty string as its "returns" value.
See Chapter Six in "Art of Assembly" (DOS version) for a further discussion of this instruction. Note, however, that one generally does not use "int" under Win32 to make OS or BIOS calls. The "int $80" instruction is what you'd normally use to make very low-level Linux calls.
< Bound Instruction
This instruction takes the following forms:
bound( Reg16, mem )
bound( Reg16, AnonMem )
bound( Reg32, mem )
bound( Reg32, AnonMem )
bound( Reg16, constL, constH )
bound( Reg32, ConstL, ConstH )
The extended syntax forms emit the two constants to the static data segment and substitute the address of the first constant ( ConstL ) as their memory operand.
< The Enter Instruction
The ENTER instruction uses the syntax: "enter( const, const );". The first constant operand is the number of bytes of local variables in a procedure, the second constant operand is the lex level of the procedure. As a general rule, you should not use this instruction (and the corresponding LEAVE) instructions. HLA procedures automatically construct the display and activation record for you (more efficiently than when using ENTER). See the HLA Reference Manual for more details on building procedure activation records.
< CMPXCHG Instruction
This instruction uses the following syntax:
cmpxchg( Reg8, Reg8 )
cmpxchg( Reg8, Memory )
cmpxchg( Reg8, AnonMem )
cmpxchg( Reg16, Reg16 )
cmpxchg( Reg16, Memory )
cmpxchg( Reg16, AnonMem )
cmpxchg( Reg32, Reg32 )
cmpxchg( Reg32, Memory )
cmpxchg( Reg32, AnonMem )
Note: HLA does not currently support the Pentium cmpxchg8b instruction, although you can easily create a macro to implement this instruction. You can also use #ASM..#ENDASM or #EMIT to implement the CMPXCHG8B instruction.
< The XADD Instruction
The XADD instruction uses the following syntax:
xadd( Reg8, Reg8 )
xadd( mem, Reg8 )
xadd( AnonMem, Reg8 )
xadd( Reg16, Reg16 )
xadd( mem, Reg16 )
xadd( AnonMem, Reg16 )
xadd( Reg32, Reg32 )
xadd( mem, Reg32 )
xadd( AnonMem, Reg32 )
< BSF and BSR Instructions
The bit scan instructions use the following syntax (substitute BSR for BSF as appropriate):
bsf( Reg16, Reg16 );
bsf( mem, Reg16 );
bsf( AnonMem, Reg16 );
bsf( Reg32, Reg32 );
bsf( mem, Reg32 );
bsf( AnonMem, Reg32 );
< The BSWAP Instruction
This instruction takes the form "bswap( reg32 )". It converts between little endian and big endian data formats in the specified 32-bit register.
< Bit Test Instructions
This group of instructions includes BT, BTC, BTR, and BTS. They allow the following generic forms:
Substitute the BTC, BTR, or BTS mnemonic for BT in the examples above for these other instructions.
< Floating Point Instructions
HLA supports the following FPU instructions.
fld( FPreg );
fst( FPreg );
fld( FPmem ); // Returns operand.
fst( FPmem ); // 32 and 64-bits only! Returns operand.
fstp( FPmem ); // Returns operand.
fxch( FPreg );
fild( FPmem ); // Returns operand.
fist( FPmem ); // 32 and 64-bits only! Returns operand.
fistp( FPmem ); // Returns operand.
fbld( FPmem ); // Returns operand.
fbstp( FPmem ); // Returns operand.
fadd( );
fadd( FPreg, st0 );
fadd( st0, FPreg );
fadd( FPmem ); // Returns operand.
fadd( FPconst ); // Returns operand.
faddp( );
faddp( st0, FPreg );
fmul( );
fmul( FPreg, st0 );
fmul( st0, FPreg );
fmul( FPmem ); // Returns operand.
fmul( FPconst ); // Returns operand.
fmulp( );
fmulp( st0, FPreg );
fsub( );
fsub( FPreg, st0 );
fsub( st0, FPreg );
fsub( FPmem ); // Returns operand.
fsub( FPconst ); // Returns operand.
fsubp( );
fsubp( st0, FPreg );
fsubr( );
fsubr( FPreg, st0 );
fsubr( st0, FPreg );
fsubr( FPmem ); // Returns operand.
fsubr( FPconst ); // Returns operand.
fsubrp( );
fsubrp( st0, FPreg );
fdiv( );
fdiv( FPreg, st0 );
fdiv( st0, FPreg );
fdiv( FPmem ); // Returns operand.
fdiv( FPconst ); // Returns operand.
fdivp( );
fdivp( st0, FPreg );
fdivr( );
fdivr( FPreg, st0 );
fdivr( st0, FPreg );
fdivr( FPmem ); // Returns operand.
fdivr( FPconst ); // Returns operand.
fdivrp( );
fdivrp( st0, FPreg );
fiadd( mem16 ); // Returns operand.
fiadd( mem32 ); // Returns operand.
fiadd( const ); // Returns operand.
fimul( mem16 ); // Returns operand.
fimul( mem32 ); // Returns operand.
fimul( const ); // Returns operand.
fidiv( mem16 ); // Returns operand.
fidiv( mem32 ); // Returns operand.
fidiv( mem32 ); // Returns operand.
fidiv( const ); // Returns operand.
fidivr( mem16 ); // Returns operand.
fidivr( mem32 ); // Returns operand.
fidivr( const ); // Returns operand.
fcom( );
fcom( FPreg );
fccom( FPmem ); // Returns operand.
fcomp( );
fcomp( FPreg );
fcomp( FPmem ); // Returns operand.
fucom( );
fucom( FPreg );
fucomp( );
fucomp( FPreg );
fcompp();
fucompp();
ficom( mem16 ); // Returns operand.
ficom( mem32 ); // Returns operand.
ficom( const ); // Returns operand.
ficomp( mem16 ); // Returns operand.
ficomp( mem32 ); // Returns operand.
ficomp( const ); // Returns operand.
fsqrt(); // The following all return "st0"
fscale();
fprem();
fprem1();
frndint();
fxtract();
fabs();
fchs();
ftst();
fxam();
fldz();
fld1();
fldpi();
fldl2t();
fldl2e();
fldlg2();
fldln2();
f2xm1();
fsin();
fcos();
fsincos();
fptan();
fpatan();
fyl2x();
fyl2xp1();
finit(); // Returns ""
fwait();
fclex();
fincstp();
fdecstp();
fnop();
ffree( FPreg );
fldcw( mem );
fstcw( mem );
fstsw( mem );
See the chapter on real arithmetic in "The Art of Assembly Language Programming" for details on these instructions. Note that HLA does not support the entire FPU instruction set. If you absolutely need the few remaining instructions, use the #ASM..#ENDASM or #EMIT directives to generate them.
< Additional Floating Point Instructions for Pentium Pro and Later Processors
The FCMOVcc instructions (cc= a, ae, b, be, na, nae, nb, nbe, e, ne, u, nu) use the following basic syntax:
They move the specified floating point register to ST0 if the specified condition is true.
The FCOMI and FCOMIP instructions use the following syntax:
fcomi( st0, stn );
fcomip( st0, stn );
These instructions behave like their (syntactical equivalent) FCOM and FCOMP brethren except they store the status in the EFLAGs register directly rather than in the floating point status register.
< MMX Instructions
HLA supports the following MMX instructions found on the Pentium and later processors (note that some instructions are only available on Pentium III and later processors; see the Intel reference manuals for details):
HLA uses the symbols mm0, mm1, ..., mm7 for the MMX register set.
The following MMX instructions all use the same syntax. The syntax is
The following MMX instructions require a special syntax. The syntax is listed for each instruction.
pextrw( constant, mmxReg, Reg32 );
pinsrw( constant, Reg32, mmxReg );
pshufw( constant, mmxReg, mmxReg );
pshufw( constant, mem64, mmxReg );
Please see the appropriate Intel documentation or "The Art of Assembly Language" for a discussion of the behavior of these instructions.
< Memory Addressing Modes in HLA
HLA supports all the 32-bit addressing modes of the Intel 80x86 instruction set2. A memory address on the 80x86 may consist of one to three different components: a displacement (also called an offset), a base pointer, and a scaled index value. The following are the legal combinations of these components:
displacement + basePointer + scaledIndex
Note that a scaled index value cannot exist by itself.
HLA's syntax for memory addressing modes takes the following forms:
staticVarName[ breg32 + ireg32 ]
staticVarName[ breg32 + ireg32*index ]
staticVarName[ breg32 + constant ]
staticVarName[ ireg32 + constant ]
staticVarName[ ireg32*index + constant ]
staticVarName[ breg32 + ireg32 + constant ]
staticVarName[ breg32 + ireg32*index + constant ]
staticVarName[ breg32 - constant ]
staticVarName[ ireg32 - constant ]
staticVarName[ ireg32*index - constant ]
staticVarName[ breg32 + ireg32 - constant ]
staticVarName[ breg32 + ireg32*index - constant ]
[ breg32 + ireg32 + constant ]
[ breg32 + ireg32*index + constant ]
[ breg32 + ireg32 - constant ]
[ breg32 + ireg32*index - constant ]
" staticVarName " denotes any static variable currently in scope (local or global).
" basereg " denotes any general purpose 32-bit register.
" breg32 " denotes a base register and can be any general purpose 32-bit register.
" ireg32 " denotes an index register and may also be any general purpose register, even the same register as the base register in the address expression.
" index " denotes one of the four constants "1", "2", "4", or "8". In those address expression that have an index register without an index constant, "*1" is the default index.
Those memory addressing modes that do not have a variable name preceding them are known as "anonymous memory locations." Anonymous memory locations do not have a data type associated with them and in many instances you must use the type coercion operator in order to keep HLA happy.
Those memory addressing modes that do have a variable name attached to them inherit the base type of the variable. Read the next section for more details on data typing in HLA.
HLA allows another way to specify addition of the various addressing mode components in an address expression - by putting the components in separate brackets and concatenating them together. The following examples demonstrate the standard syntax and the alternate syntax:
The reason for allowing the extended syntax is because you might want to construct these addressing modes inside a macro from the individual pieces and it's much easier to concatenate two operands already surrounded by brackets than it is to pick the expressions apart and construct the standard addressing mode.
< Type Coercion in HLA
While an assembly language can never really be a strongly typed language, HLA is much more strongly typed than most other assembly languages.
Strong typing in an assembly language can be very frustrating. Therefore, HLA makes certain concessions to prevent the type system from interfering with the typical assembly language programmer. Within an 80x86 machine instruction, the only checking that takes place is a verification that the sizes of the operands are compatible.
Despite HLA playing fast and loose with machine instructions, there are many times when you will need to coerce the type of some operand. HLA uses the following syntax to coerce the type of a memory location or register operand:
There are two instances where type coercion is especially important: (1) when you need to assign a type other than byte, word, or dword to a register3; (2) when you need to assign an anonymous memory location a type.