HLA Strings and the HLA String Library

Chapter Overview

A string is a collection of objects stored in contiguous memory locations. Strings are usually arrays of bytes, words, or (on 80386 and later processors) double words. The 80x86 microprocessor family supports several instructions specifically designed to cope with strings. This chapter explores some of the uses of these string instructions.

The 80x86 CPUs can process three types of strings: byte strings , word strings, and double word strings. They can move strings, compare strings, search for a specific value within a string, initialize a string to a fixed value, and do other primitive operations on strings. The 80x86's string instructions are also useful for manipulating arrays, tables, and records. You can easily assign or compare such data structures using the string instructions. Using string instructions may speed up your array manipulation code considerably.

Character Strings

Since you'll encounter character strings more often than other types of strings, they deserve special attention. The following paragraphs describe character strings and various types of string operations.

At the most basic level, the 80x86's string instruction only operate upon arrays of characters. However, since most string data types contain an array of characters as a component, the 80x86's string instructions are handy for manipulating that portion of the string.

Probably the biggest difference between a character string and an array of characters is the length attribute. An array of characters contains a fixed number of characters. Never any more, never any less. A character string, however, has a dynamic run-time length, that is, the number of characters contained in the string at some point in the program. Character strings, unlike arrays of characters, have the ability to change their size during execution (within certain limits, of course).

To complicate things even more, there are two generic types of strings: statically allocated strings and dynamically allocated strings. Statically allocated strings are given a fixed, maximum length at program creation time. The length of the string may vary at run-time, but only between zero and this maximum length. Most systems allocate and deallocate dynamically allocated strings in a memory pool when using strings. Such strings may be any length (up to some reasonable maximum value). Accessing such strings is less efficient than accessing statically allocated strings. Furthermore, garbage collection1 may take additional time. Nevertheless, dynamically allocated strings are much more space efficient than statically allocated strings and, in some instances, accessing dynamically allocated strings is faster as well.

A string with a dynamic length needs some way of keeping track of this length. While there are several possible ways to represent string lengths, the two most popular are length-prefixed strings and zero-terminated strings. A length-prefixed string consists of a single byte, word, or double word that contains the length of that string. Immediately following this length value, are the characters that make up the string. Assuming the use of byte prefix lengths, you could define the string "HELLO" as follows:

byte 5, "HELLO";

Length-prefixed strings are often called Pascal strings since this is the type of string variable supported by most versions of Pascal2.

Another popular way to specify string lengths is to use zero-terminated strings. A zero-terminated string consists of a string of characters terminated with a zero byte. These types of strings are often called C-strings since they are the type used by the C/C++ programming language. If you are manually creating string values, zero terminated strings are a little easier to deal with because you don't have to count the characters in the string. Here's an example of a zero terminated string:

byte "HELLO", 0;

Pascal strings are much better than C/C++ strings for several reasons. First, computing the length of a Pascal string is trivial. You need only fetch the first byte (or word) of the string and you've got the length of the string. Computing the length of a C/C++ string is considerably less efficient. You must scan the entire string (e.g., using the SCASB instruction) for a zero byte. If the C/C++ string is long, this can take a long time. Furthermore, C/C++ strings cannot contain the NULL character. On the other hand, C/C++ strings can be any length, yet require only a single extra byte of overhead. Pascal strings, however, can be no longer than 255 characters when using only a single length byte. For strings longer than 255 bytes, you'll need two or more bytes to hold the length for a Pascal string. Since most strings are less than 256 characters in length, this isn't much of a disadvantage.

Common string functions like concatenation, length, substring, index, and others are much easier to write (and much more efficient) when using length-prefixed strings. So from a performance point of view, length-prefixed strings seem to be the better way to go. However, Windows requires the use of zero-terminated strings; so if you're going to call win32 APIs, you've either got to use zero-terminated strings or convert them before each call.

HLA takes a different approach. HLA's strings are both length-prefixed and zero terminated. Therefore, HLA strings require a few extra bytes but enjoy the advantages of both schemes. HLA's string functions like concatenation are very efficient without losing Windows compatibility.

HLA's strings are actually an extension of length prefixed strings because HLA's strings actually contain two lengths: a maximum length and a dynamic length. The dynamic length field is similar to the length field of Pascal strings insofar as it holds the current number of characters in the strring. HLA's length field, however, is four bytes so HLA strings may contain over four billion characters. The static length field holds the maximum number of characters the string may contain. By adding this extra field HLA can check the validity of operations like string concatenation and string assignment to verify that the destination string is large enough to hold the result. This is an extra integrity check that is often missing in string libraries found in typical high level languages.

In addition to providing two lengths, HLA also zero terminates its strings. This lets you pass HLA strings as parameters to Win32 and other functions that work with zero-terminated strings. Also, in those few instances where zero-terminated strings are more convenient, HLA's string format still shines. Of course, the drawback to zero-terminated strings is that you cannot put the NUL character (ASCII code zero) into such a string, fortunately the need to do so is not very great.

HLA's strings actually have another few attributes that improve their efficiency. First of all, HLA almost always aligns string data on double word boundaries. HLA also allocates data for a string in four-byte chunks. By aligning strings on double word boundaries and allocating storage that is an even multiple of four bytes long, HLA allows you to use double word string instructions when processing strings. Since the double word instructions are often four times faster than the byte versions, this is an important benefit. As a result of this storage and alignment, HLA's string library routines are very efficient.

Of course, HLA strings are not without their disadvantages. To represent a string containing n characters requires between n+9 and n+12 bytes in memory. HLA's strings require at least n+9 bytes because of the two double word length values and the zero terminating byte. Furthermore, since the entire object must be an even multiple of four bytes long, HLA strings may need up to three bytes of padding to ensure this.

HLA string variables are always pointers. HLA even treats string constants as literal pointer constants. The pointer points at the first byte of the character string. Successive memory locations contain successive characters in the string up to the zero terminating byte. This format is compatible with zero-terminated strings like those that C/C++ uses. The dynamic (current) length field is situated four bytes before the first character in the string (that is, at the pointer address minus four). The maximum (static) length field appears eight bytes before the first character of the string. See HLA String Format. shows the HLA string format.

HLA String Format

For more information on strings and HLA strings, see the chapter on character strings in AoA.

HLA Standard Library String Functions

The HLA Standard Library contains a large number of efficient string functions that perform all the common string operations, and then some. This section discusses the HLA string functions and suggests some uses for many of these functions and other objects.

The stralloc and strfree Routines

procedure stralloc( strsize: uns32 ); returns( "eax" );

procedure strfree( strToFree: string );

This text has already discussed the stralloc and strfree routines in the chapter on character strings, but a review is probably useful here. These routines dynamically allocate and deallocate storage for a string object in memory. They are the principle mechanism HLA provides for allocating storage for string variables. Therefore, you need to be comfortable using these procedures.

The first thing to note about these routines is that they are not actually a part of the HLA String Library. They are actually members of the memory allocation package in the HLA Standard Library. The reason for mentioning this fact is just to point out that the names of these routines are stralloc and strfree. Most of the routines in the HLA Standard Library belong to the str namespace and, therefore, have names like str.cpy and str.length. Note that most HLA string function names have a period between the str and the base function name; this is not true for stralloc and strfree since they are not a part of the HLA string package3.

The stralloc parameter specifies the maximum number of characters for the string it allocates. The stralloc routine allocates at least enough storage for this many characters plus the 9-12 bytes of overhead required for a string object. It initializes the MaxStrLen field to at least strsize (it could be as large as strsize+3 depending on strsize and the need for padding bytes in the string object). This function also initializes the length field to zero and stores a zero byte in the first character position of the string data (that is, it zero terminates the empty string it creates). Since the other HLA string functions require double word aligned strings, stralloc returns a pointer that points at a double word boundary.

Upon return from stralloc, the EAX register contains the address of the string object. Generally you would store this 32-bit pointer into a string variable or pass it on to some other function that needs the address of a string object. Like any other string pointer, the value stralloc returns points at the first character position in the storage it allocates.

Internally, the stralloc routine calls malloc to allocate the storage for the string data on the heap. However, the pointer that stralloc returns is not the same value that malloc returns. This is because string objects require an eight-byte prefix that holds the MaxStrLen and length fields. Therefore, stralloc actually returns a pointer that is eight bytes beyond the value that the internal call to malloc returns. Therefore, you cannot call the free procedure to return this string storage to the heap because free requires a pointer to the beginning of the storage that malloc allocates4. Instead, call the strfree routine to return string object storage to the system. The strfree's parameter is the address of a string object that you allocated with stralloc.

Note that you must not use strfree to attempt to free storage for objects that you do not allocate (directly or indirectly) with stralloc. In particular, do not attempt to free statically initialized strings or strings you create with str.strvar.

Many of the HLA Standard Library string routines begin with a name of the form "str.a_*****". This "a_" prefix on the function name indicates that the string function automatically allocates storage for a new string by calling stralloc. These functions typically return a pointer to the new string in the EAX register, just like stralloc. When you are done with the string these functions create, you can free the storage for the string by calling strfree.

// stralloc and strfree demonstration program.

program str_alloc_free_demo;

#include( "stdlib.hhf" )

static

str1 :string;

str2 :string;

begin str_alloc_free_demo;

// Allocate a string with a maximum length of 16 characters.

stralloc( 16 );

mov( eax, str1 );

// Initialize this string with the str.cpy routine:

str.cpy( "Hello ", str1 );

// Allocate storage for a second string with 16 characters

// and initialize the string data:

stralloc( 16 );

mov( eax, str2 );

str.cpy( "world", str2 );

// Concatenate the two strings and print the pertinent data:

str.cat( str1, str2 );

mov( str1, ebx );

mov( [ebx-4], ecx ); // Get the current string length.

mov( [ebx-8], edx ); // Get the maximum string length.

stdout.put

(

"str1='",

str1,

"', length=",

(type uns32 ecx ),

", maximum length=",

(type uns32 edx),

);

mov( str2, ebx );

mov( [ebx-4], ecx ); // Get the current string length.

mov( [ebx-8], edx ); // Get the maximum string length.

stdout.put

(

"str2='",

str2,

"', length=",

(type uns32 ecx ),

", maximum length=",

(type uns32 edx),

);

// Okay, we're done with the strings, free the storage

// associated with them:

strfree( str1 );

strfree( str2 );

end str_alloc_free_demo;

Example of stralloc and strfree Calls

The str.strRec Data Structure

The str.strRec data structure lets you directly access the maximum and current length prefix values of an HLA string. This allows you to use symbolic (and meaningful) names to access these fields rather than using numeric offsets like -4 and -8. By using str.strRec you don't have to remember which offset is associated with the two different length values.

The str.strRec type definition is a RECORD with the following fields:

MaxStrLen

length

strData

The MaxStrLen field (obviously) specifies the offset (-8) of the maximum string length double word in a string. The length field specifies the offset (-4) to the current dynamic length field. The strData field specifies the offseet (0) of the first character in the string; generally, you do not use this last field because accessing the character data in a string is trivial (your string variable points directly at the first character in the string).

Generally, you use the str.strRec type to coerce a string pointer appearing in a 32-bit register. For example, if EAX contains the address of an HLA string variable, then "mov( (type str.strRec [eax]).length, ecx );" extracts the current string length. In theory, you could use this type to declare string headers, but no one really uses this data type for that purpose; instead, this type exists mainly as a mechanism for type coercion. The following sample program is a modification of the previous program that uses str.strRec rather than literal numeric offsets.

// str.strRec demonstration program.

program strRec_demo;

#include( "stdlib.hhf" )

static

str1 :string;

str2 :string;

begin strRec_demo;

// Allocate a string with a maximum length of 16 characters.

stralloc( 16 );

mov( eax, str2 );

// Initialize this string with the str.cpy routine:

str.cpy( "Hello ", str2 );

// Allocate storage for a second string with 16 characters

// and initialize the string data:

stralloc( 16 );

mov( eax, str1 );

str.cpy( "world", str1 );

// Concatenate the two strings and print the pertinent data:

str.cat( str1, str2 );

mov( str1, ebx );

mov( (type str.strRec [ebx]).length, ecx ); // Get the current str len

mov( (type str.strRec [ebx]).MaxStrLen, edx ); // Get the max str len

stdout.put

(

"str1='",

str1,

"', length=",

(type uns32 ecx ),

", maximum length=",

(type uns32 edx),

);

mov( str2, ebx );

mov( (type str.strRec [ebx]).length, ecx ); // Get the current str len

mov( (type str.strRec [ebx]).MaxStrLen, edx ); // Get the max str len

stdout.put

(

"str2='",

str2,

"', length=",

(type uns32 ecx ),

", maximum length=",

(type uns32 edx),

);

// Free the storage associated with these strings:

strfree( str1 );

strfree( str2 );

end strRec_demo;

Programming Example that uses the str.strRec Data Type

The str.strvar Macro

The str.strvar macro statically allocates storage for a string in the STATIC variable declaration section (you cannot use str.strvar in any of the other variable declaration sections). This provides a convenient mechanism for declaring static strings when you know the maximum size at compile-time.

Example:

static

StaticString: str.strvar( 32 );

This macro invocation does two things: (1) it reserves sufficient storage for a string that can hold at least 32 characters (plus an additional nine bytes for the string overhead); (2) it allocates storage for a string pointer variable and initializes that variable with the address of the string storage. When you reference the object named StaticString you are actually accessing this pointer variable.

Note that str.strvar uses parentheses rather than square brackets to specify the string size. Syntactically, square brackets would be nice since this gives the illusion of declaring an array of characters. However, str.strvar is a macro and the character count is a parameter; macro parameters always appear within parentheses, so you must use parentheses in this declaration.

// str.strvar demonstration program.

program strvar_demo;

#include( "stdlib.hhf" )

static

demoStr :str.strvar( 16 );

begin strvar_demo;

// Initialize our string via str.a_cpy (note that a_cpy automatically

// allocates storage for the string on the heap):

str.cpy( "Hello World", demoStr );

mov( demoStr, ebx );

mov( (type str.strRec [ebx]).length, ecx ); // Get the current str len

mov( (type str.strRec [ebx]).MaxStrLen, edx ); // Get the current str len

stdout.put

(

"demoStr='",

demoStr,

"', length=",

(type uns32 ecx ),

", maximum length=",

(type uns32 edx),

);

end strvar_demo;

Program that Demonstrates the use of the str.strvar Declaration

The str.length Function and the str.mLength Macro

procedure str.length( s:string ); returns( "eax" );

macro str.mLength( s ); // s must be a 32-bit register or a string variable

The str.length function and str.mLength macro compute the length of an HLA string and copy this length into the EAX register. The macro version (str.mLength) is more efficient since it compiles into a single MOV instruction (accessing the str.strRec.length field directly). For this reason you should generally use the macro (str.mLength) to compute the length rather than the str.length function. You should only use the str.length function when you need procedure call semantics (e.g., when you need to pass the address of the length function to some other procedure).

You may question why HLA even provides a length function. After all, extracting the string's length using the str.strRec type definition is easy enough to do. The principle reason HLA provides a length function is because "str.length(s)" is much easier to read and understand than "mov( (type str.strRec [eax]).length, eax);" Of course, the str.mLength function compiles directly into this instruction, so there is no efficiency reason for using the direct access mechanism. The only time you should really use the str.strRec RECORD type is when you need to move the string length into a register other than EAX.

The str.length and str.mLength parameters must be a string variable or a 32-bit register (which, presumably, contains the address of a string in memory). Remember, string variables are really nothing more than pointers, so when you pass a string variable as a parameter to an HLA string function, HLA passes the value of that pointer which happens to be the address of the first character in the string.

There is a big difference between the two calls "str.length( eax );" and "str.length( (type string [eax]) );" The first call assumes that EAX contains the value of a string pointer (that is, EAX points directly at the first character of the actual string); in this first example, HLA simply passes the value in the EAX register to the str.length function. In the second example, "str.length( (type string [eax]) );" , HLA assumes that EAX contains the address of a string variable (which is a pointer) and passes the 32-bit address at the location contained within EAX. In this example, EAX is a pointer to a string variable rather than the string itself.

Computing the length of a string is one of the most common string operations. In fact, length computation is probably the most oft-used string functions in a string library since most of the other string functions need to compute the string length in order to do their work. This is why HLA's length-prefixed string data structure is so important- computing the string length is a common operation and length-prefixed strings make this computation trivial.

// str.length demonstration program.

program strlength_demo;

#include( "stdlib.hhf" )

static

demoStr :string;

begin strlength_demo;

// Initialize our string via str.a_cpy (note that a_cpy automatically

// allocates storage for the string on the heap):

str.a_cpy( "Hello World" );

mov( eax, demoStr );

mov( eax, ebx );

str.mLength( ebx ); // Can use a register or str var with str.mLength.

mov( eax, ecx );

str.length( demoStr ); // Can use a register or str var with str.length.

stdout.put

(

"demoStr='",

demoStr,

"', length via str.mLength=",

(type uns32 ecx ),

", length via str.length=",

(type uns32 eax),

);

// Free the storage allocated by the str.a_cpy procedure:

strfree( demoStr );

end strlength_demo;

Example of str.length and str.mLength Function Calls

The str.init Function

procedure str.init( var b:byte; numBytes:dword ); returns( "eax" );

There are four ways you can allocate storage for an HLA compatible string: you can use the str.strvar macro (see The str.strvar Macro) to statically allocate storage for a string, you can initialize a string variable in a STATIC or READONLY section, you can dynamically allocate storage using a function like stralloc, or you can manually reserve the storage yourself. To manually reserve storage you must set aside enough storage for the string, the maximum length, the current length, the zero terminating byte, and any necessary padding bytes. You must also ensure that the string begins on a double word boundary and that the entire structure's byte count is an even multiple of four5. After you reserve sufficient storage, you must also initialize the MaxStrLen and length fields and supply a zero terminating byte for the string. This turns out to be quite a bit of work. Fortunately, the str.init function takes care of most of this work for you.

This function initializes a block of memory for use as a string object. It takes the address of a character array variable b and aligns this address to a double word boundary. Then it initializes the MaxStrLen, length, and zero terminating byte fields at the resulting address. Finally, it returns a pointer to the newly created string object in EAX. The numBytes field specifies the size of the entire buffer area, not the desired maximum length of the string. The numBytes field must be 16 or greater, else this routine will raise an ex.ValueOutOfRange exception. Note that string initialization may consume as many as 15 bytes (up to three bytes to align the address on a double word boundary, four bytes for the MaxStrLen field, four bytes for the length field, and the string data area must be a multiple of four bytes long (including the zero terminating byte). This is why the numBytes field must be 16 or greater. Note that this function initializes the resulting string to the empty string. The MaxStrLen field will contain the maxium number of characters that you can store into the resulting string after subtracting the zero terminating byte, the sizes of the length fields, and any alignment bytes that were necessary.

In general, if you want the maximum string length to be at least m characters, you should reserve m+16 bytes and pass the address of this buffer to str.init. Note that the actual maximum length HLA writes to the MaxStrLen field is the maximum number of characters one could legally put into the string (after subtracting the overhead and padding bytes). If you need to set a specific MaxStrLength value of exactly m, then allocate m+16 bytes of storage, call str.init (passing the address of the buffer and m+16), and then store m into the MaxStrLen field upon return from str.init.

// str.init demonstration program.

program strinit_demo;

#include( "stdlib.hhf" )

static

theStr :string;

unalign :byte; // Do this so strData is not dword aligned.

strData :byte[ 48 ]; // Storage for a string with 32 characters.

begin strinit_demo;

// Create a string variable using the "strData" array to hold the

// string data:

str.init( strData, 48 );

mov( eax, theStr );

// Initialize our string via str.cpy:

str.cpy( "Hello there World, how are you?", theStr );

mov( theStr, ebx );

str.length( ebx );

mov( (type str.strRec [ebx]).MaxStrLen, edx );

lea( esi, strData );

stdout.put

(

"theStr='",

theStr,

"', length=",

(type uns32 ecx ),

", max length=",

(type uns32 edx),

nl,

"Address of strData: ",

esi,

nl,

"Address of start of string data: ",

(type dword theStr),

);

end strinit_demo;

Programming Example that uses the str.init Function

The str.cpy and str.a_cpy Functions

procedure str.cpy( src:string; dest:string );

procedure str.a_cpy( src:string ); returns( "eax" );

The str.cpy routine copies the character data from one string to another and adjusts the destination string's length field accordingly. The destination string's maximum string length must be at least as large as the current size of the source string or str.cpy will raise a string overflow exception. Before calling this routine, you must ensure that both strings have storage allocated for them or the program will raise an exception. Note that simply declaring a destination string variable does not allocate storage for the string object. You must call stralloc or somehow otherwise allocate data storage for the string. Failing to allocate storage for the destination string is probably the most common mistake beginning programmers make when calling the str.cpy routine.

Note that there is a fundamental difference between the following two code sequences:

mov( srcStr, eax );

mov( eax, destStr );

and

str.cpy( srcStr, destStr );

The two MOV instructions above copy a string by reference whereas the call to str.cpy copies the string by value. Usually, copying a string by reference is much faster than copying the string by value, since you need only copy four bytes (the string pointer) when copying by reference. Copy by value, on the other hand, requires copying the length value (four bytes), each character in the string (length bytes), plus a zero terminating byte. This is slower than simply copying a pointer and can be much slower if the string is long. However, keep in mind that if you copy a string by reference, then the two string objects are aliases of one another. Any change to you make to one of the strings is reflected in the other. When you copy a string by value (using str.cpy), each string variable has its own data, so changes to one string will not affect the other.

Although str.cpy does not automatically allocate storage for the destination string, the need to do this arises quite often. The str.a_cpy handles this common requirement. As you can see above, the str.a_cpy routine does not have a destination operand. Instead, str.a_cpy calls stralloc to allocate sufficient storage for a new string and copies the source string to this new string. After copying the data, str.a_cpy returns a pointer to the new string in the EAX register. When you are done with this string data you should call strfree to return the storage back to the system.

// str.cpy demonstration program.

program strcpy_demo;

#include( "stdlib.hhf" )

static

strConst :string := "This is a string";

srcStr :str.strvar( 32 );

destStr :string;

smallStr :str.strvar( 12 );

begin strcpy_demo;

// Use str.cpy to initialize srcStr by copying the

// static string constant <<strConst>> to srcStr.

str.cpy( strConst, srcStr );

str.length( srcStr );

stdout.put

(

"srcStr='",

srcStr,

"', length=",

(type uns32 eax ),

);

// Okay, now use str.a_cpy to make a copy of srcStr

// whose storage is dynamically allocated on the heap:

str.a_cpy( srcStr );

mov( eax, destStr );

str.mLength( eax );

stdout.put

(

"destStr='",

destStr,

"', length=",

(type uns32 eax ),

);

// Now let's demonstrate what can go wrong if a string

// overflow occurs:

try

str.cpy( srcStr, smallStr );

anyexception

stdout.put( "An exception occured while copying srcStr to smallStr" nl );

endtry;

// Don't forget to free the storage associated with destStr:

strfree( destStr );

end strcpy_demo;

Program that uses the str.cpy and str.a_cpy Procedures

The str.cat and str.a_cat Functions

procedure str.cat( src: string; dest: string );

procedure str.a_cat( leftSrc: string; rightSrc: string ); returns( "eax" );

These two functions concatenate two strings. The str.cat procedure directly concatenates one string to the end of the destination string (that the second parameter specifies). The str.a_cat procedure creates a new string on the heap (by calling stralloc) and copies the string the first parameter specifies to this new string. Immediately thereafter, it concatenates the string object the second parameter specifies to the end of this new string. Finally, str.a_cat returns the address of the new string in the EAX register. Note that str.a_cat, unlike str.cat, does not affect the value of either string appearing in the parameter list. When you finish using the string that str.a_cat allocates, you can return the storage to the system by passing the address to strfree.

String concatenation is easily one of the most common string operations (the others being string copy and string comparison). Concatenation is a fundamental operation that you use to build larger strings up from smaller strings. A few common examples of string concatenation include applying suffixes (like ".HLA") to filenames and merging a person's first and last names together to form a single string.

Examples of the str.cat and str.a_cat Procedures

The String Comparison Routines

procedure str.eq( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.ne( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.lt( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.le( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.gt( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.ge( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.ieq( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.ine( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.ilt( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.ile( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.igt( lftOperand: string; rtOperand: string ); returns( "al" );

procedure str.ige( lftOperand: string; rtOperand: string ); returns( "al" );

These procedures compare two strings. They are equivalent to the boolean expression:

lftOperand op rtOperand

where op represents one of the relational operators "=", "<>" ("!=" to C programmers), "<", "<=", ">", or ">=". These functions return true (1) or false (0) in the EAX register depending upon the result of the comparison6. For example, "str.lt( s, r );" returns true in EAX if s < r, it returns false otherwise. This feature lets you use these procedures as boolean expression. The following example shows how you could use str.lt in an IF statement:

if( str.lt( s, r )) then

-- do something if s < r --

endif;

As you've probably noticed, there are two different sets of string comparison functions. Those that have names of the form "str.i**" do case insensitive string comparisons. That is, these functions compare the strings ignoring differences in alphabetic case. For example, these functions treat "Hello" and "hello" as through they were the same string. Note that case insensitive comparisons are relatively inefficient compared with case sensitive comparisons, so you should only use these forms if you absolutely need a case insensitive comparison.

These functions do not modify their parameters.

// String comparisons demonstration program.

program strcmp_demo;

#include( "stdlib.hhf" )

static

str1 :string := "abcdefg";

str2 :string := "hijklmn";

str3 :string := "AbCdEfG";

procedure cmpStrs( s1:string; s2:string ); nodisplay;

var

eq :boolean;

ne :boolean;

lt :boolean;

le :boolean;

ge :boolean;

gt :boolean;

begin cmpStrs;

stdout.put( nl "String #1 = '", s1, "'" nl );

stdout.put( "String #2 = '", s2, "'" nl nl );

str.eq( s1, s2 ); mov( al, eq );

str.ne( s1, s2 ); mov( al, ne );

str.lt( s1, s2 ); mov( al, lt );

str.le( s1, s2 ); mov( al, le );

str.ge( s1, s2 ); mov( al, ge );

str.gt( s1, s2 ); mov( al, gt );

stdout.put

(

"eq = ", eq, nl,

"ne = ", ne, nl,

"lt = ", lt, nl,

"le = ", le, nl,

"ge = ", ge, nl,

"gt = ", gt, nl

);

str.ieq( s1, s2 ); mov( al, eq );

str.ine( s1, s2 ); mov( al, ne );

str.ilt( s1, s2 ); mov( al, lt );

str.ile( s1, s2 ); mov( al, le );

str.ige( s1, s2 ); mov( al, ge );

str.igt( s1, s2 ); mov( al, gt );

stdout.put

(

"ieq = ", eq, nl,

"ine = ", ne, nl,

"ilt = ", lt, nl,

"ile = ", le, nl,

"ige = ", ge, nl,

"igt = ", gt, nl

);

end cmpStrs;

begin strcmp_demo;

cmpStrs( str1, str2 );