8.9 Side Effects
A side effect is any computation or operation by a procedure that isn't the primary purpose of that procedure. For example, if you elect not to preserve all affected registers within a procedure, the modification of those registers is a side effect of that procedure. Side effect programming, that is, the practice of using a procedure's side effects, is very dangerous. All too often a programmer will rely on a side effect of a procedure. Later modifications may change the side effect, invalidating all code relying on that side effect. This can make your programs hard to debug and maintain. Therefore, you should avoid side effect programming.
Perhaps some examples of side effect programming will help enlighten you to the difficulties you may encounter. The following procedure zeros out an array. For efficiency reasons, it makes the caller responsible for preserving necessary registers. As a result, one side effect of this procedure is that the EBX and ECX registers are modified. In particular, the ECX register contains zero upon return.procedure ClrArray; begin ClrArray; lea( ebx, array ); mov( 32, ecx ); while( ecx > 0 ) do mov( 0, (type dword [ebx])); add( 4, ebx ); dec( ecx ); endwhile; end ClrArray;
If your code expects ECX to contain zero after the execution of this subroutine, you would be relying on a side effect of the ClrArray procedure. The main purpose behind this code is zeroing out an array, not setting the ECX register to zero. Later, if you modify the ClrArray procedure to the following, your code that depends upon ECX containing zero would no longer work properly:procedure ClrArray; begin ClrArray; mov( 0, ebx ); while( ebx < 32 ) do mov( 0, array[ebx*4] ); inc( ebx ); endwhile; end ClrArray;
So how can you avoid the pitfalls of side effect programming in your procedures? By carefully structuring your code and paying close attention to exactly how your calling code and the subservient procedures interface with one another. These rules can help you avoid problems with side effect programming:
- Always properly document the input and output conditions of a procedure. Never rely on any other entry or exit conditions other than these documented operations.
- Partition your procedures so that they compute a single value or execute a single operation. Subroutines that do two or more tasks are, by definition, producing side effects unless every invocation of that subroutine requires all the computations and operations.
- When updating the code in a procedure, make sure that it still obeys the entry and exit conditions. If not, either modify the program so that it does or update the documentation for that procedure to reflect the new entry and exit conditions.
- Avoid passing information between routines in the CPU's flag register. Passing an error status in the carry flag is about as far as you should ever go. Too many instructions affect the flags and it's too easy to foul up a return sequence so that an important flag is modified on return.
- Always save and restore all registers a procedure modifies.
- Avoid passing parameters and function results in global variables.
- Avoid passing parameters by reference (with the intent of modifying them for use by the calling code).
These rules, like all other rules, were meant to be broken. Good programming practices are often sacrificed on the altar of efficiency. There is nothing wrong with breaking these rules as often as you feel necessary. However, your code will be difficult to debug and maintain if you violate these rules often. But such is the price of efficiency1. Until you gain enough experience to make a judicious choice about the use of side effects in your programs, you should avoid them. More often than not, the use of a side effect will cause more problems than it solves.
Recursion occurs when a procedure calls itself. The following, for example, is a recursive procedure:procedure Recursive; begin Recursive; Recursive(); end Recursive;
Of course, the CPU will never return from this procedure. Upon entry into Recursive, this procedure will immediately call itself again and control will never pass to the end of the procedure. In this particular case, run away recursion results in an infinite loop2.
Like a looping structure, recursion requires a termination condition in order to stop infinite recursion. Recursive could be rewritten with a termination condition as follows:procedure Recursive; begin Recursive; dec( eax ); if( @nz ) then Recursive(); endif; end Recursive;
This modification to the routine causes Recursive to call itself the number of times appearing in the EAX register. On each call, Recursive decrements the EAX register by one and calls itself again. Eventually, Recursive decrements EAX to zero and returns. Once this happens, each successive call returns back to Recursive until control returns to the original call to Recursive.
So far, however, there hasn't been a real need for recursion. After all, you could efficiently code this procedure as follows:procedure Recursive; begin Recursive; repeat dec( eax ); until( @z ); end Recursive;
Both examples would repeat the body of the procedure the number of times passed in the EAX register3. As it turns out, there are only a few recursive algorithms that you cannot implement in an iterative fashion. However, many recursively implemented algorithms are more efficient than their iterative counterparts and most of the time the recursive form of the algorithm is much easier to understand.
The quicksort algorithm is probably the most famous algorithm that usually appears in recursive form (see a "Data Structures and Algorithms" textbook for a discussion of this algorithm). An HLA implementation of this algorithm follows:program QSDemo; #include( "stdlib.hhf" ); type ArrayType: uns32[ 10 ]; static theArray: ArrayType := [1,10,2,9,3,8,4,7,5,6]; procedure quicksort( var a:ArrayType; Low:int32; High: int32 ); const i: text := "(type int32 edi)"; j: text := "(type int32 esi)"; Middle: text := "(type uns32 edx)"; ary: text := "[ebx]"; begin quicksort; push( eax ); push( ebx ); push( ecx ); push( edx ); push( esi ); push( edi ); mov( a, ebx ); // Load BASE address of "a" into EBX mov( Low, edi); // i := Low; mov( High, esi ); // j := High; // Compute a pivotal element by selecting the // physical middle element of the array. mov( i, eax ); add( j, eax ); shr( 1, eax ); mov( ary[eax*4], Middle ); // Put middle value in EDX // Repeat until the EDI and ESI indicies cross one // another (EDI works from the start towards the end // of the array, ESI works from the end towards the // start of the array). repeat // Scan from the start of the array forward // looking for the first element greater or equal // to the middle element). while( Middle > ary[i*4] ) do inc( i ); endwhile; // Scan from the end of the array backwards looking // for the first element that is less than or equal // to the middle element. while( Middle < ary[j*4] ) do dec( j ); endwhile; // If we've stopped before the two pointers have // passed over one another, then we've got two // elements that are out of order with respect // to the middle element. So swap these two elements. if( i <= j ) then mov( ary[i*4], eax ); mov( ary[j*4], ecx ); mov( eax, ary[j*4] ); mov( ecx, ary[i*4] ); inc( i ); dec( j ); endif; until( i > j ); // We have just placed all elements in the array in // their correct positions with respect to the middle // element of the array. So all elements at indicies // greater than the middle element are also numerically // greater than this element. Likewise, elements at // indicies less than the middle (pivotal) element are // now less than that element. Unfortunately, the // two halves of the array on either side of the pivotal // element are not yet sorted. Call quicksort recursively // to sort these two halves if they have more than one // element in them (if they have zero or one elements, then // they are already sorted). if( Low < j ) then quicksort( a, Low, j ); endif; if( i < High ) then quicksort( a, i, High ); endif; pop( edi ); pop( esi ); pop( edx ); pop( ecx ); pop( ebx ); pop( eax ); end quicksort; begin QSDemo; stdout.put( "Data before sorting: " nl ); for( mov( 0, ebx ); ebx < 10; inc( ebx )) do stdout.put( theArray[ebx*4]:5 ); endfor; stdout.newln(); quicksort( theArray, 0, 9 ); stdout.put( "Data after sorting: " nl ); for( mov( 0, ebx ); ebx < 10; inc( ebx )) do stdout.put( theArray[ebx*4]:5 ); endfor; stdout.newln(); end QSDemo; Program 8.9 Recursive Quicksort Program
Note that this quicksort procedure uses registers for all non-parameter local variables. Also note how Quicksort uses TEXT constant definitions to provide more readable names for the registers. This technique can often make an algorithm easier to read; however, one must take care when using this trick not to forget that those registers are being used.
8.11 Forward Procedures
As a general rule HLA requires that you declare all symbols before their first use in a program4. Therefore, you must define all procedures before their first call. There are two reasons this isn't always practical: mutual recursion (two procedures call each other) and source code organization (you prefer to place a procedure in your code after the point you've first called it). Fortunately, HLA lets you use a forward procedure definition to declare a procedure prototype. Forward declarations let you define a procedure before you actually supply the code for that procedure.
A forward procedure declaration is a familiar procedure declaration that uses the reserved word FORWARD in place of the procedure's declaration section and body. The following is a forward declaration for the quicksort procedure appearing in the last section:procedure quicksort( var a:ArrayType; Low:int32; High: int32 ); forward;
A forward declaration in an HLA program is a promise to the compiler that the actual procedure declaration will appear, exactly as stated in the forward declaration, at a later point in the source code. "Exactly as stated" means exactly that. The forward declaration must have the same parameters, they must be passed the same way, and they must all have the same types as the formal parameters in the procedure5.
Routines that are mutually recursive (that is, procedure A calls procedure B and procedure B calls procedure A) require at least one forward declaration since only one of procedure A or B can be declared before the other. In practice, however, mutual recursion (direct or indirect) doesn't occur very frequently, so the need for forward declarations is not that great.
In the absence of mutual recursion, it is always possible to organize your source code so that each procedure declaration appears before its first invocation. What's possible and what's desired are two different things, however. You might want to group a related set of procedures at the beginning of your source code and a different set of procedures towards the end of your source code. This logical grouping, by function rather than by invocation, may make your programs much easier to read and understand. However, this organization may also yield code that attempts to call a procedure before its declaration. No sweat. Just use a forward procedure definition to resolve the problem.
One major difference between the forward definition and the actual procedure declaration has to do with the procedure options. Some options, like RETURNS may appear only in the forward declaration (if a FORWARD declaration is present). Other options may only appear in the actual procedure declaration (we haven't covered any of the other procedure options yet, so don't worry about them just yet). If your procedure requires a RETURNS option, the RETURNS option must appear before the FORWARD reserved word. E.g.,procedure IsItReady( valueToTest: dword ); returns( "EAX" ); forward;
The RETURNS option must not also appear in the actual procedure declaration later in your source file.
8.12 Putting It All Together
This chapter has filled in one of the critical elements missing from your assembly language knowledge: how to create user-defined procedures in an HLA program. This chapter discussed HLA's high level procedure declaration and calling syntax. It also described how to pass parameters by value and by reference as well as the use of local variables in HLA procedures. This chapter also provided information about instruction composition and the RETURNS option for procedures. Finally, this chapter explained recursion and the use of forward procedure declarations (prototypes).
The one thing this chapter did not discuss was how procedures are written in "pure" assembly language. This chapter presents just enough information to let you start using procedures in your HLA programs. The "real stuff" will have to wait for a few chapters. Fear not, however; later chapters will teach you far more than you probably care to know about procedures in assembly language programs.
1This is not just a snide remark. Expert programmers who have to wring the last bit of performance out of a section of code often resort to poor programming practices in order to achieve their goals. They are prepared, however, to deal with the problems that are often encountered in such situations and they are a lot more careful when dealing with such code.
2Well, not really infinite. The stack will overflow and Windows or Linux will raise an exception at that point.
3Although the latter version will do it considerably faster since it doesn't have the overhead of the CALL/RET instructions.
4There are a few minor exceptions to this rule, but it is certainly true for procedure calls.
5 Actually, "exactly" is too strong a word. You will see some exceptions in a moment.