Chapter Nine Domain Specific Embedded Languages
9.1 Chapter Overview
HLA's compile time language was designed with one purpose in mind: to give the HLA user the ability to change the syntax of the language in a user-defined manner. The compile-time language is actually so powerful that it lets you implement the syntax of other languages (not just an assembly language) within an HLA source file. This chapter discusses how to take this feature to an extreme and implement your own "mini-languages" within the HLA language.
9.2 Introduction to DSELs in HLA
One of the most interesting features of the HLA language is its ability to support Domain Specific Embedded Languages (or DSELs, for short, which you pronounce "D-cells"). A domain specific language is a language designed with a specific purpose in mind. Applications written in an appropriate domains specific language (DSL) are often much shorter and much easier to write than that same application written in a general purpose language (like C/C++, Java, or Pascal). Unfortunately, writing a compiler for a DSL is considerable work. Since most DSLs are so specific that few programs are ever written in them, it is generally cost-prohibitive to create a DSL for a given application. This economic fact has led to the popularity of domain specific embedded languages. The difference between a DSL and a DSEL is the fact that you don't write a new compiler for DSEL; instead, you provide some tools for use by an existing language translator to let the user extend the language as necessary for the specific application. This allows the language designer to use the features of the existing (i.e., embedding) language without having to write the translator for these features in the DSEL. The HLA language incorporates lots of features that let you extend the language to handle your own particular needs. This section discusses how to use these features to extend HLA as you choose.
As you probably suspect by now, the HLA compile-time language is the principle tool at your disposal for creating DSELs. HLA's multi-part macros let you easily create high level language-like control structures. If you need some new control structure that HLA does not directly support, it's generally an easy task to write a macro to implement that control structure. If you need something special, something that HLA's multi-part macros won't directly support, then you can write code in the HLA compile-time language to process portions of your source file as though they were simply string data. By using the compile-time string handling functions you can process the source code in just about any way you can imagine. While many such techniques are well beyond the scope of this text, it's reassuring to know that HLA can handle just about anything you want to do, even once you become an advanced assembly language programmer.
The following sections will demonstrate how to extend the HLA language using the compile-time language facilities. Don't get the idea that these simple examples push the limits of HLA's capabilities, they don't. You can accomplish quite a bit more with the HLA compile-time language; these examples must be fairly simple because of the assumed general knowledge level of the audience for this text.
9.2.1 Implementing the Standard HLA Control Structures
HLA supports a wide set of high level language-like control structures. These statements are not true assembly language statements, they are high level language statements that HLA compiles into the corresponding low-level machine instructions. They are general control statements, not "domain specific" (which is why HLA includes them) but they are quite typical of the types of statements one can add to HLA in order to extend the language. In this section we will look at how you could implement many of HLA's high-level control structures using the compile-time language. Although there is no real need to implement these statements in this manner, their example should provide a template for implementing other types of control structures in HLA.
The following sections show how to implement the FOREVER..ENDFOR, WHILE..ENDWHILE, and IF..ELSEIF..ELSE..ENDIF statements. This text leaves the REPEAT..UNTIL and BEGIN..EXIT..EXITIF..END statements as exercises. The remaining high level language control structures (e.g., TRY..ENDTRY) are a little too complex to present at this point.
Because words like "if" and "while" are reserved by HLA, the following examples will use macro identifiers like "_if" and "_while". This will let us create recognizable statements using standard HLA identifiers (i.e., no conflicts with reserved words).
126.96.36.199 The FOREVER Loop
The FOREVER loop is probably the easiest control structure to implement. After all, the basic FOREVER loop simply consists of a label and a JMP instruction. So the first pass at implementing _FOREVER.._ENDFOR might look like the following:#macro _forever: topOfLoop; topOfLoop: #terminator _endfor; jmp topOfLoop; #endmacro;
Unfortunately, there is a big problem with this simple implementation: you'll probably want the ability to exit the loop via break and breakif statements and you might want the equivalent of a continue and continueif statement as well. If you attempt to use the standard BREAK, BREAKIF, CONTINUE, and CONTINUEIF statements inside this _forever loop implementation, you'll quickly discover that they do not work. Those statements are valid only inside an HLA loop and the _forever macro above is not an HLA loop. Of course, we could easily solve this problem by defining _FOREVER thusly:#macro _forever; forever #terminator _endfor; endfor; #endmacro;
Now you can use BREAK, BREAKIF, CONTINUE, and CONTINUEIF inside the _forever.._endfor statement. However, this solution is ridiculous. The purpose of this section is to show you how you could create this statement were it not present in the HLA language. Simply renaming FOREVER to _forever is not an interesting solution.
Probably the best way to implement these additional statements is via KEYWORD macros within the _forever macro. Not only is this easy to do, but it has the added benefit of not allowing the use of these statements outside a _forever loop.
Implementing a _continue statement is very easy. Continue must transfer control to the first statement at the top of the loop. Therefore, the _continue #KEYWORD macro will simply expand to a single JMP instruction that transfers control to the topOfLoop label. The complete implementation is the following:keyword _continue; jmp topOfLoop;
Implementing _continueif is a little bit more difficult because this statement must evaluate a boolean expression and decide whether it must jump to the topOfLoop label. Fortunately, the HLA JT (jump if true) pseudo-instruction makes this a relatively trivial task. The JT pseudo-instruction expects a boolean expression (the same that CONTINUEIF allows) and transfers control to the corresponding target label if the result of the expression evaluation is true. The _continueif implementation is nearly trivial with JT:keyword _continueif( ciExpr ); JT( ciExpr ) topOfLoop;
You will implement the _break and _breakif #KEYWORD macros in a similar fashion. The only difference is that you must add a new label just beyond the JMP in the _endfor macro and the break statements should jump to this local label. The following program provides a complete implementation of the _forever.._endfor loop as well as a sample test program for the _forever loop./************************************************/ /* */ /* foreverMac.hla */ /* */ /* This program demonstrates how to use HLA's */ /* "context-free" macros, along with the JT */ /* "medium-level" instruction to create */ /* the FOREVER..ENDFOR, BREAK, BREAKIF, */ /* CONTINUE, and CONTINUEIF control statements. */ /* */ /************************************************/ program foreverDemo; #include( "stdlib.hhf" ) // Emulate the FOREVER..ENDFOR loop here, plus the // corresponding CONTINUE, CONTINUEIF, BREAK, and // BREAIF statements. macro _forever:foreverLbl, foreverbrk; // Target label for the top of the // loop. This is also the destination // for the _continue and _continueif // macros. foreverLbl: // The _continue and _continueif statements // transfer control to the label above whenever // they appear in a _forever.._endfor statement. // (Of course, _continueif only transfers control // if the corresponding boolean expression evaluates // true.) keyword _continue; jmp foreverLbl; keyword _continueif( cifExpr ); jt( cifExpr ) foreverLbl; // the _break and _breakif macros transfer // control to the "foreverbrk" label which // is at the bottom of the loop. keyword _break; jmp foreverbrk; keyword _breakif( bifExpr ); jt( bifExpr ) foreverbrk; // At the bottom of the _forever.._endfor // loop this code must jump back to the // label at the top of the loop. The // _endfor terminating macro must also supply // the target label for the _break and _breakif // keyword macros: terminator _endfor; jmp foreverLbl; foreverbrk: endmacro; begin foreverDemo; // A simple main program that demonstrates the use of the // statements above. mov( 0, ebx ); _forever stdout.put( "Top of loop, ebx = ", (type uns32 ebx), nl ); inc( ebx ); // On first iteration, skip all further statements. _continueif( ebx = 1 ); // On fourth iteration, stop. _breakif( ebx = 4 ); _continue; // Always jumps to top of loop. _break; // Never executes, just demonstrates use. _endfor; end foreverDemo; Program 9.1 Macro Implementation of the FOREVER..ENDFOR Loop
188.8.131.52 The WHILE Loop
Once the FOREVER..ENDFOR loop is behind us, implementing other control structures like the WHILE..ENDWHILE loop is fairly easy. Indeed, the only notable thing about implementing the _while.._endwhile macros is that the code should implement this control structure as a REPEAT..UNTIL statement for efficiency reasons. The implementation appearing in this section takes a rather lazy approach to implementing the DO reserved word. The following code uses a #KEYWORD macro to implement a "_do" clause, but it does not enforce the (proper) use of this keyword. Instead, the code simply ignores the _do clause wherever it appears between the _while and _endwhile. Perhaps it would have been better to check for the presence of this statement (not to difficult to do) and verify that it immediately follows the _while clause and associated expression (somewhat difficult to do), but this just seems like a lot of work to check for the presence of an irrelevant keyword. So this implementation simply ignores the _do. The complete implementation appears in Program 9.2:/************************************************/ /* */ /* whileMacs.hla */ /* */ /* This program demonstrates how to use HLA's */ /* "context-free" macros, along with the JT and */ /* JF "medium-level" instructions to create */ /* the basic WHILE statement. */ /* */ /************************************************/ program whileDemo; #include( "stdlib.hhf" ) // Emulate the while..endwhile loop here. // // Note that this code implements the WHILE // loop as a REPEAT..UNTIL loop for efficiency // (though it inserts an extra jump so the // semantics remain the same as the WHILE loop). macro _while( whlexpr ): repeatwhl, whltest, brkwhl; // Transfer control to the bottom of the loop // where the termination test takes place. jmp whltest; // Emit a label so we can jump back to the // top of the loop. repeatwhl: // Ignore the "_do" clause. Note that this // macro should really check to make sure // that "_do" follows the "_while" clause. // But it's not semantically important so // this code takes the lazy way out. keyword _do; // If we encounter "_break" inside this // loop, transfer control to the first statement // beyond the loop. keyword _break; jmp brkwhl; // Ditto for "_breakif" except, of course, we // only exit the loop if the corresponding // boolean expression evaluates true. keyword _breakif( biwExpr ); jt( biwExpr ) brkwhl; // The "_continue" and "_continueif" statements // should transfer control directly to the point // where this loop tests for termination. keyword _continue; jmp whltest; keyword _continueif( ciwExpr ); jt( ciwExpr ) whltest; // The "_endwhile" clause does most of the work. // First, it must emit the target label used by the // "_while", "_continue", and "_continueif" clauses // above. Then it must emit the code that tests the // loop termination condition and transfers control // to the top of the loop (the "repeatwhl" label) // if the expression evaluates false. Finally, // this code must emit the "brkwhl" label the "_break" // and "_breakif" statements reference. terminator _endwhile; whltest: jt( whlexpr ) repeatwhl; brkwhl: endmacro; begin whileDemo; // Quick demo of the _while statement. // Note that the _breakif in the nested // _while statement only skips the // inner-most _while, just as you should expect. mov( 0, eax ); _while( eax < 10 ) _do stdout.put( "eax in loop = ", eax, " ebx=" ); inc( eax ); mov( 0, ebx ); _while( ebx < 4 ) _do stdout.puti32( ebx ); _breakif( ebx = 3 ); stdout.put( ", " ); inc( ebx ); _endwhile; stdout.newln(); _continueif( eax = 5 ); _breakif( eax = 8 ); _continue; _break; _endwhile end whileDemo; Program 9.2 Macro Implementation of the WHILE..ENDWHILE Loop
184.108.40.206 The IF Statement
Simulating the HLA IF..THEN..ELSEIF..ELSE..ENDIF statement using macros is a little bit more involved than the simulation of FOREVER or WHILE. The semantics of the ELSEIF and ELSE clauses complicate the code generation and require careful thought. While it is easy to write #KEYWORD macros for _elseif and _else, ensuring that these statements generate correct (and efficient) code is another matter altogether.
The basic _if.._endif statement, without the _elseif and _else clauses, is very easy to implement (even easier than the _while.._endwhile loop of the previous section). The complete implementation is#macro _if( ifExpr ): onFalse; jf( ifExpr ) onFalse; #keyword _then; // Just ignore _then. #terminator _endif; onFalse: #endmacro;
This macro generates code that tests the boolean expression you supply as a macro parameter. If the expression evaluates false, the code this macro emits immediately jumps to the point just beyond the _endif terminating macro. So this is a simple and elegant implementation of the IF..ENDIF statement, assuming you don't need an ELSE or ELSEIF clause.
Adding an ELSE clause to this statement introduces some difficulties. First of all, we need some way to emit the target label of the JF pseudo-instruction in the _else section if it is present and we need to emit this label in the terminator section if the _else section is not present.
A related problem is that the code after the _if clause must end with a JMP instruction that skips the _else section if it is present. This JMP must transfer control to the same location as the current onFalse label.
Another problem that occurs when we use #KEYWORD macros to implement the _else clause, is that we need some mechanism in place to ensure that at most one invocation of the _else macro appears in a given _if.._endif sequence.
We can easily solve these problems by introducing a compile-time variable (i.e., VAL object) into the macro. We will use this variable to indicate whether we've seen an _else section. This variable will tell us if we have more than one _else clause (which is an error) and it will tell us if we need to emit the onFalse label in the _endif macro. A reasonable implementation might be the following:#macro _if( ifExpr ): onFalse, ifDone, hasElse; ?hasElse := False; // Haven't seen an _else clause yet. jf( ifExpr ) onFalse; #keyword _then; // Just ignore _then. #keyword _else; // Check to see if this _if statement already has an _else clause: #if( hasElse ) #error( "Only one _else clause is legal in an _if statement' ) #endif ?hasElse := true; //Let the world know we've see an _else clause. // Since we've just encountered the _else clause, we've just finished // processing the statements in the _if section. The first thing we // need to do is emit a JMP instruction that will skip around the // _else statements (so the _if section doesn't fall in to the // _else code). jmp ifDone; // Okay, emit the onFalse label here so a false expression will transfer // control to the _else statements: onFalse: #terminator _endif; // If there was no _else section, we must emit the onFalse label // so that the former JF instruction has a proper destination. // If an _else section was present, we cannot emit this label // (since the _else code has already done so) but we must emit // the ifDone label. #if( hasElse ) ifdone: #else onFalse: #endif #endmacro;
Adding the _elseif clause to the _if.._endif statement complicates things considerably. The problem is that _elseif can appear zero or more times in an _if statement and each occurrence needs to generate a unique onFalse label. Worse, if at least one _elseif clause appears in the sequence, then the JF instruction in the _if clause must transfer control to the first _elseif, not to the _else clause. Also, the last _elseif clause must transfer control to the _else clause (or to the first statement beyond the _endif clause) if its expression evaluates false. A straight-forward implementation just isn't going to work here.
A clever solution is to create a string variable that contains the name of the previous JF target label. Whenever you encounter an _elseif or an _else clause you simply emit this string to the source file as the target label. Then the only trick is "how do we generate a unique label whenever we need one?". Well, let's suppose that we have a string that is unique on each invocation of the _if macro. This being the case, we can generate a (source file wide) unique string by concatenating a counter value to the end of this base string. Each time we need a unique string, we simply bump the value of the counter up by one and create a new string. Consider the following macro:#macro genLabel( base, number ); @text( base + string( number )); #endmacro;
If the base parameter is a string value holding a valid HLA identifier and the number parameter is an integer numeric operand, then this macro will emit a valid HLA identifier that consists of the base string followed by a string representing the numeric constant. For example, 'genLabel( "Hello", 52)' emits the label Hello52. Since we can easily create an uns32 VAL object inside our _if macro and increment this each time we need a unique label, the only problem is to generate a unique base string on each invocation of the _if macro. Fortunately, HLA already does this for us.
Remember, HLA converts all local macro symbols to a unique identifier of the form "_xxxx_" where xxxx represents some four-digit hexadecimal value. Since local symbols are really nothing more than text constants initialized with these unique identifier strings, it's very easy to obtain an unique string in a macro invocation- just declare a local symbol (or use an existing local symbol) and apply the @STRING: operator to it to extract the unique name as a string. The following example demonstrates how to do this:#macro uniqueIDs: counter, base; ?counter := 0; // Increment this for each unique symbol you need. ?base := @string:base; // base holds the base name to use. . . . // Generate a unique label at this point: genLabel( base, counter ): // Notice the colon. We're defining a ?counter := counter + 1; // label at this point! . . . genLabel( base, counter ): ?counter := counter + 1; . . . etc. #endmacro;
Once we have the capability to generate a sequence of unique labels throughout a macro, implementing the _elseif clause simply becomes the task of emitting the last referenced label at the beginning of each _elseif (or _else) clause and jumping if false to the next unique label in the series. Program 9.3 implements the _if.._then.._elseif.._else.._endif statement using exactly this technique./*************************************************/ /* */ /* IFmacs.hla */ /* */ /* This program demonstrates how to use HLA's */ /* "context-free" macros, along with the JT and */ /* JF "medium-level" instructions to create */ /* an IF statement. */ /* */ /*************************************************/ program IFDemo; #include( "stdlib.hhf" ) // genlabel- // // This macro creates an HLA-compatible // identifier of the form "_xxxx_n" where // "_xxxx_" is the string associated with // the "base" parameter and "n" represents // some numeric value that the caller. The // combination of the base and the n values // will produce a unique label in the // program if base's string is unique for // each invocation of the "_if" macro. macro genLabel( base, number ); @text( base + string( number )) endmacro; /* ** Emulate the if..elseif..else..endif statement here. */ macro _if( ifexpr ):elseLbl, ifDone, hasElse, base; // This macro must create a unique ID string // in base. One sneaky way to do this is // to use the converted name HLA generates // for the "base" object (this is generally // a string of the form "_xxxx_" where "xxxx" // is a four-digit hexadecimal value). ?base := @string:base; // This macro may need to generate a large set // of different labels (one for each _elseif // clause). This macro uses the elseLbl // value, along with the value of "base" above, // to generate these unique labels. ?elseLbl := 0; // hasElse determines if we have an _else clause // present in this statement. This macro uses // this value to determine if it must emit a // final else label when it encounters _endif. ?hasElse := false; // For an IF statement, we must evaluate the // boolean expression and jump to the current // else label if the expression evaluates false. jf( ifexpr ) genLabel( base, elseLbl ); // Just ignore the _then keyword. // A slightly better implementation would require // this keyword, the current implementation lets // you write an "_if" clause without the "_then" // clause. For that matter, the current implementation // lets you arbitrarily sprinkle "_then" clauses // throughout the "_if" statement; we will ignore // this for this example. keyword _then; // Handle the "_elseif" clause here. keyword _elseif(elsex); // _elseif clauses are illegal after // an _else clause in the statement. // Enforce that here. #if( hasElse ) #error( "Unexpected '_elseif' clause" ) #endif // We've just finished the "_if" clause // or a previous "_elseif" clause. So // the first thing we have to do is jump // to the code just beyond this "_if" // statement. jmp ifDone; // Okay, this is where the previous "_if" or // "_elseif" statement must jump if its boolean // expression evaluates false. Emit the target // label. Next, because we're about to jump // to our own target label, bump up the elseLbl // value by one to prevent jumping back to the // label we're about to emit. Finally, emit // the code that tests the boolean expression and // transfers control to the next _elseif or _else // clause if the result is false. genLabel( base, elseLbl ): ?elseLbl := elseLbl+1; jf(elsex) genLabel( base, elseLbl ); keyword _else; // Only allow a single "_else" clause in this // "_if" statement: #if( hasElse ) #error( "Unexpected '_else' clause" ) #endif // As above, we've just finished the previous "_if" // or "_elseif" clause, so jump directly to the end // of the "_if" statement. jmp ifDone; // Okay, emit the current 'else' label so that // the failure of the previous "_if" or "_elseif" // test will transfer control here. Also set // 'hasElse' to true to catch additional "_elseif" // and "_else" clauses. genLabel( base, elseLbl ): ?hasElse := true; terminator _endif; // At the end of the _if statement we must emit the // destination label that the _if and _elseif sections // jump to. Also, if there was no _else section, this // code has to emit the last deployed else label. ifDone: #if( !hasElse ) genLabel( base, elseLbl ): #endif endmacro; begin IFDemo; // Quick demo of the use of the above statements. for( mov( 0, eax ); eax < 5; inc( eax )) do _if( eax = 0 ) _then stdout.put( "in _if statement" nl ); _elseif( eax = 1 ) _then stdout.put( "in first _elseif clause" nl ); _elseif( eax = 2 ) _then stdout.put( "in second _elseif clause" nl ); _else stdout.put( "in _else clause" nl ); _if( eax > 3 ) _then stdout.put( "in second _if statement" nl ); _endif; _endif; endfor; end IFDemo; Program 9.3 Macro Implementation of the IF..ENDIF Statement