4.4 Characters

4.4 Characters

Perhaps the most important data type on a personal computer is the character data type. The term "character" refers to a human or machine readable symbol that is typically a non-numeric entity. In general, the term "character" refers to any symbol that you can normally type on a keyboard (including some symbols that may require multiple key presses to produce) or display on a video display. Many beginners often confuse the terms "character" and "alphabetic character." These terms are not the same. Punctuation symbols, numeric digits, spaces, tabs, carriage returns (enter), other control characters, and other special symbols are also characters. When this text uses the term "character" it refers to any of these characters, not just the alphabetic characters. When this text refers to alphabetic characters, it will use phrases like "alphabetic characters," "upper case characters," or "lower case characters."¹.

Another common problem beginners have when they first encounter the character data type is differentiating between numeric characters and numbers. The character `1' is distinct and different from the value one. The computer (generally) uses two different internal, binary, representations for numeric characters (`0', `1', ..., `9') versus the numeric values zero through nine. You must take care not to confuse the two.

Most computer systems use a one or two byte sequence to encode the various characters in binary form. Windows and Linux certainly fall into this category, using either the ASCII or Unicode encodings for characters. This section will discuss the ASCII character set and the character declaration facilities that HLA provides.

4.4.1 The ASCII Character Encoding

The ASCII (American Standard Code for Information Interchange) Character set maps 128 textual characters to the unsigned integer values 0..127 ($0..$7F). Internally, of course, the computer represents everything using binary numbers; so it should come as no surprise that the computer also uses binary values to represent non-numeric entities such as characters. Although the exact mapping of characters to numeric values is arbitrary and unimportant, it is important to use a standardized code for this mapping since you will need to communicate with other programs and peripheral devices and you need to talk the same "language" as these other programs and devices. This is where the ASCII code comes into play; it is a standardized code that nearly everyone has agreed upon. Therefore, if you use the ASCII code 65 to represent the character "A" then you know that some peripheral device (such as a printer) will correctly interpret this value as the character "A" whenever you transmit data to that device.

You should not get the impression that ASCII is the only character set in use on computer systems. IBM uses the EBCDIC character set family on many of its mainframe computer systems. Another common character set in use is the Unicode character set. Unicode is an extension to the ASCII character set that uses 16 bits rather than seven to represent characters. This allows the use of 65,536 different characters in the character set, allowing the inclusion of most symbols in the world's different languages into a single unified character set.

Since the ASCII character set provides only 128 different characters and a byte can represent 256 different values, an interesting question arises: "what do we do with the values 128..255 that one could store into a byte value when working with character data?" One answer is to ignore those extra values. That will be the primary approach of this text. Another possibility is to extend the ASCII character set and add an additional 128 characters to the character set. Of course, this would tend to defeat the whole purpose of having a standardized character set unless you could get everyone to agree upon the extensions. That is a difficult task.

When IBM first created their IBM-PC, they defined these extra 128 character codes to contain various non-English alphabetic characters, some line drawing graphics characters, some mathematical symbols, and several other special characters. Since IBM's PC was the foundation for what we typically call a PC today, that character set has become a pseudo-standard on all IBM-PC compatible machines. Even on modern machines, which are not IBM-PC compatible and cannot run early PC software, the IBM extended character set still survives. Note, however, that this PC character set (an extension of the ASCII character set) is not universal. Most printers will not print the extended characters when using native fonts and many programs (particularly in non-English countries) do not use those characters for the upper 128 codes in an eight-bit value. For these reasons, this text will generally stick to the standard 128 character ASCII character set. However, a few examples and programs in this text will use the IBM PC extended character set, particularly the line drawing graphic characters (see Appendix B).

Should you need to exchange data with other machines which are not PC-compatible, you have only two alternatives: stick to standard ASCII or ensure that the target machine supports the extended IBM-PC character set. Some machines, like the Apple Macintosh, do not provide native support for the extended IBM-PC character set; however you may obtain a PC font which lets you display the extended character set. Other machines have similar capabilities. However, the 128 characters in the standard ASCII character set are the only ones you should count on transferring from system to system.

Despite the fact that it is a "standard", simply encoding your data using standard ASCII characters does not guarantee compatibility across systems. While it's true that an "A" on one machine is most likely an "A" on another machine, there is very little standardization across machines with respect to the use of the control characters. Indeed, of the 32 control codes plus delete, there are only four control codes commonly supported - backspace (BS), tab, carriage return (CR), and line feed (LF). Worse still, different machines often use these control codes in different ways. End of line is a particularly troublesome example. Windows, MS-DOS, CP/M, and other systems mark end of line by the two-character sequence CR/LF. Apple Macintosh, and many other systems mark the end of line by a single CR character. Linux, BeOS, and other UNIX systems mark the end of a line with a single LF character. Needless to say, attempting to exchange simple text files between such systems can be an experience in frustration. Even if you use standard ASCII characters in all your files on these systems, you will still need to convert the data when exchanging files between them. Fortunately, such conversions are rather simple.

Despite some major shortcomings, ASCII data is the standard for data interchange across computer systems and programs. Most programs can accept ASCII data; likewise most programs can produce ASCII data. Since you will be dealing with ASCII characters in assembly language, it would be wise to study the layout of the character set and memorize a few key ASCII codes (e.g., "0", "A", "a", etc.).

The ASCII character set (excluding the extended characters defined by IBM) is divided into four groups of 32 characters. The first 32 characters, ASCII codes 0 through $1F (31), form a special set of non-printing characters called the control characters. We call them control characters because they perform various printer/display control operations rather than displaying symbols. Examples include carriage return, which positions the cursor to the left side of the current line of characters², line feed (which moves the cursor down one line on the output device), and back space (which moves the cursor back one position to the left). Unfortunately, different control characters perform different operations on different output devices. There is very little standardization among output devices. To find out exactly how a control character affects a particular device, you will need to consult its manual.

The second group of 32 ASCII character codes comprise various punctuation symbols, special characters, and the numeric digits. The most notable characters in this group include the space character (ASCII code $20) and the numeric digits (ASCII codes $30..$39). Note that the numeric digits differ from their numeric values only in the H.O. nibble. By subtracting $30 from the ASCII code for any particular digit you can obtain the numeric equivalent of that digit.

The third group of 32 ASCII characters contains the upper case alphabetic characters. The ASCII codes for the characters "A".."Z" lie in the range $41..$5A (65..90). Since there are only 26 different alphabetic characters, the remaining six codes hold various special symbols.

The fourth, and final, group of 32 ASCII character codes represent the lower case alphabetic symbols, five additional special symbols, and another control character (delete). Note that the lower case character symbols use the ASCII codes $61..$7A. If you convert the codes for the upper and lower case characters to binary, you will notice that the upper case symbols differ from their lower case equivalents in exactly one bit position. For example, consider the character code for "E" and "e" in the following figure:

Figure 4.6 ASCII Codes for "E" and "e"

The only place these two codes differ is in bit five. Upper case characters always contain a zero in bit five; lower case alphabetic characters always contain a one in bit five. You can use this fact to quickly convert between upper and lower case. If you have an upper case character you can force it to lower case by setting bit five to one. If you have a lower case character and you wish to force it to upper case, you can do so by setting bit five to zero. You can toggle an alphabetic character between upper and lower case by simply inverting bit five.

Indeed, bits five and six determine which of the four groups in the ASCII character set you're in:

Table 9: ASCII Groups

Bit 6

Bit 5

Group

0

0

Control Characters

0

1

Digits & Punctuation

1

0

Upper Case & Special

1

1

Lower Case & Special

So you could, for instance, convert any upper or lower case (or corresponding special) character to its equivalent control character by setting bits five and six to zero.

Consider, for a moment, the ASCII codes of the numeric digit characters:

Table 10: ASCII Codes for Numeric Digits

Character

Decimal

Hexadecimal

"0"

48

$30

"1"

49

$31

"2"

50

$32

"3"

51

$33

"4"

52

$34

"5"

53

$35

"6"

54

$36

"7"

55

$37

"8"

56

$38

"9"

57

$39

The decimal representations of these ASCII codes are not very enlightening. However, the hexadecimal representation of these ASCII codes reveals something very important - the L.O. nibble of the ASCII code is the binary equivalent of the represented number. By stripping away (i.e., setting to zero) the H.O. nibble of a numeric character, you can convert that character code to the corresponding binary representation. Conversely, you can convert a binary value in the range 0..9 to its ASCII character representation by simply setting the H.O. nibble to three. Note that you can use the logical-AND operation to force the H.O. bits to zero; likewise, you can use the logical-OR operation to force the H.O. bits to %0011 (three).

Note that you cannot convert a string of numeric characters to their equivalent binary representation by simply stripping the H.O. nibble from each digit in the string. Converting 123 ($31 $32 $33) in this fashion yields three bytes: $010203, not the correct value which is $7B. Converting a string of digits to an integer requires more sophistication than this; the conversion above works only for single digits.

4.4.2 HLA Support for ASCII Characters

Although you could easily store character values in byte variables and use the corresponding numeric equivalent ASCII code when using a character literal in your program, such agony is unnecessary - HLA provides good support for character variables and literals in your assembly language programs.

Character literal constants in HLA take one of two forms: a single character surrounded by apostrophes or a pound symbol ("#") followed by a numeric constant in the range 0..127 specifying the ASCII code of the character. Here are some examples:
				`A'			#65			#$41			#%0100_0001
 

 
Note that these examples all represent the same character (`A') since the ASCII code of `A' is 65.

With a single exception, only a single character may appear between the apostrophes in a literal character constant. That single exception is the apostrophe character itself. If you wish to create an apostrophe literal constant, place four apostrophes in a row (i.e., double up the apostrophe inside the surrounding apostrophes), i.e.,
''''
 

 
The pound sign operator ("#") must precede a legal HLA numeric constant (either decimal, hexadecimal or binary as the examples above indicate). In particular, the pound sign is not a generic character conversion function; it cannot precede registers or variable names, only constants. As a general rule, you should always use the apostrophe form of the character literal constant for graphic characters (that is, those that are printable or displayable). Use the pound sign form for control characters (that are invisible, or do funny things when you print them) or for extended ASCII characters that may not display or print properly within your source code.

Notice the difference between a character literal constant and a string literal constant in your programs. Strings are sequences of zero or more characters surrounded by quotation marks, characters are surrounded by apostrophes. It is especially important to realize that
`A' ¦ "A"
 
The character constant `A' and the string containing the single character "A" have two completely different internal representations. If you attempt to use a string containing a single character where HLA expects a character constant, HLA will report an error. Strings and string constants will be the subject of a later chapter.

To declare a character variable in an HLA program, you use the char data type. The following declaration, for example, demonstrates how to declare a variable named UserInput:
static
 
	UserInput:				char;
 

 
This declaration reserves one byte of storage that you could use to store any character value (including eight-bit extended ASCII characters). You can also initialize character variables as the following example demonstrates:
static
 

 
	TheCharA:				char := `A';
 
	ExtendedChar				char := #128;
 

 
Since character variables are eight-bit objects, you can manipulate them using eight-bit registers. You can move character variables into eight-bit registers and you can store the value of an eight-bit register into a character variable.

The HLA Standard Library provides a handful of routines that you can use for character I/O and manipulation; these include stdout.putc, stdout.putcSize, stdout.put, stdin.getc, and stdin.get.

The stdout.putc routine uses the following calling sequence:
					stdout.putc( chvar );
 

 
This procedure outputs the single character parameter passed to it as a character to the standard output device. The parameter may be any char constant or variable, or a byte variable or register³.

The stdout.putcSize routine provides output width control when displaying character variables. The calling sequence for this procedure is
			stdout.putcSize( charvar, widthInt32, fillchar );
 

 
This routine prints the specified character (parameter c) using at least width print positions⁴. If the absolute value of width is greater than one, then stdout.putcSize prints the fill character as padding. If the value of width is positive, then stdout.putcSize prints the character right justified in the print field; if width is negative, then stdout.putcSize prints the character left justified in the print field. Since character output is usually left justified in a field, the width value will normally be negative for this call. The space character is the most common fill value.

You can also print character values using the generic stdout.put routine. If a character variable appears in the stdout.put parameter list, then stdout.put will automatically print it as a character value, e.g.,
	stdout.put( "Character c = `", c, "`", nl );
 

 
You can read characters from the standard input using the stdin.getc and stdin.get routines. The stdin.getc routine does not have any parameters. It reads a single character from the standard input buffer and returns this character in the AL register. You may then store the character value away or otherwise manipulate the character in the AL register. The following program reads a single character from the user, converts it to upper case if it is a lower case character, and then displays the character:
 
program charInputDemo;
 
#include( "stdlib.hhf" );
 
static
 
    c:char;
 
    
 
begin charInputDemo;
 

 
    stdout.put( "Enter a character: " );
 
    stdin.getc();
 
    if( al >= `a' ) then
 
    
 
        if( al <= `z' ) then
 
        
 
            and( $5f, al );
 
            
 
        endif;
 
        
 
    endif;
 
    stdout.put
 
    ( 
 
        "The character you entered, possibly ", nl,
 
        "converted to upper case, was `"
 
    );
 
    stdout.putc( al );
 
    stdout.put( "`", nl );
 
   
 
end charInputDemo;
 

 
Program 4.1	 Character Input Sample
 
You can also use the generic stdin.get routine to read character variables from the user. If a stdin.get parameter is a character variable, then the stdin.get routine will read a character from the user and store the character value into the specified variable. Here is the program above rewritten to use the stdin.get routine:
 
program charInputDemo2;
 
#include( "stdlib.hhf" );
 
static
 
    c:char;
 
    
 
begin charInputDemo2;
 

 
    stdout.put( "Enter a character: " );
 
    stdin.get(c);
 
    if( c >= `a' ) then
 
    
 
        if( c <= `z' ) then
 
        
 
            and( $5f, c );
 
            
 
        endif;
 
        
 
    endif;
 
    stdout.put
 
    ( 
 
        "The character you entered, possibly ", nl,
 
        "converted to upper case, was `",
 
        c,
 
        "`", nl 
 
    );
 
   
 
end charInputDemo2;
 

 
Program 4.2	 Stdin.get Character Input Sample
 
As you may recall from the last chapter, the HLA Standard Library buffers its input. Whenever you read a character from the standard input using stdin.getc or stdin.get, the library routines read the next available character from the buffer; if the buffer is empty, then the program reads a new line of text from the user and returns the first character from that line. If you want to guarantee that the program reads a new line of text from the user when you read a character variable, you should call the stdin.flushInput routine before attempting to read the character. This will flush the current input buffer and force the input of a new line of text on the next input (which should be your stdin.getc or stdin.get call).

The end of line is problematic. Different operating systems handle the end of line differently on output versus input. From the console device, pressing the ENTER key signals the end of a line; however, when reading data from a file you get an end of line sequence which is typically a line feed or a carriage return/line feed pair. To help solve this problem, HLA's Standard Library provides an "end of line" function. This procedure returns true (one) in the AL register if all the current input characters have been exhausted, it returns false (zero) otherwise. The following sample program demonstrates the use of the stdin.eoln function.
 
program eolnDemo2;
 
#include( "stdlib.hhf" );
 
begin eolnDemo2;
 

 
    stdout.put( "Enter a short line of text: " );
 
    stdin.flushInput();
 
    repeat
 
    
 
        stdin.getc();
 
        stdout.putc( al );
 
        stdout.put( "=$", al, nl );
 
        
 
    until( stdin.eoln() );
 
    
 
end eolnDemo2;
 

 
Program 4.3	 Testing for End of Line Using Stdin.eoln
 
The HLA language and the HLA Standard Library provide many other procedures and additional support for character objects. Later chapters in this textbook, as well as the HLA reference documentation, describe how to use these features.

4.4.3 The ASCII Character Set

The following table lists the binary, hexadecimal, and decimal representations for each of the 128 ASCII character codes.

Table 11: ASCII Character Set

Binary

Hex

Decimal

Character

0000_0000

00

0

NULL

0000_0001

01

1

ctrl A

0000_0010

02

2

ctrl B

0000_0011

03

3

ctrl C

0000_0100

04

4

ctrl D

0000_0101

05

5

ctrl E

0000_0110

06

6

ctrl F

0000_0111

07

7

bell

0000_1000

08

8

backspace

0000_1001

09

9

tab

0000_1010

0A

10

line feed

0000_1011

0B

11

ctrl K

0000_1100

0C

12

form feed

0000_1101

0D

13

return

0000_1110

0E

14

ctrl N

0000_1111

0F

15

ctrl O

0001_0000

10

16

ctrl P

0001_0001

11

17

ctrl Q

0001_0010

12

18

ctrl R

0001_0011

13

19

ctrl S

0001_0100

14

20

ctrl T

0001_0101

15

21

ctrl U

0001_0110

16

22

ctrl V

0001_0111

17

23

ctrl W

0001_1000

18

24

ctrl X

0001_1001

19

25

ctrl Y

0001_1010

1A

26

ctrl Z

0001_1011

1B

27

ctrl [

0001_1100

1C

28

ctrl \

0001_1101

1D

29

Esc

0001_1110

1E

30

ctrl ^

0001_1111

1F

31

ctrl _

0010_0000

20

32

space

0010_0001

21

33

!

0010_0010

22

34

"

0010_0011

23

35

#

0010_0100

24

36

$

0010_0101

25

37

%

0010_0110

26

38

&

0010_0111

27

39

'

0010_1000

28

40

(

0010_1001

29

41

)

0010_1010

2A

42

*

0010_1011

2B

43

+

0010_1100

2C

44

,

0010_1101

2D

45

-

0010_1110

2E

46

.

0010_1111

2F

47

/

0011_0000

30

48

0

0011_0001

31

49

1

0011_0010

32

50

2

0011_0011

33

51

3

0011_0100

34

52

4

0011_0101

35

53

5

0011_0110

36

54

6

0011_0111

37

55

7

0011_1000

38

56

8

0011_1001

39

57

9

0011_1010

3A

58

:

0011_1011

3B

59

;

0011_1100

3C

60

<

0011_1101

3D

61

=

0011_1110

3E

62

>

0011_1111

3F

63

?

0100_0000

40

64

@

0100_0001

41

65

A

0100_0010

42

66

B

0100_0011

43

67

C

0100_0100

44

68

D

0100_0101

45

69

E

0100_0110

46

70

F

0100_0111

47

71

G

0100_1000

48

72

H

0100_1001

49

73

I

0100_1010

4A

74

J

0100_1011

4B

75

K

0100_1100

4C

76

L

0100_1101

4D

77

M

0100_1110

4E

78

N

0100_1111

4F

79

O

0101_0000

50

80

P

0101_0001

51

81

Q

0101_0010

52

82

R

0101_0011

53

83

S

0101_0100

54

84

T

0101_0101

55

85

U

0101_0110

56

86

V

0101_0111

57

87

W

0101_1000

58

88

X

0101_1001

59

89

Y

0101_1010

5A

90

Z

0101_1011

5B

91

[

0101_1100

5C

92

\

0101_1101

5D

93

]

0101_1110

5E

94

^

0101_1111

5F

95

_

0110_0000

60

96

`

0110_0001

61

97

a

0110_0010

62

98

b

0110_0011

63

99

c

0110_0100

64

100

d

0110_0101

65

101

e

0110_0110

66

102

f

0110_0111

67

103

g

0110_1000

68

104

h

0110_1001

69

105

i

0110_1010

6A

106

j

0110_1011

6B

107

k

0110_1100

6C

108

l

0110_1101

6D

109

m

0110_1110

6E

110

n

0110_1111

6F

111

o

0111_0000

70

112

p

0111_0001

71

113

q

0111_0010

72

114

r

0111_0011

73

115

s

0111_0100

74

116

t

0111_0101

75

117

u

0111_0110

76

118

v

0111_0111

77

119

w

0111_1000

78

120

x

0111_1001

79

121

y

0111_1010

7A

122

z

0111_1011

7B

123

{

0111_1100

7C

124

|

0111_1101

7D

125

}

0111_1110

7E

126

~

0111_1111

7F

127

¯

4.5 The UNICODE Character Set

Although the ASCII character set is, unquestionably, the most popular character representation on computers, it is certainly not the only format around. For example, IBM uses the EBCDIC code on many of its mainframe and minicomputer lines. Since EBCDIC appears mainly on IBM's big iron and you'll rarely encounter it on personal computer systems, we will not consider that character set in this text. Another character representation that is becoming popular on small computer systems (and large ones, for that matter) is the Unicode character set. Unicode overcomes two of ASCII's greatest limitations: the limited character space (i.e., a maximum of 128/256 characters in an eight-bit byte) and the lack of international (beyond the USA) characters.

Unicode uses a 16-bit word to represent a single character. Therefore, Unicode supports up to 65,536 different character codes. This is obviously a huge advance over the 256 possible codes we can represent with an eight-bit byte. Unicode is upwards compatible from ASCII. Specifically, if the H.O. 17 bits of a Unicode character contain zero, then the L.O. seven bits represent the same character as the ASCII character with the same character code. If the H.O. 17 bits contain some non-zero value, then the character represents some other value. If you're wondering why so many different character codes are necessary, simply note that certain Asian character sets contain 4096 characters (at least, their Unicode subset).

This text will stick to the ASCII character set except for a few brief mentions of Unicode here and there. Eventually, this text may have to eliminate the discussion of ASCII in favor of Unicode since many new operating systems are using Unicode internally (and convert to ASCII as necessary). Unfortunately, many string algorithms are not as conveniently written for Unicode as for ASCII (especially character set functions) so we'll stick with ASCII in this text as long as possible.

¹Upper and lower case characters are always alphabetic characters within this text.

²Historically, carriage return refers to the paper carriage used on typewriters. A carriage return consisted of physically moving the carriage all the way to the right so that the next character typed would appear at the left hand side of the paper.

³If you specify a byte variable or a byte-sized register as the parameter, the stdout.putc routine will output the character whose ASCII code appears in the variable or register.

⁴The only time stdout.putcSize uses more print positions than you specify is when you specify zero as the width; then this routine uses exactly one print position.

Bit 6	Bit 5	Group
0	0	Control Characters
0	1	Digits & Punctuation
1	0	Upper Case & Special
1	1	Lower Case & Special

Character	Decimal	Hexadecimal
"0"	48	$30
"1"	49	$31
"2"	50	$32
"3"	51	$33
"4"	52	$34
"5"	53	$35
"6"	54	$36
"7"	55	$37
"8"	56	$38
"9"	57	$39

Binary	Hex	Decimal	Character
0000_0000	00	0	NULL
0000_0001	01	1	ctrl A
0000_0010	02	2	ctrl B
0000_0011	03	3	ctrl C
0000_0100	04	4	ctrl D
0000_0101	05	5	ctrl E
0000_0110	06	6	ctrl F
0000_0111	07	7	bell
0000_1000	08	8	backspace
0000_1001	09	9	tab
0000_1010	0A	10	line feed
0000_1011	0B	11	ctrl K
0000_1100	0C	12	form feed
0000_1101	0D	13	return
0000_1110	0E	14	ctrl N
0000_1111	0F	15	ctrl O
0001_0000	10	16	ctrl P
0001_0001	11	17	ctrl Q
0001_0010	12	18	ctrl R
0001_0011	13	19	ctrl S
0001_0100	14	20	ctrl T
0001_0101	15	21	ctrl U
0001_0110	16	22	ctrl V
0001_0111	17	23	ctrl W
0001_1000	18	24	ctrl X
0001_1001	19	25	ctrl Y
0001_1010	1A	26	ctrl Z
0001_1011	1B	27	ctrl [
0001_1100	1C	28	ctrl \
0001_1101	1D	29	Esc
0001_1110	1E	30	ctrl ^
0001_1111	1F	31	ctrl _
0010_0000	20	32	space
0010_0001	21	33	!
0010_0010	22	34	"
0010_0011	23	35	#
0010_0100	24	36	$
0010_0101	25	37	%
0010_0110	26	38	&
0010_0111	27	39	'
0010_1000	28	40	(
0010_1001	29	41	)
0010_1010	2A	42	*
0010_1011	2B	43	+
0010_1100	2C	44	,
0010_1101	2D	45	-
0010_1110	2E	46	.
0010_1111	2F	47	/
0011_0000	30	48	0
0011_0001	31	49	1
0011_0010	32	50	2
0011_0011	33	51	3
0011_0100	34	52	4
0011_0101	35	53	5
0011_0110	36	54	6
0011_0111	37	55	7
0011_1000	38	56	8
0011_1001	39	57	9
0011_1010	3A	58	:
0011_1011	3B	59	;
0011_1100	3C	60	<
0011_1101	3D	61	=
0011_1110	3E	62	>
0011_1111	3F	63	?
0100_0000	40	64	@
0100_0001	41	65	A
0100_0010	42	66	B
0100_0011	43	67	C
0100_0100	44	68	D
0100_0101	45	69	E
0100_0110	46	70	F
0100_0111	47	71	G
0100_1000	48	72	H
0100_1001	49	73	I
0100_1010	4A	74	J
0100_1011	4B	75	K
0100_1100	4C	76	L
0100_1101	4D	77	M
0100_1110	4E	78	N
0100_1111	4F	79	O
0101_0000	50	80	P
0101_0001	51	81	Q
0101_0010	52	82	R
0101_0011	53	83	S
0101_0100	54	84	T
0101_0101	55	85	U
0101_0110	56	86	V
0101_0111	57	87	W
0101_1000	58	88	X
0101_1001	59	89	Y
0101_1010	5A	90	Z
0101_1011	5B	91	[
0101_1100	5C	92	\
0101_1101	5D	93	]
0101_1110	5E	94	^
0101_1111	5F	95	_
0110_0000	60	96	`
0110_0001	61	97	a
0110_0010	62	98	b
0110_0011	63	99	c
0110_0100	64	100	d
0110_0101	65	101	e
0110_0110	66	102	f
0110_0111	67	103	g
0110_1000	68	104	h
0110_1001	69	105	i
0110_1010	6A	106	j
0110_1011	6B	107	k
0110_1100	6C	108	l
0110_1101	6D	109	m
0110_1110	6E	110	n
0110_1111	6F	111	o
0111_0000	70	112	p
0111_0001	71	113	q
0111_0010	72	114	r
0111_0011	73	115	s
0111_0100	74	116	t
0111_0101	75	117	u
0111_0110	76	118	v
0111_0111	77	119	w
0111_1000	78	120	x
0111_1001	79	121	y
0111_1010	7A	122	z
0111_1011	7B	123	{
0111_1100	7C	124	\|
0111_1101	7D	125	}
0111_1110	7E	126	~
0111_1111	7F	127	¯

Character	Decimal	Hexadecimal
"0"	48	$30
"1"	49	$31
"2"	50	$32
"3"	51	$33
"4"	52	$34
"5"	53	$35
"6"	54	$36
"7"	55	$37
"8"	56	$38
"9"	57	$39

4.4 Characters

4.4.1 The ASCII Character Encoding

4.4.2 HLA Support for ASCII Characters

4.4.3 The ASCII Character Set

4.5 The UNICODE Character Set

Web Site Hits Since Jan 1, 2000

Web Site Hits Since
Jan 1, 2000

Character	Decimal	Hexadecimal
"0"	48	$30
"1"	49	$31
"2"	50	$32
"3"	51	$33
"4"	52	$34
"5"	53	$35
"6"	54	$36
"7"	55	$37
"8"	56	$38
"9"	57	$39