Chapter 8 - Structures

Defining Structures
Creating Instances of Structures
- Heap Instances
Nested Structure Definitions
Multiform Structures
Structure of a Matrix Parameter Field

A "structure" is an object that groups different types of data according to a template designed by the programmer, and allows the programmer to designate names that can be used to store and retrieve the data items in the structure. While arrays hold data that is all one size and refer to the data elements using index numbers, structures let you group data of different sizes and refer to them using names of your choice. The structure defining words declare the type of the data (integer, real, address, extended address, etc.) and this enhances the clarity of the program.

Using structures involves defining them, instantiating them, and addressing their elements which are called "members". Defining a structure can be thought of as setting up a template for the data. The template comprises a set of named offsets that specify the position of each member relative to the base of the structure. Instantiating a structure creates an instance of the structure and assigns it to a particular named memory location, either in the variable area, the definitions area of the dictionary, or in the heap. Addressing a member in a structure is accomplished by stating the name of the structure instance followed by the name of the structure member. Executing the name of the structure instance leaves its base xaddress on the stack, and then executing the name of the structure member automatically adds an offset to the base xaddress to produce the extended address of the member. Standard fetch and store operations can then be used to access the member.

Defining Structures

To define a structure, execute STRUCTURE.BEGIN: followed by the name of the structure. Then state the appropriate member defining commands followed by member names that you choose. STRUCTURE.END finishes the definition.

For example, the following commands create a structure to hold customer information:

STRUCTURE.BEGIN: CUSTOMER.INFO    \ name the structure as a whole
INT->    +ACCOUNT#    \ now name each of the members
40 STRING->    +CUSTOMER.NAME
50 STRING->    +STREET
35 STRING->    +CITY&STATE
DOUBLE->    +ZIP.CODE
REAL->    +ACCT.BALANCE
STRUCTURE.END    \ end the structure definition

This creates a template for the data. If you try this example on your computer, you will notice that while the structure is being compiled, items are temporarily placed on the data stack by the compiler. These stack items should not be altered; they are used to initialize the size of the structure and the offset of the members.

The member defining word INT→ removes the next name from the input stream, +ACCOUNT#, and creates a header in the name area of the dictionary. The action of +ACCOUNT# is to add an offset to an extended address on the top of the stack. Because +ACCOUNT# is the first member in the structure, it adds an offset of 0. (Actually, members with offsets of 0 are smart enough to do nothing at run time, thereby saving execution time). Note that the member defining words end with → to suggest that they are defining the name that follows. Note also that we start each member name with + to suggest that it adds an offset to the quantity on the stack. The defining word STRING→ creates a header and action for CUSTOMER.NAME and, based on the quantity on the stack, reserves space in the structure for the string. It reserves 1 byte more than the number on the stack to allow for a count byte at the start of the string. In the example structure two more string members are defined, then a double number field is reserved for the zip code, and a floating point field is reserved for the customer's account balance.

Executing CUSTOMER.INFO leaves the size of the entire structure (in bytes) on the stack; this structure takes 138 bytes.

Here is a list of the available member defining words:

BYTE->	defines a single byte item
n BYTES->	defines an item of n bytes in size
INT->	defines a 2-byte (16-bit) integer
n INTS->	defines n 2-byte integer numbers
DOUBLE->	defines a 4-byte double number
n DOUBLES->	defines a collection of n 4-byte double numbers
REAL->	defines a 4-byte floating point number
n REALS->	defines a collection of n 4-byte floating point numbers
ADDR->	defines a 2-byte (16-bit) address
n ADDRS->	defines n 2-byte addresses
PAGE->	defines a 2-byte page, only the lower order byte is significant
XADDR->	defines a 4-byte extended address
n XADDRS->	defines a collection of n 4-byte extended address
XHNDL->	defines a 4-byte xaddr whose contents are a handle
n STRING->	defines a counted string n+1 bytes long, n ⇐ 255
n STRUCT->	defines a member word with n bytes reserved
n1 n2 STRUCTS->	defines a collection of n1 structures each n2 bytes in size

Most of these words call the kernel word FIELD which creates a named offset that, when executed, calls XU+ to add the offset to the extended address on the top of the stack.

Creating Instances of Structures

To create a structure instance in the variable area for a particular customer named BURGER.WORLD, use V.INSTANCE: as

CUSTOMER.INFO V.INSTANCE: BURGER.WORLD

CUSTOMER.INFO leaves the size of the structure on the stack. V.INSTANCE: (where the "V" indicates the variable area) reserves room in the variable area for the structure. It creates a definition for BURGER.WORLD that, when executed, leaves the extended address of the structure instance on the stack.

To initialize the BURGER.WORLD's account number to 1234, we can execute

1234 BURGER.WORLD +ACCOUNT# !

BURGER.WORLD leaves the extended address of the structure instance on the stack, +ACCOUNT# adds its offset to yield the extended address of the member, and ! saves the value 1234 into the specified member. Similarly, CMOVE can be used to initialize the name, street, and city strings, 2! can initialize the double number zip code, and F! can set the floating point account balance.

If we want a structure to eventually reside in write-protected or PROM memory in the definitions area instead of in the RAM variable area, we execute

CUSTOMER.INFO D.INSTANCE: MARTHA'S.PIES

where the "D" in D.INSTANCE means that space for the structure is reserved in the definitions area.

Executing the word SIZE.OF followed by the name of a structure instance leaves the size of the instance on the stack. For example

SIZE.OF BURGER.WORLD .↓ ok 138

which is the size of the CUSTOMER.INFO structure.

Note that structure instances are allowed to cross page boundaries, and executing V.INSTANCE: or D.INSTANCE: may cause the variable pointer VP or the dictionary pointer DP, respectively, to be advanced to a new page.

Heap Instances

It is sometimes useful to instantiate structures in the heap, and this involves a slightly different procedure. To create a named instance of the structure, execute

CUSTOMER.INFO H.INSTANCE: JOE'S.PIZZA

Like the previous commands, this creates a definition for JOE'S.PIZZA that, when executed, leaves the base address of the instance on the stack. And SIZE.OF can be used to retrieve the size of the heap instance. However, unlike the previous commands, no memory has yet been assigned. Instead, a parameter field for JOE'S.PIZZA has been created in the variable area, and the size of the structure has been saved with the definition of JOE'S.PIZZA. To allocate memory in heap, we can execute

SIZE.OF JOE'S.PIZZA ' JOE'S.PIZZA ALLOCATED

The word SIZE.OF removes the next word from the input stream, examines its definition to recover the size (which was put there by H.INSTANCE:) and leaves the size on the stack. ' JOE'S.PIZZA leaves the extended parameter field address on the stack, and ALLOCATED converts the size to a double number and calls FROM.HEAP to allocate memory for the structure. Note that we could have replaced the command SIZE.OF JOE'S.PIZZA with CUSTOMER.INFO, as both leave the size of the structure on the stack.

To de-allocate the heap instance, simply execute

' JOE'S.PIZZA DEALLOCATED

which returns the memory to the heap manager and clears the parameter field. The definition of the structure remains intact for future use.

Nested Structure Definitions

Structures can be nested. For example, the following is a valid structure definition:

STRUCTURE.BEGIN: ADDRESS.LIST
    BYTE->    +FLAG
    STRUCTURE.BEGIN: SUB
        20 STRING->    +NAME
        20 STRING->    +ADDRESS
        INT->    +ID#
    STRUCTURE.END
    2 SUB STRUCTS->    +NAME&ADDRESSES
STRUCTURE.END

Embedded within the definition of ADDRESS.LIST is another structure definition for SUB. The member defining word STRUCTS expects on the stack the number of structures and the size of the structures. It names a member with the appropriate offset, and reserves space for the sub-structures in the main structures. Strict hierarchy must be maintained, with each STRUCTURE.END matching its corresponding STRUCTURE.BEGIN command.

Note that SUB could also have been defined outside of ADDRESS.LIST. The following pair of structures behave identically to those defined above:

STRUCTURE.BEGIN: SUB
    20 STRING->    +NAME
    20 STRING->    +ADDRESS
    INT->        +ID#
STRUCTURE.END

STRUCTURE.BEGIN: ADDRESS.LIST
    BYTE->    +FLAG
    3 SUB STRUCTS->    +NAME&ADDRESSES
STRUCTURE.END

If the base xaddress of a particular instance of ADDRESS.LIST is on the stack and we wish to fetch and print the ID# in the first SUB structure, we execute

( base.xaddr -- ) +NAME&ADDRESSES +ID# @ .

That is, +NAME&ADDRESSES adjusts the base address to point to the first SUB structure, and +ID# further adjusts the base address to point to the identification number field within the SUB structure.

Multiform Structures

Multiform structures (called "unions" in C and "variant records" in Pascal) are used to define structures whose members may hold different types of data at different times, or whose data may be referred to in several equivalent forms.

For example, the following structure describes the first 6 bytes of the parameter field of a heap object:

STRUCTURE.BEGIN: HEAP.STRUCTURE.PF
TYPE.OF:
    XHNDL->    +xHANDLE    \ xaddress of the handle to the heap block
                \ containing the data
OR.TYPE.OF:
     PAGE->    +HNDL.PAGE    \ handle can be referred to as page...
     ADDR->    +HNDL.ADDR    \ ... and 16-bit address
OR.TYPE.OF:
     PAGE->    +HEAP.PAGE    \ page of handle is also page of heap
TYPE.END
     ADDR->    +CURRENT.HEAP    \ address of the heap used. Its page
                \ is the same as the handle's page.
STRUCTURE.END

The TYPE.OF: ... OR.TYPE.OF ... TYPE.END syntax is used to define the multiform structure which in this case allows us to refer to a 32-bit quantity in one of three ways, depending on the context of its use.

Structure of a Matrix Parameter Field

The following structure defines the layout of a matrix parameter field:

STRUCTURE.BEGIN: MATRIX.PF
    HEAP.STRUCTURE.PF
    STRUCT->    +HEAP.STRUCTURE.PF    \ xHandle and end.heap
    INT->    +ELEMENT.SIZE        \ #Bytes per element
    INT->    +DIMENSIONS        \ #dimensions
    INT->    +#COLS            \ #columns
    INT->    +#ROWS            \ #rows
STRUCTURE.END

Note that the HEAP.STRUCTURE.PF appears as a sub-structure. MATRIX.PF and its analog, ARRAY.PF are kernel words that leave the size of their respective parameter fields on the stack. They are useful when creating stack frames to hold the parameter fields of temporary matrices and arrays. This is a key technique for writing re-entrant code as explained in the next chapter.

This page is about: Forth Language Data Structures and Types – A structure is an object that groups different types of data according to template designed by programmer, and allows programmer to designate names that can be used to store and retrieve data items in structure. While arrays hold data that is all one …