Making a Programming Language From Scratch/Arrays

Array Declarations


Array declarations are generally done along with all other simple variables. The format for array declarations is essentially an extension of the format for declarations of simple variables.


[format as for simple declarations][,][name of variable]['[' or ',' or ';' or '='][if '[' size of array (constant)][']'][value of array][next declaration]


int a,b,arr[5],arr2[5]=(1,2,3,4,5);

(Note that as we have designated '{' as terminating character we replace the common '{' with '('. If you have not done so you can also use the brace.)



This algorithm is the continuation of the previous algorithm, with the following steps to be added to check for array variables and to deal with them properly.

1. Get name.
2. If next character be [ then until the character be ] input character and store it in the index variable .
3. If next character be , or ; then write to data? section :
  [Name] [index] dup ?
4. Else add one character and then get character until character be ).
Then write to data section:
  [name] [value]

The dup keyword will replicate ? or 0 value for all the members without needing to individually initialize each.

Special for strings


Strings are initialized differently form other array variables in that they can be declared simultaneously in one chunk separated by "and" Also strings always end with char 0 (or '\0'). The following algorithm is to be appended with the previous one.

Algorithm for strings only:

1. if character after char array declaration not be'=' continue with rest of parent algorithm.
2. add one to index.(skip ")
3. while character not " get character and store in array.
4. write to .DATA section:
   [name] byte "[value]",0

Note that some assemblers may impose a limit on the actual size of initialized string (in MASM 6.1 it is 255 bytes or 255 individual characters). Note that the size of the initialized string is only that of the value provided, i.e. in the following example

char str[50]="hello";

the size of the array str is only 5+1 or 6 characters.

Array Referencing


An array has to be referred to in order for it to be useful. An array is referenced in the following format.

[...expression][array name][index of variable][...expression]

However assembly language does not accept this format. Moreover it takes the index as the number of bytes after the starting address, rather than the number of variable after the starting address.

Format in assembly

[...instruction][array name][number of bytes after starting][...instruction]

Further more the index cannot be a memory variable, but has to be a register or a constant.

The solution is the following set of instructions:

mov ebx,[index variable(can be a constant also)]
[assignment instruction to register of type of array][array name] /[ebx * type array]/
[assignment instruction to arr[ANUM] (increment ANUM) from register used above] 

the '/' signify differential use of '['(here the brackets after / are to be copied into code)

Algorithm for referencing(upon detection of array variable or to be done as soon as instruction comes):

1. While character not '[' and not ';' and not '{'
   increment index.
2. If character be ';' or '{' end process.
3. While character not ']' get character and store in array.
4. Get name of array.
5. Use format as given above.
6. Replace reference by arr[ANUM-1]
7. Repeat step 1