Cellular Automata/Listing Preimages

Two algorithms are described for listing the preimages of a present configuration. The first algorithm is described in many texts by Andrew Wuensche the second is introduced in this book.

Preimage network

edit
 
The preimage network

A new graph the preimage network is introduced to facilitate the description of reverse algorithms in terms of paths through the graph.

The preimage diagram of a single cell is the network of preimages of that cell. Like preimage matrices of cells in a sequence   length   are aligned into a chain and multiplied, preimage diagrams for each of the cells can be aligned to form the preimage network. If all the links in the De Bruijn diagram are taken into account, than only those representing neighborhoods that are mapped into the observed cell are valid links. Each link or path consisting of many links between nodes in the diagram is a locally valid path.

Elements   in the multiplied preimage matrix   count the preimages beginning at one of the left overlaps   and ending at one of the right overlaps  . In the same way there are   different valid paths between the two overlaps in the preimage network.

The definition of what is globally valid depends on whether the finite lattice is bounded or cyclic.

  • On a bounded lattice a globally valid path connects any overlap on the left boundary to any overlap at the right boundary.
  • on a cyclic lattice a globally valid path must begin and end at the same overlap value.

A node on a locally valid path is a locally valid node. A node on a globally valid path is a globally valid node.

The arrows that point the links in the increasing direction of the position index can be omitted.

edit
 
Preimage network link weight

Overlaps and neighborhoods in the preimage network can be weighted depending on the number of globally valid paths passing through them. The weights can be illustrated on the preimage network as node size (not used in the examples) and line thickness. Since the number of preimages may increase exponentially the thickness may be a logarithm of the weight.

 

There are two different ways to define and calculate weights depending on whether the finite lattice is bounded or cyclic.

Bounded lattice

edit

The weights for links   at position index   are the product of:

  • the number of paths   from left boundary   to the source overlap   and
  • the number of paths   from the drain overlap   to the right boundary  .
 

Both parts of the product are contained inside preimage vectors,   for the left part and   for the right part.

 
 

Cyclic lattice

edit

On a cyclic lattice the weights for links   at position index   are the sum over overlaps of products of:

  • the number of paths   from one left overlap   to the source overlap   and
  • the number of paths   from the drain overlap   to the right boundary  .
 

Because boundary overlaps must be distinguished the preimage vector is not enough and the information on the number of paths is contained in preimage matrices.

 

Computing weights

edit

To compute all the link weights both preimage counts from the left boundary to each cell and from the right boundary to each cell are needed. For bounded lattices the counts are stored in preimage vectors, for cyclic lattices counts are stored in preimage matrices.

Bounded lattice

edit

For bounded lattices the preimage counts are stored as preimage vectors. The first counts at the left   and the right   are boundary vectors.

 

Left sided counts   are calculated from the left to the right, the vector at position   is calculated from the vector at position  . Right sided counts   are calculated from the right to the left, the vector at position   is calculated from the vector at position  .

 
 

The total number of all preimages is calculated by applying both boundaries.

 


Algorithm pseudo-code for counting from the left to the right
b[0] = b_L                     # the left preimage vector b[0] is initialized to the left boundary b_L
for ( x=1; x<=l; x++ )         # run the position index from the left (x=0) to the right (x=l-1)
  b[x] = b[x-1]*D[C[x-1]]      # the new preimage vector is calculated from the old using the
end                            # preimage matrix D[C[x-1]] of the cell value C[x-1] at position x-1

p = 0                          #
for ( o=0; o<|S|^(k-1); o++ )  # the number of all preimages p is the scalar product of
  p += b[l][o] * b_R[o]        # the last right preimage vector b[l] and the right boundary vector b_R
end                            #
Algorithm pseudo-code for counting from the right to the left
b[l] = b_R                     # the right preimage vector b[l] is initialized to the right boundary b_R
for ( x=l-1; x>=0; x-- )       # run the position index from the right (x=l-1) to the left (x=0)
  b[x] = D[C[x]]*b[x+1]        # the new preimage vector is calculated from the old using the
end                            # preimage matrix D[C[x]] of the cell value C[x] at position x

p = 0                          #
for ( o=0; o<|S|^(k-1); o++ )  # the number of all preimages p is the scalar product of
  p += b_L[o] * b[0][o]        # the last left preimage vector b[0] and the left boundary vector b_L
end                            #

Cyclic lattice

edit

For cyclic lattices the counts are stored as preimage matrices, this is necessary to distinguish counts for each of the overlaps at the joined boundaries. Since the lattice is cyclic, there is no need to apply a boundary vector as in the bounded lattice.

Left sided counts   are calculated from the left to the right, the matrix at position   is calculated from the matrix at position  . Right sided counts   are calculated from the right to the left, the matrix at position   is calculated from the matrix at position  .

 
 

The number of preimages must be counted separately for each of the   overlaps at the joined boundaries  . A cyclic preimage (row) vector   is constructed from the diagonal elements of the preimage matrix  .

 
 

The total number of all preimages is the sum of the elements in the cyclic preimage vector   (row), the scalar product with the unrestricted boundary vector   (row transposed into column).

 
Algorithm pseudo-code for counting from the left to the right
D[0] = I                       # the left preimage matrix D[0] is initialized to the identity matrix
for ( x=1; x<l; x++ )          # run the position index from the left (x=0) to the right (x=l-1)
  D[x] = b[x-1]*D[C[x]]        # the new preimage matrix is calculated from the old using the
end                            # preimage matrix D[C[x]] of the cell value C[x] at position x

p = 0                          #
for ( o=0; o<|S|^(k-1); o++ )  # the number of all preimages p is the scalar product of
  p += D[l-1][o][o]              # the sum of the diagonal elements of the right preimage matrix D(l-1)
  b_0[o] = D[0][o][o]          # the diagonal are copied into the cyclic preimage vector b_0
end                            #
Algorithm pseudo-code for counting from the right to the left
D[l] = I                       # the right preimage matrix D[l] is initialized to the identity matrix
for ( x=l-1; x>=0; x-- )       # run the position index from the right (x=l-1) to the left (x=0)
  D[x] = D[C[x]]*b[x+1]        # the new preimage matrix is calculated from the old using the
end                            # preimage matrix D[C[x]] of the cell value C[x] at position x

p = 0                          #
for ( o=0; o<|S|^(k-1); o++ )  # the number of all preimages p is the scalar product of
  p += D[0][o][o]              # the sum of the diagonal elements of the left preimage matrix D(0)
  b_0[o] = D[0][o][o]          # the diagonal are copied into the cyclic preimage vector b_0
end                            #

String operations

edit

Preimages as paths in the preimage network are described in terms of neighborhoods and overlaps, which are both sequences of cells. Since neighborhoods on a globally valid path overlap, each cell can be found in   neighborhoods and in   overlaps.

Neighborhoods are represented by links in the preimage diagrams and networks. The neighborhood   surrounding the cell   is defined as

 

Overlaps are represented by nodes in the preimage diagrams and networks. The overlap   surrounding the cell   is defined as

 

Sequences can be represented and manipulated in two ways:

  1. compact representations are manipulated using algebraic operations (multiplication, division (quotient without remainder) and modulo)
  2. bit arrays are manipulated using boolean logical operations (shifts, fusion with OR and masking with AND) each cell consumes  

Extracting the observed cell

edit

Extracting from the neighborhood

edit

Algebraic notation

 

Logic notation

c_x = (n_x >> m*(k-k_0-1)) AND (2^m-1)

Extracting from the overlap

edit

Algebraic notation

 

Logic notation

c_x = (o_x >> m*(k-k_0)) AND (2^m-1)

Transformations between overlaps and neighborhoods

edit

Extending an overlap into a neighborhood

edit

A neighborhood   (length  ) is created from an overlap   (length  ) by adding one cell   at the left or the right side.

String notation (concatenation of cell strings)

 
 

Algebraic notation

 
 

Logic notation

n_x = (o_x << m) OR c
n_x-1 = (c << m(k-1)) OR o_x

Shortening a neighborhood into an overlap

edit

An overlap   (length  ) is created from a neighborhood   (length  ) by removing one cell   at the left or the right side.


String notation

  •   means string   without the extreme right symbol  
  •   means string   without the extreme left symbol  )
 
 

Algebraic notation

 
 

Logic notation

n_x = o_x >> m
n_x+1 = o_x AND (2^(m*(k-1))-1)

Trace and backtrack algorithm

edit
 
Tracing and backtracking through the preimage network (rule 110 configuration 0011)

This algorithm traces paths through the preimage diagram. It starts at the first node at the left that is not masked by left boundary conditions. Depending on the position index value and whether there are links to the right there are four options:

  • tracing the first unused valid link to the next node and remembering the selected path
  • if the reached node is at the right boundary, validity of the boundary is tested and the path is stored as a preimage
  • if there are no unused links to the right, backtrack to the previous node on the current path
  • when backtracking reaches the left boundary, the tracing restarts at the next valid node till all nodes are exhausted.

The algorithm could start at the right and trace to the left obtaining the same preimages. The direction tracking is chosen so that the listed preimages are sorted in an ascending order regarding their compact representation.

Fromal algorithm description

edit
 
The trace and backtrack algorithm (simplified flowchart)
 
Tracing and backtracking from the observed node

Most parts of the algorithm are identical for bounded and cyclic latices, the differences are seen only at the boundaries.

Main loop and left boundary

The main loop counts through the available tracing start points. The algorithm begins tracing by the first overlap at the left   (on bounded lattices overlaps masked by the left boundary   are skipped). When all the traces beginning with this overlap are exhausted (backtracking reaches the left boundary   and the link index exceeds the number of available links  ) tracing continues to the next valid overlap   till it reaches the last overlap  . When traces from the last overlap are exhausted the algorithm finishes. On bounded lattices overlaps may be unrestricted or masked by a boundary vector  . On cyclic latices all overlaps are used.

Tracing

The tracing increments the current position index  . Each tracing step begins at the current overlap   at position   and traces the first not yet traced link to the drain overlap   at position index  . There are   links to be traced, if the current overlap has not yet been backtracked than tracing begins at link   otherwise the link provided by the backtracking is incremented. The validity of the trace step must be verified, each link represents a neighborhood   that must lead to the present cell value  .

  • If the link is valid, its index   is stored as a new cell   on the currently traced path and the algorithm continues from the drain node  .
  • If the link is not valid than the tracing must proceed with the next link  , if there are no links left   than the algorithm switches to backtracking.
Backtracking

Backtracking is returning back on the current path till the point where an available link for tracing is found. The current overlap without further paths is   each backtracking step steps backward on the current path over the link   to the overlap   at position index  . The backward link   and the source overlap   are sequences on the current path. When back at the source cell, the link to trace forward is incremented from the last traced link  .

Right boundary and listing of preimages

When tracing reaches the right boundary listing a preimage can begin.

  • For cyclic lattices the current overlap   is compared to the starting overlap at the left  , if they are equal a preimage   of length   is added to the preimage list.
  • For bounded lattices the current overlap   is masked by the right boundary vector  , if accepted a preimage   of length   is added to the preimage list.

For each new preimage the preimage counter   is incremented.

Algorithm pseudo-code

edit
p=0                               # the preimage counter is initialized to 0
x=0                               # the position index is initialized to 0
c=0                               # the link index is initialized to 0
o_L=0                             # the tracing starts at the first left overlap
while (bounded) then              # if the lattice is bounded
  if ( b[o_L]>0 ) then            # boundary conditions must be checked
    o_L++                         # to ensure the start point is valid
  end_if                          #
end_while                         #
o=o_L                             # the current overlap is initialized

while (o_L<|S|^(k-1)) then        ## main loop ##

if ( c<|S| ) then                 # tracing and right boundary

  if ( x<l ) then                 ## tracing ##
    n=o*|S|+c                     # the neighborhood is created
    if ( f(n)==c_x ) then         # checking it the neighborhood is valid
      o=(o*|S|+c)/|S|^k           # stepping to the new overlap at the right
      C[x+k-k0-1]=c               # storing the link selection cell
      x++                         # incrementing the position index
      c=0                         # at the new overlap start with the first link
    else                          # the selected link is not valid
      c++                         # try the next link
    end_if                        #
  end_if                          #

  else                            ## right and cyclic boundary conditions ##
    if (bounded)                  # for bounded lattices
      if ( b[o]>0 ) then          # check masking of the current overlap with the right boundary vector
        store(C)                  # list the preimage consisting of cells on the current path
      end_if                      #
    end_if                        #
    if (cyclic)                   # for cyclic lattices
      if ( o==o_L ) then          # check if overlaps at the left and right boundary are equal
        store(C)                  # list the preimage consisting of cells on the current path
      end_if                      #
    end_if                        #
    c=|S|                         # to prevent further tracing from the boundary
  end_if                          #
 
else                              # backtracking and left boundary

  if ( x==0 ) then                ## left boundary conditions ##
    o_L++                         # move to the next start point for tracing
    if (bounded) then             # for bounded lattices
      if ( b[o_L]>0 )             # check masking of the left overlap by the left boundary vector
        c=0                       # move to the next start point for tracing
      else                        # else if the current overlap is not valid
        o_L++                     # move to the next start point for tracing
      end_if                      #
    else_if (cyclic) then         # for cyclic lattices
      c=0                         #
    end_if                        #

  else                            ## backtracking ##
    o=(C[x-k0-1]*|S|^(k-1)+o)/|S| # backtracking to the previous overlap
    c=C[x+k-k0-1]                 # restoring the last used link at this overlap
    c++                           # at the backtracked overlap continue with the next link
    x--                           # decrementing the position index
  end_if                          #

end_if

end                               # if all start points are exhausted end the algorithm

Algorithm complexity

edit

The algorithm can be divided into two time consuming operations

  • tracing and backtracking
  • listing preimages

The algorithm visits all links and nodes accessible from the boundary where it starts. Accessibilyti of overlaps and neighborhoods is defined as an induction:

Base
Overlaps   at the starting boundary   (left here) are acessible if they are valid  .
Induction
Every neighborhood sourced at an acessible node is acessible. Every drain overlap to an acessible and valid neighborhood is acessible.

The valid accessible links are visited exactly twice (first while tracing and secondly while backtracking), invalid lins are visited only once (their validity is evaluated by tracing). A simple aproximation for the tracing time is counting all the accessible overlaps, counting must be performed for each of the overlaps at the boundary so preimage matrices must be used for counting.

An arbitrary overlap   is accessible from a valid overlap   at the left boundary if there is at least one valid path between the two  . Accessibility of overlaps at position index   can be extracted from the preimage matrix   by transformning its elements into boolean values or by using bolean multiplication to calculate preimage matrices  . The algorithm complexity of tracing is the sum of all accessible overlaps.

 

The upper boud to the processing time is a linear function of the observed configuration size   but grows exponentially with the increasing neighborhood size (preimage diagram size).

 

The time required for listing of preimages   is a linear function of the number of preimages  , where the time to compute a single preimage is a linear function of the configuration length  .

 

Memory consumption of the algorithm is minimal, only the current path in the preimage network is stored and used for listing if a valid preimage is found.

Count and list algorithm

edit
 
The count and list algorithm (simplified flowchart)

This algorithm consists of a first backward or counting pass that counts the preimages and a second forward or listing pass that lists them. The trace and backtrack algorithm must trace each link before knowing if it leads to a valid preimage, on the other hand the counting pass produces information obout wheather and how many preimages there are for each link. All the preimages can be listed without backtracking.

The direction of the listing pass could be inverted but is chosen so that the listed preimages are sorted in an ascending order regarding their compact representation, the counting pass is in the opposite direction.

Formal algorithm description

edit

Booth the forward (left to right, producing   vectors) and backward (right to left, producing   vectors) counting passes have been described above, while computing weights for the preimage network. Only one of the passes is needed, here the backward pass is used.

The listing pass can be divided into initialization and a few steps inside the main loop.

Initialization

Since the number of preimages   has been calculated by the counting pass, memory can be allocated for them in advance. The array of starting overlaps   holds the value of the starting point for each preimage (valid path in the preimage network). The number of preimages using each starting overlap is computed as a weight vector for overlaps at the left boundary. For bounded lattices the left boundary and the preimage count from the righ end to the left are used.

 

For cyclyc lattices the preimage matrix of the whole sequence is used.

 

The array of starting overlaps is copied into the array of overlaps at the current position index   and the listing can start.

Listing

The listing starts at the left boundary   and ends at the right boundary  . At each increment of the position index one cell is computed for each of the   preimages.

To simplify the algorithem the right cell   in the neighborhood   is stored. The first cells   of preimages on a bounded lattice can be computed from the starting overlap  . For cyclic lattices the last cells at the right can be wraped to the beginning.

Algorithm pseudocode

edit
C = malloc( p * (l+k-1) * |S| )            # reserve memory for the counted amount of preimages
                                           #
i = 0                                      # start with the first preimage
for ( o=0; o<|S|^(k-1); o++ )              # for all overlaps
  for ( j=0; j<b_L[0][o]*b_R[0][o]; i++ )  # for each overlap the number of preimages is calcuated
    o_x[i] = o                             # i-th preimage path begins with overlap o
    i++                                    # to the next preimage
  end_for                                  #
end_for                                    #
if (cyclyc)                                # for bounded lattices there is no need to know the starting points
  o_L = o_R = o_x                          # copy the current array of overlap to the array of starting overlaps
end_if                                     #

for ( x=0; x<l; x++ )                      # run from the left to the right
  i = 0                                    # start with the first preimage
  while ( i< p )                           # check if all preimages have been finished
    for ( c=0; c<|S|; c++)                 # check all links for each source overlap
      o_x[i] = (o_x[i]*|S|+c) / |S|^(k-1)  # calculate the drain overlap
      elseif (bounded)                     # for bounded lattices
        w = b_R[x+1][o_x[i]]               # the count from the drain node to the right end is used
      if (cyclic)                          # for cyclic lattices
        w = D_R[x+1][o_x[i],o_L[i]]        # the count from the drain node to the end node (same as left)
      end_if                               #
      for ( j=0; j<w; j++ )                # use the count
        C[i][x+k-k0-1] = c                 # and write a cell for each of them
        i++                                # increment the preimage counter
      end_for                              #
    end_for                                #
  end_while                                #
end_for                                    # end of algorithm

Algorithm complexity

edit

The complexity of the counting is a linear function of the configuration length  . The dependence on the neighborhood size   is not qubic since there at most   nonzero elements in the preimage matrix of a single cell. If the lattice is bounded the counts are stored as vectors and the complexity is linear, for cyclic lattices counts are stored as vectors and the complexity is quadratic.

 
 

The complexity of the listing is again a linear function of the configuration length   and the number of preimages.

 

The counting by matrix multiplication processes all the nodes in the primage diagram instead the tracing and backtracking from the previous algorithm avoids some nodes gaining on everage complexity, on the other hand this algorith is easier to optimize.

Memory consumption is another downside, and it grows as fast as time for counting.

Proofs

edit

References

edit
  1. Andrew Wuensche and Mike Lesser, The Global Dynamics of Cellular Automata An Atlas of Basin of Attraction Fields of One-Dimensional Cellular Automata
  2. Wuensche,A.,(1997), Attractor Basins of Discrete Networks, Cognitive Science Research Paper 461, Univ. of Sussex, D.Phil thesis
  3. Vladimir Batagelj, Efficient Algorithms for Citation Network Analysis (2003)
  4. J. C. Seck Tuoh, G. Juárez y H. V. McIntosh, Calculating ancestors in one-dimensional cellular automata, International Journal of Modern Physics C, Vol 15, No. 8, pp 1151-1169, October 2004
  5. MathPages
  6. Iztok Jeras, Andrej Dobnikar, Algorithms for computing preimages of cellular automata configurations (PDF)

Software

edit
  1. DDLab Tools for researching Cellular Automata, Random Boolean Networks, multi-value Discrete Dynamical Networks, and beyond; by Andy Wuensche.
  2. Iztok Jeras, Algorithms for computing preimages of cellular automata configurations TAR.BZ2