Primary links

Working with submodels : One-sided relationship enumeration

One-sided relationship enumeration

One-sided relationship enumeration is a feature specifically to make more efficient the creation of large rasters. A raster is a grid of pixels. Each pixel is equivalent in size and shape. Usually, each pixel is square, though it is also possible to work with other rasters. In this discussion, we shall consider only square grids.

In a square grid, each pixel has potentially eight neighbours: north-west, north, north-east, west, east, south-west, south and south-east. However, pixels at the edge of the grid will have only a subset of those neighbours. For example a pixel on the northern edge of the grid will have five neighbours (west, east, south-west, south, and south-west) and one in the south-west corner will have only three: (north, north-east and east).

To take advantage of the powerful features in Simile, we represent the pixels as instances of a multiple-instance submodel. The number of instances is equal to the width of the grid (in pixels) multiplied by the height. Each pixel has an id number, equal to the index of its position. The row and column that the pixel is in can be calculated from the id number, knowing the width of the grid.

Within each pixel, we can create an eight-instance submodel representing each possible neighbour. To allow for edge effects, we can make the existence of each of the eight conditional. Each (possible) neighbour-instance has a row and column calculated from its position relative to its parent pixel. Now each pixel has access to a list of its neighbours.

To use this knowledge, we can create an association submodel, representing the relationship of "neighbourhood" between the central pixel and the (up to) eight pixels that border it. Two role arrows are drawn between the pixel submodel and the neighbourhood submodel, representing the pixel in the centre and the pixels in the border.

The final step is to create a condition in the neighbourhood model that evaluates for each pixel whether any given pixel is in the neighbourhood. Since each pixel has a list of its own neighbours, this condition is simply:

any(my_id_border == {neighbours_ids_centre} )

In this expression, neighbours_ids_centre is the list of ids of neighbours from the instance in the "centre" role, and my_id_border is the index of the pixel that is potentially in the border (set to index(1) in a variable inside the grid square submodel). The condition is true (and the relationship therefore exists) for a centre and a border pixel, if any of the elements of the list of the neighbour_ids in the centre equates to the index of the pixel in the border.

So far, it is important to emphasise that we have followed the procedure that is standard in Simile for many different types of relationship. We now need to look carefully at what this model represents, how Simile is converting it into a computer program, and how it can be changed to make it more efficient.

What does the model represent?

This model looks unusual as the condition for the relation uses values from a submodel of the base model. But the operation is pretty similar to the self-relation in the previous section, with the difference that each base model instance can be in relation with many other instances. Any time the index of the instance in the "border" role matches any of the indices from the "neighbours" submodel of the instance in the "centre" role, an association instance will be created. Now if the "centre" instance needs to have information from all the "border" instances, this can be passed to the association via the "border" role and back via the "centre" role. Because different instances have different numbers of "neighbours" submodels, they have different numbers of associations.

How is Simile running it?

Now for the bad news. When Simile creates a program for any association submodel, it first creates nested loops for each of the base models. If there are two role arrows from the same base model as in this case, it will loop through all its instances, and for each one, loop through all its instances again. This is so it can compare values from every instance with those from every other instance, in order to decide which pairs the association holds between. In this model there is yet another loop inside these -- it has to loop through the instances of "neighbours" in the "centre" role to see if any match the index of the square in the "border" role.

This only has to be done when the model is initialized, as the definition of the relationship clearly dos not change over time or from run to run. Once the relationship is created, the association submodel instances have pointers to their base instances so they remember which ones to use. But if there are, say, a million grid squares (not an excessive number for Simile) it takes 8x1012 comparisons to set up the relationship, which would keep your computer busy for hours. So what can we do?

How can we improve efficiency?

Very easily, just get rid of the association submodel, and instead have any variable that needs to be shared between neighbours taken outside the grid square submodel into an array variable of its own. Then draw an influence from that array to a variable in the "neighbours" submodel, and give it the equation element([array_outside],neighbours_ids).  The values of this variable can then be summed inside the grid square but outside the "neighbours" submodel to give the sum of the original values from all an individual's neighbours. This only needs one level of looping through the grid square model, since it is using the neighbours_ids values directly to index the array rather than comparing them with those from other instances.

But this model is not as clear as one with an explicit neighbour relationship, especially if many values are passed between neighbours. So we have added a feature to the interpretation of the condition symbol that allows an association to be built in a similar manner, i.e., by using the index to look up one of the base instances rather than seaching through all of them and comparing their indices with a value we got from the first base instance. To do this, we rewrite the condition's equation like this:

any(index(1) is {neighbours_ids_centre} )

Previously, this expression would have been written with the equality operator "==", instead of the key word "is". The new key word has been introduced in order to invoke the efficient program generation that is possible for rasters. Note that we have also replaced the reference to the variable my_id in the submodel with the "border" role, with the function index(1). This is because index(1) means the same thing (because we set "Allow base instance lookup" in the properties dialogue of the "border" role arrow), and we cannot use a variable from the base model in that role until we have looked up the base instance itself.

This is not the only situation that is applicable: in general, as the name implies, the feature is useful for any situation where the relationship can be calculated in a loop over only ONE of the roles. In this case, because each pixel is able to calculate its own list of neighbours, it is not necessary to take each possible pair of pixels and determine if they are neighbours. It is from this short cut that the improvement in efficiency stems.

There is a model in the catalogue on Simulistics web site, showing all this in practice. If, without looking at the example, you could create an equivalent model, you are a graduate summa cum laude of Simile use. The example can be used as a starting-point for most work on rasters, grid squares and so on.

In: Contents >> Working with submodels >> Association submodels