Built-in functions : Statistics

Statistical functions produce variates from distributions, and generally produce different values at each point in the model where they are called.

There are up to three forms of each statistical function:

  • The form with a _const suffix. This will produce a new value when the simulation is initialized or reset, or when a submodel instance containing it is created. This value stays the same until the end of the run, or until the submodel instance containing it ceases to exist.
  • The form with a _var suffix (or no suffix). This will produce a new value on each time step for each instance where it occurs. The sequence of values for all such functions can be initialized using the Initialize pseudo-random tool. If there is no _const form of a function, the _const behaviour can be produced by wrapping this form in the at_init() function.
  • The form with an extra argument. The last argument (an integer) serves as a seed for the values produced by that particular occurrence of the function. These will be the same each run, and independent of any other statistical functions in the model. For instance if it is used in a conditional submodel, it will start producing the same sequence of results each time an instance of the submodel comes into existence (assuming the seed value is the same).

If a statistical function only has one form, it behaves like the 'var' form.

Built-in functions : rand_const function

rand_const function

rand_const(X,Y)

Returns a random number between X and Y at the start of the simulation or when the submodel instance is created. The random-number generator is not called again, and so the value stays the same until the simulation is reset.

Input: numeric, numeric

Result: numeric

Comment:

The main use of this function is to assign values to a set of instances of a multiple-instance submodel (fixed-membership or population). For example, to randomly assign initial sizes to a set of trees in a multiple-instance tree submodel, we could use the equation:

size = rand_const(12,20)

or to assign random locations to the trees, we could use the equations:

x = rand_const(0,50)

y = rand_const(0,100)

which would randomly place the trees in the left-hand half of a one-hectare plot, assuming that the values are in metres.

The use of rand_const() is deprecated because it cannot be made to behave in the same way as rand_var when implicitly replicating over an array. It is implemented by internal conversion to at_init(rand_var(x,y)) and this form should be used in full to make the replicatio behaviour clear.

Historical note: You may come across some models that use a rand(X,Y) function. This behaves like rand_const if Simile deduces that the model element will only be called at initialisation time, and like rand_var if the equation contains some variable that changes over time. The use of this function is now also deprecated because the semantics of the two uses are so very different. Also, there are some situations when you need to be able to over-ride this behind-the-scenes decision about how the function should behave.

In: Contents >> Working with equations >> Functions >> Built-in functions

Built-in functions : rand_var function

rand_var function

rand_var(X,Y)

Returns a random number between X and Y, with a new value every time step.

Input: numeric, numeric

Result: numeric

Comment:

This function is used for doing stochastic modelling and Monte-Carlo simulations, i.e. one or more processes in the model (like giving birth or dying) have a random element to them.

rand_var gives a new result for every call, and if it is used in an expression that is replicated to make an array, each element's random value will be different.

rand_var uses the pseudo-random sequence generator built into the c++ compiler which Simile is using to create executable models. The sequence is initialized with a value generated from the process ID and clock time when Simile starts up, so no two runs will produce the same results. However, if it is required that a model has exactly the same behaviour each time it runs, despite including calls to rand_var, this can be achieved by means of a tool that sets the seed to a given value; see Initializing pseudo-random sequence.

Historical note: You may come across some models that use a rand(X,Y) function. This behaves like rand_const if Simile deduces that the model element will only be called at initialisation time, and like rand_var if the equation contains some variable that changes over time. The use of this function is now deprecated because the semantics of the two uses are so very different. Also, there are some situations when you need to be able to over-ride this behind-the-scenes decision about how the function should behave.

In: Contents >> Working with equations >> Functions >> Built-in functions

Built-in functions : binome function

binome function

binome(prob, n)

Input: Real numerical value, integer value

Result: A value from the binomial distribution with the given probability and number of trials. A new random deviate is generated each time step.

The binomial distribution describes the probability of a given number of positive outcomes occurring when a number n of trials are carried out, each with a certain probability p of a positive outcome.

This function is implemented using a pseudo-random sequence generator; notes regarding its behaviour can be found in the documentation for the rand_var function.

Examples:

coins_heads_up = binome(0.5, coins_tossed)

In: Contents >> Working with equations >> Functions >> Built-in functions

Built-in functions : colin function

colin function

colin([Array])

Returns a deviate from a distribution whose relative probabilities are given by the values in the argument array. A new deviate is generated each time step.

Inputs: array of probabilities (real).

Outputs: index to value in array (int).

This can be used to make a deviate from an explicit set of probabilities where the pattern does not match any other built-in statistical function.

This function is implemented using a pseudo-random sequence generator; notes regarding its behaviour can be found in the documentation for the rand_var function.

Example:

colin([1,1,1,10,1]) --> 4 (usually), 1,2,3 or 5 (occasionally).

In: Contents >> Working with equations >> Functions >> Built-in functions

Built-in functions : exprnd function

exprnd function

exprnd(mean [, seed])

Returns: value sampled from an exponential distribution (numerical)

Arguments: mean of distribution (numerical), seed for random sequence (integer, only required if a reproducible series of values is needed)

Example: A Geiger counter pointed at a radioactive source will emit a series of clicks at random times. The durations of the intervals between the clicks are distributed exponentially.

In: Contents >> Working with equations >> Functions >> Built-in functions

 

Built-in functions : gaussian_var function

gaussian_var function

gaussian_var(mean, sd)

Input: Two real numerical values

Result: A random sample from a Gaussian (normal) distribution, with the supplied mean and standard deviation. A new random sample is generated each time step.

This function is implemented using a pseudo-random sequence generator; notes regarding its behaviour can be found in the documentation for the rand_var function.

Examples:

daily_rainfall = gaussian_var(annual_rainfall/365, 1.0)

In: Contents >> Working with equations >> Functions >> Built-in functions

Built-in functions : with_colin function

with_colin function

with_colin({ProbList},{ValList})

Takes two lists with equal size, and returns an element from the second argument, picked at random with the probability of each element proportional to the value of the corresponding element in the first argument. A new return value is generated each time step.

Inputs: list of probabilities (real), list of corresponding values (any).

Outputs: element picked from second list (any).

This can be used to make a deviate from an explicit set of probabilities where the pattern does not match any other built-in statistical function.

Note that it only works on lists; if you want to do something similar with fixed-size arrays, you can combine the element and colin functions to achieve the same effect as follows: element([ValList], colin([ProbList]))

This function is implemented using a pseudo-random sequence generator; notes regarding its behaviour can be found in the documentation for the rand_var function.

Example:

with_colin({1,1,1,10,1}, {"apples", "pears", "oranges", "grapes", "bananas"}) --> "grapes" (usually), "apples", "pears", "oranges" or "bananas" (occasionally)

In: Contents >> Working with equations >> Functions >> Built-in functions

Built-in functions: hypergeom function

hypergeom function

hypergeom(Pop, Mark, Sample)

Returns a deviate from a hypergeometric distribution for a given population, number of marks, and size of sample.

Inputs: Population size (int), number of marked individuals (int), size of sample from population (int)

Outputs: deviate of number of marked individuals from sample

The hypergeometric distribution tells us the range of probabilities of getting a number of "marked" individuals when taking a sample of a certain size from a population, a given number of which are "marked".

This function is implemented using a pseudo-random sequence generator; notes regarding its behaviour can be found in the documentation for the rand_var function.

Example: A research process involves ringing a certain number of seabirds from a population and releasing them, then at a later date recapturing a different number of the birds and checking how many ringed individuals are retreived. If a random group of individuals are captured each time, the probability of getting n rings back is equal to the probability of getting the result n from the equation:

rings_retrieved = hypergeom(Seabird_population, Birds_ringed, Birds_caught)

In: Contents >> Working with equations >> Functions >> Built-in functions

Buit-in functions : poidev function

poidev function

poidev(mean)

Input: Real numerical values

Result: A value from the Poisson distribution with the given mean. A new random deviate is generated each time step.

The poisson distribution describes the probability of a given number of positive outcomes occurring in the limiting case of the binomial distribution, i.e., with very many trials each with a very small chance of a positive outcome.

This function is implemented using a pseudo-random sequence generator; notes regarding its behaviour can be found in the documentation for the rand_var function.

Example: A hospital serves a large community in which a certain percentage of individuals are thought to be carriers of the hospital superbug MRSA. If we admit a small number of individuals to hospital, we would expect the probability of getting a certain number of MRSA carriers in that group to be equal to the probability of getting that number as the result of this equation:

MRSA_positive_admissions = poidev(Total_admissions*MRSA_prevalence_percent/100)

In: Contents >> Working with equations >> Functions >> Built-in functions