3.4. Functions calculating statistics over realizations¶
3.4.1. Mean and variance, static data¶
In the folder framework
, open real.py
. It is the script created in the previous exercise drawing realizations from stochastic variables. PCRaster includes functions to calculate statistics over these realizations. This is done in the postmcloop. These functions read the data written to disk in the initial or the dynamic, and calculate the statistics over these data, writing the results to disk.
The function mcaveragevariance
calculates the mean and variance of realizations, on a cell-by-cell basis.
Save real.py
under the new name statistics.py
. Add the following lines to the postmcloop:
names = ['b']
sampleNumbers = self.sampleNumbers()
print(sampleNumbers)
timeSteps = [0]
mcaveragevariance(names, sampleNumbers, timeSteps)
The mcaveragevariance
function requires three inputs:
names
is a list of filenames (given as strings) of the map variables for which the statistics are needed. As these are read from disk, you need to have stored these in the initial or the dynamic using report. Here we use a list of length 1, containing one variable.sampleNumbers
is a list of Monte Carlo samples over which statistics are calculated. It is also printed to check it.timesteps
is a list of time steps for which the statistics are created. Asb
represents a static variable, we provide a list containing one zero element. This tellsmcaveragevariance
that it gets static data.
After adding the lines, change the number of Monte Carlo samples to use to 100, for more precision. Save the script and run. Type dir
in the main folder framework
. The mcaveragevariance
function has created the maps b-ave.map
and b-var.map
, containing the mean and variance. Display these maps.
Now, change names
to calculate the statistics of vegetation height also. Save and run the script. Type dir
in the main folder framework
to see what it has stored. Display the mean and variance maps of vegetation height.
Question: How well does your Monte Carlo script calculate the original statistics of the stochastic variables?
For the variable
b
, the variance is approximately between 50% and 150% of the expected value.Idem, between 10% and 190%.
Idem, between 1% and 1000%.
Idem, between 99% and 101%.
3.4.2. Percentiles, static data¶
The calculation of percentile data is almost just as easy.
Open statistics.py
and add the following two lines to the postmcloop, below the other statements:
percentiles=[0.025,0.05,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,0.95,0.975]
mcpercentiles(names,percentiles,sampleNumbers,timeSteps)
Save the script and run. The functions percentiles
works similar to mcaveragevariance
, with the exception that it needs a list providing the percentiles that need to be calculated. In the framework
folder, examine what the script has stored.
Create a probability density distribution plot of the vegetation height and the variable b
. Click on the map to retreive the plot for different cell locations.
Question: What is the 25% percentile value of b
? What is the probability that b
exceeds 1.0?
The 25% percentile value is about -0.05, the exceedance probability for 1.0 is about 0.2 (values depend on cell location).
The 25% percentile value is about -0.5, the exceedance probability for 1.0 is about 0.2 (values depend on cell location).
The 25% percentile value is about -0.01, the exceedance probability for 1.0 is about 0.2 (values depend on cell location).
The 25% percentile value is about -0.005, the exceedance probability for 1.0 is about 2.0 (values depend on cell location).
3.4.3. Dynamic data¶
A very similar approach can be used when calculating statistics over dynamic data.
Open statistics.py
. Let’s first create a more interesting dynamic variable. At the top of the dynamic is already written:
c = normal(1)
self.report(c,'c')
Now, type at the bottom of the dynamic:
self.d=self.d+c+1
self.report(self.d,'d')
Initialize self.d
in the initial by adding:
self.d=scalar(0)
Save statistics.py
and run the script. Check what is stored in the folder 1
.
In the main folder framework
animate a visualisation of 12 realizations of c
and d
. Use the Aguila options --scenarios
, --timesteps
and --multi
.
Question: What happens with the uncertainty in c
and d
over time?
The uncertainty in
c
andd
increases with time.The uncertainty in
c
andd
decreases with time.The uncertainty in
c
is time-independent, while the uncertainty ind
increases with time.The uncertainty in
c
increases with time, while the uncertainty ind
is time-independent.
To calculate the sample statistics, add the following block of code at the bottom of the postmcloop:
names=['c','d']
timeSteps=self.timeSteps()
print(timeSteps)
mcaveragevariance(names,sampleNumbers,timeSteps)
mcpercentiles(names,percentiles,sampleNumbers,timeSteps)
Note that this is very similar to the code needed for static data. We only need to change the names
and the timeSteps
. Note that self.timeSteps()
generates a list of the timesteps in the model. It is printed here.
Save the script and run. It will take some time to run as it needs to calculate statistics for each time step.
Type dir c*
in the command prompt to see what is stored by the script. Also try dir d*
.
Create probability distribution plots of c
and d
. Also, plot time series and animate.
Question: Set the map view of d
to retrieve the exceedance probability of a threshold value of 5. What happens with this probability over time?
It increases over time up to a value of about 0.01 at the last time step.
It increases over time up to a value of about 0.1 at the last time step.
It increases over time up to a value of about 1 at the last time step.
It increases over time up to a value of about 0.5 at the last time step.