Skip to content

Dynamic variable computation

cjacobscrisioni edited this page Mar 17, 2025 · 4 revisions

The model entails a number of grid-based variables that are used in calibration to fit models describing the presence of built-up land. Most of the used variables are static, meaning they do not change throughout the modelling sequence. Some of those variables do require some preprocessing, as is described in prior variable preparation. This section describes variables that are computed dynamically, meaning that they are recomputed with results from the concurrent or prior model iteration.

Degrees of urbanisation

Degrees of urbanisation are a method of discerning geography into levels of urbanisation. A summary of the technique is given by GHSL here. They are computed based on 1km population grids. Depending on user inputs, the mechanism can also include built-up fractions as a secondary variable. This mechanism has been coded in a Matlab and an ArcGIS version, with tools available publicly here. The degrees of urbanisation either discern three levels of urbanisation (level 1) or six levels of urbanisation (level 2). For this project the Matlab code has been reproduced as a GeoDMS template. This template is available through /Preprocessing/Calculate_DegreeOfUrbanisation/Gen_T. It requires total built-up surface, residential surface and total population in grids and produces a grid with level 2 degrees of urbanisation as a result. Considerable effort has been put into the accurate reproduction of the results yielded by the GHSL scripts (see e.g. issues #45 and #97). After the final round of improvements, tests with Asia yielded 3694 false negative urban grid cells (0.6% of all urban grid cells), 3053 false positives (0.5% of urban grid cells). The remaining false negatives seem to be due to different smoothing outcomes. See maps below from close to Calcutta (Figures 1 and 2).

Figure 1 Degurba results from gen_T template in GeoDMS, close to Calcutta image

Figure 2 Results from GHSL Matlab scripts, close to Calcutta image

It is important to note that the GeoDMS scripts can approximate the original Matlab code, but cannot fully reproduce the same outcomes efficiently. For instance, the Matlab scripts subset a window with a specific urban cluster, and apply the 'smoothing' process only in that window. This process is looped until all identified urban clusters are processed. This could be reproduced in GeoDMS, but would require a sizeable number of grids to be constructed. In addition this would require an iterative metascripted approach through templates or GeoDMS's iter() command. Both are feasible in principle, but become problematic for code handling especially as the degrees of urbanisation approach is requested repeatedly, and the number of urban clusters is only known after population and built-up land are modelled in a timestep. There are a number of refinements regarding the degree of urbanisation calculation that still need to be included. To mention some:

  • There is a parameter governing whether a grid cell should be part of an urban cluster depending on the level of built-up land in that grid cell. It is currently hard-coded to 0.5. Ideally this parameter is passed on in the template call. Furtermore,
  • the original degree of urbanisation approach can optionally also include country borders as a restriction of clusters. In case that restriction is enforced, a cross-border cluster can only be identified as a city if it has enough population inside national borders. The internally computed degrees of urbanisation do not include this restriction. For the reference project, degrees of urbanisation where generated ex-post using country borders as a restriction for most countries across the globe; except for a number of small countries such as Vatican city. In principle the country borders restriction can be imposed in the cluster generation. The most straightforward approach would be to search for clusters; then labelling them with an unique combination of cluster id and nationality; and finally searching the size of this cluster. However this does come with the caveat that clusters that are only connected through an intermediate non-domestic part of the cluster are still considered a singly cluster with the national boundary restriction. Finally,
  • the template currently takes residential and total built-up land as separate inputs. It needs to be verified whether this discernment is really necessary and effective in the applied template; presumably, only total built-up land is necessary.

Distance to settlements

Distance to cities, and distance to villages, towns or cities, are included in the calibrated model describing locational suitability. These depend on the results of the degrees of urbanisation template described above. Here,

  • a city is a contiguous cluster of grid cells with a sufficiently high population density (1,500 inhabitants / sq km) and at least 50,000 inhabitants in the cluster;
  • a town is a cluster of grid cells with at least 300 people per square kilometer, but at least 5,000 inhabitants and less than 50,000 inhabitants; and
  • a village is a cluster of cells with at least 300 people per square kilometer, and at least 500 inhabitants but less than 5,000 inhabitants.

Two approaches have been developed to compute these weighted costs. The currently implemented version creates an explicit network that connects every grid cell to its 8 neighbours. GeoDMS's impedance_table() function is subsequently used to get a cost estimate to reach the closest set of destination points, if any destination is within cutoff distance.

The used network definition is given in /Preprocessing/Calculate_GridDistances/Network . Results with historical settlements can be found in /Preprocessing/GridDistances/[Y____]/get_Grid_Costs_netw

An alternative specification has been developed as well, which depends on grid-based distance search instead. To do this, an extended version of the griddistance operator has been implemented in GeoDMS. This version:

  • Takes into account the latitude-specific aspect ratio of lat-long rasters.
  • Allows the specification of a maximum impedance to prevent calculating more than necessary.
  • Allows for the use of zones and a zonal boundary impedance specification.

This operator requires that the model's geographic coordinates system allows that the modelling grid is easily warped to an equidistant projection type (such as Mercator). The model currently runs in Mollweide, for which grids are not easily warped to an equidistant equivalent. The use of this operator has therefore been discontinued. In case of interest, the implementation is still present in older model versions. See the item: costsgrid_zonal_untiled_maximp_latitude_specific in the GridDists.dms file from Spring - Summer 2024.

Clone this wiki locally