Thursday, August 28, 2025
No menu items!
HomeNatureMapping urban gullies in the Democratic Republic of the Congo

Mapping urban gullies in the Democratic Republic of the Congo

Identifying cities significantly affected by UGs

We first identified all cities in the DRC that were significantly affected by UGs. For this, we checked all urban centres that were assigned the official status of ‘city’ by presidential decree (Articles 53–55 of Decree Law 081 of 2 July 1998) as well as other urban centres with at least 80,000 inhabitants in 2020 (according to ref. 51) that show characteristics of small cities. This list can be considered exhaustive.

The presence of UGs was checked in all these cities using available Google Earth imagery of very high resolution (that is, a resolution of 1 m or smaller; Supplementary Table 1). We considered a feature to be a UG if it could be recognized as such, based on commonly accepted geomorphic criteria25,52. More specifically, the feature had to be recognizable as a channel eroded by concentrated runoff with an elongated shape, a discernible thalweg, a gully head and visible gully edges. Furthermore, the thalweg needed to be oriented along the steepest slope or in another way that allowed effective runoff evacuation (for example, following a road downslope). Moreover, the gully needed to be located within 200 m of buildings. Field surveys of 434 gullies in Kinshasa, Kikwit and Bukavu confirmed that all mapped features were UGs. Nonetheless, these field visits also showed several smaller UGs that were not detected. We, therefore, restricted our analyses to UGs with a thalweg of at least 30 m to avoid potential biases caused by contrasts in detection accuracy.

To verify whether the detected UGs are linked to urban growth, we checked high-resolution panchromatic aerial photographs of each city, taken in the 1950s. These photographs are conserved at the Royal Museum for Central Africa in Belgium53. If identified gullies were already present, we examined whether they were inside or outside built-up areas and, when outside, whether they were linked to the road network (that is, the gully formed along a road or lies in its direct extension, within a distance of about 100 m or less). Gullies observed in the 1950s that were located outside built-up areas and not related to the road network were assumed to be of natural origin. They were not retained for further analysis.

Mapping the extent and expansion rates of UGs

In each affected city (Methods, ‘Identifying cities significantly affected by UGs’, Fig. 3), we manually mapped the polygons delineating the spatial extent of all UGs on a reference image (Supplementary Fig. 5). This reference image was a cloud-free, very high-resolution image available in Google Earth that was taken between 2021 and 2023 (Supplementary Table 1). Given that many UGs evolve into branched networks of multiple gully heads (Fig. 1), we considered a branching feature as an individual UG if it had an identifiable gully head and a thalweg of at least 30 m long. For most cities, the limits of the UGs could be identified and mapped because of the clear visual contrast between the gully channel and the surrounding environment. A notable exception was the city of Bukavu, for which the Google Earth imagery did not always allow a clear delineation. This is attributable to the clayey soils on which the city is built54, resulting in insufficient visual contrast. For this city, the mapping was complemented with handheld GPS field surveys.

Next, the areal expansion rates of the UGs were quantified by remapping their limits as observed on older images available in Google Earth of adequate quality (Supplementary Fig. 5). For Kinshasa and Kikwit, this imagery was complemented with Pléiades images (taken on 21 April 2015 for Kikwit and 28 April 2014 or 19 June 2015 for Kinshasa). Depending on their age and the availability of imagery, UGs were digitized one to five times, with image dates ranging between 2002 and 2023. For each gully observed on at least two images with different dates, the gully expansion was calculated by subtracting the area of the gully polygon, as mapped on the older image, from the area on the more recent image. We defined an expansion event as an urban gully showing an increase in mapped extent of at least 10 m2 between two image dates. It should be noted, however, that this observed expansion may, in reality, be attributable to several consecutive, smaller expansion phases that occurred between the two image dates.

For UGs that were newly formed during the observation period, the areal expansion was assumed to be the area of the polygon as mapped on the first image in which the gully was visible. For expansion events of already existing gullies, we further differentiated between gully head retreat (GH) and sidewall widening (SW). Similar to ref. 26, we considered GH to be the part of the gully expansion that occurred upslope of the gully head on the oldest image (demarcated with a straight line, perpendicular to the gully thalweg; Supplementary Fig. 5). SW was quantified as the expansion that took place downslope of this gully head. This way, each mapped gully expansion event could be attributed to the formation of a new gully, gully head retreat and/or gully sidewall widening.

For gullies with three or more suitable images available, two to four expansion rates were calculated using pairs of subsequent images and the procedure described above. Yet, the exact dates on which the observed gully expansion occurred are mostly unknown. To reconstruct the cumulative expansion of gullies over time (Fig. 4b) and estimate the displaced population (Methods, ‘Estimating the displaced population’), we, therefore, assumed that each expansion event took place halfway between the two image dates. For newly formed gullies, the formation date was assumed to be the average between the date of the last available image in which the UG was absent and the date of the first image in which it was present.

Assessing the factors controlling UG occurrence

We conducted bivariate and multivariate analyses to assess the factors that help us to explain the spatial patterns of UGs across DRC. Owing to data constraints (for example, lack of high-resolution digital elevation models) and to allow robust comparisons with areas not affected by UGs, we conducted these analyses at a resolution of 30 arcseconds (about 1 km at the equator).

We first converted our gully inventory to cells of this resolution. A cell was classified as being affected by UGs if it contained one or more gully heads as mapped on the reference image (Methods, ‘Mapping the extent and expansion rates of UGs’). This resulted in a dataset of 752 cells in which UGs occur. Next, we randomly generated cells in affected and non-affected cities across the DRC (Methods, ‘Identifying cities significantly affected by UGs’), both inside the city limits and in a 2-km buffer around them. We then checked in Google Earth whether signs of gullies were present in these cells. In total, we checked 1,469 additional cells: 754 within the city limits (which all had no gullies) and 715 in a 2-km buffer around it. Only two cells of the latter were found to have gullies and were added to the cells having a UG, bringing the total number of cells with a gully to 754 and the number of cells without to 1,467.

For each cell, we extracted a set of variables from different geospatial datasets49,55,56,57,58,59,60,61 that might be relevant in explaining the presence or absence of gullies (Supplementary Table 2). Most of these variables are commonly used in other studies aiming to explain the occurrence of gullies25. Using two-sided Mann–Whitney U-tests62, we analysed to what extent the distribution of these variables differed significantly between cells with or without gullies.

We then constructed a logistic regression model63 that combines the variables best explaining observed differences between affected and non-affected cells, making sure each variable remained significant within the model. For this, we first standardized all predictor variables to values between 0 and 1, based on the observed minimum and maximum across all cells. Next, we applied a backwards stepwise selection procedure, in which we first fitted a logistic regression model based on all remaining variables (Supplementary Table 2) and then systematically removed all variables that were not significant (P-value of Z-statistic >0.0001). We also excluded the variable Sand_iSDA (Supplementary Table 2) from the model, as it was probably subject to important uncertainties59 and strongly correlated to the more robust dummy variable Soil, which was also highly significant and functionally expressed the same property. The variables that were finally retained in the model were Slope, Soil, Tree_Cover, BUA and Road (cf. Supplementary Table 2).

The overall performance of the model was tested based on a cross-validation, in which we made ten 70:30% random splits between calibration and validation data. Each time, the four retained variables were standardized based on the range of the training data and their corresponding coefficients refitted (Supplementary Table 3). The resulting alternative model was then applied to the independent test data. We used a validation receiver operating characteristic curve and the corresponding area under the curve as a proxy for model performance64 (Supplementary Fig. 3).

Estimating the displaced population

To estimate the population that was probably displaced by the formation and expansion of UGs, we made use of JRC GHS population data8, that is, a series of gridded datasets that provide estimates of population density at a spatial resolution of about 100 m. This dataset was chosen as it covers the whole territory of the DRC and the entire observation period of our study. Furthermore, it was positively evaluated by previous studies65.

The displaced population caused by each gully expansion event was calculated as

$$\text{DP}\,=\,({\text{Area}}_{\text{recent}}\,\ast \,{\text{PopDens}}_{\text{recent}})-({\text{Area}}_{\text{old}}\,\ast \,{\text{PopDens}}_{\text{old}})$$

(1)

where DP is the total population expected to be displaced due to an observed case of gully expansion or formation; Arearecent is the area of the UG as mapped on the most recent of the two images; Areaold is the area of the UG as mapped on the oldest of the two images; and PopDensrecent and PopDensold are the average population densities in the polygons corresponding to Arearecent and Areaold, respectively (Supplementary Fig. 5). PopDens values were derived from the JRC GHS population data8, considering the year that the expansion event was expected to take place (Methods, ‘Mapping the extent and expansion rates of UGs’). As these data are originally available for 5-year intervals (for example, 2000, 2005), estimates for in-between years were obtained by first linearly interpolating these raster datasets. Summing the DP values for all (chronologically sorted) cases of UG expansion allowed us to calculate the cumulative number of displaced persons (Fig. 4b).

A similar strategy was applied to differentiate between population displaced by the formation of new gullies (New), sidewall widening (SW) or gully head retreat (GH). For the first, Areaold was assumed to be zero. For SW, we only considered the area downslope of the previously mapped gully head when calculating Arearecent. For GH, we only considered the area upslope of the previously mapped gully head when calculating Arearecent, whereas Areaold was assumed to be zero. The population density of each expansion event was calculated by dividing DP by the areal extent of the event. We tested whether the distributions of areal expansion, population density and expected displaced population were significantly different between types of gully expansion (that is, New, SW, GH or Total; Fig. 4a) using two-sided Mann–Whitney U-tests62.

Estimating the exposed population

To estimate the population exposed to UG expansion in a given year, we considered all mapped gully polygons that were present in that year (Methods, ‘Mapping the extent and expansion rates of UGs’) and generated hazard zones around them, using different buffer distances (Supplementary Fig. 5). The total exposed population was then calculated as

$$\text{PE}\,=\,\mathop{\sum }\limits_{i=1}^{n}({\text{PopDens}}_{\text{year},i}\times {\text{AH}}_{i})$$

(2)

where PE is the total population exposed to the expansion of UGs, n is the number of hazard zone polygons, PopDensyear,i is the population density of hazard zone polygon i in the considered year according to the JRC GHS population data8, and AHi is the area of hazard zone polygon i. Mapped gully polygons were excluded from the hazard zone polygons, and overlapping polygons were only counted once.

To account for various degrees of exposure, different buffers were considered when creating these hazard polygons. First, we applied a buffer distance of 100 m around the mapped contours of the gullies. Although somewhat arbitrary, this distance provides an intuitively understandable estimate of the population for which the threats of UG expansion are a regular and significant concern. These people are potentially exposed to direct impacts such as damage to their property and/or displacement, but also to numerous indirect and intangible impacts (for example, decreased housing property value, required investments in initiatives to counter further gully expansion, decreased accessibility, increased stress). Almost no data on these indirect impacts are currently available. However, our mapping efforts and field surveys indicate that this distance of 100 m is probably still a highly conservative value. For example, we also observed that people living several hundred metres away from a UG are involved in implementing measures aiming to stop gully expansion. Usually, they do so at their own expense28,35.

Second, we generated buffer areas that characterize the potential expansion zones of the UGs. People living within these buffers are expected to be directly exposed to potential property damage, displacement and even injury or death. The potential expansion areas were quantified by considering both gully widening and gully head retreat. For widening, we analysed the maximum widths of all UGs that were at least 10 years old as mapped on the reference images (Methods, ‘Mapping the extent and expansion rates of UGs’). These widths varied strongly within and between cities (Supplementary Fig. 2). Nonetheless, UGs in sandy substrates (see ‘Soil’ in Supplementary Table 2; Fig. 2) were significantly wider than those formed in other substrates according to a two-sided Mann–Whitney U-test (P < 0.0001; Supplementary Fig. 2). Hence, we treated these two samples differently and considered the 95% quantile as the expected maximum width a gully may attain in this substrate (that is, 70 m for UGs in sandy substrates and 51 m for UGs in non-sandy substrates). We then generated buffer areas around the thalweg of each mapped UG with distances equal to half of these expected maximum widths.

A similar strategy was used for the potential gully head retreat. We assessed the average linear gully head retreat rate (quantified as the Euclidean distance over which the head retreated, divided by the observation period) of all UGs for which the available imagery allowed us to do so over a period of a minimum of 10 years (Methods, ‘Assessing the factors controlling UG occurrence’). As with the maximum gully widths, these expansion rates varied considerably within and between cities but could be robustly grouped in UGs formed in sandy substrates and UGs formed in other substrates based on a two-sided Mann–Whitney U-test (Supplementary Fig. 2; P < 0.0001). We considered the 95% quantiles of these two populations (that is, 19.2 m yr−1 for sandy and 10.6 m yr−1 for other substrates) and multiplied them by 10, as we aimed to assess the population exposed to gully expansion at a decadal timescale. The resulting distances (192 m or 106 m) were used to generate buffer areas around the gully heads present in the considered year. We chose circular buffer areas because the exact direction of gully head retreat is impossible to predict. Numerous gully heads changed direction or bifurcated as they retreated, depending on the road network, local topography and other factors that are hard to quantify (Fig. 1). We then merged these circular buffers around the head with the maximum width buffers around the thalweg and subtracted the mapped UG polygons from them. The resulting polygons indicate the land areas that are potentially prone to further gully expansion and were used to calculate AHi (equation (2) and Supplementary Fig. 5).

Third, we generated hazard zones that characterize the expected expansion zones of UGs. People living in these areas are severely exposed to the impacts of UG expansion. For this, we followed the same strategy as for the potential UG expansion zones. Yet, we considered the averages rather than the 95% quantiles to create the buffers. For the maximum gully widths, these corresponded to 32 m in sandy and 19 m in non-sandy substrates (Supplementary Fig. 2). The average linear head retreat rates were 5.4 m yr−1 and 3.2 m yr−1 for UGs in sandy and non-sandy substrates, respectively.

To further assess the main factors contributing to the increases in UG exposure between 2010 and 2023 (Fig. 5b), we compared different scenarios. More specifically, for each hazard zone, we calculated the population in 2023 that lived in the hazard zones of UGs as mapped in 2010. Comparing this number with the originally exposed population of 2010 indicates how much of the increase in exposure is attributable to population increases in already existing hazard zones. Likewise, we calculated the 2023 population living in the 2023 hazard zones of UGs that were already present in 2010. Comparing this with the exposed population of 2010 indicates how much of the increase in exposure is due to further expansion of already existing UGs. Finally, by subtracting these two contributions from the total increase, we assessed the exposure increase due to new UGs that were formed since 2010.

Uncertainty assessment

Both our estimates of the displaced (Fig. 4) and exposed (Fig. 5) populations are subject to uncertainties. The main source of these uncertainties is potential errors in the JRC GHS8 population estimates that were used. As these errors are unknown, an exact calculation of the associated uncertainty is impossible. Yet, to evaluate the overall reliability of our results, we repeated the analyses of population exposure in 2020 (Methods, ‘Estimating the exposed population’) based on WorldPop Population data. For this, we used the unconstrained and unadjusted datasets66,67. WorldPop datasets have a similar spatial resolution as JRC GHS data and are informed by the same underlying census data (CIESIN GPWv4.11), but are based on a different modelling approach8,66. Moreover, WorldPop data are only available until 2020. Results show that, although comparable, estimates of the exposed population are lower when using WorldPop data (Supplementary Fig. 6 and Supplementary Table 6). This is particularly true for small cities and, to a lesser extent, for Kinshasa. For large cities (for example, Kikwit, Mbuji-Mayi, Bukavu), the deviations are typically less than 20%. Further analyses showed that this is mainly because WorldPop underestimates population density estimates in large parts of the affected areas (Supplementary Fig. 7). In many pixels in which WorldPop assumes densities of fewer than 50 persons ha−1, JRC GHS indicates population densities of 100–1,000 persons ha−1 (Supplementary Fig. 7). Further verification in Google Earth (considering the number of houses) confirmed that the latter are probably more accurate. Also, earlier studies indicated that unconstrained WorldPop data are often prone to underestimations in highly populated areas68,69. Hence, although the estimates of exposed population are higher when using JRC GHS data, they are probably also more correct. Given the highly similar procedure (Methods, ‘Estimating the displaced population’), the same probably holds for the displaced population.

The main limitation of both datasets is that they are global data products based on the interpolation and extrapolation of often very limited population statistics, using proxies that probably correlate to population densities8,65,66,67. This is a necessity as there are generally no detailed and accurate population data available for the DRC. A notable exception to this is the city of Bukavu, for which detailed census data were collected in 2018, based on the methodology presented in ref. 70. Extrapolation of these data to the level of individual neighbourhoods resulted in a population density map that directly builds on field-based observations and provides the most reliable population data that can reasonably be expected. To further assess the uncertainty of our estimates, we quantified the exposed population using this population density layer (according to the gully extents in 2020; Methods, ‘Estimating the exposed population’) and compared the results with those obtained with JRC GHS and WorldPop data for 2018 (Supplementary Table 7). Although estimates based on the JRC GHS data are very similar to those based on WorldPop data, they are 22.9–39.3% lower than the results obtained from the direct census data. This indicates that, at least in the case of Bukavu, our exposure and displacement rates are probably still underestimations.

Based on these comparisons, and taking into account the effect of other sources of errors (for example, inaccuracies in mapped UG polygons), we cautiously assumed that all estimates of displaced and exposed population are subject to relative errors of up to 40%. Accordingly, we estimated uncertainty ranges by considering values up to 40% higher or lower than the reported figures.

RELATED ARTICLES

Most Popular

Recent Comments