Data-driven modelling of joint debris flow release susceptibility and connectivity

In mountain basins, sediment supply to the fluvial system occurs mainly through episodic geomorphic processes — such as debris flows and other landslide types — whose effectiveness is strongly influenced by the structural connectivity within a catchment. This paper presents a novel data-driven approach to identify and map areas that are simultaneously susceptible to debris flow initiation and structurally connected to the main channel network (i.e. relevant sediment source areas for predicting and mitigating flood hazards in the river channels). The presented approach comprises: (i) the visual interpretation, delineation, mapping and classification of event-specific connected and disconnected debris flow areas in three catchments of the Italian Alps; (ii) the development of data-driven debris flow release susceptibility models that are combined with quantitatively classified index of connectivity (IC) maps; and (iii) a thorough evaluation of the approach, including an assessment of its spatial transfer-ability across the catchments. The main results show: (i) quantitative IC thresholds to discriminate connected from disconnected debris flow release areas; (ii) statistically well-performing and geomorphically plausible debris flow release susceptibility models for the three basins; (iii) diverse joint debris flow connectivity – susceptibility maps that allow identifying zones which are differently relevant in terms of debris flow connectivity. This work also highlights the spatial transferability of the approach, associated benefits and potential drawbacks, as well as the utmost importance of a thorough combined quantitative and qualitative (i


| INTRODUCTION
Sediment connectivity can be defined as a state variable of a geomorphic system reflecting the degree of linkage that controls the sediment flux throughout the landscape (Cavalli et al., 2019;Heckmann et al., 2018;Wohl et al., 2019).Three types of sediment connectivity-lateral, longitudinal and vertical-can be analysed in catchments, depending on the investigated linkages among the system components.Lateral connectivity focuses on linkages between the channel network, its floodplains and hillslopes.Longitudinal connectivity aims to investigate the sediment transfer along the channel network while vertical connectivity is related to bed surfacesubsurface sediment interactions (Bracken et al., 2015;Fryirs, 2013).
The assessment of sediment connectivity is pivotal in Alpine headwaters, since high mountains experience some of the largest sediment fluxes and are major sediment sources for river systems (Carrivick & Tweed, 2019;Carrivick et al., 2018;Cavalli et al., 2013;Hoffmann et al., 2013).Transfer of sediments to the channel network occurs primarily throughout episodic geomorphic processes such as rockfalls, debris flows, landslides and floods, whose effectiveness in terms of sediment supply is extremely variable and strongly influenced by climate conditions, by the complex morphological setting of catchments, valley morphometry (Cavalli et al., 2019) and other types of buffers (e.g.alluvial fans, floodplains), barriers (e.g.valley steps, dams) and blankets (e.g.bed armouring).
Interest in sediment connectivity research has grown over the past decades, along with the approaches trying to quantify it.Several methodical approaches and indices have been proposed (Heckmann et al., 2018;Wohl et al., 2019).Methodological procedures range from the development of indices based on drainage area and slope (Dalla Fontana & Marchi, 2003) to GIS-based models (Borselli et al., 2008;Cavalli et al., 2013;Lane et al., 2017) and sophisticated twodimensional or mathematical models (Baartman et al., 2020;Cossart & Fressard, 2017;Michaelides & Wainwright, 2002).Borselli et al. (2008) introduced the index of connectivity (IC), which was further developed by Cavalli et al. (2013).The IC spatially assesses structural connectivity based on high-resolution digital elevation models (DEMs).It is a geomorphometric index considering the potential for downward routing of the sediment in the upslope catchment and the flow path length that sediment has to travel to reach a given target based on DEM-derived parameters and a sediment flux impedance factor (Cavalli et al., 2020).It has been used in many recent studies for various aims, such as to assess the effects of land use and topographic changes in Mediterranean and mountain catchments (Cislaghi & Bischetti, 2019;Lizaga et al., 2018;Llena et al., 2019;L opez-Vicente et al., 2017), to estimate the impact on geomorphic systems of natural infrequent disturbances (Martini et al., 2019;Ortíz-Rodríguez et al., 2017;Pellegrini et al., 2021), to identify dominant processes acting in headwater catchments and proglacial areas (Bollati & Cavalli, 2021;Cavalli et al., 2019;Goldin et al., 2016;Heiser et al., 2015).Research based on the IC has mainly been conducted to compare the overall sediment connectivity between different basins or different sectors of the same catchment, as well as to differentiate between areas featuring relatively high/low connectivity with respect to the closest main channel or to the catchment outlet.However, most of the published research on connectivity does not explicitly account for the actual terrain susceptibility to specific processes (e.g.debris flows or other types of mass movements).
The term 'landslide susceptibility' describes the spatial propensity of an area to be affected by a landslide (Guzzetti et al., 2005).Beyond the single-slope scale, a reasonable parameterization of spatial physically based slope stability models is often hindered by the unavailability of spatial geotechnical data.In contrast, data-driven models are more flexible in terms of their input data, while often still being able to outperform their physical counterparts (Seefelder et al., 2017).In past decades, a large amount of research focused on identifying landside-prone terrain at regional or multi-regional scale using data-driven procedures (Budimir et al., 2015;Goetz et al., 2015;Reichenbach et al., 2018;Steger & Kofler, 2019).In many cases, those data-driven assessments focused specifically on environmental conditions present at the source zone of a specific landslide type in order to identify potential process starting zones (Heckmann et al., 2014;Petschko et al., 2016;Steger et al., 2016a).Besides the delineation of landslide-prone terrain, landslide susceptibility models were also applied to obtain insights into the influence of environmental variables on landslide occurrence (Goetz et al., 2015;Pisano et al., 2017;Schmaltz et al., 2017;Vorpahl et al., 2012).Landslide susceptibility assessments, combined with further analyses, were also used to develop rules for regional-scale early warning and landslide hazard, exposure or risk zonations (Guzzetti et al., 2005;Krøgli et al., 2018;Pereira et al., 2016).Landslide susceptibility models and maps were also investigated for their potential to explain spatial variability in sediment yield (Broeckx et al., 2016), or the association between landslide susceptibility and landslide mobilization rates (Broeckx et al., 2019).Bordoni et al. (2018) included the IC in a data-driven landslide model to estimate landslide-prone road sections.
It is expected that a combined perspective which considers both lateral sediment connectivity and landslide susceptibility will be required to identify areas that are most relevant to provide specific sediments (e.g.debris flow material) to the channel network, because an area that is structurally connected to a channel is not necessarily prone to trigger a specific landslide type.Despite the relatively high amount of published research in both fields, data-driven procedures that combine the spatially explicit IC mapping with models that identify areas prone to slope instability are still missing.Up to now, to the best of our knowledge, no approaches have been presented to classify

| Geological and geomorphological settings
Three catchments situated in South Tyrol (Northern Italy) were analysed in this study (Figure 1a): Stolla, Pfitsch/Vizze and Sulden/ Solda.This latter catchment is drained by the Sulden River and by a major tributary, the Trafoi River.The three catchments are spatially distributed across the region and are characterized by varying substrate lithologies, drainage areas, geomorphological setting, land use and hydrological regime (Table 1).
The Stolla River flows from south to north (Figures 1a and b) and its basin is composed of dolomites, limestones and sandstones.The Stolla channel is mainly confined in the upper part, whereas it is characterized by an alternation of confined and partly confined reaches in the middle and lower sectors (Scorpio et al., 2022).The dolomitical rockwalls frequently deliver large volumes of loose sediments to talus slopes and cones, whose sediments are frequently reworked by debris flow processes during rainstorms.
The Pfitsch basin presents a NE-SW orientation and flows into the Eisack/Isarco River (Figures 1a and c).A prevalence of metamorphic rocks, especially gneiss and schist, and intrusive igneous rocks (granite) characterizes its substrate.Aside from the first 3.5 km, where the channel is highly confined by the hillslopes, the channel alternates unconfined and partly confined upstream of the Novale dam (drainage area 112 km 2 , Figure 1c).Downstream from this dam the channel flows in a narrow and confined valley for about 5 km, whereas the last 3 km of channel length is unconfined and channelized.Large debris flow fans are present at the toe of both valley sides.The Sulden catchment (drainage area 130 km 2 ) is drained by two main rivers flowing from north to south, the Sulden (drainage area 107 km 2 , eastern part of the basin) and the Trafoi (drainage area 53 km 2 , western part, Figures 1a and d).The area shown in Figure 1d will hereafter be referred to as 'Sulden'.Bedrock geology includes both metamorphic rocks (mainly quartz phyllites, ortho-and paragneiss) and carbonate rocks (limestones and dolomites).Channels are mainly confined or partly confined, with a diffuse presence of debris flow fans and talus cones at the edges of the valley bottom.

| Recent debris flow-triggering events
In the last decade, the three basins have been affected by highmagnitude storm events which caused channel widening and several mass movements, including shallow landslides and debris flows (Figure 2).
In the Stolla basin, a rainstorm presenting a short duration (6 h) and a rainfall intensity exceeding 45 mm/h occurred on 5 August 2017.The cumulative rainfall depth ranged from 40 to 84 mm and the return period was estimated in the order of 200 years for rainfall duration of 1 h (Scorpio et al., 2022).More than 500 debris flowscovering a total area of 1.7 km 2 (4.1% of basin area)-were triggered along the hillslopes during this event, and the Stolla main channel underwent considerable widening and bed-level changes (Scorpio et al., 2022, Figures 1b and 2a-c).
In the Pfitsch basin, a very intense summer storm took place on 4 August 2012.The rain gauge in Sterzing (2 km downstream of the basin outlet) registered 72.8 mm of rainfall in 6 h, which corresponds to an estimated return period between 200 and 300 years (Macconi et al., 2012).The area close to the basin outlet received both the maximum event rainfall accumulation (136 mm) and the maximum hourly rain intensities (38 mm/h; Destro et al., 2018).The cumulative rainstorm precipitation decreased moving towards the upper part of the basin, and from the left side to the right side of the valley.The storm triggered numerous landslides and debris flows (Figures 1c and 2d and e), presenting a total area of 0.77 km 2 (0.5% of basin area); their spatial distribution mirrored quite well the observed rainfall distribution (Destro et al., 2018).
The upper part of the Sulden basin was affected by a rainstorm on 13 and 14 August 2014 (Macconi et al., 2012).The Madritsch weather station (located at about 2900 m a.s.l.) registered a mean rainfall intensity of 2.9 mm/h and a total rainfall depth of 48.5 mm over the duration of 16.8 h (Kofler et al., 2021).Several hillslope processes were triggered by this event (Figures 1d and 2f-h), including a rock glacier front failure evolving to a debris flow and other debris flows leading to a total debris flow area of about 0.44 km 2 , corresponding to 0.27% of the basin area (Kofler et al., 2021;Savi et al., 2021).Intense bedload transport was measured in the Sulden River (Coviello et al., 2019).However, the August 2014 storm event in Sulden was estimated to be considerably more frequent than the events analysed in the Stolla and Pfitsch basins, with a recurrence interval likely in the range 20-50 years.

| METHODS
The proposed approach builds upon three main steps: (i) creation of a classified debris flow release susceptibility map (susceptible vs not susceptible); (ii) thresholding of the sediment connectivity index map with respect to debris flow initiation areas (connected vs disconnected); and (iii) joining of the two resulting binary maps to produce a combined debris flow release connectivity-susceptibility map.
The methodological workflow is depicted in Figure 3 and started with the mapping and labelling (connected vs disconnected) of debris flow polygons and sampling of debris flow release points (Figure 3a; Section 3.1).The labelled debris flow release points then served as input for a logistic regression model that enabled us to derive a quantitative threshold for classifying the IC map (Cavalli et al., 2013; Figure 3b) into connected and disconnected areas (Figure 3c; Section 3.2).A set of predisposing factors (Figure 3e) and the debris flow release points were used to train and evaluate a generalized additive model (Figure 3f) to create a binary map that depicts susceptible and not susceptible areas in terms of debris flow release (Section 3.3).Detailed evaluations of the IC model and the susceptibility model (Section 3.4) were followed by an intersection of the associated binary maps (Figures 3c and f) to create the final combined connectivity-susceptibility maps (Figure 3h).
The previously described approach was first developed within the Stolla catchment.Two different strategies were then applied to test the spatial transferability of the approach and of the specific results (Figures 3d and g).

| Debris flow inventories and sampling of release zones
Event-specific debris flow inventories were produced for the three previously described rainstorm events, each in a specific catchment (see Section 2.2).The mapping was based on the comparison of

| Index of connectivity map and its thresholding
The structural connectivity represents the potential of a landscape to be connected through flow pathways and was evaluated by applying the IC (Cavalli et al., 2013).For more details on index computation, see the online Supplementary Material.(Swets, 1988).Besides evaluating the overall discriminatory power of the classification via the area under the ROC (AUROC), the ROC plot was also used to elaborate an 'optimal' threshold for classifying the underlying probabilities into the two classes based on the Youden index (Schisterman et al., 2005).In summary, the cutpoint value that 'optimally' separates connected from disconnected areas represents the predicted probability score in the ROC space that maximizes the sum of sensitivity and specificity, and furthermore has the property of maximizing the overall correct classification rate (Hosmer et al., 2013;Ruopp et al., 2008).This cutpoint was then used to divide the continuous IC probability map into two classes: 'connected' and 'disconnected' debris flow release areas.
The above-described methodological framework was first tested in the Stolla catchment (Figure 3c), and the IC thresholds were then transferred and tested in the Pfitsch and Sulden catchments (Figure 3d).For these latter basins, additional optimized thresholds were also computed based on local debris flow data (Figure 3d).

| Debris flow release susceptibility modelling and thresholding
An interpretable supervised classifier, namely a binomial generalized additive model (GAM; Figure 3f), was applied to delineate areas prone to debris flow release (Hastie & Tibshirani, 1999).The statistical modelling and associated evaluations were performed within the R software (R Core Team, 2017), and the GAM was run utilizing the Rpackage 'mgcv' (Wood, 2017).
Several recent studies demonstrated the usefulness of GAMs in the field of data-driven landslide susceptibility modelling (Goetz et al., 2015;Petschko et al., 2014;Steger et al., 2021;Vorpahl et al., 2012).GAMs are flexible semi-parametric models that allow accounting for non-linear relationships between the response variable (i.e.presence/absence of debris flow release points in this study) and continuous variables (e.g.slope angle, normalized elevation) by fitting smoothing splines to continuously scaled variables (Hastie & Tibshirani, 1999;Wood, 2017).
In detail, the GAM was used to model the binary response that represents debris flow release locations (Figure 3a) and debris flow absences.Sampling of debris flow release points is described in detail in Section 3.1.A topographical correction was applied to randomly sampled debris flow absence locations outside the mapped debris flow polygons.This procedure accounts for projection-related areal biases (i.e. the measured planar area is smaller than the true surface area on steep terrain) and ensures that differently inclined terrain are equally well represented within the study site, as described in detail by Steger et al. (2021).To represent the topographically complex terrain with a sufficient quantity of observations while ensuring computational feasibility, the number of debris flow absence points to be sampled was set to three times the number of landslide presences (e.g. for Stolla, 612 presences and 1836 absences).
A parsimonious GAM model was created by introducing six significant and geomorphically plausible terms as predictors (Figure 3e).
The underlying seven spatial variables (i.e. the model contains one two-variable interaction term) represent one land use variable and six topographic factors (slope angle, plan curvature, profile curvature, topographic wetness index, slope roughness, normalized height index) that were derived directly from the 5 m DTM (Section 3.1) using the respective tools within the SAGA GIS environment (Conrad et al., 2015).The rasterized (5 m) land use map-with classes of bare surface, forest, shrubland and grassland-was obtained from the Landuse information System South Tyrol (LISS, 2013).
The model construction started with the smoothing parameter selection for continuous variables.This step was conducted via internal cross-validation, while setting a maximum of three effective degrees of freedom for the spline parameterization.p-Values for each term were interpreted to reject the null hypothesis (i.e.no effect of the term) (Wood, 2013).Thus, a term was only included in the final model when showing a significant effect (p-values < 0.001), while simultaneously depicting a geomorphologically plausible relationship to debris flow occurrence (Steger et al., 2016a).In this context, the visualization of component smooth functions served as a basis to uncover implausible modelled relationships and to obtain insights into the effect of terms on the outcome by interpreting the deviation of the centred curves from the y-axis value of 0 (i.e. a horizontal line at 0 indicates no effect) (Steger et al., 2021).
The following six terms fulfilled our model selection requirements.
Slope angle was introduced to represent downslope forcing, while plan and profile curvature were used as a proxy for slope hydrological influences (e.g.surface runoff) and potential for loose material availability.
An interaction term that represents the interplay of the topographic wetness index and slope roughness was included after evaluating the first model runs (more details in the discussion).The relative height position of debris flow release zones within the catchment was represented by the normalized height index.Finally, the land use variable was used as a proxy for the spatial variation of the effects of vegetation (or its absence) on debris flow release (Reichenbach et al., 2018;Steger & Kofler, 2019;Van Westen et al., 2008).
In analogy to the IC models, debris flow susceptibility modelling was first conducted in the Stolla catchment (Figure 3f), whereas subsequent transferability tests (i.e.direct transfer of the results; recalibration with local data) were performed also within the Pfitsch and Sulden areas (Figure 3g).Analogous to the binary maps of connectivity, the continuous probability map of debris flow release susceptibility was classified into a binary map according to the ROC analysis and the Youden index (susceptible vs not susceptible terrain; Figures 3f   and g).

| Model evaluation techniques
The two separate models (IC and susceptibility; Sections 3.2 and 3.3) were evaluated separately by assessing: (i) their discrimination power and (ii) their non-spatial and spatial prediction performance.The AUROC curve was used as the primary error measure for the continuous model outputs (i.e.probability estimates) while confusion matrix measures (e.g.sensitivity, specificity) were used to evaluate the subsequent categorized results (i.e.binary IC and susceptibility maps, final combined map).In summary, the calculation of the discrimination power focused on the fitting performance of the two single models AUROCs between 0.7 and 0.8 can be considered acceptable.AUROC values between 0.8 and 0.9 refer to an excellent discrimination, and values > 0.9 to an outstanding discrimination (cf.AUROC interpretation guidelines in Hosmer et al., 2013).
The predictive performance estimation was also based on the AUROC metric and assessed by iteratively confronting the predicted probability scores with independent test data using k-fold CV and kfold SCV (Brenning, 2012;Schratz et al., 2019).Thus, predictive per-

| Map combination and model transfer
Both binary maps (i.e. the classified IC and release susceptibility maps) were spatially overlaid to obtain the final joint susceptibilityconnectivity maps that reflect the four classes mentioned above (Figure 3h).A first exploratory data analysis showed that the IC values sampled within the debris flow release zones of the Stolla and Pfitsch catchments were considerably higher (À2.5 and À2.6, respectively; Figures 4a and b) for connected debris flows compared to their disconnected counterparts (À3.7 and À3.5, respectively; Figures 4a and   b).Their significant difference was confirmed by a p-value of <0.001 as measured via the non-parametric Wilcox-Mann-Whitney rank sum test.In contrast, no statistically significant difference (p-value > 0.1) was observed between connected and disconnected debris flows for the Sulden basin, where median values for connected debris flows (À1.9, Figure 4c) were very similar to median values for disconnected debris flows (À2.1, Figure 4c).
The logistic regression model trained for the Stolla catchment performed exceptionally well to discriminate connected from disconnected debris flow release areas (AUROC 0.93; Table 2).
The high generalizability and robustness of the Stolla IC model was confirmed with independent test data based on non-spatial (CV median AUROC 0.95, IQR 0.055) and spatial (SCV median AUROC 0.88, IQR 0.096) data partitioning (Table 3).
The optimal IC value cutpoint that separates connected from disconnected debris flows in the Stolla catchment was found to be À3.21 (i.e.values > À3.21 mean connected areas), which corresponds to a portion of correctly classified connected debris flow releases of 95%, and to 85% of correctly classified disconnected debris flows (Table 2, Figure 5).
The superimposition of mapped debris flows on the spatial model predictions (Figure 6a) and its binary derivative (Figure 6b) visually confirms the good agreement between model outputs and the mapped debris flows.The data-driven thresholding led to a binary map that depicts around 30% (70%) of the total basin area as connected to (disconnected from) the main channel in terms of debris flow release (Figure 6b).
A direct spatial transfer of the Stolla IC threshold to the Pfitsch and Sulden basins led to divergent performance statistics (transf models in Table 2).For instance, the computed AUROC scores reveal that the direct spatial transfer of the Stolla IC model to the Pfitsch area was associated with an excellent separability of connected and disconnected debris flow release observations (AUROC 0.82), whereas a below acceptable discrimination was found for the Sulden area (AUROC 0.57) (Hosmer et al., 2013).These results are evident when visually comparing actual debris flow release zones with the respective spatial pattern of the IC models (Figures 7a and c).For Pfitsch, the true positive rate of 0.81 and the lower true negative rate of 0.68 indicate that the spatially transferred Stolla threshold was more accurate to correctly classify connected debris flows compared to disconnected ones (transf models in Table 2, Figure 7b) and that a higher IC threshold is required to balance classification errors for connected and disconnected observations (Figure 8a).For Sulden, the spatial model transfer led to a classification that failed to correctly classify all disconnected debris flows (true negative rate 0) while almost all connected debris flows were classified correctly (true positive rate of 0.96; Table 2, Figure 7d).These numbers revealed that a local optimization is required for the Sulden basin in order to balance misclassification rates.
The recalculation of the IC thresholds for Pfitsch and Sulden based on basin-specific debris flow observations resulted in slightly higher (Pfitsch: À3.10, Figure 5) and considerably higher (Sulden: The associated sensitivity and specificity scores indicate a more balanced distribution between correctly classified connected and disconnected debris flow observations when using optimized, basin-specific thresholds (Table 2).In fact, a higher (Pfitsch) or much higher (Sulden) portion of correctly classified disconnected debris flow observations was obtained, at the cost of a slightly lower (Pfitsch) or much lower true positive rate (Sulden) (Table 2).For the Pfitsch basin, the two binary maps are similar, with 18% (transferred) and 15% (optimized) of the total area classified as connected (Figures 8a and b).In contrast, considerable differences between the two maps can be observed for the Sulden basin, as the proportion of the area covered by connected areas decreased from 67% (transferred) to 16% (optimized) (Figures 8c and d).F I G U R E 5 Comparisons of optimized IC thresholds (points) and the underlying IC probability score for the three catchments.The underlying IC logistic regression coefficients were 2.71 (Stolla), 1.88 (Pfitsch) and 0.24 (Sulden).
points could be achieved in Pfitsch only, whereas the classification for Sulden can be considered unacceptable, despite the recalibration of the model using local data.

| Debris flow release susceptibility
Similar to the IC models, debris flow susceptibility modelling was first conducted in the Stolla catchment.The GAM was built on six significant (p-value < 0.001) model terms that allowed us to separate debris flow release zones from debris flow absences associated with the 2017 event (slope angle, normalized height, planform curvature, profile curvature, land cover, interaction between surface roughness and wetness index).Modelled relationships, visualized in the form of component smooth functions, revealed that the chances of debris flow release are highest for medium inclined slopes (peak at around 40 ; Figure 9a), at medium to high relative slope positions below the prevalent rocky faces (normalized heights around 0.5; Figure 9b) exhibiting a concave-shaped terrain form (negative profile and plan curvature values; Figures 9c and d).
Furthermore, bare surface areas, covered with loose debris, were associated with the highest chances of releasing debris flows, followed by grasslands and shrubland, whereas forested terrain was associated with the lowest chance of debris flow release (Figure 9e).
Higher debris flow release likelihoods were also predicted for rough terrain that spatially coincides with a high topographic wetness index (Figure 9f).
The comparison of the continuously scaled spatial prediction pattern (Figure 10a) with debris flow observations showed a very high separability of the two classes in the training data (debris flow release zones vs absence observations) with an AUROC of 0.92 (Table 4).
Validation of the model with independent test observations confirmed a high predictive performance and generalization capacity of the debris flow release susceptibility model for Stolla with median AUROC scores of 0.92 (CV, Table 5) and 0.89 (SCV, Table 5).
Binary thresholding of the continuously scaled debris flow release susceptibility map was conducted in analogy to the approach implemented for the IC map.
The maximization of the sum of sensitivity and specificity led to a probability cutpoint of 0.31, a true positive rate of 0.83 and a true negative rate of 0.87 (Table 4).Inspection of the subsequent binary map for Stolla (Figure 10b) highlighted that those areas classified as susceptible to debris flow release frequently corresponded to medium inclined impluvium, hollows and low hierarchical order channels at medium to high unvegetated slope positions.
A particularly good spatial agreement between mapped debris flow initiation areas and terrain classified as susceptible to debris flow release was observed for the western part of the Stolla catchment (Figure 10).4).
Training of new GAMs within these two test areas increased the discrimination power of the models, resulting in AUROC scores of 0.87 for Pfitsch and 0.92 for Sulden (opt models in Table 4; Figures 11b and d).An evaluation of the prediction performance of these local models confirmed a high ability of the models to correctly classify independent test observations with median AUROC scores of 0.85 (CV) and 0.86 (SCV) for Pfitsch and 0.86 (CV) and 0.89 (SCV) for Sulden (Table 5).
Subsequent ROC-driven thresholding of the maps led to binary maps that contain 88 and 87% of mapped debris flow releases within areas classified as susceptible for Pfitsch and Sulden, respectively.In both test catchments, differences in the spatial prediction pattern exist when visually comparing the spatially transferred models with the locally trained ones.Areas classified as susceptible to release debris flows correspond to higher elevation impluvium, ridges and upper parts of hillslopes in the transferred maps (Figures 12a and c) and with branches of drainage network and hillslopes at lower elevations, in the optimized maps (Figures 12b and d).

| Joining debris flow release susceptibility with its connectivity
Basin areas predicted to be: (i) disconnected and not susceptible to  6).
In analogy, 76% of the disconnected debris flows were correctly classified as lying in areas 'susceptible but disconnected' (areal extent 12%).The resulting FR score of 6.33 highlights the high density of the disconnected debris flow observations within this zone (Table 6).The very low FR values indicate that very few zones in the Stolla area were affected by a double misclassification, such as observed connected debris flow releases lying within zones classified as not susceptible and disconnected (cf.FR of 0.07 and 0.09 in Table 6).
In the Pfitsch catchment, areas classified as 'susceptible and connected' mainly correspond to hillslopes close to the bottom of confined reaches.Comparing post-event debris flow classification for Pfitsch with the transferred and optimized connectivity-susceptibility maps showed that local optimization improves the classification (Table 7).
For instance, 42% of the observed disconnected debris flows were located within the comparably small area (16% areal extent) classified as susceptible and disconnected (FR 2.62).In analogy, 36% of the observed connected debris flow releases were located within the very small zone (1% areal extent) classified as susceptible and connected, leading to a very high FR of 36.Double misclassifications are susceptible (5% areal extent, Figure 14a) compared to the spatially transferred model combination (1%, Figure 14b).For the optimized Pfitsch models, 63% of the disconnected and 61% of the connected debris flow observations were located within susceptible areas that were labelled as disconnected (24% areal extent, FR 2.63) and connected (5% extent, FR 12.2), respectively.FRs clearly below 1 for the double misclassifications (cf.FR of 0.4 and 0.11, Table 7) provide further quantitative evidence for a meaningful, but not outstanding, classification quality.
For Sulden, the plausibility of the joint connectivity-susceptibility map was considered unacceptable for the transferred model, with higher, but still limited explanatory power for the optimized one (Table 8).
Comparing the labelled debris flow release observations with the spatially transferred connectivity-susceptibility model confirmed a comparably low portion of correct classification.Even though 64% of the connected debris flow release points were observed within their 'correct' zone (susceptible and connected; FR 3.04), a large portion of the area was affected by misclassifications.For instance, none of the observed disconnected debris flow releases was located within areas classified as susceptible and disconnected (FR 0), while 63% of these observations were observed for areas classified as susceptible and connected (correct in terms of susceptibility, but wrong in terms of connectivity).The map based on the optimized joint classification for Sulden (Figure 14d) showed substantial deviations from the map based on the spatially transferred models (Figure 14c).Table 8 confirms an improved and more balanced classification in case the underlying models were optimized with local input data.For instance, the portion of labelled debris flow observations located within wrongly classified areas in terms of both connectivity and susceptibility decreased from 19% (8 + 1 for disconnected and connected debris flows, respectively) to 9% (1 + 3) when locally optimizing the underlying models.Within the optimized connectivity-susceptibility map for Sulden, 56% of the connected debris flows were located within the comparably small zone classified as 'susceptible and connected' (areal extent 5%), leading to a high FR of 11.2.In analogy, 54% of the disconnected debris flow releases were observed for 'susceptible but disconnected' terrain (areal extent 12%; FR 4.5).Wrong classifications, in which both labels, connectivity and susceptibility, were incorrectly assigned to an area were clearly below the FR threshold of 1, with FR of 0.36 for disconnected debris flows and FR of 0.17 for connected debris flows (Table 8).

| (Dis)agreement between modelled and observed debris flow susceptibility/connectivity
The results highlight that the class labels of the combined connectivity-susceptibility maps for Stolla and Pfitsch frequently agree with observed process dynamics (Tables 6 and 7), despite con-  -d) depict how the likelihood of debris flow release changes with respect to the topographic variables shown.Values > 0 depict an above-average likelihood of debris flow release for the respective variable value and vice versa.Bare surfaces associated with the categorical land use variable (e) represents the reference class (= 0) and the other classes have to be interpreted relative to this class (e.g. the odds of forest being a location of debris flow release are lowest).The interaction term (f) depicts that the modelled likelihood of debris flow release (from red: low to white: high) is highest at locations with a high surface roughness that spatially coincides with a high wetness index.
hampered by landscape features acting as buffers, such as upper hanging valleys, narrow floodplains and talus slopes (e.g.Fryirs & Brierley, 1999;Harvey, 2002;Lane et al., 2017;Mancini & Lane, 2020;Schrott et al., 2003).Particularly the upper and western parts of the basin that are prone to debris flow initiation were regularly predicted as susceptible but disconnected by the final map (Figure 13).Most of the active talus slopes-composing the dominant source of debris flow material-were therefore assessed to be decoupled from the sediment cascade.The combined connectivitysusceptibility map for Pfitsch and associated performance scores (Figures 14a and b, Table 7) highlight that confined reaches were observed and modelled to be the main entry point of debris flow material to the channel, whereas the almost continuous floodplain and the wide alluvial fans at the valley bottom act as buffers that impede an effective connection to the main channel.
The results for the Sulden basin (Figures 14a and d, Table 8), however, and particularly those associated with the spatially  Notes: AUROC values depict the overall discriminatory power of the classification (0.5 = random, 1 = perfect) based on training data; PT relates to the probability threshold (susceptible vs not susceptible) that maximizes the sum of sensitivity (Sens) and specificity (Spec); Spec relates to the specificity (true negative rate) at the optimal PT cutpoint (i.e. the portion of observed debris flow absences correctly classified as not susceptible); Sens shows the sensitivity (true positive rate) at the optimal PT cutpoint (i.e. the portion of observed debris flow release correctly classified as susceptible); transf = the susceptibility model trained for Stolla (*) was directly transferred and tested for Pfitsch and Sulden; opt = the underlying models were optimized (trained) with local data.8 and associated description).Model evaluations revealed challenges in separating connected from disconnected observations for the Sulden case, particularly because similar IC values were observed for connected and disconnected debris flows for the investigated storm event (Figure 4c).Several additional reasons for the poor spatial transferability of the Stolla model to Sulden can be invoked, but we believe that the relatively low intensity of the studied triggering event, as well as the specific geomorphological setting, played a crucial role.Indeed, the event analysed in Stolla was characterized by considerably higher rainfall accumulations (up to 84 mm in 6 h), while the 2014 Sulden storm event was related to lower rainfall intensities with 48.5 mm in 17 h.Notably, in the Sulden basin, a previously documented storm in summer 1987 caused comparable morphological effects along the slopes, while most debris flows triggered in 1987 were reactivated in 2014 (Schiona, 1994).Comparison of the two events reveals that precipitation magnitudes were equivalent, with about 50 mm of rain depth in 24 h in 1987 (Schiona, 1994).Additionally, in the Sulden are currently labelled as disconnected would reach the channel (Buter et al., 2022).This in turn may be reflected by improved model performance estimates.However, the potential presence of non-linear associations between storm severity and its impact on debris flow connectivity has to be taken into account within such considerations.
Finally, it is worth mentioning that the proposed method does not always explicitly account for the disconnecting effects of sediment control works (e.g.check-dams)-unless they determine significant changes in the DEM-and thus must be viewed as a precautionary approach.

| Novel quantitative perspectives in the modelling of debris flow connectivity and remaining challenges
Debris flows are crucial processes for sediment transfer in mountain environments, as they represent key drivers of hillslope-channel connectivity (Brardinoni & Hassan, 2006;Hoffmann et al., 2013;Messenzehl et al., 2014).This study highlights the primary role of landscape morphologies and their spatial distribution on sediment cascades during extreme events, in analogy with recent publications (Cossart & Fressard, 2017;Cossart et al., 2018;Llena et al., 2019).In accordance with several authors-who highlighted the importance of negative feedback on the hillslope-valley bottom linkages (e.g.Fryirs & Brierley, 1999;Harvey, 2002;Lane et al., 2017;Mancini & Lane, 2020;Schrott et al., 2003), our study demonstrates that hillslopes and steep tributaries prone to slope instability (landslides and/or debris flows) may not necessarily supply sediment to subjacent channels, even during extreme events.Lack of structural connectivity [i.e.presence of buffer or barriers, sensu Fryirs et al. (2007)] and/or functional aspects (e.g.insufficient event intensity and duration) co-determine that mobilized debris flow sediments may not reach downslope channels.In other words, this work highlights how important it is to consider debris flow susceptibility along with associated sediment connectivity when elaborating the effects on the main channel networks.In fact, large connected areas may not always be susceptible to slope stability, while vast susceptible terrain may as well be disconnected from downslope channels.
We believe this aspect needs to be considered in future analyses and in the development of tools and frameworks for sediment cascade and hazard assessment.Nonetheless, the fact that landslide occurrence does not necessarily result in sediment transfer to the fluvial system has already been highlighted by recent studies (e.g.Schopper et al., 2019;Scorpio et al., 2018;Surian et al., 2016), but not yet formalized via spatially explicit modelling procedures.
In terms of sediment connectivity research, this study is the first to propose a quantitative approach to derive objective thresholds for When analysing quantitatively the IC values, these show a systematic decrease with increasing resolution of DTM and that using different weighting factors leads to different IC patterns and values (Cantreul et al., 2018;Cavalli et al., 2020;Heckmann et al., 2018).This means that caution is needed when comparing IC thresholds identified from computations based on DTM with different resolutions.In this work, the proposed IC thresholds are based on elaborations on a 5 m LiDAR DTM using the surface roughness-based weighting factor.

| Insights into data-driven debris flow release susceptibility modelling and the importance of geomorphic settings
The debris flow release susceptibility model for Stolla produced predictions that were in very high spatial agreement with observed debris flow presence and absence observations, also when validated with independent random test data (CV median AUROC 0.92, Table 5) or across several subregions (SCV median AUROC 0.89, Table 5).Inspection of associated modelled relationships revealed geomorphologically plausible results (Steger et al., 2016a).For instance, the chance of debris flow initiation was modelled to increase from flat terrain up to a peak of 40 before starting to decrease again.Such a trend appears plausible since gravitational forcing increases with increasing slope angles, while debris material may not-or be less likely to-accumulate on very steep terrain.In this context, the GAM proved useful to account for non-linear behaviour and was therefore able to overcome the reported challenges of non-linear associations in debris flow susceptibility modelling (Heckmann et al., 2014) From a methodological viewpoint and in contrast to many landslide susceptibility studies, several crucial modelling decisions-such as the selection of variables or modelling parameters-were not solely based on a maximization of statistical performance estimates (Reichenbach et al., 2018;Steger et al., 2016a).We further opted to create relatively simple parsimonious models that come with a considerable amount of generalization with respect to the complex phenomena under investigation (Coelho et al., 2019).These decisions were taken to ensure computational feasibility and to enhance model transferability and interpretability.The strong focus on model generalization was further considered useful to offer some protection against a direct propagation of possible input data inaccuracies on the results (Steger et al., 2016b).For instance, we opted for a simple GLM to translate the initial IC map into probabilities associated with the labelled debris flow data, because a monotonic association between the occurrence of connected debris flows and the IC values can be expected from a geomorphic viewpoint.In contrast, non-linear associations between the occurrence of debris flow release susceptibility and specific variables, such as slope angle, can be expected from a geomorphic viewpoint (Heckmann et al., 2014), which is why we opted for a more flexible GAM (Bordoni et al., 2020;Goetz et al., 2015;Knevels et al., 2020;Petschko et al., 2014;Vorpahl et al., 2012).
However, in this case we also strived to avoid a too close description of local particularities and a reproduction of input data flaws by restricting the maximum flexibility of the smoothing functions (Hastie & Tibshirani, 1999;Wood, 2017).Variable selection went beyond the pure focus on statistical criteria.For example, specific proxy variables, such as slope orientation, were excluded from modelling in case the modelled relationships were observed to be valid only for single basins, even though the respective variable statistically improved the local model.Similarly, to enhance the model's spatial transferability, we opted to include the normalized height index, instead of basin-specific elevation values, to describe the relative position of debris flow release zones.In summary, it was considered crucial to complement the numerous quantitative validations with continuous geomorphic plausibility checks to ensure meaningful results that agree with the knowledge on local process dynamics, that are spatially transferable to a certain degree and do not explicitly reflect or reproduce input data flaws (Steger et al., 2016a(Steger et al., , 2021)).
how potential sediment source areas, such as debris flow release zones, are laterally connected to the main channel network, thus actually supplying sediments to the latter [with the exception of the very recent work by Spiekermann et al. (2022), who developed a morphometric landslide connectivity model using logistic regression].Within such a context, this paper presents a data-driven approach to identify areas which are most likely to supply debris flow sediments to the main channel network.The work aims to address several shortcomings of previous research through an explicit integrated analysis of debris flow release susceptibility and lateral connectivity via: (i) an objective data-driven definition of binary IC thresholds to separate connected from disconnected areas for high-magnitude reference events; (ii) the development of debris flow release susceptibility models then joined with the IC; and (iii) testing the transferability of the developed approach across Alpine catchments that represent different geological and geomorphological characteristics.For this purpose, three catchment-specific debris flow-triggering events in the Italian Eastern Alps (Stolla, Pfitsch/Vizze and Sulden/Solda) were analysed.

F
I G U R E 1 Location map of the Stolla, Pfitsch and Sulden catchments (the latter includes the Trafoi subcatchment) (a).Land cover maps for Stolla (b), Pfitsch (c) and Sulden (d), obtained from the Land-use Information System South Tyrol (LISS, 2013).Background for all images is the hillshade derived from DTMs.
First, we tested a direct transfer of the reference Stolla susceptibility model and the reference Stolla IC model (including their thresholds) to the Pfitsch and Sulden basins, by directly applying the Stolla-derived relationships and thresholds to the local environmental maps (i.e.predisposing factor maps, IC map).The resulting two binary maps for each test basin were then intersected to produce a combined connectivity-susceptibility map.Evaluations of these maps provided insights into the direct transferability of the results in case no local recalibration is conducted.Second, a recalibration of susceptibility models and IC thresholds within the two test basins based on local data (i.e.debris flow release points, predisposing factors, IC map) was carried out, allowing us to investigate the transferability of the workflow to other geomorphic settings.
orthomosaics and digital terrain models (DTMs) acquired before (orthomosaics in 2011 for Pfitsch and Sulden and in 2014 for Stolla; DTMs acquired in 2005 for Pfitsch and Sulden and in 2010 for the Stolla) and immediately after the storm events (orthomosaics and DTMs in 2014 for Pfitsch and Sulden and in 2017 for Stolla).
The IC was computed by investigating the connectivity between the catchment and the main channels assumed as a target (Stolla, Pfitsch, Sulden and Trafoi rivers), and thus this scenario focused on the evaluation of the lateral connectivity (i.e. from the hillslopes to the main channels found in the valley bottoms; Figure3b).The IC values-which are dimensionlesswere computed by means of the SedInConnect software(Crema &   Cavalli, 2018).The IC map was divided into two groups (connected vs disconnected zones) by training a generalized linear model (GLM), namely logistic regression(Hosmer et al., 2013).The response variable of the logistic regression model was represented by the binary label of the previously sampled debris flow points (connected vs disconnected; Figure 3c, n = 612 for Stolla), while the corresponding IC values were treated as the predictor variable.The fitted model permits us to predict the probability of an area being connected in terms of debris flow initiation as a function of the corresponding IC value.The binary label at the point locations and the associated probability values were then used to create and analyse a receiver operating characteristic (ROC) curve (Figure 3c).ROC curves illustrate the global outcome of a binary classifier by plotting the true positive rate (sensitivity) against the false positive rate (1 À specificity) for each possible classification threshold and thus is not based on model-independent test data.The presented predictive performance estimates, in contrast, were calculated via multiple independent test sets based on a repeated non-spatial random selection (cross validation, CV) or on multiple spatially disjoint subregions (spatial cross validation, SCV).Finally, a confusion matrix was used to compare the combined connectivity-susceptibility map (i.e.not a newly trained model of its own) with debris flows that actually occurred and were labelled (connected vs disconnected) during the post-event mapping.In detail, the discrimination power metric compares the continuous model predictions (probability scores between 0 and 1) to the observations of the binary response, to determine how well the underlying model separates the binary observations of the training data (i.e.connected vs disconnected; debris flow presence vs absence)(Murillo-García et al., 2019;Steger et al., 2020).For instance, an associated AUROC of 1 would indicate that the model predictions perfectly separate connected from disconnected debris flow release observations, with each connected debris flow observation having a higher probability score than each disconnected debris flow observation.An AUROC of 0.5 would point to a random classification, while were calculated by repeatedly splitting the original data into training data, which was used to fit the model, and test data, which was used to calculate the AUROC metric.In this context, CV is based on a repeated non-spatial random splitting of training and test data, while SCV is built upon a k-means cluster algorithm to create multiple spatially disjoint training and test areas.In total, 125 AUROCs were calculated (25 repetitions, 5 folds) for each model.The associated interquartile range (IQR) of AUROCs allowed us to additionally evaluate model robustness (Steger et al., 2017).Finally, the four classes of the combination maps-(i) disconnected and not susceptible; (ii) connected but not susceptible; (iii) susceptible but disconnected; (iv) connected and susceptiblewere quantitatively compared to the label of the debris flow source zones in the form of a confusion matrix to highlight the level of spatial (dis)agreement between debris flow observations and the four classes of the map.Absolute numbers (i.e.number of observations within a class), relative numbers (i.e.proportions) and frequency ratio (FR) scores are shown in the confusion matrix.FRs were computed to additionally account for the areal extent of the four zones within the combination map.The FR metric relates the proportion of observed debris flows within one of the four zones (e.g.82% of connected debris flows were observed within the zone 'susceptible and connected') to the proportion of area covered by the respective zone (e.g.'susceptible and connected' covers 5% of the total area).The resulting FR score (in this example, 82/5 = 16.4) is >1 in case an over-proportional density of debris flow observations is observed within the respective zone.FRs < 1 (e.g.5/56 = 0.09) indicate that, compared to the size of the zone of interest (e.g.'not susceptible and disconnected' covers 56% of the area), a lower proportion (in this case 5%) of actual debris flows was observed in this class.
The spatial transferability tests followed a two-step procedure.In the first step, the reference models fitted with data from the Stolla catchment and associated thresholds were directly transferred to the environmental data produced for the other catchments (i.e.IC map and the predisposing variable maps for the susceptibility model).Comparing the mapped debris flow inventory data with these maps allowed us to evaluate the generalizability of the reference model and the direct spatial transferability of the Stolla results to catchments with dissimilar environmental conditions.The second step focused on investigating the transferability of the general workflow by creating and validating newly calibrated models for the test areas in analogy to the procedure applied for the reference Stolla model.Thus, local debris flow data was used to refit the models and recalculate the associated thresholds.4 | RESULTS4.1 | Connectivity of debris flow release areas to main channelsIn the Stolla catchment, a total of 587 debris flows (represented via 612 point locations) were identified for the triggering event in August 2017.Based on field surveys and aerial photo interpretation, only 39 of them (6%) were classified as connected to the main channel.In the Pfitsch basin, 180 debris flows (193 point locations)caused by the August 2012 rainfall event-were delineated.A total of 59 (31%) of these debris flows were categorized as connected.In the Sulden basin, 46 debris flows (47 point locations) were identified following the August 2014 storm event and more than half of them (53%, n = 25) were classified as connected to the main channels (Sulden and Trafoi).In the Stolla and Pfitsch basins, most of the disconnected debris flows were initiated at elevations >2000 m a.s.l. and stopped at hanging valleys, on talus slopes and on debris flow fans, or in a few cases on alluvial plains located on the valley floors.In the Sulden catchment, debris flows stopped mainly at debris flow fans located at the edges of the valley bottom.

À2. 03 ,
Figure 5) cutpoint values compared to the Stolla reference cutpoint of À3.21 (Table 2, Figure 5).As a result, a decrease of the area classified as connected is observed for the former basins (Figure 8a vs b; Figure 8c vs d).

F
I G U R E 6 Probability of an area to be connected to the main channel in terms of debris flow release (a) and derived classified binary map of connectivity (b) for the Stolla catchment.The Stolla modelling results (i.e. the fitted model) were then directly transferred to the Pfitsch and Sulden catchments to produce first debris flow release susceptibility models based on the identical spatial explanatory variables (Figures11a and c).For Pfitsch, a first quantitative evaluation of the spatially transferred susceptibility model from Stolla (Table4) showed that the observed debris flow release zones were frequently located on terrain classified as aboveaverage susceptible, whereas debris flow absence observations regularly coincided with below-average susceptibility scores, leading to an acceptable AUROC of 0.76.The direct transfer of the Stolla susceptibility model to the Sulden area led to a lower, but still marginally acceptable, classification accuracy (AUROC 0.71) according to ROC interpretation guidelines (Table highlight that the debris flow release susceptibility models trained within the respective areas performed outstandingly (Stolla) to excellent (Pfitsch, Sulden) to separate model-independent debris flow release observations from absence data.A direct spatial transfer of the Stolla model to both test areas led to lower and acceptable (Pfitsch) to marginally acceptable F I G U R E 7 Probability of an area to be connected to the main channel in term of debris flow release for the Pfitsch and Sulden catchments.The probability values observed at the debris flow release point locations were used to derive optimized thresholds based on the ROC curve.Maps A and C are based on the spatial transfer of the Stolla IC logistic regression model (transferred model).The maps in B and D depict the results of a logistic regression model trained with local data (optimized model).(Sulden) model performances.The spatial extent of areas susceptible to release debris flow equalled 17% in the Stolla catchment.In the Pfitsch basin, these areas decreased from 31% (transf model) to 17% (optimized map), whereas in Sulden they increased from 17% (transf model) to 29% (opt model).
Figures13 and 14.A comparison of the combined connectivity-susceptibility map for Stolla with the label of the previously sampled debris flow presence and absence locations revealed that 82% of connected debris flow release locations were located within areas classified as 'susceptible and connected' (areal extent 5%), leading to a very high FR score of 16.4 (Table6).

F
I G U R E 8 Classified binary maps of debris flow release area connectivity: transferred model in the Pfitsch catchment (a); optimized model in the Pfitsch catchment (b); transferred model in the Sulden catchment (c); optimized model in the Sulden catchment (d).under-represented within the four-class map, as confirmed by the respective FR scores below 1. Local optimization led to a different spatial classification, with larger zones classified as connected and siderable differences in geomorphology and lithology between the two basins.As shown by the geomorphological analysis carried out by Scorpio et al. (2022), in the Stolla basin sediment cascades are F I G U R E 9 Modelled relationships for the Stolla debris flow release susceptibility model.The centred component smooth functions (a

F
I G U R E 1 0 Unclassified debris flow release susceptibility map (a) and derived classified binary map (b) for the Stolla catchment.T A B L E 4 Discriminatory power for the debris flow release susceptibility model (i.e.capability to separate susceptible from not susceptible terrain) and associated optimal threshold

T
A B L E 5 Non-spatial (CV) and spatial (SCV) cross-validation results for the debris flow release susceptibility models.Median AUROC values for model-independent non-spatial (CV) and spatial (SCV) test data.The interquartile range (IQR) depicts prediction performance variability in response to changes in the data partition model, exhibit low statistical performance and plausibility (cf.columns 'transferred' in Table basin, geomorphological analysis of the orthomosaic and DTM, and comparison with the map ofButer et al. (2020), show that most debris flows originating in the upper basin areas are currently decoupled from the main river system by the presence of large buffering landforms such as terminal and lateral moraines and bedrock valley steps.Hence, we argue that the low performance indicators observed for the spatially transferred Sulden model are due to both differences in the geomorphological setting-in comparison with the catchment where the model was initially devised-and external forcing (relative magnitude-frequency of the triggering events).As expected, the classification accuracy of the combined map in the Sulden basin increased to an acceptable level as soon as the model was trained with eventspecific local data (cf.columns 'optimized' in Table8and associated description), confirming the utility of the general methodical approach while highlighting the need for site-specific calibrations in case the environmental setting in the training area differs substantially from the application study site.It can be assumed that a very severe storm event in Sulden, comparable to the one studied for the Stolla basin, may entail a situation in which a higher proportion of debris flows that F I G U R E 1 1 Unclassified debris flow release susceptibility map for Pfitsch (a, b) and Sulden (c, d).The maps in A and C are based on the spatial transfer of the Stolla landslide susceptibility model (transferred model).The maps in B and D depict the results of the GAM model trained with local debris flow data (optimized model).

FFigure 5 ,
Figure 5, Table2).In this context, testing the spatial transferability of the workflow (optimized models) and the results (transferred models) to other basins proved particularly useful to uncover the strengths and limitations of the presented approach.Before this research, despite the high interest in the literature for application of the IC in different contexts, very few studies investigated IC values in relation to geomorphic processes from a quantitative viewpoint to define optimized thresholds.Among these, it is worth mentioning the work ofMessenzehl et al. (2014), which applied the IC in a formerly glaciated alpine valley(Val Müschauns, Switzerland)  and obtained lower values of IC (< À2.37) within the glacial cirques, talus slopes, pronival ramparts and exposed bedrocks and intermediate values (< À1) within colluvial deposits, debris flow moraine deposits, debris cones and alluvial deposits.In this context, the definition of one consistent . A similar non-linear trend was observed for relative slope positions represented by the normalized height variable.Inspection of the associated map confirmed the meaningfulness of the result, since most debris flow initiation takes place at medium relative slope positions while higher relative elevation zones were predominantly represented by steep rock faces without relevant debris accumulation.Figures8c and ddepict that concave-shaped terrain was also associated with higher chances of debris flow release, which is in line with the assumption that both debris and surface water may more likely accumulate within such landforms.The introduced statistically significant interaction term between topographic wetness index and roughness revealed that typical debris flow release zones can be found in zones where rough terrain spatially coincides with a high topographic wetness index.Similar patterns were found within all basins, describing that the chance of debris flow initiation is highest in case potential water accumulation zones overlap with terrain representative of the presence of coarse material.In fact, tests showed that a separate consideration of these two variables led to implausible spatial predictions T A B L E 7 Confusion matrix for Pfitsch that confronts the debris flow point classification (observations) with the spatially coinciding class label of the combined connectivity-susceptibility map (predicted class).The frequency ratio (FR) scores relate the portion of classified debris flow points (connected or disconnected) to the portion of area covered by a predicted class.The columns 'transferred' belong to the combined map based on the transfer of the Stolla IC threshold and the Stolla susceptibility model.The columns 'optimized' relate to the map based on local IC thresholding and local susceptibility modelling of relatively high probability scores at the flood plains (high wetness index but low roughness) and at rough rock surfaces without debris material (high roughness index but low wetness index).

Finally, the results
about debris flow susceptibility highlight the effectiveness of data-driven modelling, but on the other side, they reveal the importance of integrating reliable model inputs, to ensure model generalization and to propose a geomorphological interpretation with respect to local process knowledge and field-based information.When interpreting the results, it should be kept in mind that the underlying data-driven models come along with a considerable level of abstraction.Thus, even though it is known that generalizing models tend to make more accurate predictions on unseen data, they deliberately ignore the complexity of the phenomena of interest.Still, in this study, the numerous quantitative and qualitative evaluations provided insights into the explanatory power of the results (Good & Hardin, 2006; Oreskes et al., 1994; Steger et al., 2016a).T A B L E 8 Confusion matrix for Sulden that confronts the debris flow point classification (observations) with the spatially coinciding class label of the combined connectivity-susceptibility map (predicted class).The frequency ratio (FR) scores relate the portion of classified debris flow points (connected or disconnected) to the portion of area covered by a predicted class.The columns 'transferred' belong to the combined map based on the transfer of the Stolla IC threshold and the Stolla susceptibility model.The columns 'optimized' relate to the map based on local IC thresholding and local susceptibility modelling T A B L E 1 Main physiographic and morphologic characteristics of the study catchments Box and whisker plots presenting IC values for connected and disconnected debris flows with respect to the areas of Stolla (a), Pfitsch (b) and Sulden (c).The number shown relates to the median IC value.
Table2highlights that the classification balance has been improved in both test areas via the local optimization (i.e.Spec and Sens for each model are more similar compared to the transferred counterparts).Still, the AUROC scores of the optimized models (Pfitsch 0.82, Sulden 0.57) show that an excellent discrimination between connected and disconnected debris flow releaseF I G U R E 4Notes: AUROC values depict the overall discriminatory power of the classification (0.5 = random, 1 = perfect) based on training data; IC shows the optimal IC cutpoint (connected vs disconnected) that maximizes the sum of sensitivity (Sens) and specificity (Spec); Spec relates to the specificity (true negative rate) at the optimal IC cutpoint (i.e. the portion of observed disconnected debris flow releases correctly classified as disconnected); Sens shows the sensitivity (true positive rate) at the optimal IC cutpoint (i.e. the portion of observed connected debris flow releases correctly classified as connected); transf = the IC cutpoint value was directly transferred from the original Stolla model (*) and tested for Pfitsch and Sulden; opt = the underlying models were optimized (trained) with local data.T A B L E 3 Non-spatial (CV) and spatial (SCV) cross-validation results for the IC logistic regression models.Median AUROC values for model-independent non-spatial (CV) and spatial (SCV) test data.