Do brachycephaly and nose size predict the severity of obstructive sleep apnea (OSA)? A sample‐based geometric morphometric analysis of craniofacial variation in relation to OSA syndrome and the role of confounding factors

Obstructive sleep apnea is a common disorder that leads to sleep fragmentation and is potentially bidirectionally related to a variety of comorbidities, including an increased risk of heart failure and stroke. It is often considered a consequence of anatomical abnormalities, especially in the head and neck, but its pathophysiology is likely to be multifactorial in origin. With geometric morphometrics, and a large sample of adults from the Study for Health in Pomerania, we explore the association of craniofacial morphology to the apnea–hypopnea index used as an estimate of obstructive sleep apnea severity. We show that craniofacial size and asymmetry, an aspect of morphological variation seldom analysed in obstructive sleep apnea research, are both uncorrelated to apnea–hypopnea index. In contrast, as in previous analyses, we find evidence that brachycephaly and larger nasal proportions might be associated to obstructive sleep apnea severity. However, this correlational signal is weak and completely disappears when age‐related shape variation is statistically controlled for. Our findings suggest that previous work might need to be re‐evaluated, and urge researchers to take into account the role of confounders to avoid potentially spurious findings in association studies.


| INTRODUCTION
Obstructive sleep apnea (OSA) is a sleep disorder characterized by recurrent events of partial or complete collapse of the upper airways, leading to frequent episodes of apnea and hypopnea. Its symptoms include snoring, sleep interruption and excessive daytime sleepiness, which can lead to significant impairment in the quality of life (Fietze et al., 2019;Vogler et al., 2022). Previous studies on different populations (Benjafield et al., 2019;Fietze et al., 2019) have also reported higher prevalence over the past years, which can be attributed mainly to using more sensitive recording techniques and the recent change of scoring definitions. In most of those studies, OSA severity has been indicated by the frequency of apnea and hypopnea events per hour of sleep (apnea-hypopnea index; AHI), as determined by an overnight polysomnography (see Malhotra et al., 2021 andPevernagie et al., 2020, for a discussion of the limitations of AHI).
Although the pathophysiology of the disorder is not thoroughly understood and a multifactorial origin is suggested, there is growing evidence that craniofacial anatomy is a predisposing risk factor for OSA (Agha & Johal, 2017;Neelapu et al., 2017). Among the most reported craniofacial features in OSA are maxillary and mandibular deficiencies.
The relative positioning of the maxilla and mandible is also an important factor, where patients with OSA demonstrate more retro positioning of the mandible to the maxilla, which in turn will cause a reduction in the pharyngeal airway size and an increase in its collapsibility (Lee et al., 2010). The anterior cranial base (measured as nasion-sella), which plays a critical role in the development of the upper and middle face (Nie, 2005), is also found to be shorter in patients with OSA, and can be an influencing factor resulting in midface and maxillary deficiencies (Liu et al., 2000). Recent studies also reported an increase in the anterior facial height in subjects with OSA, and it has been theorized that different growth mechanisms explain the consequentially higher pharyngeal collapsibility (Sforza et al., 2000).
With the development of different imaging modalities and techniques, a comprehensive assessment of craniofacial morphology might become possible. Over recent years, magnetic resonance imaging (MRI) has become an attractive modality to evaluate the craniofacial features in patients with OSA. Although computed tomography (CT) provides superior images for bony structures, MRI offers a noninvasive, ionization-free alternative to CT that provides a sufficiently clear definition of bony structure boundaries and adjacent soft tissue.
Geometric morphometrics has been used in the assessment of craniofacial morphology in patients with OSA (Cetinoglu et al., 2019;Monna et al., 2022). Modern applications of geometric morphometrics typically employ anatomical landmarks on 2D or 3D images to quantify, analyse, and visualize morphological variation (Adams et al., 2004). Compared with traditional morphometrics (Marcus, 1990), which uses linear measurements and angles between landmarks, geometric morphometrics offers several advantages in the characterization of craniofacial morphology, as it preserves the geometric relationships among the anatomical landmarks (Rohlf & Marcus, 1993) and offers a variety of visualization techniques to investigate changes in shape (Klingenberg, 2013). Nonetheless, literature on craniofacial morphology analysis in patients with OSA using geometric morphometrics is still relatively limited (Cetinoglu et al., 2019;Fakhry et al., 2013;Monna et al., 2022;Pae et al., 1998;Singh et al., 2007). Therefore, in the present study, we aim to explore craniofacial morphology using Procrustes-based 3D geometric morphometrics on a configuration of craniofacial landmarks measured on MRI, and assess how it might be associated to the presence and severity of OSA (as measured by the AHI), and in relation to covariates such as sex, age and body mass index (BMI) in a large sample from the Study for Health in Pomerania (SHIP) dataset (Völzke et al., 2011). 2 | METHODS

| Study sample and MRI
A subsample of SHIP was used in this study. SHIP is a populationbased study conducted to estimate the prevalence and incidence of risk factors and clinical diseases in the general population of northeast of Germany (Völzke et al., 2011). For the present study the cohort SHIP-Trend (n = 4420) was chosen. An overnight polysomnography (PSG) was offered to all SHIP-Trend participants, whereby 1264 chose to partake. PSG data were obtained for 1209 participants. Of these, 559 (46%) were females. Additionally, all participants of SHIP-Trend were offered a whole-body MRI examination in which 1800 participants agreed and went through. After matching the participation in both the sleep examination and the whole-body MRI, and removing individuals with missing data for one or more of the study variables, the final sub-cohort of this study was 702 participants.The MRI scans were performed on a 1.5-T system (Magnetom Avanto; Siemens Medical Solutions, Erlangen, Germany) as a part of a whole-body MRI protocol. Details on the analysed MRI sequences and the ethical approval are included in the Supplementary Information.

| Sleep examination
Participants attended a single-night, laboratory-based PSG (Alice

| Covariates, groups and landmark configuration
To investigate the association between craniofacial morphology and OSA, we use AHI as a proxy for the severity of the disorder and, in relation to this main analysis, we also explore the potential effect of covariates such as sex, age and BMI, as well as body height and weight (Table S1). As in previous studies (Cakirer et al., 2001;Schwab et al., 2017;Sutherland et al., 2019), to aid normality, AHI + 1 is logtransformed (lnAHI). One is added before logging to avoid zero values.
The OSA severity groups and age classes are defined as in Fietze et al. (2019). For OSA, groups are: physiological, AHI < 5; mild, AHI ≥ 5 but < 15; moderate, AHI ≥ 15 but < 30; severe, AHI ≥ 30 per hour of sleep. For age, groups are individuals in their 20s, 30s, etc. with the oldest group being 70 years old or older. Table 1 illustrates the sample composition in relation to OSA severity; and Figure 1 and Table S2 summarize the sample composition in relation to age.

| Geometric morphometrics
We apply Procrustean geometric morphometrics (see Supplementary Information for details) to the Cartesian coordinates of the landmarks to compute size and shape variables (Adams et al., 2004). Size is measured by the centroid size of a landmark configuration, which quantifies the dispersion of the landmarks around their "baricenter". Shape is a matrix of Procrustes shape coordinates, with differences between individuals (or group means) quantified as Procrustes shape distances (approximately equal to the length of a straight line connecting two observations in the multivariate shape space). Although we perform most of the analyses on total shape, including left-right asymmetries, we also calculate the magnitude of individual asymmetries in shape (i.e. the Procrustes shape difference between the symmetric and asymmetric shape components of an individual-see Supplementary Information for more information). As with other covariates, the correlation between lnAHI and individual asymmetry is assessed. Thus, overall, geometric morphometrics provides three sets of variables: (1) centroid size, measuring how large or small a cranium is; (2) individual asymmetry, corresponding to the magnitude of left-and right-side shape differences; (3) cranial shape, capturing the relative proportions of the different cranial regions. Centroid size and individual asymmetry are univariate. Cranial shape, in contrast, is a multivariate matrix of 702 rows (the individuals) by 54 variables (the 3D shape coordinates of 18 landmarks).
For all geometric morphometric analyses (superimposition, computation of asymmetric shape) we use the R statistics (R Core Team, 2022) package Morpho (Schlager, 2017), but we employ Morpheus et al. (Slice, 1999) to illustrate shape differences with wireframe diagrams (Fig. 2b). These diagrams are drawn by connecting specific pairs of landmarks with straight lines to aid the visualization (Klingenberg, 2013).

| Statistical analyses
We subdivide this section into three parts: (1) analyses of association between variables; (2) group discrimination using shape data; (3) estimates of sensitivity, specificity (Parikh et al., 2008) and the area under the curve (AUC) of a receiver-operator curve (ROC; Park et al., 2004).
To control for sexual dimorphism, all the main statistical analyses are conducted with separate sexes. Analysing females and males separately also allows to corroborate findings in one sex with those in the other. The alternative approach, when sex differences are significant, is to employ "sex-corrected" data. Sex-correcting the data is useful to increase sample size, but generally not recommended because statistical sex corrections require further assumptions (mainly, that the interactions between sex and any other factor in the analysis is not significant and accounts for a negligible amount of variance) and, thus, make the study design more complex, while potentially introducing inaccuracies. For instance, when we repeat the analyses after controlling for age-related variation, we would need to simultaneously sexand age-correct data. As female and male samples are large, we avoid this further numerical manipulation of the data. The only exception where we both sex-and age-correct data to increase sample size is when we estimate directional asymmetry within OSA groups (see Supplementary Information). However, we clearly state that the T A B L E 1 Sample composition by sex and OSA severity (absolute numbers and corresponding percentages in the sample) F I G U R E 1 Box and violin plot of age (years) by sex analysis of directional asymmetry is a very preliminary, exploratory analysis, which might stimulate future studies, but requires much larger and more balanced samples.

| Correlations and multivariate regressions
With univariate data (centroid size and individual asymmetry, as well as the covariates-age, weight, height, BMI), we explore and visualize correlations with lnAHI using the chart.Correlation function of Perfor-manceAnalytics (Peterson & Carl, 2020). Although we mainly focus on correlations with lnAHI, we compute pairwise correlations among all variables and briefly discuss significant results. We employ Spearman's r, because some variables deviate from normality. The chart.Correlation function provides a parametric test for significance with results marked with *p < 0.05, **p < 0.01, ***p < 0.001. However, in this and other analyses, we choose a conservative threshold for significance (0.005 or 0.001), and we therefore discuss only correlations marked F I G U R E 2 Landmark configuration (a) and wireframe (b) with ***. The choice of a conservative significance threshold is advocated for new discoveries (Benjamin et al., 2018), but also mitigates against the inflation of type I error in multiple tests and helps to focus on results with the largest effect size (r or, see below, multivariate R-squared).
To assess the association of shape with other variables, we use multivariate regressions in which all shape coordinates simultaneously regressed onto a predictor. Significance of the F-ratio is tested using 10,000 permutations (adonis function in vegan; Oksanen et al., 2011), which is equivalent to a permutational approach using the multivariate regressions also to preliminarily assess their potential association with shape. If age or BMI is significant, we compute regression residuals to statistically control for their effect on shape before retesting the association between shape and lnAHI. Thus, regressions of shape coordinates on lnAHI are done using total shape, as well as age-(or BMI-) corrected shapes. When results are significant using total shape, but are not significant using age-(or BMI-) corrected shape data, we conclude that there is no evidence for the association of craniofacial shape with lnAHI beyond what is expected because of the known risk factors (i.e. potential confounders) represented by age and BMI.

| Group ordinations and predictions using shape
To visualize differences in relation to age and OSA groups, as well as to assess group predictive accuracy, we use craniofacial shape in a series of between-group principal component analyses (bgPCA; see Supplemen- tary Information for more information). This method is analogous to, but computationally simpler than, a discriminant analysis and also less restrictive in terms of assumptions (Cardini et al., 2019). As with the multivariate regression of shape on lnAHI, bgPCAs are done both using total shape and age-(or BMI-) corrected shape. For the computation of bgPCAs, we use the function groupPCA of Morpho (Schlager, 2017). The main axis of between-group differences (bgPC1) is then visualized using box and violin plots (Hintze & Nelson, 1998) in ggplot2 (Wickham, 2011).
Group prediction in bgPCAs is leave-one-out cross-validated. To assess whether the resulting estimate of classification accuracy (hit rate) is better than expected by random chance, we use a random baseline approach (Kovarovic et al., 2011). Thus, we randomize groups before computing cross-validated hit rates, so that any correct hit is due to random chance. After replicating this randomization 100 times, we compute the median, and the 95th and 99th percentiles of the distribution of random chance hit rates.
• If the observed hit rate is less than the median random baseline, craniofacial shape has no group predictive power and is therefore irrelevant to assess OSA severity.
• If the hit rate is above the median random baseline, but below the 95th percentile, the evidence for group separation is weak.
• If the observed hit rate is above either the 95th or 99th percentiles, there is, respectively, fair or strong evidence that craniofacial shape predicts groups better than expected by chance.

| Sensitivity, specificity and ROC-AUC analysis
We pool mild, moderate and severe OSA (AHI ≥ 5) for the last series and ideally as close as possible to 1 (Park et al., 2004). As with other analyses, also the ROC-AUC analysis is done both using total and agecorrected shapes. BMI-corrected shapes, however, are not analysed, as the bgPCA using the four OSA severity groups (see Sections 2.5.2 and 3.2) demonstrates that controlling for BMI has a negligible effect on discriminating OSA severity using shape data.

| RESULTS
3.1 | Correlations of covariates and regressions of shape onto predictors

| Bivariate correlations: lnAHI and covariates
LnAHI is correlated with sex (p < 0.0001, r = 0.28) and, within sex, positively correlated with age and BMI ( Figure 3; Table 2). It is also correlated positively with weight and negatively with height, but correlations are smaller. Thus, we mainly focus on age and BMI, which are the main known risk factors for OSA.
Compared with females, males tend to have higher AHI/lnAHI ( Figure 4; Table S1). In both sexes, lnAHI increases with age. In females (Figure 3a), this happens mainly in their 50s, when there is a pronounced increment followed by a more gradual increase after 60 years of age. In males (Figure 3b), the increase is minimal between 20 and 39 years of age; sharp for individuals in their 40s; and gradual but constant at 50 years old or older. Within sex, older individuals with higher BMI are more at risk of OSA (p < 0.0001, r ranging from 0.41 to 0.54; Table 2). In contrast, neither cranial size nor the degree of craniofacial shape individual asymmetry are significantly correlated to lnAHI (p > 0.1, r < 0.1).
Age and BMI (Figure 3) covary (p < 0.0001, with r = 0.27 and 0.31 in, respectively, females and males). However, the increase in weight in older people (non-significant correlation with age) is partly masked by younger individuals being on average taller (correlations between height and age: p < 0.0001, with r = À0.41 in females and À 0.38 in males) and, therefore, heavier. In both sexes, taller and heavier individuals tend

| Multivariate shape: association with age, BMI and lnAHI
Age and BMI have a significant statistical effect on shape (p < 0.0001), even if this only accounts for, respectively, 3.3%-2.1% (females) and 3.3%-1.2% (males) of variance in the multivariate regressions (Table S3). As age is the factor with a stronger effect, we briefly describe this component of craniofacial shape variation in the sample. Older individuals generally have broader and deeper, but shorter, faces, with larger noses ( Figure 5). The pattern is similar in both sexes, although more pronounced in males. This congruence is confirmed by the small angles (< 36 ) formed by the female and male vectors of regression coefficients (Table S4), which implies a similar direction of shape change in relation to the predictor. Multivariate regressions of shape onto lnAHI (Table 3) are also significant in both sexes (p = 0.0001) and account for 1.5%-1.0% of total variance in, respectively, females and males. AHI-related shape changes are similar in the two sexes (angle between vectors = 35 ), although now (unlike with age) it is females who display the most pronounced variation ( Figure 6): the face is broader, and the nose larger and slightly taller, in individuals with higher lnAHI; the cranium seems also, to a very moderate extent, shorter and deeper. These differences, which suggest a propensity towards a modest brachycephaly in individuals with severe OSA, are reminiscent of, but less evident than, similar changes in age-related shape trajectories of females and males ( Figure 5).
Using age-corrected shape, regressions on lnAHI are no longer significant (p > 0.3, Rsq = 0.3% in both sexes). Controlling the effect of BMI on cranio-facial shape, in contrast, does not change the conclusions of the tests for the association between shape and lnAHI, even if there is a small reduction in Rsq (Table S5). Within sex, quasicollinearity between age-and lnAHI-related shape trajectories is supported by small angles between vectors of regression coefficients (age versus lnAHI < 32 in both sexes), which are about half of those formed by either of them and BMI-related trajectories (range = 52-67 ). Thus, angles between shape trajectories predicted in the regressions (Table S4) confirm that OSA-related shape variation is similar to changes in relation to age, but largely independent from those related to BMI. Overall, age, BMI and lnAHI are all significant predictors of craniofacial shape, but age has the larger effect and its effect (unlike that of BMI) is mostly collinear with that of lnAHI, so that controlling for age-related shape variation completely removes the small association with lnAHI.

| Group ordinations and group prediction using shape
A subtle but gradual and consistent trend in shape change with age (seen in the regressions described in Section 3.1.3) is supported also when shape is used to predict age groups (Figure 7). The leave-oneout bgPCA cross-validated average accuracy is 26% for females and 30% for males. This is better than a random chance accuracy of approximately 1 in 6 (i.e. 17%), with six a priori groups. In fact, the observed hit rates of 26%-30% exceed in both sexes the 99th percentile of random chance accuracy, which is 24% in females and 22% in males. Thus, despite a modest accuracy, craniofacial shape predicts age groups significantly better than expected by chance. Cross-validated predictions for the four OSA groups (Table 4) have average hit rates of 39% (females) and 34% (males), when based on total shape. This is more than 1 in 3 individuals being correctly classified, compared with about 1 in 4 expected by chance, and also better than the 99th percentiles of random chance accuracy, which are 38% in females and 32% in males. In contrast, using age-corrected shape, cross-validated average predictive accuracy (Table 4) drops to 22% in females and 25% in males, which is lower than the median baseline for random chance accuracy (28% in females and 26% in males). Therefore, craniofacial shape is moderately predictive of OSA severity but, if the effect of age on shape is statistically removed, the prediction of OSA groups becomes no better than expected by chance. Using BMI-corrected shapes (Table S5), on the other hand, results are virtually identical to those using full shape.
To summarize, combining the results of multivariate regressions and group predictions, we find a small but non-negligible covariation between craniofacial shape and OSA severity. This pattern is mostly unchanged when the effect of BMI on shape is controlled for. However, when the effect of age is statistically removed, the association between shape and OSA disappears. Thus, craniofacial shape features associated to OSA are mostly the same as those accounted for by age, whereas they are different from BMI-related shape changes.
F I G U R E 6 Apnea-hypopnea index (AHI)-related shape changes: for females (left) and males (right), the shape predicted by the regression trajectory for the individuals with the highest AHI (72 for females, 88 for males) is shown magnified 20 times relative to the unmodified shape (light grey) predicted for the healthy ones (AHI = 0) 3.3 | Sensitivity, specificity and ROC-AUC analysis Table 5 shows the results of cross-validated estimates of sensitivity, specificity and the ROC-AUC analysis of craniofacial shape using a binary classification (i.e. individuals affected by OSA versus healthy ones). Using full shape, sensitivity and specificity are virtually the same with about 60% of true-positives or true-negatives in both sexes. Sixty percent is a rather low predictive accuracy for two groups, but it turns out to be better than the 95 th -99 th random chance baselines (Table 5). AUC is also low (0.65 in both females and males), but above the 0.5 random chance baseline.
When age-corrected shape is used, sensitivity drops in both sexes to just above 50% and specificity is even lower (45%-50% in females and males, respectively). Predictive accuracies of agecorrected shape are well below the 95 th -99 th random chance baselines, even though they are generally marginally better than the median baseline (Table 5). Consistently with these results, that are about the same as random chance classification, also AUC drops to 0.52-0.54 (in females and males, respectively), which is very close to 0.5.

| DISCUSSION
Using a large sample of adults, we show that craniofacial morphology is associated, although very modestly, to the severity of OSA. The small indication of a significant covariance we find between lnAHI and shape, however, cannot be separated from age-related variation: if the effect of age is statistically removed, that completely erases any signal of association between the severity of the disorder and craniofacial shape. Overall, therefore, craniofacial proportions are a poor predictor of the severity of OSA. This is an important conclusion, although one whose external validity will have to be tested in future studies.
The vast majority of, if not all, the research on potential associations between OSA and morphology is correlational. In these types of analyses, it is crucial to carefully control for potential confounders, as we exemplified. A confounder is a factor that influences both sets of variables in an association study, so that one cannot exclude that a significant correlation is in fact spurious.
Thus, a researcher might find aspects of morphological variation that correlate with the severity of a disease such as OSA, but the evidence is weak if a confounder largely accounts for that same type of phenotypic change, which is exactly what happens with age in our study. In the next sections, we focus on this specific result and its implications, but also briefly discuss the possible reasons for the weak association between craniofacial shape and OSA severity.
F I G U R E 7 Box and violin plot of between-group principal component analysis (bgPC1) scores (using shape to predict age groups), suggesting an age-related trend in shape changes T A B L E 4 Average cross-validated classification accuracy using shape to predict OSA groups .

| Age-related shape variation in adults as a confounder in assessing OSA
Age is highly significant as a predictor of craniofacial shape in our study, but explains only 3% of total variance. In terms of variance accounted for, this is small, but about the same as for craniofacial shape sexual dimorphism (2.8% in our sample), which represents a much better understood aspect of variation in human morphology (Bigoni et al., 2010;Franklin et al., 2012;Kimmerle et al., 2008). Larger noses in elders have in fact been reported by Smith et al. (2021), who also provide a review of the literature on aging and head morphology showing an expansion of the surface area of the nose as a dominant feature in elders.
What are, however, the causes of age-related shape change in adults? The study of Smith et al. (2021), like ours, employs crosssectional data, and thus cannot conclusively discriminate causal factors. Change could be partly genuine, because of development and bone remodelling, and partly spurious and related to demographic and/or lifestyle changes in people of different ages. For instance, if we consider another morphological variable, such as body height, elders in our sample are approximately 10 cm shorter on average than individuals in the younger groups. Height in adults slightly decreases (approximately 2-5 cm on average) with age (de Groot et al., 1996;Dey et al., 1999;Fernihough & McGovern, 2015;Gavriilidou et al., 2015), but it seems improbable that a large mean difference, such as the one we find, is fully explained by aging alone. More likely, it is a consequence of several interacting factors. It is well known that there has been a worldwide increase in adult body height over the last decades (Cavelaars et al., 2000;NCD Risk Factor Collaboration, 2016;Perkins et al., 2016). Thus, when we compare younger and older generations, elders are shorter not just because of a physiological reduction in height with age, but also because they belong to a generation that was shorter as young adults. As a consequence, cross-sectional data on body height tend to inflate the apparent reduction in body height due to aging.
As mentioned, age-related variation in craniofacial shape, while small, has been reported before. However, as with body height, without longitudinal data, we cannot be sure whether it is mostly developmental or due to other factors. Regardless of its explanation, shape change in relation to age is mostly the same as change in relation to OSA. Some of the craniofacial differences we find between healthy individuals and individuals with OSA have been previously observed by others. As reviewed by Agha and Johal (2017), an association between a brachycephaly and OSA severity, supported by our analysis in females mainly (Figure 6), has been reported in several studies. Nasal inter-landmark distances also are, on average, larger in patients with severe OSA (Cetinoglu et al., 2019), a trend we find in association with severe OSA (Figure 6). However, brachycephaly and larger nasal proportions are also found simply in relation to age, regardless of OSA ( Figure 5; Smith et al., 2021). Brachycephaly and a larger nose might contribute to increase the severity of OSA.
However, as controlling for the effect of age on shape removes the association, the relationships between OSA and these specific aspects of craniofacial morphology remain most uncertain.

| Why such a weak association between craniofacial shape and OSA?
Even if the covariation between craniofacial shape and OSA severity is genuine, the signal for an association is weak (small Rsq and poor accuracy in predicting groups). There are several potential explanations for this result. The easiest is that craniofacial shape has little relevance for OSA. Alternatively, maybe we simply did not measure the truly relevant anatomical regions. MRIs on the lower face were not available in our dataset, which is why we did not digitize the mandible, tongue, hyoid bone and neck, often considered of importance in relation to OSA (Agha & Johal, 2017;Laharnar et al., 2021;Rizzatti et al., 2020). In particular, we opted out of including the mandible in this analysis, and focused on the cranium and upper facial area, as the mandible position  Table 4 with a classification based on four groups, we are here using a binary classification where we simply compare healthy and OSA (any degree of severity) individuals.
was not standardized and many participants had jaw movements during the scan time, which resulted in the mandible either partially not appearing in the head scan or deviating from the closed mouth position.
Further studies are needed to establish the relation between craniofacial morphology and soft oropharyngeal structures, as multiple subtle differences in craniofacial morphology could also contribute to a more collapsible airway (e.g. a retrognathic maxilla or mandible might cause the tongue to be placed more dorsally pushing against the pharyngeal walls). It is, nevertheless, possible that controlling for confounders, such as age, might remove some of the associations reported in previous research.
Finally, because AHI is a proxy for OSA severity, even if it is probably the most commonly used index, it may be inaccurate (Pevernagie et al., 2020). This adds more uncertainty to analyses of OSA-related phenotypic variation, but seems to be rarely acknowledged by researchers. Robust associations will require the development of a more accurate index to estimate OSA severity. More likely, instead of trying to capture the complexity of a multifactorial disease using a single number such as AHI, researchers will have to adopt a multivariate quantitative description of OSA symptoms.

| Conclusions
Obstructive sleep apnea is a most common sleep-related breathing disorder, which is prevalent in older men who are overweight, but can occur also in women and the young regardless of weight. Causes of the disorder are likely to be multiple and remain poorly understood.
The association of specific aspects of craniofacial anatomy with the severity of OSA is an important area of research. Our study suggests that the proportions of the upper face and cranium have modest relevance for OSA. As in previous research (Agha & Johal, 2017;Cetinoglu et al., 2019;Rizzatti et al., 2020), there is an association between OSA severity and a brachycephalic phenotype with a larger nose, but this association is weak and disappears completely if agerelated shape variance is statistically removed. Age-related shape change is small, although slightly larger than OSA-related variation.
Yet, the latter is part of the former, so that one cannot be separated from the other. Thus, we cannot say with confidence whether the correlation between morphology and OSA-severity is accurate, and has relevance for the study of the disorder, or it is instead a spurious consequence of covariance with a potential confounder. Previous work claiming associations between morphology and OSA might have to be reassessed if results did not take confounders into account, and future research must be done using a rigorous design that reduces the risk of potentially spurious findings. We opened the paper asking whether brachycephaly and nose size predict the severity of OSA. For now, the answer is mostly negative, but important in showing how crucial it is to include confounders in studies of phenotypic association.

AUTHOR CONTRIBUTIONS
AC, AD, MK, RB contributed to the conception and design of the work.
AC performed the analysis. TI, AO, RE, BS, IF, TP and NH contributed to the acquisition, analysis or interpretation of the data for the work. AC and AD drafted the manuscript. TI, MK, IF critically revised the manuscript. All coauthors gave final approval and agreed to be accountable for all aspects of work ensuring integrity and accuracy.