Assessing Decoding Ability

Tools for assessing decoding skill in students attending elementary grades are of fundamental importance for guaranteeing an early identification of reading disabled students and reducing both the primary negative effects (on learning) and the secondary negative effects (on the development of the personality) of this disability. This article presents results obtained by administering existing standardized tests of reading and a new screening procedure to about 1,500 students in the elementary grades in Italy. It is found that variables measuring speed and accuracy in all administered reading tests are not Gaussian, and therefore the threshold values used for classifying a student as a normal decoder or as an impaired decoder must be estimated on the basis of the empirical distribution of these variables rather than by using the percentiles of the normal distribution. It is also found that the decoding speed and the decoding accuracy can be measured in either a 1-minute procedure or in much longer standardized tests. The screening procedure and the tests administered are found to be equivalent insofar as they carry the same information. Finally, it is found that speed and accuracy act as complementary effects in the measurement of decoding ability. On the basis of this last finding, the study introduces a new composite indicator aimed at determining the student’s performance, which combines speed and accuracy in the measurement of decoding ability.

heterogeneous because of their different level of maturity and their different cognitive and linguistic ability. In kindergarten, Italian children are not trained systematically in reading acquisition and phonology, and this leads to a heterogeneous situation among children at the beginning of the first grade. Nevertheless, because of the consistency of Italian orthography, at the end of the first grade the majority of children successfully perform in reading and decoding. Decoding ability in elementary grades in Italy is currently assessed with the aid of tests requiring the students to read aloud a selected list of words, or a selected list of nonwords or a text. These lists are taken from standardized tests. The most widely used standardized tests have been introduced and illustrated by Tressoldi (1995, 2007). Recently, a new screening procedure for identifying impaired decoders in elementary grades in Italy was proposed by Stella, Scorza, and Morlini (2011). This screening requires the students to read a selected text for exactly 1 minute. What is important in the use of tests and screening procedures is the way the results are interpreted. One of the defining characteristic of a skilled decoder is that he or she not only is able to spell written words (or nonwords) accurately, but also does so rapidly and automatically (Bowers & Wolf, 1993). An individual who spells accurately but very slowly cannot be considered a skilled decoder. Slow rate of word reading is then characteristic of impaired decoding as well as low accuracy (Compton & Carlisle, 1994), especially in transparent languages (e.g., Jimènez & Hernàndez-Valle, 2000;Jimènez & Ramìrez, 2002;Jimènez, Rodrìguez, & Ramìrez, 2009;Serrano & Defior, 2008;Sprenger-Charolles, Colé, Lacert, & Serniclaes, 2000;Sprenger-Charolles, Siegel, Jimènez, & Ziegler, 2011;Ziegler, Perry, Ma-Wyatt, Ladner, & Schulte-Korne, 2003). In Italy, some examiners assessing decoding ability do not take into account both factors, and an individual can be classified as impaired because he or she is able to read words (or nonwords) very rapidly, even though he or she misspells a fairly large number of words (or nonwords). Some other examiners do not consider speed, and individuals with weak decoding skills who are able to read a large number of words, provided they are given ample time, are erroneously classified as adequate decoders by these examiners. Many authors have outlined the necessity of considering both speed and accuracy for a valid assessment of decoding skills and have recommended to include both these factors in standardized test procedures in learning disabilities research (e.g., Dennis & Evans, 1996;Torgersen et al., 1997;Vernon, 1990). Therefore, studying the role of speed and accuracy in assessing decoding ability and developing a composite indicator of decoding efficiency that incorporates measures of speed as well as of accuracy are two main goals in reading research. In the literature, only some few indicators have been proposed (Dennis & Evans, 1996;Joshi & Aaron, 2002;Olson, Gillis, Rack, DeFries, & Fulker, 1991;Sinatra & Royer, 1995). Another goal is to define suitable threshold values in variable measuring speed and accuracy for classifying a student as a normal decoder or as a "too slow" or "too inaccurate" decoder. At the present in Italy, these thresholds have been defined assuming the Gaussian distribution of the variables. Our study shows that all variables measuring speed and accuracy in the administered tests are not Gaussian and therefore the threshold values should be estimated on the basis of the empirical distribution of these variables rather than on the basis of the percentiles of the Gaussian distribution. In synthesis, the present study has the following aims: 1. The first aim is to examine the empirical distribution of the variables measuring speed and accuracy in two widely used Italian standardized tests and in a new screening procedure proposed by Stella et al. (2011) and to see whether the normative threshold values for the variables measuring speed and accuracy in the standardize tests lead to a percentage of students classified as impaired reader in agreement with the expected one. Although there have been no studies to indicate an accurate percentage, the Italian Dyslexia Association (http://www.aiditalia.org) believes that learning disability can affect between 3% and 4% of the Italian scholastic population. 2. The second aim is to study the bivariate relationships among variables to investigate whether speed and accuracy are correlated or act as independent elements in the assessment of the decoding ability. 3. The third aim is to investigate whether the speed and the accuracy measured in the new screening procedure (which is exactly 1 minute long) are correlated with the speed and the accuracy measured in the much longer standardized tests. 4. The fourth aim is to propose a new composite indicator of decoding efficiency that incorporates a measure of speed as well as a measure of accuracy. This indicator can be used in any test or screening procedure, both in countries with transparent languages, as well as Italian, and in countries with deep languages. Indeed, even though a shallow orthography facilitates decoding, Paulesu et al. (2001) found that dyslexic readers in shallow orthography read better than dyslexic readers in deep orthography, but they perform significantly worse than their nondyslexic counterpart readers in shallow orthography. The indicator requires only the noncorrelation of the two measures. It may be used for classifying children as impaired or not impaired on the basis of a score that takes into account both the speed and the accuracy. Singularly using a measure of speed and a measure of accuracy, a person may be classified as impaired with one measure but as not impaired with the other measure.
The article is organized as follows. In the second section we illustrate the study conducted. Drawing from the results obtained by administering the standardized tests and the screening procedure to about 1,500 students attending elementary grades in Italy, in the third section we report univariate and bivariate analyses of the variables measuring speed and accuracy, examine the empirical distribution of these variables, and discuss the choice of the threshold values for classifying the students' performance. In the fourth section we propose the new composite indicator. Finally, in the fifth section we give some concluding remarks.

The Study
A new screening procedure and two standardized tests were administered to 1,469 students in elementary grades in the Lombardia and Emilia Romagna regions of northern Italy.
The tests and the procedure were administered to students attending Grades II, III, IV, and V in February and to students attending Grade I in May. Since Italian is a language with transparent or shallow orthography, where the letters of the alphabet, alone or in combination, are in most instances uniquely mapped to each of the speech sounds occurring in the language, in general at the end of the first school year students are able to read (Goswami et al., 1998;Seymour et al., 2003). The tests and the screening procedure were administered to all students attending the schools selected for the study, but results relative to foreign students and to students with a certified learning disability were dropped from the analysis. Information about the participants is shown in Table 1. The administered screening procedure is called SPILLO (Stella et al., 2011). In this procedure, the student is asked to read aloud a text for exactly 1 minute. The text presented tells a story and is composed of 181 words. When reading a text, semantic and morphosyntactic information can enhance the reader's understanding, whereas reading a list of words or nonwords is primarily a measure of intact or impaired lexical or sublexical process. Reading a text allows the evaluation of the efficiency of the so-called instrumental reading, that is, the ability to quickly and correctly recognize and spell words present in a text. The chosen text tells a story. This story has a score of 71 on the Gulpease Index, which is similar to the Flesch Index (Flesch, 1948). It compares the words in the text with those in the base vocabulary of the Italian language of De Mauro (1997;www.eulogos.net/it/censor). The Gulpease Index uses a predefined scale from 0 (minimum readability) to 100 (maximum readability). This scale is based on the evaluation of real understanding of a body of text by readers of different school ages. A score of 71 indicates that the text is "difficult to read" for students in primary school and that children in elementary grades are not capable of reading and understanding the text by themselves. The ability to decode is measured by the following variables, computed by the examiner at the end of the procedure (after 1 minute): Y 1 : number of words read in a minute Y 2 : number of syllables per second read in a minute Y 3 : number of incorrect pronunciations in a minute The standardized tests are the Batteries for the Diagnosis of Reading and Spelling Disabilities of Sartori et al. (1995Sartori et al. ( , 2007. These tests are the most widely used diagnostic tests in Italy. While the student reads, the examiner times the reading and makes a note of the mistakes. In the first test the student is asked to read a list of words and in the second test a list of nonwords. The list of words contains the same proportion of words as a function of familiarity (one third of the words are "very familiar," one third are "familiar," and one third are "not familiar") and as a function of the number of syllables (one third of the words have two syllables, one third have three syllables, and one third have four syllables). The list of nonwords contains the same proportion of words as a function of the number of syllables. In the first test the variables measuring performances of decoding are the following: In the second test the variables are these: Although the number of syllables per second seems to be more common in text or sentence reading tests whereas the time in seconds seems to be preferred for word and nonword reading tests, the two scores are provided interchangeably for the Italian standardized tests. Intuitively, both measures have exactly the same meaning. However, when one derives the syllables per second or the time in seconds from a same test performance, the pronounced nonlinearity of the function relating the two measures can produce important discrepancies between results obtained with each score, both in clinical practice and in experimental research. This fact is illustrated in the next sections. The time requested for administering the tests depends on the student's ability and usually varies between 1 and 20 minutes. The screening procedure and the tests were administered in a quiet environment outside the classroom. The text and the list of words and nonwords were printed in Times New Roman font. Each student was asked to read aloud as quickly as possible but to be as accurate as possible at the same time. The procedure and the tests required the examiner not to intervene when the student made a mistake.

Speed: Univariate Analyses
Speed is measured by the variables X 1 , X 2 , X 4 , X 5 , Y 1 , and Y 2 . In Tables 2, 3, and 4 we list the values of some univariate statistics computed for these variables. In Figure 1 we show the empirical distributions through histograms. Performances in the decoding speed improve from Grade I to Grade V: Both the median (indicated with x 0.50 in the tables) and the mean values X 1 and X 4 decrease, whereas the mean and the median values of X 2 , X 5 , Y 1 , and Y 2 increase. Variables measuring the number of words and the number of syllables read in a second have a similar pattern: The average values of Y 1 and Y 2 across the five grades behave  I  II  III  IV  V  I  II  III  IV  V   N  333  384  200  276  276  333  384  200  276  276  Min  92  69  52  50  similarly to the average values of X 2 and X 5 . Dispersion, measured by the coefficient of variation (CV), always decreases with the grade level. The larger variability in Grades I, II, and III may be explained by considering that many covariates (e.g., cultural level, experiences in day nursery, etc.) have a great influence on decoding performances. Beginning with Grade III, these covariates become less relevant and the scholastic population becomes more homogeneous.
Skewness. Variables measuring the decoding speed on the tests (X 1 , X 2 , X 4 , X 5 ) have a positive skew and present outlying values higher than x 0.75 -1.5(x 0.75 -x 0.25 ), in all grades. These characteristics are desirable for X 1 and X 4 , which have a positive direction of pathology (impaired readers are children with high values for these variables), but not for X 2 and X 5 , which have a negative direction of pathology (impaired readers are children with low values for these variables). Y 1 has a positive skew in Grades I, II, and III but a negative skew in Grades IV and V. Y 2 has a positive skew in Grades I and II and a negative skew in Grades III, IV, and V. In Grades IV and V these variables also have outlying values smaller than x 0.25 + 1.5(x 0.75 -x 0.25 ). These features show that Y 1 and Y 2 have a more discriminative power in the last school years of primary school, when the students read more fluently.
Normality. Determining whether variables measuring speed are normally distributed is an important goal since the currently normative threshold values on the standardized tests, used for discriminating impaired and normal decoders, are based on this assumption (Sartori et al., 1995(Sartori et al., , 2007. Table 5 reports the p values of four nonparametric tests: the Shapiro-Wilk (Shapiro & Wilk, 1965), Anderson-Darling (Anderson & Darling, 1952), Lilliefors (Lilliefors, 1967), and Jarque-Bera (Jarque & Bera, 1980, 1981, 1987) tests for the normality; p values greater than .01 are presented in bold. With the exception of X 2 in Grades III, IV, and V and Y 2 in Grade III, all variables seem far from the normal distribution. The null hypothesis of Gaussian distribution is rejected with all tests considered (α = .01) for all the following variables: X 1 in all grades, X 2 in Grades I and II, X 4 in all grades, X 5 in Grades III, IV, and V, Y 1 in Grades I, II, III, and V, and Y 2 in Grades I, II, IV, and V. Regarding X 5 in Grades I and II, the null hypothesis of Gaussian distribution is rejected, for α = .01, with the Shapiro-Wilk, Anderson-Darling, and Jarque-Bera tests whereas it is accepted with the Lilliefors test. For Y 1 in Grade IV, the null hypothesis is rejected with the Shapiro-Wilk and Jarque-Bera tests whereas it is accepted with the Anderson-Darling and Lilliefors tests.
Discussion of the normative threshold values on the standardized tests. The currently used thresholds for X 1 , X 2 , X 4 , X 5 are based on the assumption of normality. The thresholds are used for classifying students as normal readers or impaired readers. They have been specified on the basis of the mean and the variance, assuming a normal distribution (Sartori et al., 1995(Sartori et al., , 2007. The thresholds have been obtained as µ + 2σ (for X 1 and X 4 ) and as µ -2σ (for X 2 and X 5 ), where µ indicates the mean and σ the standard deviation, considering that in a Gaussian distribution these values exclude about 2% of the population. The estimated values of µ and σ, reported in Sartori et al. (2007) and currently used as normative values in Italy (also for obtaining the z scores), have been estimated only for Grades II, III, IV, and V. Using the t test for the means and the nonparametric test of Levene for the variances, the means and the variances reported in Sartori et al. (2007) are significantly different (α = .05) from the means and the variances obtained in our study (see Table 4). The normative thresholds in our sample lead to percentages of students classified as impaired readers that vary greatly not only across the grades but also across the variables (see Table 6). In Grade II, for example, 30 students are classified as impaired decoders with X 1 but only 2 with X 2 . Variable X 5 with the normative threshold classifies as impaired decoders in each grade a percentage of students far from 2% and also from the expected value (between 3% and 4%). Students with a mixed profile in the word reading speed (students who are classified as impaired with X 1 but are in the normal range with X 2 or vice versa) are numbered 28 in Grade II, 6 in Grade III, and 4 in Grades IV and V. Students classified as impaired with both measures of word reading speed are only 2 in Grade II, zero in Grade III, 3 in Grade IV, and 2 in Grade V. Students with a mixed profile in the nonword reading speed (students who are classified as impaired with X 4 but are in the normal range with X 5 or vice versa) are numbered 22 in Grade II, 9 in Grade III, 2 in Grade IV, and 3 in Grade V. Students classified as impaired with both measures of nonword reading speed are numbered zero in Grade II, 1 in Grade III, 2 in Grade IV, and zero in Grade V. Because of the nonnormality of the variables, the presence of outliers, and the level of asymmetry, which is different from one grade to another, more accurate thresholds should be defined in terms of the percentiles. The thresholds for Y 1 and Y 2 have been set equal to x 0.05 to discriminate a percentage of people higher than the expected one. The procedure is not intended as a diagnostic test for learning disorders, but it is a screening procedure for detecting students with heavy difficulties in decoding. The causes of these difficulties are to be defined by subsequent more detailed analyses. In our sample, the percentages of students classified as impaired readers with Y 1 and Y 2 are similar in all grades and are not far from 5%. Students with a mixed profile in the text reading speed (students who are classified as impaired with Y 1 but not with Y 2 or vice versa) are only 5 in Grade II, zero in Grade III, 3 in

Accuracy: Univariate Analyses
In Tables 7 and 8 we list the values of some univariate statistics computed for the variables measuring the decoding accuracy (X 3 , X 6 , Y 3 ). Figure 2 shows the empirical distribution of these variables through histograms. As well as the variables measuring speed, the variables X 3 , X 6 , and Y 3 have an empirical distribution that is asymmetric and far from the Gaussian distribution. Although the mean and median values of X 3 and X 6 have a decreasing pattern from Grade I to Grade V, the mean and the median values of Y 3 are roughly constant across grades. This different pattern is a result of the fact that the time in the screening procedure is always equal to 1 minute, whereas it depends on the ability of the student on the standardized tests. In the screening, if one student increases his or her performance from one grade to the subsequent grade, he or she increases the speed of reading without being penalized on reading accuracy. Outliers are all in the "direction of pathology," and this is a desirable property. The normative threshold values for X 3 and X 6 are the 95th percentiles obtained in the study of Sartori et al. (2007). These values are similar to x 0.95 obtained in our sample (and reported in Table 3). The threshold for Y 3 is the 95th percentile as well. The percentages of students classified as impaired readers with the currently used thresholds are reported in Table 9. These percentages (with the exception of the percentage of students classified as impaired with X 6 in Grade IV) are in agreement with the expected value.

Bivariate Analyses
Tables 10, 11, 12, 13, and 14 report the matrices of the Pearson correlation coefficients between pairs of variables, in each grade. In all grades, the pairwise correlations between variables measuring speed are all significantly different from zero (α = .01). Even though the transformation from X 1 to X 2 and from X 3 to X 4 is not linear, these pairs of variables are highly correlated. The transformation from Y 1 to Y 2 is not a perfect linear transformation (since the words have a different number of syllables), but the correlation is nevertheless always equal to 1. The fact that Y 1 and Y 2 are highly correlated with X 1 , X 2 , X 4 , and X 5 provides evidence that all these variables are a measure of the same aspect of the decoding skill. Although X 3 and X 6 are highly correlated, Y 3 is not correlated with X 3 nor with X 6 . This is a  I  II  III  IV  V  I  II  III  IV  V   N  333  384  200  276  276  333  384  200  276  276  Min  0  0  0  0  0  0  0  0  0  0  Max  84  55  18  19  22  38  31  20  24    result of the fact that the time is fixed on the screening procedure. The number of mistakes has a different pattern from the number of mistakes on a reading test where the time depends on the ability of the respondent. To further investigate if Y 3 is a measure of the same aspect measured by X 3 and X 6 , we analyzed the association between the classifications of the student's performances. Even though two variables are not correlated, they may both lead to the conclusion that a student is a normal decoder (or to the conclusion that he or she is an impaired decoder). The values of the χ 2 statistic in the contingency tables obtained by considering, for each grade and for each couple of variables, Category 1 (the student has a value equal to or bigger than x 0.95 and is classified as an impaired decoder) and Category 0 (the student has a value smaller than x 0.95 and is classified as a normal decoder) all have p values smaller than .05. Therefore, for an alpha of 5%, we may reject the null hypothesis of independence between pairs of categories in each grade. To summarize results, Table 15 reports the contingency table obtained by considering students in all grades. These bivariate statistical analyses show that decoding speed and decoding accuracy can be measured with    either the 1-minute procedure or the much longer standardized tests. The screening procedure and the tests seem to be equivalent insofar as they carry the same information.
The diagnosis of developmental dyslexia and learning disability requires measures of decoding ability encompassing accuracy and speed. However, the use of a measure of speed and a measure of accuracy even when these are derived from a same single test or procedure may lead to diagnostic decisions that can be very different if one relies on one measure or the other. Considering the time in seconds and the number of errors on the list of words (X 1 and X 3 ), 55 children have a score in the pathological range for one measure but are in the normal range for the other one (25 in Grade II, 9 in Grade III, 11 in Grade IV, 10 in Grade V) and only 15 children are classified as impaired with both measures (8 in Grade II, 0 in Grade III, 4 in Grade IV, 3 in Grade V). Considering the syllables per second and the number of errors in the list of words (X 2 and X 3 ), the number of mixed profiles is 39 (13 in Grade II, 3 in Grade III, 13 in Grade IV, 10 in Grade V), whereas the number of children classified as impaired with both measures is 2 (0 in Grades II and III and 1 in Grades IV and V). Using the list of nonwords and considering the time in seconds and the number of errors (X 4 and X 6 ), the children classified as impaired with one measure but not with the other one are numbered 95 (33 in Grade II, 18 in Grade III, 30 in Grade IV, 14 in Grade V), whereas the number of students classified as impaired with both measures is only 3 (3 in Grade II and 0 in all the other grades). Considering the syllables per second and the number of errors (X 5 and X 6 ), 70 students have a score in the pathological range for one measure but are in the normal range for the other one (17 in Grade II, 9 in Grade III, 30 in Grade IV, 14 in Grade V) and none of the students are impaired with both measures. Using the screening procedure and considering the number of words and the number of errors (Y 1 and Y 3 ), the number of mixed profiles is 76 (24 in Grade II, 13 in Grade III, 19 in Grade IV, 20 in Grade V) and the number of children classified as impaired with both measures is 5 (2 in Grade II, 1 in Grades III, IV, and V). Using the screening procedure and considering the number of syllables and the number of errors (Y 2 and Y 3 ), the number of mixed profiles is 86 (29 in Grade II, 13 in Grade III, 20 in Grade IV, 24 in Grade V) and the number of children classified as impaired with both measures is 5 (2 in Grade II, 1 in Grades III, IV, and V). These serious discrepancies emerge since speed and accuracy are two different aspects of decoding skill, as is illustrated in the next section. Rather than choosing one measure over the other, the solution we suggest is to use a composite indicator combining a measure of speed and a measure of accuracy.

Multivariate Analyses
An exploratory factor analysis was performed to investigate the multivariate relationships among variables used on the tests and in the screening procedure. The analysis, performed on the correlation matrix (reached with the values of all variables in all grades), shows the presence of two main latent orthogonal factors, both with the principal components (PC) and with the common factors method (CF). The first factor is highly correlated with variables measuring speed and, to a less degree, with X 3 and X 6 . With Table 14. Correlations for Grade V.  the PC method, the eigenvalue of this factor is equal to 6 and the percentage of explained variance is 66.7%. With the CF method, the eigenvalue is 5.83 and the percentage of variance is 64.8%. The second factor is highly correlated with Y 3 and, to a less degree, with the other variables measuring accuracy (X 3 and X 6 ). With the PC method, the eigenvalue of this factor is 1.28 and the percentage of explained variance is 14.3%. With the CF method, the eigenvalue is 0.86 and the percentage of explained variance is 9.6%. Figure 3 and Table 16 summarize the results obtained with the PC method. Rotating the factors does not improve the percentage of variance explained by the first two factors. These results indicate that speed and accuracy are two different components of decoding skill. We estimate the degree to which the set of variables X 1 , X 2 , X 4 , X 5 , Y 1 , Y 2 measures a single unidimensional latent construct (the decoding speed) and the set of variables X 3 , X 6 , Y 3 measures another unidimensional latent construct (the decoding accuracy). We estimate the internal consistency of each set of variables by means of the coefficient ω (McDonald, 1999;Zinbarg, Revelle, Yovel, & Li, 2005;Zinbarg, Yovel, Revelle, & McDonald, 2006), considering the correlation matrix. For the variables regarding speed, ω = .86. For the variables regarding accuracy, ω = .64. Since these variables all have positive pairwise correlations, we may also calculate the α coefficient of Cronbach (1951) and the ρ* coefficient (Brown, 1910;Spearman, 1910). We obtain α = .85 and ρ* = .70. Regarding the decoding speed, if we select variables having positive pairwise correlations (viz., X 2 , X 5 , Y 1 , Y 2 ), ω = .94 and both the α coefficient and the ρ* coefficient are .98. All indexes show a high intercorrelation among variables belonging to each set. These findings show that the number of words and the number of syllables per second read in 1-minute procedure are measures of the same feature measured by the variables regarding decoding speed on the standardized (much longer) tests. In addition, the number of errors on the screening procedure is a measure of the same feature measured by the number of errors on the list of words and nonwords.
To evaluate the reliability of the word reading test, the nonword reading test, and the new screening procedure, we may consider the percentages of variance explained by the first two factors of the correlation matrices, since all these reading tasks measure two latent constructs of the decoding skill. In the word reading test, the first two factors (extracted with both the PC and the CF method) of the correlation matrix of variables X 1 , X 2 , and X 3 , explain 92.7% of the variance. On the nonword reading test, the first two factors (extracted with both the PC and CF methods) of the correlation matrix of variables X 3 , X 4 , and X 5 explain 93.2% of the variance. In the screening procedure, the first two factors (extracted with both the PC and the CF method) of the correlation matrix of variables Y 1 , Y 2 , and Y 3 explain 99.97%. These percentages indicate a very high reliability of all the three reading tasks.

The New Composite Indicator
In each grade, the scatterplots matrix of Y 1 , Y 2 , and Y 3 clearly shows noncorrelation and also independence between the variable measuring accuracy and any of the variables measuring speed. Figures 4, 5, 6, 7, and 8 show that for the couples of variables Y 1 , Y 3 and Y 2 , Y 3 , the sample points are distributed around the center of mass in a spherical manner. The fact that speed and accuracy are two different factors of decoding ability was also highlighted in the previous section, by means of the factor analysis performed over all the variables. An explorative factor analysis on the three variables involved in the screening procedure confirms this idea: The first factor is highly correlated with Y 1 and Y 2 and explains 67% of the total variance; the second factor is uniquely correlated with Y 3 and explains 33% of the total variance (both with the PC and the CF method). Drawing from these results, we may chose Y 1 for speed (but  we may choose Y 2 as well), and we propose the following composite indicator: Here Z 1 is the z score of Y 1 and Z 3 is the z score of Y 3 . For each observation, MD is the Mahalanobis distance from the center of mass of the sample points in the two-dimensional Euclidean space spanned by Y 1 and Y 3 . Since Y 1 and Y 3 are noncorrelated, MD coincides with the Euclidean distance calculated on the z scores. Observations with high values of MD can be considered atypical, without making any assumptions on the bivariate distribution of Y 1 and Y 3 . However, if Y 1 and Y 3 were Gaussian, observations with the same value of MD would have the same density in the bivariate normal distribution and, asymptotically, MD~χ 2 g = 2 Therefore, the value of 6 would discriminate 5% of the population, since P(χ 2 g = 2 ≥ 6) = 0.05. Our aim is to determinate a cutoff value discriminating nearly 5% of the population. Since learning disability can affect between 3% and 4% of the Italian scholastic population, a screening procedure should identify a higher percentage of impaired students: Subsequent analysis will determinate the causes of this impairment. For identifying observations that are atypical, we must consider that the variables are not Gaussian and that a high value of MD may identify not only impaired readers but also readers with very good performances. To detect atypical values in the direction of pathology, we have to observe the sign of Z 1 and Z 3 . If Z 1 is positive and Z 3 is negative, the reader is atypical because he or she has a very good performance. In all the other cases, the reader is atypical because he or she is impaired. In our sample, a threshold value equal to 6 classifies as impaired readers a percentage of students equal to 2.7% in Grade I, equal to 4.4% in Grade II, equal to 4.5% in Grade III, equal to 6.1% in Grade IV, and equal to 6.5% in Grade V. We are currently conducting a new study to investigate the validity of this threshold for the screening procedure. For this study we are selecting a large number of children to administer the screening procedure to, computing the index MD, estimating the empirical distribution of MD, and estimating the 95th percentile of this distribution. The composite indicator MD may be used in any reading test were the variables measuring speed have high pairwise correlations and the variables measuring accuracy have high pairwise correlations but there is only a slight correlation between a variable measuring speed and a variable measuring accuracy. Thus, the indicator can be used also on tests requiring students to read a selected list of words or nonwords. In our sample the indicator works well. Using X 2 and X 3 , with a threshold equal to 6, it classifies as impaired readers a percentage of children equal to 3.9% in Grade I, 3.1% in Grade II, 4.5% in Grade III, 4.7% in Grade IV, and 5.1% in Grade V. Using X 5 and X 6 the percentages of students classified as impaired readers are as follows: 4.2% in Grade I, 4.4% in Grade II, 4.0% in Grade III, 4.0% in Grade IV, and 5.5% in Grade V. All these percentages are similar across grades and are not far from the expected value (5%). Moreover, the percentages remain similar while changing the test. The use of this indicator reduces the discrepancies between results obtained with the list of words and the list of nonwords, and this has important desirable consequences in clinical practice.

Conclusions
In this study we have proposed some statistical analyses of the variables measuring speed and accuracy in two standardized tests of reading and in a new screening procedure used in Italy to detect impaired decoders in elementary grades. We believe that our findings have important implications for both research and clinical assessment. We also think that most of the findings may be extended to reading tests used in different countries. For clinical assessment, the most important findings can be summarized as follows: • Variables measuring decoding speed (both in terms of time or number of words and in terms of syllables read in a second) are not Gaussian, are asymmetric, and present many outliers. The skew is always positive in variables measuring the time and syllables per second on tests where the time depends on the ability of the reader. Variables measuring the syllables per second are bounded in the pathology direction and assume a small range of values in this direction. On the contrary, variables measuring time in seconds have no limits in the deficit direction and assume a wide range of values in the pathology domain. For these reasons, the time in seconds has a more discriminative power than the number of syllables and should be preferred in clinical practice. For variables measuring the number of words or syllables read in a second in a procedure where the time is fixed, the skew is positive in the first school years and becomes negative in later school years. The negative skewness and the presence of outliers indicate that these variables have more discriminative power in the later years of primary school, when students read more fluently. • Variables measuring the number of errors always have a positive skew. This is desirable since the direction of pathology in these variables is positive (impaired readers are students with high values in these variables). Although the average number of errors on a test where the time depends on the ability of the student decreases from Grade I to Grade V, the number of errors on a procedure where the time is fixed has a roughly constant pattern. This means that fir a screening with fixed time if one student increases performance from one grade to the subsequent grade, he or she increases the speed of decoding without diminishing decoding accuracy. • Because of the nonnormality of the variables measuring speed and accuracy, the threshold values must be estimated on the basis of the percentiles of the empirical distribution of these variables, rather than by using the percentiles of the normal distribution. With the currently used thresholds in Italy (based on the assumption of normality), we have contrasting results. For example, in Grade II, 30 students are classified as impaired on time (in seconds) in reading the list of words, whereas only 2 are classified as impaired for the number of syllables per second in reading the list of words. Using the same test (the list of words or the list of nonwords), many students are classified differently when using time in seconds or the number of syllables per second. This is because the variables are not Gaussian and are asymmetric. The z scores are therefore not Gaussian. The 95th percentile of the z scores of these variables does not coincide with the 95th percentile of the standard normal distribution (the same for the 5th percentile). This finding presents many doubts about the validity of the normative thresholds currently used in Italy. This is an important issue for clinicians given that a diagnosis of decoding disability can provide access to treatment and to other facilities that are reserved for individuals with dyslexia at school.
The following findings are relevant for researchers: • Speed and accuracy are two orthogonal latent factors of decoding skill. The consequence is that scores that are obtained from a same reading performance will necessarily be incongruent. • Rather than using a measure of speed and a measure of accuracy, we may consider a composite indicator. A person may be impaired for speed but in the normal range for accuracy, or vice versa. This can have undesirable consequences: Studies using different scores may support different theories of decoding ability, and decoding ability classifications may change according to the measure used. In this work we have proposed a new composite indicator, MD, that considers a variable measuring speed and a variable measuring accuracy. For each observation, the value of MD is the Mahalanobis distance from the center of mass of the sample points in the twodimensional Euclidean space spanned by the two variables. Observations with high values on this indicator can be considered atypical, without making any assumptions about the bivariate distribution of the variables. If these variables were found to be Gaussian, observations with the same value of MD would have the same density in the bivariate normal distribution and, asymptotically, MD~χ 2 g = 2. The use of this indicator requires only the noncorrelation of the two measures. Therefore, MD can be used in any test and in any screening procedure where speed and accuracy are found to be orthogonal factors.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.