Inefficiency in Childcare Production: Evidence from Italian Microdata

The paper provides an empirical analysis of production technology in the childcare sector and offers a comparative analysis of inefficiency between public and private day-care centres. Estimates of multi-output production technology and technical inefficiency are obtained in a stochastic frontier model by using cross-section micro-data from a region of northern Italy over the period 2007/8. We find that production exhibits increasing returns to scale and that separability between inputs and outputs is rejected. The average estimate of technical inefficiency is about 10% and public centres are more inefficient than private centres by 4.1% points.


Introduction
The availability of childcare services may have relevant effects on several socio economic problems as underlined by a large and growing body of theoretical and empirical literature. 1 Childcare may improve female labour market participation and gender equality by reconciling work and family life. It may contrast the negative trend of fertility rates by making a child less costly in terms of income and career opportunities and, from a socio-pedagogical perspective, it may contribute to child development and socio-economic integration. In the last decades the importance of childcare has been increasingly recognized and most of the developed countries, including Italy, have undertaken policies aimed at increasing the availability of the service. 2 In Italy, childcare is primarily managed by public local authorities and specifically by municipalities. The fast growth in the provision of childcare experienced since the end of the 90s has been mostly achieved by opening the childcare sector to private subjects. This pattern of development has been accompanied by increasing regulation by central and regional authorities aimed at ensuring similar quality standards between public and private providers. Also following a tightening of budgetary constraints to public funding in those years, municipalities have either turned to private providers for the supply of the whole service or outsourced the provision of auxiliary services to private subjects. As a result, municipalities have been able to extend the availability of childcare at lower costs, because private providers take advantage of more favourable employment contracts than those in the public sector.
The policy choice of expanding childcare services by resorting to private subjects poses an interesting question, which is whether the lower cost of the service is also associated with efficiency gains in production or whether it is just the result of lower input prices and notably cheaper labour services. 3 In other words, while having lower costs, are private providers also more efficient than public ones? The aim of this paper is to provide empirical estimates of production technology and efficiency in the childcare sector and answer the above question by offering a comparative analysis of inefficiency between public and private providers.
We deal with childcare services for young children defined as infants, from 3 to 12 months of age and toddlers, from 12 to 36 months. The analysis covers the years 2007/8 and refers to day-care centres in Emilia Romagna, one of the most developed regions of northern Italy which is also known for the high share of children served and the quality of childcare. 4 Emilia Romagna is the first region in Italy as for territorial coverage of the service, which is available in more than 80% of the municipalities, and it is the only Italian region to meet the Barcelona 2002 target of providing childcare for more than 33% 1 See, among others, the contributions by Blau and Currie (2006), Del Boca et al. (2009), Elango et al. (2016). 2 For an assessment of childcare facilities development programmes in European countries see European Commission (2018), while for a recent evaluation of Italian programmes see Giorgetti and Picchio (2018). 3 As shown in Mari (2010) and Fortunati et al. (2012), labour costs in the private sector are, on average, about 20% lower than those in the public sector. 4 In Emilia Romagna is active Reggio Children which is a worldwide recognized centre promoting innovative pedagogical projects for early childhood, see Hewett (2001). of children under 3 years of age. Quality standards are set by regional legislation and are enforced by municipalities to private as well as public facilities. 5 The main quality requirements are concerned with health and safety conditions, per child space for green and covered area, the proportion of children to staff and teachers' professional training and education. The regional childcare system consists of a number of formal arrangements including day-care centres, micro-centres and age-integrated institutions. For the age group 0-2 years, day-care centres are the most widespread form of arrangement and are the focus of our analysis.
Day-care slots are mainly administered by local public authorities, meaning that the access to the service is managed by municipalities. The provision of the service can be either public, when the center is directly operated by the municipality or private, when the center is operated by private subjects such as cooperatives or other social organizations. 6 Our analysis will be confined to childcare administered by municipalities and the case of 'fully private childcare', where both the access and the provision of the service are directly managed by a private subject, will not be considered. 7 The aim of our analysis is to provide estimates of production technology and input-oriented technical inefficiency, which is inefficiency related to potential savings of inputs given outputs. We will also carry out a comparative analysis of public and private day-care centres by identifying the major determinants of inefficiency and by measuring their marginal effects.
The empirical analysis refers to a sample of 482 day-care centres in Emilia Romagna. The regional survey was carried out between 2007 and 2008 and contains very detailed information about the characteristics and the organization of the service at the level of the single center. We have modelled a multi-output technology of production by using an input-distance function approach with a translog functional form specification. Stochastic frontier analysis with a half-normal one-sided error representing technical inefficiency has been applied to specify the empirical model. Three nested models have been estimated by using the maximum likelihood method. The first two models are used for comparative purposes, while the third provides the final results. In the final model, heteroscedasticity of all error terms is parametrized by a number of exogenous variables which are potential determinants of inefficiency.
Empirical studies investigating the characteristics of production technology in day-care service are few in number and provide mixed evidence about efficiency comparisons between public or non for profit centres and private or for profit centres. Most of these studies are based on costs data and recover the properties of technology indirectly from the dual relationship between production and costs.
Using a sample of 182 centres in 35 states of the US conducted in 1989, Powell and Cosgrove (1992) estimate a cost function with a single output by controlling for quality differential as well as other centre characteristics. The empirical results show that technology exhibits increasing returns to scale and that private for profit centres are more cost efficient (by 9.1% points) than other centres, once quality has been controlled for. Mukerjee and Witte (1993) compare the costs of Non For Profit (NFP) and For Profit (FP) centres in Massachusetts using cross-section data collected in 1987 and 1988. They conclude that there are not significant differences as for the method of operation of the two types of centres and that differences in costs are due to differences in wages paid by the two types of employers. A similar conclusion is reached by Preston (1993), who uses a large data set on costs from a National survey in US in 1976-77. Differences in costs between FP and NFP centres are associated with differences in the quality of the service, however there is no evidence that the two types of centres have different level of efficiency. Mocan (1997) studies a multi-product cost function and estimates several properties of the production technology. Mocan's empirical analysis is based on data from a sample of 400 observations collected in different states of the US in 1993. The main results are that there are no significant differences either in the quality of the service or in the level of efficiency between FP and NFP centres. Finally, a different approach is used in Bjurek et al. (1992) who study productive efficiency in public day-care centres in Sweden by using the non parametric method of the deterministic production frontier, or data envelopment analysis (DEA). They estimate a multi-output production frontier with a data set of 194 observations collected in 1988 and 1989 and obtain measures of (output-oriented) technical efficiency ranging from 0.89 to 0.91, meaning that, holding input fixed, actual day-care services in public centres could have been increased by 9-11%.
In Italy, empirical studies have mainly focused on the analysis of costs of the service rather than on production technology. After the 'pioneering' study by Fazioli and Filippini (1997) who estimated a cost function using a sample of data at the level of municipalities, empirical research has been addressed to the analysis of standard costs which is the base of the state funding mechanism introduced in the legislation after 2010 (Antonelli and Grembi 2011;Sose 2016). 8 There are no empirical studies on technical efficiency in childcare production, except the work by Destefanis and Maietta (2001, 2003 who estimate a production function in the different but related sector of old people care and nursing and address the interesting question of the relationship between technical efficiency and proprietary forms. Our analysis departs from international and Italian existing literature in many respects. First, we estimate a multi-output technology directly by using data on production rather than costs. Second, our study seems to be the first piece of research focusing on infant and toddler day-care sector which uses one of the largest and most systematic set of micro data covering a population of centres with a high degree of homogeneity in technology and regulation. Third, we estimate a stochastic rather than a deterministic production frontier. In deterministic frontier analysis little or no account is usually taken of measurement errors or other sources of statistical noise so that all the deviations from the frontier are considered as the result of inefficiency. By contrast, stochastic frontier models do not incur the risk to overestimate inefficiency. Fourth, heteroscedasticity of the error terms has been explicitly considered and incorporated 8 Other empirical studies have focused on related aspects such as the determinants of demand for childcare (Zollino 2008;Del Boca et al. 2009), the effects of availability of childcare on female labour market participation (Del Boca and Vuri 2007) and on child development (Brilli et al. 2016). More recent empirical work has also investigated the effects of different eligibility criteria and participation fees (Bucciol et al. 2016;Del Boca et al. 2016). in our research which increases the reliability of our findings. Indeed, the computation of inefficiency is based on residuals of the frontier estimates which are particularly sensitive to specification errors. If the presence of heteroscedasticity is neglected, estimators not only are no longer efficient but are also biased, as shown by Kumbhakar and Lovell (2000), and estimates of efficiency measures are seriously affected.
Our work has also a few similarities with other contributions to the literature on the measurement of inefficiency in production. In fact, the multi-output representation of technology is similar to the model used by Coelli and Perelman (2000) in their study of the railways sector and we apply a single-step procedure for investigating the determinants of inefficiency as in Caudill et al. (1995) andHadri (1999).
The rest of the paper is organized as follows. The next section provides information about production technology of day-care centres and introduces the measure of technical inefficiency. In Sect. 3 the stochastic frontier model is specified. Section 4 illustrates the main characteristics of the sample, the definition of the variables and provides some descriptive statistics. Section 5 contains the empirical results and finally Sect. 6 offers a summary and some conclusions. Further details on the specification of the empirical model are in Appendix A, while Appendix B contains further descriptive statistics and details of the econometric results.

Day-Care Centre Technology and Inefficiency
Day-care centres have a production technology with multiple inputs and outputs. The most important types of inputs are labour services provided by teaching staff and service personnel, i.e. workers involved in the provision of auxiliary services such as food, janitorial and laundry services. Capital input mainly consists of the services of physical buildings and green areas in addition to variable capital such as energy and heating. Two separate outputs can be jointly produced which are childcare services for infants and for toddlers. The production of these outputs involves the use of different combinations of labour and different arrangements of auxiliary services. Moreover, the amount of inputs used in production is not completely allocable between the two outputs. Technology is constrained by a regulatory framework imposed by regional legislation which sets out 'structural' quality requirements concerning, among other things, health and safety conditions of facilities, space available for green and covered area per child, maximum proportions of children to teachers, professional training and education of teaching staff. Within this regulatory framework several organization modes and different input-output combinations are possible, which are the prerogative of choice of day-care centre managers, possibly in conjunction with municipal officers. 9 The most salient choices for the organization of childcare in daycare centres are briefly discussed below. 10 In the first place centre managers have to decide the composition by age of classrooms. The day-care service can be organized by age group or by mixed-age group classrooms, where children of different ages, usually toddlers, are cared for together. The legislation places requirements on the number of children per teacher differentiated by age category and, specifically, it requires that there are no more than 5 infants per teacher and 7 toddlers (or 10 older toddlers) per teacher. Mixed-age group classrooms must comply with the more stringent requirement.
A second organizational choice is whether to have classrooms arranged on a fulltime basis and/or classrooms arranged on a part-time basis. The daily opening hours of full-time service are at least 8 h, while part-time hours must be no less than 6. Centres with no part-time classes may as well provide part-time service accommodating children in full-time classes, although this involves a suboptimal use of resources.
Another choice concerns the extension of opening hours. In order to meet the needs of working parents in the first or the final hours of the day, centres may either decide to extend the standard hours of full-time or part-time service, or to provide extra opening hours. Usually, children attendance in the extra hours is limited and the service can be arranged with a smaller number of classes.
Furthermore, centre managers have discretionary control over the organization of auxiliary activities such as food preparation, laundry, janitorial activities. These activities can either be carried out in-house or can be externalized. Usually, outsourcing food service means buying meals outside, while outsourcing of other auxiliary services is carried out by hiring external workers who supply their labour services internally.
Finally, centre managers are charged with the management of personnel. Within the limits set by employment contracts, which differ across public and private centres, managers have to decide on the composition of staff and how to combine personnel working hours to cope with the activities of the centre. Typically, the use of personnel in private centres is more flexible and this aspect may affect the relative measurements of inefficiency between public and private facilities. 11 One of the assumptions underlying our analysis of technology is that, as a first approximation, there are no significant differences in 'structural' quality between public and private centres. In fact, in order to provide the service, private centres must be awarded an authorization by the municipality which ensures compliance with quality requirements set by the regional legislation. Although this assumption seems to be consistent with our data (see Sect. 4 below), we are aware that structural characteristics may not be sufficient to capture the overall quality of the service. In fact, there are also other aspects, such as the variety and quality of group activities and the interaction between child and teacher, that may affect the quality of childcare as pointed out, among others, by Blau (1999). However, due to the lack of information in our data set, we are not able to fully take into account possible differences in overall quality between public and private centres. 12 In order to introduce a measure of inefficiency and to specify the empirical model we need a slightly more formal description of technology. A production technology is a set of feasible plans of production, i.e. input-output combinations, (x, y), where the positive vector of input x can be transformed into the positive vector of output, y. A production plan is efficient if there are no other input-output combinations with higher amounts of output and/or lower amounts of input. The set of all the efficient input-output combinations is the frontier of production. Feasible production plans which do not belong to the frontier are inefficient, meaning that either a higher level of output can be obtained with the same input or a lower quantity of inputs can be used to produce the same output or both.
The analysis of efficiency conducted in the present investigation is input-oriented, i.e. it refers to quantities of input needed to produce given vectors of output, because the focus is on the potential saving of physical inputs to produce a given combination of physical outputs. This choice is also motivated by the fact that the quantities of output are not under complete control of centre managers. Indeed, the number of enrolled infants and toddlers depends, to a large extent, on the decision of municipalities and, similarly, the number of children served on a part-time or full-time basis is the result of parents' choices. On the other hand, centre managers have relatively greater control over input choices. For example, they have some discretion about the number, the working hours and the qualification of workers to employ in teaching activities and in auxiliary services. They decide whether or not to organize mixed-age classrooms and whether or not some classrooms are arranged on a part-time basis, which are choices that in turn influence the staffing needs. They also decide whether and to what extent externalize auxiliary services. Thus, it seems sensible to assume that the objective of the centre manager, who is usually under a tight budget constraint, is to economize on the use of inputs, so that the analysis of efficiency is best conducted in terms of input-oriented measures.
In order to illustrate the measure of inefficiency adopted here, let us consider Fig. 1 which shows the input requirement set V (y) associated with output y in the case with only two inputs. 13 Given the inefficient input combination x, the efficient counterpart denoted byx is obtained as a proportional contraction of all inputs. The input vector x, which lies on the isoquant and is a scalar multiple of x, is taken as the basis for the measurement of technical inefficiency associated with x. Specifically, the scalar coefficient definingx indicates the proportion of inputs needed to efficiently produce the given output vector, holding the relative amounts of inputs constant. Graphically, the scalar coefficient is given by the ratio of the distance 0x to the distance 0x in Fig. 1. Writingx asx = e −u x, the scalar coefficient e −u can be taken as a measure of efficiency and the scalar u as a measure of inefficiency. Indeed, if x lies on the isoquant, the production plan (x, y) is efficient andx = x, so that e −u = 1 and u = 0. i.e. efficiency is equal to 1 and inefficiency is equal to zero. On the other hand, if the plan (x, y) is inefficient, the input vector is in the interior of V (y) and e −u < 1 implies u > 0, i.e. efficiency is less than 1 and inefficiency is strictly positive. Accordingly, we x 1 will define 14 Technical Efficiency associated with the feasible production plan (x, y) by where e −u is the coefficient of the input vector contraction lying on the isoquant associated with y. TE ranges from 0 to 1 and its value indicates the proportion of the observed inputs needed to efficiently produce the given level of output.
On the other hand, the measure of technical inefficiency is given by the proportion of inputs used in excess of the efficient vectorx, which is (1−e −u ). Technical Inefficiency associated with the feasible production plan (x, y) is defined by where we used the first order approximation at u = 0, i.e. e −u = 1 − u. The scalar u is non negative and indicates the rate at which all inputs can be proportionally deflated holding output constant.
A convenient way to model technical inefficiency in multiple output production technologies is by means of an input-distance function, a notion originally put forward by Shephard (1970). The (input) distance function is defined at all production plans and is given by The distance D(x, y) is the reciprocal of the proportional contractions or expansions of inputs x needed to reach the isoquant associated with y. If x lies on the isoquant, no contraction of inputs is needed so that D(x, y) = 1. When x is an interior point of V (q), as depicted in Fig. 1, a contraction of inputs is required to reach the boundary and D(x, y) > 1. If (x, y) is not a feasible plan, then greater quantities of inputs are needed to produce y and an expansion of the input vector is required to reach the boundary of V (y), thus D(x, y) < 1. The production technology can be specified in terms of a distance function by the inequality D(x, y) ≥ 1; the set of solutions to D(x, y) = 1 is the production frontier. The function D(x, y) is linearly homogeneous in input, i.e. D(t x, y) = t D(x, y) for t > 0. Other properties of the distance function follow from standard assumptions on technology. In particular, D(x, y) is increasing and concave in x and decreasing and quasi-convex in y. 15 For our purposes, the most useful aspect of the distance function is its relationship with technical inefficiency; indeed, as can be seen by (3), D(x, y) is the reciprocal of technical efficiency coefficient, i.e. e −u = 1/D(x, y). Hence, it turns out that technical inefficiency is equal to the log of the distance function, i.e. u = log D(x, y). (4)

Empirical Model Specification
We assume that the production technology for infant-toddler centres has two outputs and three inputs. By homogeneity of the distance function the input variables can be normalized by input x 3 , so that we have is the normalized input vector. Taking the logs, replacing log D(x, y) from (4) and rearranging yield Appending to the right-hand-side a stochastic, zero mean and symmetric error term, v, which accounts for measurement errors and other sources of statistical noise, yields the following stochastic production frontier model The empirical model is specified once a parametric functional form for the distance function is chosen. The translog is a particularly convenient functional form because it is linear in the parameters and provides a local second order approximation to any arbitrary function. Therefore, we assume that log D(x, y) in (5) has a translog functional form which is given by 16 For a detailed analysis of properties of D(x, y) see Färe and Primont (1995). 16 Further details regarding the specification of the empirical model can be found in Appendix A.
To estimate the parameters of the model (5) and (6), further assumptions about the unobserved terms v and u are needed. For any observation i, the inefficiency term u i is treated as a random variable and specifically as a one-sided error with a distribution in the non negative domain, i.e. with u i ≥ 0. We assume that inefficiency terms are independently distributed across observations and follow a half-normal distribution, where 0 and σ 2 u are respectively the mean and the variance of the normal distribution before truncation. 17 The random errors v i are supposed to be independently and normally distributed with zero mean and variance As noted by several authors, Coelli (2000) and Kumbhakar (2013) among others, this empirical model may suffer from endogeneity problems. In particular, it is argued that if firms are aware of their inefficiency and take decisions about the level of inputs as the result of an optimization process, then a relationship between input variables and the composite error term exists.
Indeed, we are assuming that while outputs are demand determined so that they can be taken as exogenously given, the level of inputs is set by day-care managers with the aim of minimizing costs given input prices and technology, which means that the input variables are correlated with the inefficiency error terms u i . However, by using the same kind of arguments as in Coelli (2000) and Kumbhakar (2013), it can be shown that by contrast the input ratios do not depend on inefficiency. Therefore, as the error terms v i are supposed to be uncorrelated with explanatory variables, no endogeneity problem in the estimate of the parameters of model (5) arises (see Appendix A).
The stochastic frontier model specified by (5), (6) and the distributional assumptions on the unobservable terms, (7) and (8), is referred to as the half-normal model. Our aim is to estimate the production frontier coefficients α, β and γ , the variance σ 2 v and, in particular, the parameter σ 2 u of the inefficiency distribution. Indeed, the estimate of σ 2 u together with the distributional hypothesis on u i yield observation-specific estimates of technical efficiency and inefficiency, which is the focus of our analysis. The parameters of the half-normal model are estimated by using the Maximum Likelihood (ML) method. The assumptions about u i and v i are used to derive the distribution of the composed error terms, i , and thus the log-likelihood for each observation. The ML estimates are obtained by numerical optimization of the sum of the log-likelihood of each observation. Two cases are considered. The first case is the estimate of the half-normal model with homoscedastic error terms, i.e. with constant σ 2 v and σ 2 u . To guarantee that the variance estimates are positive, the parametrization by means of an exponential function is used, i.e.
where δ 0 and θ 0 are unrestricted scalars. Variance estimates are recovered by substituting parameter estimates into the above formulae.
In the second case, a half-normal model with heteroscedasticity in the error terms is considered. As shown by Caudill and Ford (1993) and Kumbhakar and Lovell (2000), ignoring heteroscedasticity may severely bias the estimate of the frontier and by this way the estimates of technical inefficiency. Following the literature 18 we let the variance of v i and the pre-truncated variance of u i to depend on exogenous observable variables. Therefore, σ 2 v and σ 2 u are parametrized by using exponential functions and are given by where z i and w i are vectors of observable variables and δ and θ are the associated vectors of parameters. The measure of technical inefficiency for each observation i is computed as the expected value of u i conditional on the composed error i , i.e. E(u i | i ), according to the formula provided by Jondrow et al. (1982). Similarly, the observation-specific technical efficiency is given by E(e −u i | i ) and it is computed as in Battese and Coelli (1995) In the heteroscedastic half-normal model, the examination of exogenous determinants of inefficiency is conducted through the analysis of the variables z i affecting the pre-truncated variance of u i in (10). The marginal effect of each exogenous variable z ik on the expected value of observation-specific inefficiency, is given by the derivative ∂ E(u i ) ∂z ik whose sign is the same as the sign of the δ k coefficient (see Appendix A).

Sample and Data
In their activity of licensing and monitoring the quality of childcare, regional authorities carry out periodic surveys collecting information at level of detail of day-care centres. The manager of each day-care facility is asked to answer a questionnaire providing information about characteristics of children, teaching staff, service personnel and the outsourcing of auxiliary services. Our empirical analysis is based on cross section micro-data collected in the regional survey carried out in 2007-8. 19 Different kinds of childcare facilities, such as day-care centres, micro-centres and age-integrated institutions, are kept distinct in the survey. From the point of view of production technology, these kinds of facilities differ from each other in many respects including the scale of operation and the output mix as well as the provision of complementary services and the use of the work force. In order to deal with a less uneven technology, we have decided to concentrate our attention on the core service represented by day-care centres, which covers nearly 68% of childcare services in the region (in terms of enrolled children). 20 The database contains 626 observations (day-care centres). The centres with access administered by a municipality are 495, three quarter of which are public, i.e. directly operated by the municipality and the remaining quarter are private, i.e. entrusted by the municipality to cooperatives or other private subjects. Non municipal centres, where both access and the provision of the service are privately managed, are 131 and cover about 20% of the children enrolled in day-care centres in the region. They are mainly private firms (38%), cooperatives (33%) and religious organizations (12%) and most of them (75%) have less than 35 children.
Dropping 131 non municipal centres and 13 incomplete observations gave us a sample size of 482 municipal centres, 361 of which are public and 121 are private. 21 The final selected sample covers 98.6% of the total number of children enrolled in municipal day-care centres in the region.
Some descriptive statistics of the sample composition, which also show the main significant differences between public and private centres, are presented in Appendix B, Table 5. Centre size is defined in terms of capacity, which is the maximum number of children that the facility is authorized to accept by local authorities according to the regional legislation. The center is considered small if capacity does not exceed 35, medium if capacity is between 35 and 60 and large if capacity exceeds 60. Medium or large size prevails in public centres (77%), while more than 80% of private centres have medium or small size. More than 62% of centres provide a joint service for infant and toddlers, while 181 centres, which amount to the remaining 38%, offer only toddler service. Most private centres (above 60%) provide only toddler service, while more than 70% of public centres serve jointly infants and toddlers. There are no centres with only infants.
On average, 30% of day-care centres have mixed-age classrooms, although they are not uniformly distributed between public and private facilities. In fact, about 45% of private centres have mixed-age classrooms, which is twice as much the percentage for public centres. Not all centres have children with disability and typically children with special needs are more concentrated in public facilities. Three-quarters of centres have only full time classes and more than 70% provide extra daily opening hours of service. This is true for both public and private facilities with no significant differences.
Table 5 also shows that the proportion of public centres producing all auxiliary services in-house is 30%, as compared to only 12% in private centres. Most services are produced in-house by both kinds of centres except for food services which are externalized by 80% of private facilities and by only 40% of public centres. Therefore, on average, public centres seem to provide better quality food which is usually associated with in-house service.
In short, the main differences between public and private day-care centres can be summarised as follows. Public centres have medium to large size, typically serve both infants and toddlers, most of them do not have mixed-age classrooms and half of them hosts children with disabilities. A large majority supplies better quality food because it prepares meals in-house. On the other hand, private centres have medium to small size, typically serve only toddlers and almost half of them have mixed-age classrooms. A large majority externalizes the food service.
The main average characteristics of the day-care service by type of provider are presented in Appendix B, Table 6. Public centres have a significantly higher number of children and classrooms than private centres, while average group size per class is the same (17 children). Public centres also have significantly higher proportions of infants and of children with disabilities. The child-staff ratio, which is given by the ratio of children to full-time equivalent teachers, is 6.96 in private centres against 6.20 in public centres. The difference is statistically significant and certainly is partly due to higher proportions of infants and children with disabilities which impose on public facilities more stringent requirements in terms of teaching staff. Being the child-staff ratio and the group size of classrooms usually considered as indices of 'structural' quality of the service, the data of our sample do not seem to support the view of large differences in the structural quality of the service between public and private facilities.
The average proportion of children attending part-time classes is higher in private centres than in their public counterparts, although the difference is not statistically significant. Daily hours of full time service are almost the same for both types of centres, while extra hours of service are higher in private facilities.
A difference between public and private centres is also found in the use of labour input. In fact, the weekly working time is shorter in the public sector, where it is also more frequent to observe part-time labour contracts. This difference is revealed by our data, where we notice that, although 85% of staff in both public and private centres has weekly hours exceeding 24 h, more than 30% of workers in private centres work for more than 36 h, while this figure is only 2% for public centres workers. These differences are also reflected in a higher number of teachers and a higher fragmentation index, which is the ratio of teachers to full-time equivalent teachers, in public centres.
To sum up, the main differences between public and private services are that public centres have a larger number of children and classes, a higher proportion of infants and of children with disabilities. There is no significant difference as to the proportion of part-time children, while the structural quality of the service, measured by group size and child-staff ratio, does not appear to greatly differ. Private centres have a larger number of extra hours of service, while public facilities have a higher fragmentation of the labour-force.
We end this section with a description of the variables used in our empirical investigation. Their main descriptive statistics are shown in Table 1.

Input variables
Two labour inputs are considered, the labour of teaching staff and the labour of service personnel employed by the manager of the centre or by external contractors. Teaching staff also includes assistant teachers, staff used to extend opening hours and staff used to care for children with disabilities. Service personnel includes workers providing auxiliary services such as food, janitorial and laundry services. The two labour input variables are measured by taking the sum of weekly hours of teaching staff and the sum of weekly hours of service personnel. Since the labour embodied in the outsourced food service is not directly observed, a figurative amount of labour input has been imputed to the centres purchasing catered food service by using the following method. We have computed the per child average of labour inputs employed in the production of the auxiliary services in large size centres with internal food preparation. Next, we computed the per child average of labour inputs in auxiliary services for the centres externalizing food preparation. The difference between the above two averages is the labour input that centres externalizing the service are assumed to bear per child. We proceeded by imputing to these centres the extra labour input for food service obtained by multiplying the above difference for the number of enrolled children. In applying the above imputation method, we are implicitly assuming that food preparation exhibits increasing returns to scale and that large size day-care centres benefit from the same economies as the firms specialized in the supply of food services to the externalizing centres. This imputation method may somewhat underestimate the efficiency gains that firms externalizing the food service may attain. The two types of labour input, for teaching and for services, are respectively denoted by edu and ser.
Capital input is mainly given by physical buildings and green areas, furnitures, equipments and variable capital such as heating and energy. The survey does not record data such as square meters of buildings and green areas so that we had to resort to a proxy in order to measure capital input. As the regional legislation fixes physical minimum standards for space and playing area per child, the maximum number of children that each centre is authorized to host is proportional to the physical size of the facility. Therefore, we take as a proxy of capital input the capacity of the day-care facility, i.e. the maximum number of children that the centre is authorized to host, and denote it by cap. Output variables Infant output and toddler output are separately measured by the respective total number of weekly child-hours. Daily opening hours are computed as the weighted average of hours of service of full-time and part-time classes and include extra hours of service. Weekly hours are computed over a five days working week and outputs are then calculated by multiplying average weekly opening hours by the number of children in each age group. Infant output and toddler output are respectively denoted by yi and yt.
The following variables, representing particular aspects of the organization of the day-care service or relevant attributes of the centres, will be used to capture efficiency differences across centres.
Other variables pub is a dichotomous variable which takes the value of 1 if the center is public, i.e. operated by the municipality. small is a dummy indicating centres with capacity less than 35. pro is a dichotomous variable indicating whether the center is located in a capital of province, where the survey is likely to have been conducted more accurately. ptr_ft is the ratio of part-time children to total enrolled children in centres with no part-time classes; this variable is equal to zero in centres with both part-time and fulltime classes. mix is a dummy indicating centres with mixed-age classrooms. disa is a dichotomous variable which takes the value of 1 if children with disabilities are enrolled at the centre. exh is the number of daily extra hours of service provided by the centre. food_in is a dummy variable identifying the centres which do not externalize the food service, i.e. the most important auxiliary service. frag is an index of fragmentation of the labour force which is given by the ratio of teachers to full-time equivalent teachers (normalized at 36 h/week).

Empirical Results
The day-care service technology is estimated using the data described in Sect. 4. The model is defined with three input variables, x 1 = cap/mean(cap), x 2 = ser/mean(ser) and x 3 = edu/mean(edu), and two output variables, y 1 = yi/mean(yi) and y 2 = yt/mean(yt), where the variables are rescaled to have unit means. Inputs are normalized by x 3 and finally the natural logs are taken. 22 We denote by nledu the negative of the log of teaching staff hours, by lcap the log of normalized capacity and by lser the log of normalized service personnel hours. lyi and lyt are respectively the logs of infant and toddler output. Significance levels: † 10%, * 5%, * * 1% The ML estimates of parameters for three different nested models are presented in Table 2. 23 As a benchmark we first estimated the half-normal model defined by (5), (6), (7) and (8), under the assumption of no one-sided error, i.e. with the restriction σ 2 u = 0. This model, which is denoted by OLS, reduces to the standard linear regression model with a symmetric, normally distributed error term and can be estimated by the OLS method. The estimated coefficients are shown in the third column of Table 2. The second column shows the estimates of the unrestricted half-normal model, where σ 2 v and σ 2 u are parametrized as in (9). This model, which is denoted by HN, allows us to conduct a first check of the correct specification of the Stochastic Frontier model with technical inefficiency. Finally, the first column of Table 2 presents the estimates of the half-normal model with heteroscedasticity in both the error terms, which is denoted by HNH. The variance of v i and the pre-truncated variance of u i are parametrized by the exponential function as in (10) and (11). The models HN and HNH are nested and specifically HN is the HNH model under the restriction that the coefficients δ and θ are equal to zero. The two models are compared to show the effects of heteroscedasticity on the estimated parameters and efficiency measures. The HNH model is also used to examine the determinants of technical inefficiency.
The OLS model fits the data quite well with an R-squared exceeding 94%. All the first-order coefficients have a correct sign and are significantly different from zero as well as several second-order parameters. However, the OLS residuals exhibit a strongly significant negative skewness which is a symptom of the presence of a onesided error. In fact, a comparison between the OLS and the HN models reveals that the specification of a stochastic production frontier model with technical inefficiency is supported by empirical evidence. A Likelihood Ratio (LR) test for the null hypothesis of no one-sided error was conducted by comparing the log-likelihood values of the 'restricted' model, OLS, and the 'unrestricted' stochastic frontier HN model. 24 The LR test for the null hypothesis σ 2 u = 0 has a mixed χ 2 distribution with 1 degree of freedom and its critical values for hypothesis testing are tabulated in Kodde and Palm (1986). The critical value at the 1% significance level is 5.412. The computed value of LM is 51.396 indicating a strong rejection of the null hypothesis of no one-sided error. Therefore, there is empirical evidence that justifies the use of the stochastic frontier model for the analysis of technical inefficiency in day-care centres.
The estimates of technical inefficiency in the HN model are obtained by using Jondrow et al. (1982) formula (see (18) in Appendix A). The average value of inefficiency is 0.15 (with a standard deviation of 0.11), which means that actual inputs can be proportionally reduced by 15% without reducing output. We also obtained the estimates of observation-specific technical efficiency, E(e −u | i ), by using Battese and Coelli (1995) formula (see formula (19) in Appendix A). The estimated average technical efficiency is 0.87, with a standard deviation of 0.09, meaning that the daycare service can be provided by employing, on average, only 87% of actual inputs. The distribution of observation-specific technical efficiency estimates of the HN model is plotted in Fig. 3 in Appendix B.
An unsatisfactory aspect of the estimated HN model is revealed by the comparison of the frontier coefficient estimates between the OLS and HN models, which shows a number of quite sizeable differences. Since the OLS is known to produce consistent estimates of slope parameters, 25 the presence of these discrepancies may cast some doubts about the goodness of frontier estimates in the HN model and, as a result, of inefficiency estimates. Moreover, since neglecting heteroscedasticity may result in biased estimates, as pointed out by Kumbhakar and Lovell (2000), we decided to introduce heteroscedasticity of the error terms in the half normal model. Heteroscedasticity has been modelled by using the exogenous variables listed in Table 1. In the variance equation of the statistical error, (11), two exogenous w h variables have been considered, pro and small. In fact we noticed that data measurements are more accurate and systematic in centres located in urban areas and that small size centres exhibit greater variability in the data because they are usually located in rural areas, where the use of capacity is more vulnerable to changes of demographic factors. In the variance equation of the inefficiency term, (10), we included as exogenous variables, z k , those which are supposed to be the major sources of technical efficiency or inefficiency. For example, the public or private nature of the centre and the characteristics of the organization of the service such as the size of part-time when part-time classes are absent, the extra hours of service, the presence of mixed-age classes, the degree of externalization of auxiliary services and the fragmentation of the labour-force. The estimates of the HNH model are shown in column one of Table 2.
We performed a LR test to check whether the model with heteroscedasticity is preferred as compared to the HN model. Since the two models are nested, we tested for the null hypothesis that θ = 0 and δ = 0. The computed LR test is 187.84, while the critical value at 1% significance level of the χ 2 distribution with 9 degrees of freedom is 21.67. Therefore, the homoscedastic half-normal model is strongly rejected and we shall focus our analysis on the HNH model.
First of all it can be noticed that the slope coefficients of the production frontier are not noticeably different from those of the OLS model and, in particular, the first-order input coefficients are positive and significant. 26 The first-order coefficients of outputs are significant and negative, as expected, and are used to obtain an estimate of returns to scale. A local measure of returns to scale is the elasticity of scale which is given, at input and output means, by η s = −1/(β 1 + β 2 ). 27 The estimated elasticity of scale, obtained by substituting the firstorder output coefficients, is 1.21, which means that the day-care production technology exhibits locally increasing returns to scale. 28 A 1% proportional increase in all inputs may produce a proportional increase in all outputs by more than 1% and precisely by 1.21%. This conclusion, which agrees with the results of others empirical studies on day-care services, 29 indicates that the average centre size of 47 children is smaller than the minimum efficient scale of production.
The estimated model allows us to perform a test of separability between inputs and outputs. If the marginal rate of transformation between inputs is not affected by the amount of outputs, then an aggregate measure of outputs can be used in the representation of technology, which can also be described by a single-product technology. Separability obtains if all the cross input-output coefficients γ hm are equal to zero. Although almost none of these coefficients taken individually is significantly different from zero, the joint hypothesis that all γ parameters are equal to zero can be rejected. 26 The estimated coefficient of teaching staff input in (12) is recovered from the homogeneity restriction (13) in Appendix A and is α 3 = 0.286. The other parameters in (12) can be similarly recovered from homogeneity restrictions (13), (14) and (15). 27 The elasticity of scale for multiple output technologies was introduced by Panzar and Willig (1977). See also Färe and Primont (1995) and Appendix A. 28 The LR test for the null hypothesis of constant returns against the alternative of increasing returns to scale is rejected at any of the usual levels of significance. 29 See, for example, Powell and Cosgrove (1992) and Mocan (1997).  Indeed, a LR test for the null hypothesis that γ hm = 0 for h = 1, 2 and m = 1, 2 was conducted. The value of the LR test is 9.78 while the critical value at the 5% (1%) level of significance of the χ 2 distribution with 4 degrees of freedom is 9.49 (13.78). Therefore the hypothesis of separability between inputs and outputs is rejected at the 5% level of significance (but not at 1%). Our empirical results seem to suggest that the use of aggregate output and a production function in the analysis of day-care technology is not justified and should be avoided.
The estimates of technical inefficiency and efficiency in the HNH model are computed by using formulae (18) and (19) in Appendix A. 30 The average inefficiency is about 0.10, with a standard deviation of 0.09, and the average technical efficiency is 0.91, with a standard deviation of 0.08. As can be noted, there are noticeable differences in the estimated values of efficiency and inefficiency between the HNH and the HN model. The average inefficiency falls from 14% to 10% while efficiency increases from 87% to 91%. These more favourable measures are to be preferred since are based on more accurate estimates of the production frontier. The distribution of technical efficiency estimates is plotted in Fig. 2. As can be seen, the distribution is highly concentrated since almost three-quarters of the centres have technical efficiency above 90%. 31 Table 3 shows the distribution of average inefficiency by type of provider and by type of service. Public centres are, on average, more inefficient than private ones, in 30 The measures are computed by using the observation-specific estimates of σ 2 v,i and σ 2 u,i , which in turns are obtained from (10) and (11) by substituting for the estimated values of parameters. 31 The distribution of estimated technical inefficiency is plotted in Fig. 4 in Appendix B. fact inefficiency in private centres is over 6% while it is about 11% in public centres. 32 Moreover, Table 3 shows that centres with only toddlers service are more inefficient than centres providing jointly services for infants and toddlers, the difference of score being greater than 2% points. Therefore, the specialization of production in the childcare sector does not seem to result in efficiency gains. Table 3 also reveals that the lowest level of average inefficiency is 5% which is achieved in day-care centres which are private and that provide a joint service for infants and toddlers.
The HNH model has been used to address the issue of exogenous determinants of inefficiency. In particular, we have seen which of the z h variables that parametrize the variance of u i can be regarded as the sources of inefficiency and we have evaluated the magnitude of their effects. As explanatory variables we have considered the public or private nature of the day-care center and the other variables in Table 1 related to the organization of the service and the labour-force.
The estimates of δ coefficients in the parametrization equation of σ 2 u,i , (10), are presented in the first column of Table 2. The estimated value of δ k gives the effect of the z k variable on the variance of u i , but it does not directly provide the marginal effect of z k on inefficiency, because the dependence of the unconditional mean E(u i ) on σ 2 u,i is non linear. Although the sign of the coefficient δ k reveals the sign of the effect of the variable z k , the magnitude is given by the value of the marginal effect computed by using formula (20) in Appendix A and it is shown in Table 4.
All the estimated δ coefficients have the expected sign and most of them are different from zero at a level of significance below 5%. The coefficient of pub is positive and significantly different from zero meaning that public day-care centres are more inefficient. The point estimate of the marginal effect is 0.041, which means that, on average, public centres have a value of technical inefficiency of 4.1% points higher than private centres, all else being equal. Another positive and highly significant effect on inefficiency is given by the presence of part-time children in centres with no classrooms arranged on a part-time basis. The marginal effect of ptr_ft is 0.174 meaning that if the ratio of part-time children to total children increases by 10% points the inefficiency rises by about 1.7% points. Since more than three-quarters of centres have only fulltime classes and since the average proportion of children served on a part-time basis is about 20%, without significant differences between public and private centres, we conclude that this determinant is a major source of inefficiency and equally affects public and private facilities.
A negative effect on inefficiency and therefore a pro-efficiency effect is provided by extra hours of service, i.e. hours in excess of the standard opening hours. From Table 2 it is seen that exh has a negative and highly significant coefficient and, from Table 4, that its marginal effect is − 0.064. Increasing by 10 min the extra hours of service there is the opportunity to obtain a proportional reduction of inputs by about 1.1% points. Centres supplying extra hours of service are about 70% and are equally found between public and private facilities.
The mix variable has a positive and significant coefficient. The marginal effect on inefficiency of having mixed age classes is 0.033, which means that this type of organization of the service increases, on average, by 3.3% points the inefficiency of the day-care centre. Indeed, if children of different age groups are put together in the same classroom, regulations require that the child staff ratio of the younger children is met, thus older children are served with more teachers than required. As shown in Table 5, this factor of inefficiency affects more heavily private rather than public centres.
The frag variable has a positive and significant effect on inefficiency, although its marginal effect, which is 0.069, is of limited magnitude. For example, suppose to have 10 full-time teachers and decide to pass to 9 full-time teachers an two half-time teachers, then the frag index increases by 0.1 and the average predicted increase of inefficiency is only about 0.7% points. The last two variables, food_in and disa might have a positive effect on inefficiency, however their coefficients are not significantly different from zero. Empirical evidence suggests that in-house production of the food service (and other auxiliary services) does not seem to be a key determinant of inefficiency, although, as noted in Sect. 4, its effect may be underestimated due to the method we used to impute labour input to externalized food service. Children with disabilities do not seem to absorb extra resources, may be because their number is relatively small and they are evenly distributed across centres.
We close this section with a few remarks about the inefficiency difference between public and private centres that emerges from our empirical investigation. An average difference in efficiency of 4.1% points is not negligible if one considers that this value is 'cleaned' from all other factors which are usually introduced to explain the differences in efficiency between public and private providers, such as the greater flexibility in the use of labour in private employment contracts, as captured by the fragmentation index, or from the higher degree of externalization of auxiliary services. Our empirical analysis suggests that the inefficiency gap that we have estimated may originate from other factors that are not been explicitly considered, such as better managerial abilities in private centres and/or the stronger incentive to minimize costs deriving from the pursuit of profit. Nevertheless, it is also worth noting that the greater inefficiency of public providers may be partly explained by a better structural quality. In fact, as we have seen in Sect. 4, the child-staff ratio in public centres tends to be lower than in private centres.
Finally, a global comparative evaluation of efficiency should also take into account the overall quality of childcare services in public and private centres. We have not examined this delicate issue here due to the lack of information in our data set, although a few elements in favour of a greater quality of the public service can be identified. Most notably, the higher wages for educational staff in the public sector, which are supposed to attract more experienced and qualified teachers, and the preparation of meals in-house, which is quite widespread among public centres and should guarantee a better quality of food for children.

Summary and Conclusions
In this work we estimated the production technology in the childcare sector of an important region of northern Italy and we obtained measurements of technical inefficiency of day-care centres. We also identified the major determinants of inefficiency and we provided estimates of the magnitude of their effects, carrying out a comparative analysis between public and private centres. We found that technology exhibits locally increasing returns with an estimated elasticity of scale of 1.21. This result indicates that the technically efficient size of productive units in the childcare sector is greater than the average centre size of 47 children. Moreover, data do not seem to support the hypothesis of a technology with separable inputs and outputs, therefore the production of day-care services is best described by a multiple input and output model, while the use of aggregate output with a production function does not seem to be justified.
Technical inefficiency in the day-care sector is about 10% indicating that, on average, centres can proportionally reduce inputs by that amount without reducing the amount of service provided. Also, 10% can be considered as a measure of potential cuts in the costs of providing the service. On average, public centres are more inefficient than their private counterparts as technical inefficiency is about 11% in public centres and ranges from 6 to 7% in private centres. The inefficiency gap between public and private provision of the service is estimated at 4.1% points, other things being equal. In part, this estimate may reflect a higher structural quality of public childcare as compared to private.
Other important determinants of inefficiency are related to specific modes of organization of the day-care service. Centres with no classes arranged on a part-time basis and serving part-time children are more inefficient; specifically, increasing the proportion of part-time children by 10% points raises inefficiency by 1.7% points. On the other hand, centres offering more extra hours of service are more efficient, because they economize on hours of standard full-time service; 10 more minutes of extra service are associated with an efficiency gain of 1.1% points. Centres with mixed-age classrooms are, on average, more inefficient by 3.3% points. The fragmentation of the labour force increases inefficiency, although the effect is of limited magnitude, and the choice of outsourcing or producing in-house auxiliary services, such as food, does not seem to have a sizeable effect on efficiency. Finally, most of the above determinants do not seem to affect differently public and private facilities, because they are evenly spread over the two types of providers. For example, the practice of supplying part-time service without classes organized on a part-time basis, which is one of the major sources of inefficiency, is equally shared by more than 70% of both public and private centres. A notable exception, however, is the presence of mixed-age classes, which is an important source of inefficiency characterizing private centres.
There are a few policy recommendations stemming from our analysis. First, there is certainly room for improving efficiency in the day care sector in Italy if even in one of the regions with the highest standards in the provision of childcare inefficiencies turn out to be not negligible. Second, expanding childcare services would be desirable not only from a socio-economic point of view, but also on efficiency grounds. In fact, the average size of centres in the childcare sector being relatively small limits the opportunity to take full advantage of scale economies. It follows that the expansion of the service should be pursued by increasing the actual size of day-care centres and, in particular, by increasing the presence of private providers which have a smaller average size and suffer most from efficiency losses due to a more intensive use of mixed-aged classes. Third, it must be stressed that simply increasing the number and the size of private centres will not remove some of the major sources of inefficiency which are shared by both public and private providers. Hence, other interventions in the modes of organizing the childcare service are needed in order to improve the efficiency of the sector.

Compliance with Ethical Standards
Funding This study has not benefited from funding.

Conflict of interest
The authors declare that they have no conflict of interest.
In order to estimate the parameters of the model, further assumptions are introduced. The inefficiency terms u i are treated as random variables with a half-normal distribution N + (0, σ 2 u ), where σ 2 u is the variance of the normal distribution before truncation. 34 Moreover, the inefficiency terms are supposed to be independently distributed across observations. The random errors v i are independently and normally distributed with zero mean and variance σ 2 v and are uncorrelated with explanatory variables. The empirical model (5) does not suffer from an endogeneity problem, as the regressors (the input ratios) are not correlated with the inefficiency error term. Indeed, notice that the first-order conditions of the cost minimization problem can be written in terms of input ratios as for h = 1, 2, where p h and p 3 are input prices. The system of two equations can be solved forx 1 andx 2 as functions of only exogenous variables, i.e. input prices p 1 , p 2 , p 3 and outputs y 1 and y 2 . Hence, input ratios are not affected by inefficiency so that the empirical model (5) does not suffer from endogeneity problems. The parameters of the half-normal model are estimated by using the Maximum Likelihood (ML) method. The assumptions about u i and v i are used to derive the distribution of the composed error term, i , and thus the log-likelihood for each observation which is 35 where φ and are, respectively, the probability density and the probability distribution functions of the standard normal and 34 Notice that, although the distribution has zero mode, the mean of u i is different from zero and Var(u i ) is not equal to σ 2 u . Indeed, E(u i ) = σ u √ 2/π and Var(u i ) = σ 2 u (π − 2)/π . 35 See Stevenson (1980) or Kumbhakar et al. (2015).
The ML estimates are obtained by numerical optimization of the sum of the loglikelihood of each observation.
The measure of technical inefficiency for each observation i is computed as the expected value of u i conditional on the composed error i , according to the formula provided by Jondrow et al. (1982) where μ * i and σ * are as in (16) and (17). Similarly the observation-specific technical efficiency, computed as in Battese and Coelli (1995), is given by In the heteroscedastic half-normal model, the examination of the exogenous determinants of inefficiency is conducted through the analysis of the variables z i affecting the pre-truncated variance of u i in (10). The marginal effect of each exogenous variable z ik on the expected value of observation-specific inefficiency, is given by the derivative Since φ(0) > 0, the sign of the marginal effect is the same as the sign of the δ k coefficient. Finally, a local measure of economies of scale can be computed by using the firstorder estimated parameters of D(x, y). In fact, as shown in Färe and Primont (1995) (pp. 39-40), the elasticity of scale at (x, y) can be written in terms of elasticities of the distance function as η(x, y) = − 1 ε D,y 1 (x, y) + ε D,y 2 (x, y) where ε D,y m (x, y) = ∂ log D(x, y) ∂ log y m , m = 1, 2 As easily seen from (12), ε D,y m (x, y) evaluated at the variables mean, i.e. at (x, y) = (1, 1, 1, 1, 1), is equal to the first-order coefficient of log y m , so that the local value of the elasticity of scale is given by