Investing in formal on-the-job training: are SMEs lagging much behind?

In a modern economy, the investment in human capital by firms is crucial to foster technological adoption and foster productivity growth. This paper analyzes the correlation between firm size and the investment in job training by employers. Using a large firm level data set across 99 developing countries, we show that a strong and positive correlation in the investment in job training and firm size is a robust statistical finding both within and across countries with very different institutions and levels of development. Even though we cannot fully disentangle correlation from causality, we show that the size-training gap is not fully explained by differences across firms in market imperfections or institutional failures impeding the development of smaller firms. Our findings call for the urgency of collecting better panel data sets to understand how cost-effective are on-the-job training programs in fostering firm productivity and growth in developing countries. Jel codes: J24, D24


Motivation
The international community has long recognized the important role of the small and medium enterprise sector (SMEs) in the economies of the developing world. Policymakers around the world worry about how to foster productivity and growth among this group of firms. In a modern economy, the investment in human capital is crucial to foster technological adoption and, thus ultimately, achieve higher productivity growth. This paper explores a large firm level survey across 99 countries to document the differences in the job training provided by employers across firms of different sizes. Our findings show that a strong and positive correlation across the investment in job training and firm size is a robust empirical finding within and across countries with different institutions and income levels. Furthermore, proxies for some market imperfections and institutional failures impeding SME development do not explain most of the differences in the training intensity across small and large firms. It is thus a very robust finding that large firms conduct more on the job training. Unfortunately, with our data, we cannot fully disentangle correlation from causality. It is possible that size is not driving training and that on-the-job training is productive and itself drives firm productivity and growth. We highlight that more and better data sets are needed to rigorously tackle this identification problem and thus shed more light on this important policy question in developing countries.
The importance of the SME sector throughout the developing world is undeniable. First, SMEs account for more than half of manufacturing employment in many countries (e.g., Ayyagari et al. 2008). Second, there is also a growing recognition of the role that SMEs play in sustained global and regional economic growth, higher employment and poverty alleviation. Moreover, few economists disagree that SMEs face greater constraints to their growth than large firms. Access to finance usually ranks high among these constraints and is often pointed to as the main reason behind SMEs having a smaller capacity to invest. Documenting and understanding the main binding constraints on firm's investment and growth in developing countries is crucial for the design of policies that promote long run productivity growth.
The investment in human capital has been widely documented as a core component of each individual's human development, firm growth and aggregate productivity growth. For example, Heckman et al. (1998) estimate that individuals invest in human capital over the whole life-cycle, but more than one half of lifetime human capital is accumulated through post-schooling investments taking place on the job. Moreover, differences in total factor productivity account for approximately half of the differences in income across countries and are generally associated with differences in technological progress (Hall and Jones, 1999). These differences are also large between firms within a single country (Hsieh and Klenow, 2007), and technology adoption and investment in human capital are shown to be core factors in explaining how firms catch up to the technology frontier. Moreover, these factors are also important for designing policies to enhance growth and development. Surprisingly, very little research has been done on the differences within countries and across firms in the investment in job training around the developing world, and the reason also relates mainly to lack of data. This paper explores a unique cross sectional firm level data set across 99 countries in the developing world to document differences in the investment in job training across firm sizes. The Enterprise Surveys, collected by the World Bank, have unique information to study this topic. First, the surveys explore an almost standardized questionnaire across countries and thus collect information that is comparable across and within countries. Second, the surveys are available for 99 developing countries covering all the geographical regions of the world and income levels. This wide range of countries covered allows us to test the extent to which the existing differentials are explained by differences across countries in their institutions and policies. Third, the survey collects detailed firm characteristics including variables that are good proxies for the firm's access to information and external finance, measures of the degree of openness and technological innovation, measures of the human capital composition of the workforce and on the perceptions regarding the firm's investment climate. The availability of several firm characteristics will allow us to analyze the role of different factors in explaining the correlation between the investment in job training and firm size.
Unfortunately, survey limitations affect the scope of the analysis beyond our control. First, in most countries, these surveys are representative of the formal sector, and particularly of the manufacturing sector. Since in developing countries the services and/or informal sector can reach more than half the workforce, this will naturally limit the representativeness of the analysis to the informal or non-manufacturing sectors. Still, we expect that most of the job training taking place in the informal sector to be more of the type of learning-by-doing or apprenticeship rather than formal training programs (see Johanson and Van Adams 2004). Second, the surveys collect only information on formal training programs, leaving undocumented any informal training taking place while workers are on the job (e.g., learning by doing) 1 . For this reason, we are likely underestimating the overall investment in training, especially among the smaller enterprises, where informal training or apprenticeship schemes are likely to be more important (e.g., Frazer, 2006, Velenchik, 1995, Teal 1996, Monk et al. 2008) 2 . However, our findings are still relevant and important for the formal training programs. Apprenticeships are usually focused on employee hands-on learning to enable usually youth to perform a given task. They often have limited or no linkages to more academic training. Formal training programs, instead, tend to be more broadly determined by the needs of firms, often when they start new tasks and explore new production processes or upgrades/changes in knowledge and procedures. This training is obviously very different from the apprenticeship training. In this paper, we focus only on formal training programs and on how different the patterns of investment are across firm sizes.
The paper documents several interesting findings. First, we find robust evidence of a large and statistically significant positive correlation between firm size and the investment in job training. In particular, we find that small (11-50 permanent employees), medium (50-250 employees) and large firms (more than 250 employees) train approximately 13, 30 and 40 percentage points more than micro firms (with 10 or less employees).
Second, our findings show that these differences are robust across countries in different geographical regions and income levels and, thus, also with different institutional and economic backgrounds. Interestingly our findings show that the disadvantage of micro firms relative to large firms in offering formal training opportunities is greater for firms operating in the Middle East and North Africa and in Africa, as well as for low income countries where formal training incidence is very low. Moreover, within countries, this pattern is also robust to firms with similar patterns of investment in innovation and technology adoption, operating in the same sector of activity and located in the same city.
Third, we find robust evidence that the differences across firm sizes in the investment in job training are not fully explained by differences across small and large firms in the access to information and external finance, facility in the coordination with workers or in the degree of perceived economic uncertainty. Even though the disadvantage of SMEs across all these factors may partly explain their lower training provision across the developing world, these differences do not fully explain their gap in investment in training with respect to larger firms 3 . It is reassuring to see that the differential in the investment across firm sizes is robust to several cuts in our sample as well as to the control of many firm observed characteristics (e.g., managerial education, contract structure within the firm or perceptions of the economic uncertainty) or unobserved country-sector and city variables. The small explanatory power of these market imperfections is again a robust finding across geographical regions and countries with different levels of development.
Even though our main empirical findings survive a battery of tests, because we are exploring a cross sectional data set, we cannot ultimately rule out that the positive correlation between firm size and the investment in job training is driven by a reverse causality argument. In fact, it is very possible that on-the-job training is itself productive and impacts firm productivity and growth, ultimately leading to larger firm sizes (and not the other way around). It is thus difficult to interpret this causally. In order to do so convincingly, one would need an instrumental variable for training, or a panel data set observing over time the investments in formal on-the-job training programs and direct measures of firm productivity. Only if such data were available would it be possible to investigate whether training is productive and the extent to which it impacts firm productivity growth.
Nevertheless, it is reassuring that our results are quantitatively large and very robust across a wide set of specifications and after controlling for several different firm characteristics. And even though it is likely that part of the identified correlation is explained by the productivity impacts of training, it is unlikely that all the correlation is accounted by this reverse causality argument. These findings are, however, important to help shape the data and analytical agenda moving forward. They highlight the importance of determining more rigorously whether the investment in formal training is, or not, productive in developing countries and which types/modalities of formal trainings are most cost-effective and for which types of firms 4 . Possible future avenues for data collection and for research in developing countries involve collecting longer panel data sets with firm level and time series information on the human resources practices of firms (including investments in formal training programs, apprenticeships or other informal learning) together with information on direct measures of firm productivity several firm characteristics, including technological adoption practices, and also the direct and indirect costs of these investments. For instance, Dearden et al. (2006) and Almeida and Carneiro (2009) illustrate for UK and Portugal, respectively, how long panels can be used to estimate the impacts of job training on productivity. In addition, there is also need for more research evaluating rigorously the impact of specific policy reforms incentivizing the investment in job training and analyzing their impacts on firm growth and productivity (see, e.g., Leuven and Oosterbeek, 2004).
It is worth stressing that we do not directly address the problem of the determinants of the firm size distribution at the country level. Instead, we take the distribution of firm size as given within countries and investigate the impact of this predetermined size structure in the investment in job training at the firm level. Nevertheless, our data shows substantial differences in the distribution of firm sizes across countries. And a significant part of these differences are likely to relate to differences across countries in institutions, like product and factor regulations, or in the fiscal policy (e.g., Kumar et al. 1999) 5 . However, our empirical analysis conditions on country and sector heterogeneity and thus explores within country and sector variation across firm size. Moreover, we will also test the robustness of our findings when exploring variation within country, sector and city.
Our paper relates with three strands of the literature. First, we relate to the literature analyzing the patterns and determinants of the investment in job training. In spite of the importance of the topic for both individuals and firms, the systematic empirical evidence based on micro data in the developing world is still scant. Exceptions include the work by Frazer (2006), Teal (1996), Velenchik (1995), Lopez-Acevedo and Tan (2003), Rosholm et al. (2007), Pierre and Scarpetta (2006), Almeida and Aterido (2011) 6 . Some interesting patterns have been documented at the firm level for developed countries, including the advantage of large firms in this investment (e.g., Black and Lynch, 2001;Lillard and Tan, 1992;Leuven and Oosterbeek, 1999;Royalty, 1996;and Bassanini et al. 2005, for OECD countries). However, to our knowledge, no empirical work to date has documented the robustness of the differences in the intensity of offering formal training within and across developing countries with very different institutional and economic differences.
Second, we relate to the literature investigating whether training is productive. To our knowledge, this literature exploring direct firm level productivity measures is quite reduced, partly due to the lack of good data in developing settings. One exception is Lopez-Acevedo and Tan (2003), who explore firm panel data for Mexico between 1993 and 1999 and show that training had a large and statistically significant impact on productivity. Conducting joint training and R&D yielded larger returns than these investments alone. In the developed world, evidence has shown that even though public job training has low returns, private job training can be quite productive and have heterogeneous returns. The empirical findings overwhelmingly show a strong and positive impact of employer provided training on different measures of firm productivity both in a cross section (e.g., Bartel, 1995) or with longitudinal panels (e.g., Dearden et al., 2006, Zwick, 2006, Almeida and Carneiro, 2009, Colombo and Stanca, 2014 . More recently there is also evidence from a randomized experiment finding supportive evidence of positive impacts of training on firm performance (De Grip and Sauermann, 2012) 8 . Almeida and Carneiro (2009) find that rates of return are very heterogeneous but that for those firms providing some training, employer provided training is productive. To the extent that this pattern is also true in developing countries, we may expect that part of the estimated correlation between firm size and training could be driven by a productivity effect of training. However, it is unlikely that this explains all of the correlation as it is quantitatively large.
In addition, some studies have also looked at the impact of training on wages, assuming this impact is a lower bound of the impact on firm productivity. As summarized recently in Almeida and Faria (2014), the point estimates are generally positive, but magnitudes are quite diverse. Chung (2000) and Johanson and Van Adams (2004) explore cross sectional data and find evidence of large returns (between 20% and 38%) for Malaysia and Tanzania, respectively. On the other hand, Frazer (2006) finds that in Ghana, during the 90s, the returns to apprenticeship training were not statistically different form zero. Monk et al. (2008) find, in addition, some heterogeneity within country and across education levels. They show that the returns of apprenticeships are 50% for individuals with no education but decline as education raises. Rosholm et al. (2007) and Almeida and Faria (2014) both explore a matched employer and employee data set and a propensity score matching methodology. They find that the wage returns to training are on average 21% for Kenya and that in Zambia, training is not associated with higher wages, while Almeida and Faria (2014) show that the average wage returns to on-the-job training are 7.7% for Malaysia and 4.5% for Thailand.
Third, we relate to the empirical literature looking at the growth constraints facing SMEs in the developing world. Some papers analyze the differences between small and large firms in their growth and productivity and on how these relate to differences in the general business environment (e.g., Van Biesebroeck, 2005, Ibarrarán et al. 2009. A particularly large strand of this literature looks at differences across firm sizes in the access to external finance (e.g., Beck et al. 2005). Some papers have also analyzed differences across small and large firms in other performance indicators like the investment in innovation and technological adoption (De Mel et al. 2009). While it is unquestionable that SMEs play an important role in the developing world, most analyses do not lend foundation for policies supporting SMEs (e.g., through subsidizing SME's investments; see Beck et al. 2008 andIbarrarán et al. 2009) 9 . However, most micro analyses have been criticized for being country or region specific. To our knowledge, no previous work has investigated empirically the differences across small and large firms in the investment in job training by SMEs.
The rest of the paper proceeds as follows. Section 2 briefly describes the data set used. Section 3 discusses alternative reasons why SMEs could be less likely to invest in job training than larger firms. Section 4.1 documents the differences across firm sizes in the intensity to train, and section 4.2 analyzes the heterogeneity of these findings across alternative samples within and across countries. Section 5 analyzes the extent to which training differences across firm size are fully explained by differences in market imperfections and institutional failures impeding SME development. Section 6 concludes and draws policy implications.

Data
Our analysis explores a large firm level data set, the Enterprise Survey, collected by the World Bank across several developing countries. For each of the 99 countries in our sample, we select the most recent wave of data available. The only exception relates to a few countries where we have included a previous wave instead of the most recent one to insure a more comprehensive coverage of the relevant variables explored in our analysis. The final data set covers more than 48,000 firms operating in 99 developing countries and surveyed between 2002 and 2007.
The Enterprise Surveys are one of the best data sets to analyze the employer provided job training across developing countries 10 . First, the surveys collect comparable information for several firm characteristics across all the countries. This comparability allows us to document cross country and within country profiles of firms offering job training. Second, the survey collects information on training intensity at the firm level as well as several other firm and workforce characteristics. These include the firm size and age, human capital composition of the workforce, measures of R&D and technology adoption and firm openness. In addition, there is also detailed information on the firm's geographical location and its sector of activity (2-digit-ISIC classification). Third, the surveys reach a substantial number of countries across all the regions of the world: 22% of the sample is in Africa, 20.7% of the sample in East Asia, 17.4% in Eastern Europe, 21% in Latin America, 9% in MENA and 9% in South Asia. This wide coverage allow us to document several cross country correlations between the investment in job training and country level indicators such as the country's level of development, institutional quality or general educational attainment. Finally, the surveys have the advantage of collecting information on training flows. This information is likely to be a more accurate measure of recent job training than in surveys attempting to measure the stock of training at the firm level (see, e.g., Bassanini et al. 2005) 11 . We measure training intensity at the firm level by computing a dummy variable that equals one when the firm reports having supplied formal training to their workers. The exact question in the survey is: "Do you offer formal training to your permanent employees?" Additional file 1 defines all the variables used in the paper.
Some firms in the data report missing information on job training. This could raise some concerns regarding some biases on misreporting 12 .
It is worth discussing how firms could differ significantly in how they define a formal job training program across countries, sectors and possibly even firm sizes. While all the training events refer to formal programs supplied by the firm, they are likely to differ in content, organization and financing 13 . For example, even though all programs are provided by the firm, the training is not necessarily financed equally by the firm and their workers. In some firms, workers might be willing to support part of the cost, either directly or indirectly (e.g., through lower wages). The training might also have a more general or firm specific content, or it can differ depending on whether it is delivered by the firm itself or by a (private or public) training institute. This heterogeneity in the service provided illustrates well the complexity of comparing training incidence across countries. We assume that these differences in training content become less relevant, and eventually vanish, when comparing firms within the same country, city and sector of activity. This will be our main strategy throughout the empirical work.
The final sample covers more than 48,000 firms with non-missing training data across all the regions of the world. The final sample covers both manufacturing (78%) and non-manufacturing (22%) sectors. Within manufacturing, the sample covers: Food and Beverages (17%), Chemicals and Plastics (14%), Electronics (8%), Textiles (10%), Garments and Leather (20%), Metals and Machinery (19%), Paper, Wood and Furniture (10%) and Other (3%). On average, 39% of the firms in the sample report offering job training to their employees. However, Figure 1 illustrates well the wide dispersion across regions of the world and countries. The dispersion in this figure is sticking. Countries in South Asia, the Middle East and in Africa are among the lowest providers of on the job training to their workers. In particular, only 11% and 15% of the firms in Pakistan and Senegal report offering training, respectively. Rather, in Eastern Europe, East Asia and in Latin America, there are larger shares of firms training. At least 70% of the firms in Slovakia, Chile and Thailand offer job training programs to their workforce. Nevertheless, one must be cautious when comparing these cross country incidences as they may be affected by measurement error. Table 1 reports the summary statistics for the main variables used in the paper. On average, there are 30% of micro firms (up to 10 permanent employees), 37% of small firms (between 11 and 50 permanent employees), 19% of medium firms (between 50 and 250 permanent employees) and 7.7% of large firms (more than 250 permanent employees). The average firm in the sample is 16.5 years old and has a 51% probability of being located in the capital or in a large city. On average, workers have 7 years of schooling (equivalent to incomplete secondary). In addition, 24.5% of the firms in the sample export, 11.8% have at least 10% foreign capital and 54% have recently adopted new technology. Finally, most of the firms in the sample are in the manufacturing sector. Among these, only 23% of the firms operate in high technology sectors like Electronics, Chemicals and Pharmacy, Auto Equipment and Machinery. The remaining 77% operate in low technology sectors like Textiles, Garments, Agro-industry, Wood and Furniture, and Plastics. Table 2 reports interesting correlations across firm profiles and training incidence. We divide the sample into training and non-training firms and summarize the main variables of interest. The evidence supports the common view that training firms tend to be larger, more open and older than non-training firms. They also operate in more capital intense sectors, have higher labor productivity and pay higher wages. Furthermore, they have a more skilled workforce and invest more in technology than non-training firms.
3 The differential in job training by firm size: modeling the determinants We consider next a simple empirical model to document the size differential in the provision of formal training programs. Assume that firm i in industry j and in country c decides whether or not to train its workers if the present value of expected profits from this investment (future benefits minus costs) is positive.
Even though we cannot observe the expected profit π Ã ijc (latent variable) in our data, we do observe whether the firm offers job training to its employees. We assume that the firm's profit of investing in job training is a linear function of the firm size and of other firm observable characteristics X ijc and its country-sector of activity μ cj: where ε ijc captures the firm unobservable characteristics correlated with the return of investing in job training. Given this latent functional form, the probability that firm i offers job training is given by: Equation (2) can be estimated by maximum likelihood assuming that the error term is normally distributed (probit model). Variable Size ijc includes four firm size dummies: micro (up to 10 employees), small (11-50 employees), medium (51-250 employees) and large (more than 250 employees) 14 . In X ijc we include proxy measures for the degree of firm openness (dummy variable when the firm exports more than 10% of total sales or has more than 10% foreign owned capital), public capital ownership, intensity of technology adoption and for the human capital of the workforce. μ cj are the country and 3-digit ISIC interaction sector dummy variables 15 . Standard errors are clustered at the country and sector level to capture any auto-correlation of the residuals across firms within countries and sectors.
The main coefficients of interest in equation (3) are the β's for each of the size dummies. In the empirical work, the omitted category will always be micro firms; therefore, the β reports the percentage point difference in the training incidence for small, medium and large firms relative to micro firms. It is worth highlighting that we always control for country and sector fixed effects. Accounting for country fixed effects is important since countries differ in the strength of their institutions and investment climate, and this is likely to simultaneously influence the incentives of firms in providing on-the-job training as well as the firm size distribution in the economy. For example, we expect that countries with regulatory institutions favoring larger firms to have a size distribution of firms that is more skewed towards bigger firms, which in turn might offer more training. Furthermore, allowing for country and 3-digit sector fixed effects accounts for differences across firms (within countries) in the capital intensity of their technology, which will also simultaneously affect training intensity and the size distribution in the economy. Therefore, we are confident that our results are not driven simply by the sector composition within countries or to the cross country variation in institutions. Rather with our reduced form, our findings will be driven by comparing differences across firms in the training intensity within the same country and detailed (3-digit ISIC) sector of activity. This decision reflects the fact that both firm size and the intensity to train (as well as the several dimensions of the business environment that we will explore in next section) have substantial variation within countries and sectors. Moreover, controlling for a set of interaction dummies by country and sector also accounts for potential omitted variables and possible measurement errors across countries.
We reinforce that we do not attempt to claim any causality between firm size and the likelihood of the investment in job training. This is driven by the fact that in larger firms may have larger benefits or reduced costs of the investment in training. In spite of the large number of firm characteristics included in our reduced form, it is impossible to fully disentangle correlation from causality in our cross sectional data and in the absence of an instrumental variable. It is possible that, if training is productive, it is itself leading to firm growth and to larger firm sizes. Actually, some of the empirical evidence in developing country contexts summarized in the introductory chapter of the paper is consistent with this idea. In the next sections we, therefore, attempt to estimate β after controlling for different observable and unobservable characteristics and for different samples but never attempt to fully disentangle correlation from causality. Table 3 reports the point estimates for the different variables in equation (2). The specifications across different columns differ in the controls included. In column (1), we just control for firm size and country dummies, and in column (2), we add a proxy for the degree of firm openness (dummy variable for whether firm exports and has some foreign owned capital), age of the firm, public ownership and share of skilled workers in the workforce. Column (3) includes, in addition to the variables in column (2), a proxy for technological innovation, and column (4) includes country-city fixed effects 16 .

Main results
The findings in column (1) show that there is a statistically significant and quantitatively important positive correlation between the investment in job training and firm size. In particular, small, medium and large firms are 16.1, 36.7 and 49.0 percentage points more likely to invest in job training than micro firms, respectively. These differences are slightly reduced, although they remain quantitatively important and statistically significant when we include additional control variables, like firm observable characteristics in columns (2) and (3) and country-city and sector unobservable characteristics in column (4).
One reason why larger firms could be more likely to train their workers in column (1) could relate to their larger integration into global markets (both through exports and FDI) or through differences in the shares of capital publically owned, age of the firm, or in the education of the workforce. Moreover, larger firms are also more likely to invest in new technology than smaller firms and this could also lead them to invest more in job training. The findings in columns (2) and (3) of Table 3 show that the Source: Author's calculations are based on the Enterprise Suveys (World Bank). Note: *significantat 10%; **significantat 5%; ***significantat 1%. Dependent variable is a dummy variable that equals 1 when the firm reports providing on-the-job training. Standard errors are clustered at the country and sector level. Column (1) includes country fixed effects, column (2) includes country and sector interactions and column (4) includes country-sector and city interactions. We consider 3 city categories: capital cities and those with a population above one million, and cities below 1 million people. Column (1) includes size dummies; column (2) includes in addition to the variables in column (1), firm openness, log age of the firm, the share of capital owned by public sources and the share of skilled workers in the firm. Column (3) adds a technological innovation dummy, and column (4) replicates column (3) but controls for country-city-sector dummies. We refer to the specification in column (3) of this table as the baseline specification. All the variables are defined in Table Additional file 3. investment in job training is complementary both to the skills of the workforce and to the degree of firm openness and technological innovation. Most interestingly, the magnitude and statistical significance of the size-job training premium remains important after controlling for differences across firms in these characteristics. Large firms could also systematically differ in their intensity to train due to their geographical location. We conjecture that firms located in the capital city or in very large cities (with more than 1 million inhabitants) could benefit from more developed institutions, better supply of training, or better access to information than firms located in smaller cities (even within the same country). To the extent that small firms disproportionally locate outside the capital city, as we actually see in our sample, could be partly driving the differences across small and large firms in their training intensity. The findings in column (4) of Table 3 control for these within country differences. Reassuringly, the point estimates in column (4) are almost unchanged from those in column (3). Therefore, we will take the specification in column (3) as our baseline specification. Additional file 2 tests the robustness of the findings reported in Table 3 by exploring alternative proxies for the stock of human capital of the workforce and for the degree of innovation and technological adoption. In column (1) we add to the baseline specification reported in column (3) of Table 3 the mean years of schooling of the workforce, in column (2) we add to the baseline specification the share of investment in R&D as a share of total sales, in column (3) we add to the baseline specification whether the firm has an International Organization for Standardization (ISO) certification and in column (4) we add over the baseline specification the firm capacity utilization. Reassuringly, the point estimates associated with the different firm sizes remain quantitatively similar and statistically significant.
It is interesting to discuss the signs of the estimates for the control variables in Table 3. First, the findings suggest a strong complementarity between the investment in job training of the workforce and the stock of human capital of the workforce, measured by the share of skilled workers in the firm. This finding is supportive of the idea that the more educated is the workforce in the firm, the higher is the return to this investment, regardless of firm size. This is fully aligned with the empirical evidence exploring household level surveys, where more educated workers have higher returns and are more likely to receive job training than less educated workers (see, e.g., Lopez-Acevedo and Tan 2003; Johanson and Van Adams 2004). Second, the findings also suggest a strong complementarity between the investment in job training and the degree of firm openness on the one hand and between the investment in job training and technological adoption on the other hand 17 . Third, we do not find strong support of the view that, all else constant, older and publically owned firms are more likely to invest in job training. Table 4 tests whether the positive correlation across firm size and the investment in job training is driven by other factors related to the firm's geographical location and the sector of activity. First, we test whether within each country, the positive correlation across firm size and the investment in job training could be driven by the fact that larger firms disproportionally locate in the country capital or in other very large cities 18 . And, we conjecture that firms located in the capital or in large cities could have better institutions, better access to information on the quality of trainings, or possibly face lower training costs due to the proximity to training centers. Therefore, larger firms could be, all else constant, more likely to invest in on-the-job training simply because of their geographical location. Column (1) of Table 4 reports the estimates for the baseline specification when restricting the sample only to firms located outside the capital city. Reassuringly, the point estimates remain positive and statistically stronger than the ones reported in column (3) of Table 3.
We investigate next whether the positive correlation between firm size and the investment in job training could be driven by the sector of activity. This is plausible since small and large firms are likely to differ in the technology use, which in turn is also likely to determine the demand for investing in job training. Even though in the baseline specification reported in column (3) of Table 3, we already control for differences across firms in the sector of activity (with country-sector dummies), it is possible that the returns to the investment are different by sector of activity. If the latter holds, we would expect the results to be different across different samples with differing technologies, levels of technological adoption and/or capital/labor rations. Moreover, since in our sample micro and small firms are more concentrated in non-manufacturing sectors, this sector composition could also be driving the results 19 . Column (2) reports the findings for the baseline specification when restricting the sample only to manufacturing firms, which account for 73% of the sample. Reassuringly, the magnitude and significance of the point estimates are almost not affected. In columns (3) and (4), we also analyze whether our findings are systematically different across low-tech and high-tech sectors, respectively. In our sample, the low-tech manufacturing sectors have a larger share of smaller firms than the high-tech sectors, probably due to the large fixed costs of starting up high-tech activities. Reassuringly, restricting the sample across these two groups still yields quantitatively large and statistical significant differences in the intensity to train across firm sizes in the two manufacturing sectors. However, the point estimates in columns (3) and (4) show that the differences across firm sizes are slightly larger for firms operating in the low-tech sectors than for those firms in the high-tech sectors. Finally, one could argue that the differences across firms in the technology used are still not adequately captured by the sector composition of firms. In particular, there is robust evidence that there is wide dispersion in productivity even within narrowly defined sectors (e.g., Eslava et al. 2004;Foster et al. 2008) so that the returns to the investment in job training could be more directly related to the stock of physical capital. We test the robustness of our baseline specification to controlling for differences across firms in the capital intensity of their technology (captured by the capital labor ratio). The findings, reported in column (5) of table A4 again show that the main coefficients of interest remain robust 20 . Also interestingly, our findings do not differ significantly if we were to break the sample by the age cohort of firms (not reported but available upon request). The latter is suggestive that the differences across small and large firms in the intensity to train are also not explained by the fact that larger firms tend to be, on average, older.

Heterogeneity around the developing world
In this section, we discuss whether the positive correlation across firm size and the investment in job training holds across different geographical regions of the world and income levels. We discussed in section 2 that there is a large heterogeneity in the intensity to provide job training across countries and regions of the world. Figure 1 has shown that Africa or South Asia tend to have a lower share of firms offering job training than firms elsewhere. In particular, a firm in Thailand or in Brazil is, on average, more than four times more likely to offer job training than a firm in Mozambique or in Gambia. Table 3 showed that some of this variation is explained by differences in the observable characteristics of firms across countries. However, there is still a large unexplained variation related with the firm's geographical location 21 . Table 5 reports the baseline specification when restricting the sample to firms operating in Africa in column (1), in East Asia in column (2), in Eastern Central Europe in column (3), in Latin America in column (4), in the Middle East and North Africa in column (5) and in South Asia in column (6). Firms operating within each of these broad regions of the world have arguably a more similar institutional environment, including the cultural and socio-economic characteristics. In particular, we are interested in understanding how the results could be driven by the specific training policy for the manufacturing sector in Africa. The importance of apprenticeships, particularly in West Africa, has been well documented and is predominant among smaller firms (see e.g. Frazer, 2006;Bas, 1989;or Velenchik, 1995) 22 . Even though we believe that this type of more informal training scheme is outside the scope of our data, it is important to test the extent to which results could be driven by these institutional differences across regions 23 . Again, our point estimates show quantitatively important differences across all regions of the world. Two facts are worth highlighting, though. First, differences in the intensity to train for Africa and in the Middle East and North Africa (MENA henceforth) are slightly larger than those in our baseline specification. In particular, larger firms are 47 percentage points and 45.7 percentage points more likely to train than micro firms (omitted category) in Africa and MENA. These are regions were training intensity is, according to our sample, one of the lowest in the world. Second, East Asia is the region of the world with the smallest dispersion across firm sizes.
There, small, medium and large firms are 6.8, 23.6 and 32.1 percentage points more likely to train than micro firms, respectively.
Finally, Table 6 analyzes the heterogeneity of the main findings by the country's income level. Column (1) reports the baseline specification for firms operating in low income countries, column (2) for firms in middle-low and column (3) in middle-high income countries 24 . Interestingly, the point estimates still show statistically significant and quantitatively important differences in the intensity to train by firm size across all income levels. However, when comparing the results to the baseline specification reported in Table 3 for all income groups, we find that the dispersion in training Table 5 The size-job training differential: Heterogenity around the developing world Source: Author's calculations are based on the Enterprise Suveys (World Bank). Note: *significantat 10%; **significantat 5%; ***significantat 1%. Dependent variable is a dummy variable that equals 1 when the firm reports providing on-the-job training. Standard errors are clustered at the country and sector level. Columns (1) through (6) report the results of estimating the baseline specification (column 3 of Table 3) for different regional samples. Column (1) includes only firms in Africa, column (2) includes firms in East Asia, column (3) includes firms in Eastern Europe, column (4) includes firms in Latin America, column (5) includes firms in the Middle East and North Africa and column (6) includes firms in South Asia. All the variables are defined in Additional file 3. Source: Author's calculations are based on the Enterprise Suveys (World Bank). Note: *significantat 10%; **significantat 5%; ***significantat 1%. Dependent variable is a dummy variable that equals 1 when the firm reports providing on-the-job training. Standard errors are clustered at the country and sector level. Columns (1) through (3) report the results of estimating the baseline specification (in column (3) of Table 3) for different samples. Column (1) includes only firms in low income countries, column (2) includes only countries in low-middle income countries and column (3) includes only firms in high-middle income countries. All the variables are defined in Additional file 3. associated with larger firms is greater for low income countries, and the one associated with small and medium sized firms is greater for middle-high income countries. Our findings in Tables 3, 4 and 5 show that large firms provide less on-the-job training than smaller firms around the developing world and that quantitatively the differences are important. The findings also show that these differences in training intensity across firm sizes are not fully explained by differences across firms in their average human capital of the workforce, the degree of firm openness and technological adoption. Moreover, they are also prevalent across all regions of the world and income levels, although differences tend to be larger for firms operating in low-tech sectors, in regions with overall lower incidences and in low income countries.
One point is worth highlighting. Even though we estimate a strong and positive differential in the intensity to train across firm sizes, we are not able to ultimately disentangle correlation from causality. In particular, it is possible that the positive coefficients are driven by reverse causality where firm size is a leading indicatornot a causal indicatorof the firm's investment in job training. It is reassuring to see, however, that the differential in the investment in job training is robust across several cuts in our sample, as well as to the control of several observable and unobservable characteristics, like the degree of firm openness and technological adoption, the skills of the workforce or country-sector unobservable characteristics.

Robustness checks
In this section, we present some robustness checks to help us validate further the strong correlation found between firm size and the investment in job training. In particular, we think through some of the main facts that prevent SMEs from engaging in this investment and investigate whether the correlation holds after trying to account for them. In particular, we test the robustness of our findings when controlling for different factors that are likely linked to alternative market imperfections impeding this investment.
First, we conjecture that SMEs may invest less than larger firms in training because they have a reduced access to information on available training programs and their quality. This could be driven by SMEs having a more remote location relative to the city center than larger firms or due to lower human capital of their management. The managerial education may also be correlated with the set of information available to the firm. Second, we conjecture that differences in training intensity across firm sizes could be related to differences across firms in how easy the coordination of this investment is with their workers or in the time during which firms can recover this investment on their average worker. We proxy the flexibility in the contractual arrangements with the share of the workforce with temporary contracts and the difficulties in coordination across workers and firms with the share of unionized workforce. Third, we investigate whether the fact that smaller firms could have a reduced access to external finance relative to larger firms or their smaller reinvestment of profits could be driving the results. To look at this, we explore differences across firms in the share of reinvested profits and the access to external finance. Finally, we investigate whether SMEs are less likely to invest in job training due to the greater uncertainty on the quality of the investment and thus on their returns. If uncertainty is a problem hitting particularly SMEs, then they could be less likely to invest in spite of the potentially large ex-post returns. To account for differences across firms in economic uncertainty, we explore information on how managers rank the importance of the uncertainty in the economic and regulatory policy relative to other obstacles 25 . Our prior is that for firms reporting that economic uncertainty is more of an obstacle than other factors, they will be less likely to invest in job training. And, because firms differ significantly in their optimism or pessimism, we control for the differences across firms in the perception about how different indicators constrain growth (including perceptions regarding the economic and regulatory policy uncertainty, macroeconomic instability, corruption, crime, theft and disorder, anti-competitive or informal practices, legal system/conflict resolution).
The findings of all these robustness checks are reported in columns (1) through (6) of Table 7. They show that, after including proxies for all these reasons across all specifications, the size differentials associated with job training remains positive, quantitatively important and statistically significant. Moreover, with only one exception, the magnitude of the point estimates remains very similar to our baseline specification (column 3 of Table 3) 26 .

Conclusion
In a modern economy, the investment in human capital by firms is crucial to foster technological adoption and ultimately achieve higher productivity growth. This paper analyzes the correlation between firm size and the investment in job training by employers. Using a large cross sectional firm level data set across 99 developing countries, we show that a strong and positive correlation in the investment in job training and firm size is a robust statistical finding both within and across countries with very different institutions and levels of development. We also show that the size-training gap is not fully explained by differences across firms in market imperfections or institutional failures impeding SME development.
It is thus a very robust finding that in the developing world, larger firms undertake more formal training. Unfortunately, our data set does not allow us to fully disentangle correlation from causality. If human capital in the form of training is driving productivity, then it seems inevitable that it is driving size as well. Without convincing instruments or better panel data sets capturing the investments in training and direct measures of productivity, one cannot disentangle how much training impacts size and how much firm size, presumably in the form of lower average costs, increases the job training. Therefore, our analysis calls for the importance of collecting more longitudinal data sets to better understand how cost-effective on-the-job training programs are in fostering firm productivity and growth in developing countries.

Endnotes
forms. First, the technical and vocational training; second, the apprenticeship system, which is mostly informal and has more relevance in Western Africa; and third, any other formal manufacturing sector training still carried out within firms. In spite of the importance of all these strategies for skills development, there is little research in each of these topics. This paper contributes to this literature by focusing on the third channel.
3 Promoting the access to information and finance, improving the coordination across firms and workers and mitigating economic uncertainty are likely to incentivize the investment in job training, regardless of the firm size. Moreover, institutional and policy reforms fostering the integration of firms in the global markets and leading to more technological innovations (e.g., through lower regulations on firm entry/exit) will also foster the investment in job trainings. The fact that this type of intervention does not overwhelmingly benefit SMEs relatively to larger firms has been supported by others (e.g., Ibarrarán et al. 2009). Source: Author's calculations are based on the Enterprise Suveys (World Bank). Note: *significantat 10%; **significantat 5%; ***significantat 1%. Dependent variable is a dummy variable that equals 1 when the firm reports providing on-the-job training. Standard errors are clustered at the country level. Columns (1) reports the baseline specification as in column (3) of Table 3. Column (2) through (6) report different robustness based on alternative firm characteristics. In addition to column (1), we have controls for the managerial tertiary education (column 2), share of temporary contracts (column 3), share of unionized workers (column 4), share of profits reinvested (column 5), access to external finance (column 6) and uncertainty on the regulatory environment (column 7). All the variables are defined in Additional file 3.
4 As we summarize later in this section, some of the more recent empirical work shows that the investment in job training is positively correlated with direct measures of firm productivity and with average wages at the firm level (see Faria, 2014 or Almeida andCho, 2012 for recent comprehensive reviews). 5 In developing countries, several regulations tend to favor the existence of micro or small firms because they apply only to firms above a certain employment threshold. Kumar et al. (1999) explain differences in firm size across countries with the role institutions, such as the judicial and the financial systems, play. 6 Middleton et al. (1993)  ing program to a selected group of workers and finds that participation in training yields a 10 percent increase in performance. 9 There is a large debate around SME based policies. The micro evidence does not provide much support for the view that SMEs have a greater effect on productivity, employment and growth than large firms. In particular, the bulk of the firm-level evidence does not support the contention that SMEs are particularly effective job creators (Rosenzweig, 1988;Little et al. 1987). Furthermore, research also does not universally support the claim that SMEs particularly foster innovation (Pagano and Schivardi, 2003;Pack and Westphal, 1986). Finally, while some firm-level studies find that SMEs intensify competition, there is little direct evidence on the positive effects of SME policy on productivity growth. One reason could relate with firm size not being an exogenous determinant of growth so that SME policies could distort firm size and hurt economic efficiency (Kumar et al. 1999;Caves and Pugel 1980). Alternatively, policies promoting a sound business environment (through low entry/exit barriers, well defined property rights, and effective contract enforcement) could be more conducive to market competition, but the benefits are not only restricted to SMEs. 10 The Enterprise Surveys are one of the most comprehensive sources of comparable firm level data in the developing world (see Aterido et al. 2011;Ibarraran et al. 2009). They have been used to study closely related topics (Pierre and Scarpetta, 2006;Almeida and Aterido, 2011;Almeida and Fernandes, 2008;and Rosholm et al. 2007). A previous version of this survey was used by Frazer (2006) and Teal (1996) for Ghana and by Kahyarara and Teal (2008) for Tanzania. They have also been used to link quality of business environment and firm growth (e.g., Dollar et al. 2005;Aterido et al. 2011). 11 The Enterprise Surveys also collect information on the extensive training margin for some countries, including the percentage of skilled and unskilled workers trained, and training hours. We have also found robust size training premiums for the share of both skilled and unskilled workers (not reported but available on request). 12 Additional file 3 presents a simple model documenting whether the firms with missing data on the question on whether they provide job training differ systematically from other firms. The findings show that firms with missing data are, on average, smaller, younger and have some share of public ownership. Firms are also located in the capital city and have a more skilled workforce than firms reporting non-missing information on job training. Firms operating in non-manufacturing sectors (like retail or services) are also more likely to report missing data than firms in manufacturing. Even though this could be a source of concern, we have analyzed our main findings assuming two extreme scenarios: first assuming all the missing reporting is associated either with no investments in job training; second, assuming all the missing reporting is associated with investing in job training. Reassuringly, the point estimates in these two scenarios would remain qualitatively and quantitatively very similar to our main results (Tables 3 through 5). Under the first scenario, the point estimates for the baseline specification (Table 3, column 3) would become 11.5, 29.2 and 38.2 for small, medium and large sized firms, while in the second scenario, the point estimates would become 15.3, 33.8 and 43.1 percentage points, respectively. 13 Since we explore a firm level survey, we assume this question relates to the acquisition of skills required for work. However, the characteristics of the training supplied by each firm are likely to differ. First, trainings could take place inside or outside the normal working period. Second, it can be formally or informally accredited. Third, it can be offered within or outside the firm. From the precise question being asked, it is unclear the type of training being offered. However, since the survey covers mostly formal sector firms and the question refers to formal training, we assume that the survey captures mostly formal training episodes taking place outside of the working period. This comes in opposition to the informal training, which tends to take place during the working period and usually does not result in a specific qualification. The latter is more difficult to measure. Nevertheless, and independently of the training type, we do expect the training to jointly benefit firms and workers through higher productivity and/or wages. Otherwise, one of the parties would not be willing to engage in it. 14 The threshold for firm size dummies are somewhat arbitrary and often differ across countries and studies. Our definition follows Aterido et al. (2011) and differentiates small firms from micro firms. It is reassuring to see that our results go through with alternative definitions. 15 We have a total of 1,089 dummy variables for the 99 countries and eleven threedigit ISIC sectors of activity (including Food and Agro-industry, Textiles, Garments and Leather, Metal and Machinery, Electronics, Chemicals and Plastics, Construction, Retailing, Services, Wood, Furniture and Paper and Other industries). 16 The Enterprise Surveys report information on whether the firm is located in the capital city or in cities with more than 1 million workers. The country-city fixed effects included in column (4) of Table 3 are the interaction of this city dummy with the country dummies. This specification also controls for sector 3 digit fixed effects. 17 It is also interesting to note that, in column (3) of Table 3, the positive correlation between the investment in job training on the one hand and the degree of firm openness on the other hand is not solely explained by differences across firms in their technological innovation. Notice that these point estimates are still large and statistically significant in columns (3) and (4) after controlling for differences across firms in the degree of technological adoption. 18 In particular, in our sample, 40% of the micro and 45% of the small firms are located in the capital or in another very large city, while 50% of the large firms are located there. 19 We also worry more about the possible non-representativeness of the Enterprise Surveys in the non-manufacturing sectors than for the manufacturing sectors. Moreover, in most countries, the Enterprise Surveys do not cover so exhaustively the non-manufacturing sectors. 20 The robustness of our results reported in table A4 in the appendix is reassuring, but some of the specifications have significantly less observations than our baseline specification because some measures have not been consistently collected across all countries. 21 Running a regression of training intensity in firm size, not accounting for country fixed effects, yields an R-squared of 0.09 and larger point estimates than the ones reported in column (1) of Table 3. In particular, we find that small, medium and large firms are 17.3, 38.3 and 48.4 percentage points more likely to train than micro firms. Thus, differences across firms in their geographical location explain approximately half of the observed variation in training intensity across firm size. 22 Frazer (2006) describes that apprenticeships are periods of approximately three years during which an apprentice learns a trade. At the end of this period, the apprentice may be hired by the firm where the apprenticeship took place, or start a job elsewhere (including self-employment). Apprenticeships occur most often, but not exclusively, in smaller firms, and the master is often the owner of the firm. 23 Apprenticeships are often criticized for offering informal training methods and lacking theoretical foundations that are needed in the complex technical demands of the formal sector. The main advantages are that it is not government funded (with cost being borne by the apprentice and/or firm) and that they provide an option for youth who would otherwise be unemployed. 24 The group of middle-high income countries include: Botswana, Chile, Costa Rica, Croatia, Czech Republic, Estonia, Hungary, Latvia, Lebanon, Lithuania, Malaysia, Mauritius, Oman, Poland, Russia, Slovakia and South Africa. 25 The exact question in the Enterprise Surveys is "Please tell us if any of the following issues are a problem for the operation and growth of your business. If an issue poses a problem, please judge its severity as an obstacle on a four-point scale: Telecommunications, Electricity, Transportation, Access to Land, Tax rates, Tax administration, Customs and Trade Regulations, Labor Regulations, Skills and Education of Available Workers, Business Licensing and Operating Permits, Access to Financing, Cost of Financing, Economic and Regulatory Policy Uncertainty, Macroeconomic Instability, Corruption, Crime, Theft and Disorder, Anti-competitive or Informal Practices, Legal System/Conflict Resolution. 26 The smaller point estimates in column (2) are in part explained by the lack of data on the managers' education for several countries. Replicating our baseline specification in column (1) but for the sample of countries in column (2) would yield coefficients for small, medium and large firms of 10.3, 30.6 and 38.3, respectively.