FACTOR ANALYSIS OF THE CO2 EMISSIONS IN THE UNITED STATES WITH IMPLICATIONS FOR THE ENVIRONMENTAL POLICY

This paper applies the Generalized Divisia Index to decompose the CO2 emissions into eight components and uses the Factor Analysis to determine their clusters - the combinations of related components that play the leading role. Economic analysis of these clusters allows for the determination of the main  drivers of the CO2 emissions. As a case study, we used the data of the United States from 1950 through 2040 separated into three periods: 1950 - 1980; 1981 -  2012, and 2013 - 2040, each characterized by a specific type of socioeconomic development: industrial, post-industrial, and information, respectively. Data for the last period are projections. As a result, we got an insight into the typology of the CO2 emissions and obtained recommendations on environmental policy aimed at their mitigation.


INTRODUCTION
As the report of the Intergovernmental Panel on Climate Change, ICCP (2014), states, human intervention aimed to reduce the sources of greenhouse gases (GHG) is strongly needed to prevent global warming and climate change. As this report also mentions, the opportunities for the mitigation of the CO2 emissions are limited due to the existence of economic, societal, and cultural differences among countries. The GHG mitigation policy should be consistent with sustainable development, equity, value judgements and ethical considerations. Also, all individual agents should be prepared to forego some of their own interests. Finally, the climate policy is subject to the ability of individuals, organizations, and countries to perceive related risks and uncertainties.
Fossil fuel combustion and industrial processes contribute about 78% of the total. Between 2000 and 2010, the increase in anthropogenic GHG emissions resulted from energy supply (47%), industry (30%), transport (11%) and buildings (3%) sectors. Among the most important drivers of the increase in GHG emissions were economic development and population growth. Without the relevant mitigation policies, this trend may begin threatening the human wellbeing and even existence.
The instruments available for the policymakers are different in the societies located at different stages of economic development because the differences in the main sources of economic growth. In this paper, we distinguish among the three main types of the societies: industrial, post-industrial, and information. In the industrial society, the most of the gross domestic product is generated in the industrial sectors of the economy. When collection of the service sectors takes the lead, the society is referred to as post-industrial, Bell (1973). Finally, when creation, use, and manipulation of information becomes the main factor of international competitive advantage, the society reaches the information stage, Beniger (1986). This paper covers the period of 1950 through 2040 during whichn the United States passed the first two stages and is expected to enter the third, the highest, stage of the information society. We demonstrate how the CO2 factor structure changes depending on the level of economic development. The investigation of this change is important in view of the practical significance of the mitigation of the CO2 emissions. The main objective is to help determine the driving forces behind the CO2 emissions and suggest the ways of change in the environmental policies that are relevant to the level of economic growth.
The tools of investigation used in this paper are the Generalized Divisia Index Method (GDIM), Vaninsky (2014), and Factor Analysis, Thompson (2004). The paper is organized as follows. Section 2 presents the mathematical model, Section 3 the results and their discussion, and Section 4 provides conclusive remarks and outlines possible next steps.

MATHEMATICAL MODEL
In this section, we follow Vaninsky (2014) in the description of the mathematical means used in this paper. The basic tool is the factorial decomposition of the CO2 emissions of factors as suggested by Kaya identity, Kaya (1990), extended to include the interconnected factors, Vaninsky (1983Vaninsky ( , 2014. Methodologically, the Kaya identity may be traced to the seminal publications of Laspeyres (1871), and Paasche (1874). These publications suggested the additive decomposition of the resultant indicator Z given in the multiplicative form as where ∆Z and ∆Z [Xi] stand for the change in Z and its parts corresponding to the factorial indicators Xi, respectively. The idea was to change the factorial indicators Xi one at a time and to assign at each step the partial change in the level of Z to a factor Xi, respectively. One of the disadvantages of this approach is the necessity of a priori ordering of the indicators Xi. This weakness was overcome in the publication of Divisia (1925) that suggested the statement of the problem in continuous time. This publication assumed that the factorial indicators change continuously in time as Xi = Xi(t), so that the necessity to put the factorial indicators in order was eliminated. The continuous-time factorial decomposition is as follows : where Σ is the summation symbol, symbol ∫ stands for the integration, Xi' is the derivative of Xi by time t, and integration is done by time t as well. With this approach, the impact of a factor Xi on the change in the resultant indicator Z is as follows: This approach was further extended in Scheremet et al. (1971) to include any continuously differentiated functions, rather than products of the factorial indicators only. Assuming the authors received the factorial decomposition as: where M a y 0 6 , 2 0 1 5 fi' is a partial derivative with respect to the i-th argument, and Xi' =dX/dt. Formula (6) may be rewritten in the vector form as is a column gradient vector of the function f(X1, …, Xn), upper index T stands for the transposition, the dot-symbol stands for the dot-product of two vectors, and dX is a diagonal matrix with elements dX1, dX2,…,dXn.
Publications of Meerovoch (1974) and Vaninsky and Meerovich (1978) introduced a new class of the decomposition problems related to the structural change; see Maital and Vaninsky (2000) for details. The division -Sheremet approach was extended further in publications Vaninsky (1983Vaninsky ( , 1986) by the introduction of the factorial indicators that are not included in the model directly. This approach was applied in Vaninsky (2014) to the decomposition of the CO2 emissions. We will refer to it in this paper below as a Generalized Divisia Index Method (GDIM).
In the framework of the GDIM, the resultant indicator Z is a function of the factorial indicators X1, X2,…,Xn that are interconnected by a system of equations: The second equation may be written in matrix form as The following formula was proved in Vaninsky (1984): upper index "+" denotes the generalized inverse matrix, and I is the identity matrix. It is known that if the columns of the matrix Φx are linearly independent, then See Albert (1972) for details. It should be mentioned that since the formula (12) uses an operator of projection on a surface, the factors should be measured in relative units; see Vaninsky (1984Vaninsky ( , 1987 for detail. Vaninsky's publications (2013, 2014) applied this approach to decomposition of the CO2 emissions by factors of GDP, energy, population, their carbonization intensities, and other factors by extending the Kaya identity, Kaya (1990). This identity is a particular case of index model (1) adapted to environmental studies. It expresses the CO2 emissions as a product of carbon intensity of energy (CO2/E), the energy intensity of economic activity (E/GDP), GDP per capita (GDP/P), and population (P): The impact of each of the factors can be computed by using either the discrete Laspeyres-Paasche approach or the continuous-time approach of Divisia. The Kaya identity is a useful practical tool for finding the ways of reducing the CO2 emissions. For example, the Kaya-identity-based decomposition is available as a part of statistical data published by the U.S. Energy Information Administration on its website www.eia.gov. This approach, however, may be critiqued from two viewpoints. Firstly, only the population indicator is included as a quantitative indicator; neither energy nor GDP is considered within the framework of the factorial model (15). Secondly, different factor models similar to (15) may be offered that lead to different factorial decompositions.
Keeping this in mind, we follow in this paper publication of Vaninsky (2014) and transform the Kaya identity into factor model (10), which allows for the expansion of the analytical base of Kaya identity by the inclusion of different quantities M a y 0 6 , 2 0 1 5 and relative indicators. To do that, we begin with an observation that CO2 emissions may be presented in one of the three ways: Our objective is to incorporate all of them symmetrically into the factorial analysis. For the sake of readability, we use the following denominations: Z = CO2, X1 = GDP, X3=Energy consumption, X5=Population; X2, X4, and X6 are the carbon intensities: X2 = (CO2/GDP), X4 = (CO2/Energy), X6=(CO2/Population), correspondingly. Following Vaninsky(2014), we included two more relative indicators in the model to increase its explanatory power: X7=(GDP/Population), and X8 = (Energy/GDP).
In terms of the newly defined variables, formula (16) becomes To apply the GDIM, we separate these equations into a factor model and equations of the factors' interconnections as follows: and rewrite the equations (18) in the form (10): As shown in Vaninsky (2014), a gradient of the function Z(X) and the Jacobian matrix Φx are as follows: In this paper below, the quantitative factors X1=GDP, X3=Energy consumption, and X5=Population are considered exponential functions of a model time t, 0 ≤ t ≤ 1. The range of the model time change does not affect the final result; see Vaninsky (1983Vaninsky ( , 1987 for details. By doing so, we get all of the remaining factorial indicators and the resultant indicator Z as the functions of the model time t as well in the form: where Q stands for a quantitative or relative indicator Xi, or resultant indicator Z, and 0 and 1 are the lower indexes corresponding to base and final values, respectively. The derivatives with respect to time t are Publication Vaninsky (2014) presents a computer program in R-language, R Development Core Team (2011), that performs calculations. As a result, we obtain the decomposition of the chain rate of change in CO2 emissions into 8 factors mentioned above.
Our objective is to study the structure of the factorial decomposition obtained at different stages of economic development.
To do so, we apply a technique of factor analysis, see Thompson(2004) for details. Factor analysis aims to represent a set of n variables as linear combinations of a smaller number of k factors. The factors are assumed to be independent random variables with zero mean value and a unit standard deviation. The terms of the linear combinations are called factor loadings. Factor analysis uses the rotation of the factors to make the factor loadings clearly separated by the variables. This allows for the interpretation of the variables having largest i-factor loadings as belonging to one cluster, related to this factor.
In matrix notation, the factor analysis model is as follows: where X is an nˣN matrix representing N observations over n variables, F is a kˣN matrix of k factors, Λ is an nˣk matrix of factor loadings , and U is an nˣn uniqueness matrix. In this model, k < n < N, and a matrix product ΛF is interpreted as the communality of the variables. Factor analysis is aimed to make the matrix U as small as possible. M a y 0 6 , 2 0 1 5 In this paper below, we use 8 factorial decomposition elements as variables that are observed during the time periods corresponding to different stages of economic development.

RESULTS AND DISCUSSION
In this section we use statistical data on the U.S. economy for the period of 1950 -2040 available at the website of the U.S. Energy Information Administration www.eia.gov to obtain a deeper insight in the structure of the CO2 emissions. The data include indicators of CO2 equivalent emissions, GDP, energy consumption, and population. The data beyond 2013 are projections. The data were divided into three sets: 1950 -1980, 1981 -2012, and 2013 -2040, roughly corresponding to the three different stages of the U.S. economy: industrial, post-industrial, and information. We expect that the environmental policy, technology and the use of energy are quite different in these periods, and we aim to detect and clarify these differences.
Quantitative data for 1950 -2040 are given in table 1. Figures 1 and trends Table 2. Figure 3 presents average contributions of each component to the rate of change in the CO2 emissions. As follows from these data, the GDP remains the main factor of the increase in the CO2 emissions across the periods with decarbonization of GDP as the main factor of their decrease. The role of energy is essential in the first period only but its impact strongly decreases after that. Carbonization of the population is essential but reverses its effect from positive in the first period to negative in the two following ones.
To further investigate the structure of the CO2 rates of increase decomposition, we applied the technique of factor analysis, referring the reader to Thompson (2004) for details, implemented in R language version 3.3.1. We used the library PSYCH of the package MASS. We began with the determination of the number of factors by using the procedure VSS with rotation parameter varimax. For all three time periods, the number of factors varied from two to four depending on the criteria embedded in the procedure. For the factor analysis and finding the clusters among the decomposition variables, we applied the procedure ICLUST with up to four factors. The results are shown in table 3 and figure 4.
As follows from the obtained results, the first two clusters include, depending on the periods, the quantitative indicators of Energy consumption and GDP (industrial, 1950-1980), Energy consumption (post-industrial, 1981 -2012), and GDP (information, 2013 -2040). This means that at the industrial stage, both GDP and energy are the CO2 drivers since the production processes are energy intensive. In the post-industrial stage, when a greater part of the GDP is produced by using low-energy technology, only the total amount of energy matters. As a result, the role of the GDP as a CO2 driver decreases and it moves to the less significant cluster 2. As the society moves to the information era, all sources of energy become less CO2-emitting, and, thus, only the scale of the economy becomes the primary quantitative factor. This leads to the increase in the rank of the GDP while the energy indicator moves to the cluster 3.
Observations over the relative indicators reveal that the carbonization of the population (the CO2 over population) keeps its important role in all three types of economies. It is in the cluster 1 in the industrial and information economies and in cluster 2 in the post-industrial economy. At the same time, there is a difference between the industrial and more advanced economies. In the industrial and information economies, a factor of economic development -GDP per capita -plays a role as a part of the cluster 2 while at the post-industrial stage it changes for the energy intensity of the GDP, the factor of industrial technology. It also may be mentioned that carbonization of the GDP factor is important at both advanced stages of economic development. It is a part of the cluster 1 in both post-industrial and information economies.
The mentioned changes in the clusters' compositions are suggested to play a role in the formulation of environmental policy aimed at mitigation of the CO2 emissions and finding the ways of its implementation via economic restructuring.

CONCLUSIONS
In this paper we analyzed the CO2 emissions for the periods of 1950-1980, 1980-2012, and 2013-2040, with data beyond 2013 being projections made by the U.S. Energy Information Administration. These periods roughly correspond to the