Test of Data Normality, Return Similarity and Variance Analysis in South Asian Stock Markets

This paper analyzes the data distribution on stock market returns in SAARC nations (Bhutan, India, Bangladesh, Nepal, Sri Lanka and Pakistan) for weekly data from January 2006 to December 2011 to see if market returns are normally distributed. Secondly we have also tested if returns are similar across different markets using pair sample t-tests. While comparing differences or similarities in returns we compare associated risks for each pair to see if there exist opportunity for similar returns at lower risk or higher returns at a given risk. Finally we analyzed variance analysis using oneway ANNOVA with multiple comparisons to find out if time varying effect is present in any of the stock market return. Our finding suggests that the data distributions on stock returns of all the markets in the region are not normal. We observe high skewness, kurtosis and further the hypothesis of normal distribution have been rejected based on Jarque-Bera test for full sample data of 2006 to 2011 for all countries although, the data of Bangladesh and India seems to possess lower levels of skewness and Jarque-Bera statistics indicating lesser degree of non-normality. When data was run after splitting the sample annually, we found that the distribution was normal for most years for majority of markets. This suggested impacts of sample size on data distribution. We crosschecked the results with non-parametric test using Kolmogorov-Smirnov (K-S) since it is one of the very popular tests statisticians would use. We found that the data distributions of Indian and Bangladeshi stock returns are normal and the rest are non-normal. While analyzing the return similarities/difference using paired sample t-tests, we found that there exits no statistical differences in the average returns between different pairs of stock returns except some difference with few pairs of returns when sample was split annually. We have observed difference in the levels of risks (standard deviation). This indicates opportunity for investors to earn similar returns at lower risks by changing their investment destinations. We conducted multiple comparisons of variances using annual, weekly and seasonal codes and found that some annual time effect with some stock returns. However, we found no week of the month effect and season of the year effect. Difference in time per se for entry into the stock market and exit from it does not provide extra benefits.


ABSTRACT
This paper analyzes the data distribution on stock market returns in SAARC nations (Bhutan, India, Bangladesh, Nepal, Sri Lanka and Pakistan) for weekly data from January 2006 to December 2011 to see if market returns are normally distributed. Secondly we have also tested if returns are similar across different markets using pair sample t-tests. While comparing differences or similarities in returns we compare associated risks for each pair to see if there exist opportunity for similar returns at lower risk or higher returns at a given risk. Finally we analyzed variance analysis using oneway ANNOVA with multiple comparisons to find out if time varying effect is present in any of the stock market return. Our finding suggests that the data distributions on stock returns of all the markets in the region are not normal. We observe high skewness, kurtosis and further the hypothesis of normal distribution have been rejected based on Jarque-Bera test for full sample data of 2006 to 2011 for all countries although, the data of Bangladesh and India seems to possess lower levels of skewness and Jarque-Bera statistics indicating lesser degree of non-normality. When data was run after splitting the sample annually, we found that the distribution was normal for most years for majority of markets. This suggested impacts of sample size on data distribution. We crosschecked the results with non-parametric test using Kolmogorov-Smirnov (K-S) since it is one of the very popular tests statisticians would use. We found that the data distributions of Indian and Bangladeshi stock returns are normal and the rest are non-normal. While analyzing the return similarities/difference using paired sample t-tests, we found that there exits no statistical differences in the average returns between different pairs of stock returns except some difference with few pairs of returns when sample was split annually. We have observed difference in the levels of risks (standard deviation). This indicates opportunity for investors to earn similar returns at lower risks by changing their investment destinations. We conducted multiple comparisons of variances using annual, weekly and seasonal codes and found that some annual time effect with some stock returns. However, we found no week of the month effect and season of the year effect. Difference in time per se for entry into the stock market and exit from it does not provide extra benefits.

INTRODUCTION
Normal distribution of stock return is a fundamental assumption in the field of efficient market hypothesis. It is this assumption that posits that the returns follow random walk. Time series data particularly the stock returns are hardly random walk and hence the assumption contributes to unrealistic conclusions from the studies that tend to study stock market relationships and integration. If data is normally distributed then we expect to see skewness close to zero and kurtosis close to 3. However, more often than not, the data on stock returns are hugely tailed either to the right producing positive skewness or to the left producing negative skewness. Many studies particularly those concentrating on the stock markets of emerging and least developed economies report that the stock returns particularly in the emerging and least developed economies have reported, more often than not, very high level of data non-normality. Khan and Huq (2012) and, Sharma and Bodla (2012) are the most recent studies that provided such conclusion based on their stock market studies in the South Asian region. It is important to test data normality before we proceed with the analysis of any econometric models while trying to compare stock market returns of different stocks, find relationships and integration. There are several studies in this regard that have conducted stock market return normality tests, risk and return comparison, variance analysis, relationship analysis, causality and integration among stock markets. However, we find very few in the context of South Asian region and literature is almost nonexistent when we look for Bhutan, one of the least developed nations with very small, highly inactive and young stock market.

OBJECTIVES
This paper makes an effort to analyze stock return distribution in Bhutanese stock market vis-a-vis Indian and other markets in the region. Comparison of statistically calculated risk and returns across different stock markets make sense only if they are read in relation to data distribution pattern. It is in this context, the paper conducts test of data normality, compares stock return and variances across stock markets in the South Asian region. Paper intends to conduct one-way variance analysis to get insights into the time varying effect on each stock returns.

SIGNIFICANCE AND LIMITATIONS
This paper is different from the past studies in the sense that it includes Bhutanese stock market in the sample. Although several studies are found in the literature that addressed the issues of data normality, test of return similarities and variance analysis return but most are focused on the stock markets of developed economies. Comparatively we find much lesser studies focusing South Asia stock markets. Hardly any study is found that has included Bhutanese stock market in sample. In this context, this paper makes the stock market study in South Asian region more inclusive and comprehensive by including Bhutanese stock market. However, due to inclusion of Bhutanese stock market, the study also faces some limitations. Bhutanese stock market did not have stock index till very recently (April 2012) although the stock exchange was established several years ago. Bhutanese stock exchange has just about 20 listed companies with very poor trading taking place for most of the companies. About six of the companies' stocks trade quite actively and these were selected purposely to construct the index. Since the trading data in Bhutan was published only once a week, the study had to consider the weekly data for other stock markets too limiting the speed and frequency of trading.

LITERATURE
Generally one would believe that securities markets are efficient enough to reflect market information on stock prices. Several decades ago Fama (1970) discussed about efficient capital markets that assumes market returns follow random walk behavior. This means that the distribution of risk and returns are normal. Under the normal risk-return distribution assumption, any news that arises is very quickly spread in the market and everyone has access to it. As a result everyone has equal opportunity in realizing a given amount of return for a certain level of risk by holding a randomly selected portfolio. Malkiel (2003) says that neither technical analysis that studies past stock prices in an attempt to predict future prices, nor any fundamental analysis would enable an investor to achieve returns greater than those that could be obtained by holding a randomly selected portfolio of individual stocks with comparable risk. He states that efficient market stock prices are characterized by random walk and all subsequent future prices represent random departure from the previous series. As a result, prices fully reflect all known information, and even uninformed investors buying a diversified portfolio at the prevailing market prices will obtain similar rate of return as experts would who may invest after through technical and financial analysis. Considering the above argument, one can earn a higher return only by bearing a higher level of risk due to market efficiency. However, stock returns do not necessarily follow random walk or the distributions of return are not normal all the time in all the markets. The efficient market hypothesis (EMH) is based on the assumption of data that the returns are normally distributed. There are situation where data for stock market returns are characterized by high levels of skewness and kurtosis indicating market non-normality in data distribution or a form of inefficient market. In such cases the average returns for different stocks may be similar but the risk levels, normally measured in terms of standard deviation, could be different or vice versa. Under such a scenario investors can enjoy certain return at much lower risk. This goes against the hypothesis of EMH. Pandey (2005) provides the rich insight into estimating and forecasting of volatility of assets returns using different approaches. He explains as to which approach performs better in terms of statistical properties such as model efficiency, biasness, and predictive power in estimating and forecasting volatility. He modeled the volatility of S&P CNX Nifty an index of 50 stocks of NSE Mumbai using different class of estimators and models. His results show that the conditional volatility models perform well in estimating volatility for the past in terms of statistical bias whereas extreme value estimators perform well on statistical efficiency criteria. In terms of forecasting volatility, the author reports that the extreme value estimators are better. Author concludes that ultimate conclusion depends on data distribution and appropriateness of the models chosen for analysis. Naqvi (2004) studied the data behavior for Pakistani stock market (Karachi Stock Exchange) using weekly and monthly data. He tested data normality, autocorrelations and also analyzed data random walks using Dickey-Fuller test. He found that for both weekly and monthly data of Karachi stock exchange was away from normality confirming a very weak form of market efficiency. Aggrawal (2005) studied stock returns normality for both small and large size samples for Nifty and Sensex in Indian stock market. He used ten years' daily returns for Nifty (November 3, 1995 -July 31, 2005) and eight years' for Sensex (July 1, 1997 to July 31, 2005). He analyzed data with Kolmogorov-Smirnov (K-S), Anderson Darling (A-D) and Jarque-Bera (J-B) tests and found out that large sample data size does not follow normal distribution. It is important to note that most studies using statistical tools are carried out on the assumption that the data distributions are normally distributed irrespective of sample size. However, if stock returns are hit with systematic risks, which cannot be avoided, then increase in sample size will lead to increase in error or risks. Therefore as the sample size increases the error increases making larger samples more non-normal than the smaller samples. EMH's assumption of data normal distribution does not hold true in many cases. Kumar and Dhankar (2011) studied risk and return normality for three stock indices of Bombay Stock Exchange (BSE 100, BSE 500 and Sensex) from 1996 to 2006 using daily, weekly, monthly and annual data. They applied parametric and nonparametric tests in examining the data. They repeated the tests after splitting data into three sub-samples (January 1999 -December 1999), (January 2000 to December 2002) and (January 2003 to December 2006). They found that the distribution of risk and returns are not normal for daily and weekly returns. But the distributions of monthly and annual returns were found to be normal for all three indices. Subhani, Hassan, Mehar and Osman (2011) analyzed co-integration for Asian stock markets that includes stock indices from four countries (India, Pakistan, Bangladesh and Nepal). They tested for each indices the presence of unit root applying Dickey and Fuller model and reported that for both (with and without differencing (first lag) there was presence of data non-stationary. Since the data was non-stationary, Johansen cointegration has been applied to see if markets were integrated. They analyzed multivariate cointegration between Pakistani stock and the rest and failed to accept the hypothesis of no cointegration in the equity market in South Asian region. However, when co-integration was analyzed on one to one basis between Pakistani stock and the rest. They found that Pakistani and Bangladeshi markets were co-integrated but with Indian and Nepalese markets there was no cointegration. Saha and A. Bhunia (2012) studied relationship between Indian stock and leading South Asian markets between August 2002 and August 2011. They initially looked at the correlation matrix among the stock markets in the region and since Indian stock was observed having relationship with others, it was thought that Indian stock as a more proficient market in the region has some influence on the others. They tested each variable for unit root and applied bivariate and multivariate co-integration (Johansen co-integration approach) and Granger causality test to see if South Asian stock market is integrated. They concluded that there is ample opportunity for the investors to broaden the horizon of their investment in the capital market in the region to take advantage of the poor integration. M.M.H Khan and U.R. Huq (2012) focused their studies on the risk and return behavior of different stock indices of Bangladesh. They used three stock indices of Bangladesh stock exchange covering (2002 -2010) period to analyze riskreturn pattern. They used daily, weekly and monthly data to analyze descriptive statistics and variances for each index and found inconsistency between risk and returns indicating that an investor can achieve better returns without any additional risk. This suggests that even in the same country, the different stock indices are not integrated or closely related.

METHODOLOGY
Stock return for each stock indices in this study are calculated as follows. Rt = (Pt / Pt -1), x 100 where Rt is the rate of return for the period t, Pt -1 and Pt is the index of two successive periods. For the comparison of returns against risks (volatility) we used basic statistical risk measure (standard deviation). To test the data normality for each stock returns we applied descriptive statistics that produces average returns, associated standard deviation and, skewness, kurtosis and Jarque-Bera test. We also crosschecked the data normality using non-parametric test of Kolmogorov-Smirnov (K-S). Paired sample t-test for average return comparison between Bhutan, India and the rest in the SAARC region was applied.

RESULTS AND DISCUSSIONS
This section presents the findings and discussions thereof on each of the research objectives highlighted in the study. Firstly we present the test for data normality followed by discussion on riskreturn comparison, test of mean differences and finally variance analysis. Table 1 presents the result for risk (standard deviation) and returns (mean) along with the indicators that facilitates analysis of data normality. Skewness, Kurtosis and Jarque-Bera statistics are used in testing the normality of return distribution across the countries under study. Findings suggest that stock returns of all SAARC nations are either skewed to the right or to the left. Indian and Pakistani stock returns are skewed to the left (-0.1843 and -8329) respectively. Indian stock return is closer to zero indicating lower level of data non-normality unlike Pakistani stock return. The stock returns of other markets are positively skewed. Nepal has the highest level of skewness (7.689) indicating longest right tail in the data distribution. Among the market having positive skewness, Bangladesh seems to be closer to zero, which indicates lower level of nonnormality in data. Bhutanese and Sri Lankan data are quite close to each other in terms of skewness (2.494 and 1.216) respectively. To test flatness or peakedness of data distribution, we used kurtosis to measure the data distribution. Sri Lankan data also is quite peaked (32.349). Indian data has the lowest level of kurtosis (4.670). Bangladeshi and Pakistani stock returns show 7.140 and 5.048 respectively. Kurtosis above the level of 3.00 is usually considered to be leptokurtic (unacceptably peaked). Further to this, Jarque-Bera statistics for all the stock returns are very high and statistically significant at p-value 5% or lower. This indicates that the "hypothesis of data is normal" is rejected for all indices. We conducted non-parametric test with Kolmogorov-Smirnov (K-S) and found out that all indices except Indian and Bangladeshi stocks are not normal in distribution. Indian and Bangladeshi stock returns have (K-S) statistics of (1.126 and 1.076) leading to rejection of the hypothesis of "data non-normality". For the rest of the indices, the calculated value of (K-S) is high enough leading to the non-rejection of hypothesis of data non-normality (statistical table not reported). We repeated the nonparametric test after splitting sample sizes to annual period (about 52 data points each) to see if smaller sample size had normal distribution. We found out that smaller samples are more normal than the full sample data. Most of the indices that were non-normal for full samples were found to be normal in different smaller sample sizes (see Annex 1). This finding supports the conclusion of Aggrawal (2004) who studied the impact of sample size for Nifty and Sensex of India and reported that smaller sample size had more normal distribution as against larger samples.

Risk-return Comparison
Weekly returns obtained in We have seen that the higher returns are not necessarily associated with higher risks. For instance, Nepal has the highest risk but the return is not. Sri Lanka and Bangladesh have much higher returns but not associated with the highest risk levels. For more clarity of risk-return relationship, we have presented annualized returns in figure 1 and standard deviations (calculated based on weekly returns) in figure 2.

Test of Similarities Returns between Bhutan, India and Other SAARC Nations
This section provides the paired sample t-tests for the mean return differences between Bhutan, India and the rest. Firstly we compared mean return of Bhutanese stock against others for the full sample and secondly we repeated the tests year-wise with each stock return individually. From the full sample results presented in (

Table2. Test of similarities of Bhutanese with the rest (2006-2011)
As stated we have compared year-wise average returns in the following tables.

Table3. Test of year-wise stock return similarities between Bhutan and India
Similar to the analysis of average stock return for different time period between Bhutanese and Indian stocks, we have conducted test to see if Bhutanese stock returns are similar with the returns of other stock markets in the region. In general we found no differences of mean returns between Bhutanese stock and the other stock returns except for just 2010 and 2008. In 2010, Bhutanese stock return were found to be different from the returns of other stock markets in the region, particularly with Sri Lanka, Nepal and Bangladesh. During this period, Bhutanese average return was lower than the returns in Bangladesh and Sri Lanka (statistical difference accepted at 5% significance level) but higher than the Nepalese stock returns, again significant at 5%. In 2008, Bhutanese stock return was higher than Sri Lankan and Pakistani stock returns. The tstatistics were t-stat 2.271 and 2.493 respectively, both significant at 5% confirming statistical differences. Details statistical tables on year-wise mean comparisons are not reported in the paper for reasons of space requirement.

Comparison of Returns Variances
We know that the t-test is not a sufficient tool in statistics when we need to compare means of different categories of variables at one go since it compares pair-wise. In our analysis, we wanted to compare whether the means stock returns in Bhutan, India or any other SAARC nation is different from the regional (over all) mean. Further, we wanted to see if there exists a difference in the mean returns in Bhutan, India and others across different time periods. Analysis of variance (one-way ANNOVA) is used to test the hypothesis that several means are equal rather than comparing just a pair of means at a time. As one-way ANNOVA is an extension of the twosample t-test, in the sense that ANNOVA compares means of several groups at one go rather than on pairwise basis. In addition to determining that differences exist or does not exist among the means of different stock returns, this procedure allows us to pinpoint which mean (by different variable category) is different from the means of other categories. Generally the procedure provides two types of tests for comparing means: a priori contrasts and post hoc tests. ANNOVA (contrasts) calculates compares F-statistic, the ratio of the variance calculated among the means to the variance within the samples. Through the Fstatistic, we conclude whether there exists statistical difference between means of group variables and determine whether group differences as a whole exist or do not exist. But unless we conduct a post hoc test and produce multiple comparisons it is not possible to confirm which groups differ from the others. We have conducted both priori contrast and post hoc test wherever necessary in comparing the means and variances. Firstly, we categorized variables by country and tested hypothesis, "mean returns among the SAARC stock markets are not different from each other. Results for this hypothesis are presented in table 4. We find that there is no difference between the group means of stock returns. This is confirmed from F-test (0.236) and significance value of (0.947). However, we need to remember that one-way ANNOVA assumes that the variances of groups that are compared are similar. While conducting test of homogeneity of variance using Levene-statistics, we found that differences in variance exist very significantly (Levene-stat is 13.205 and significant value (0.000); hence assumption of ANNOVA is contradicted. Under such situation, it is suggested that "Robust Tests of Equality of Means" will have to be considered for confirmation. We conducted robust test of equality of means and found that the mean returns across the countries in the region do not differ (Welch stat is 0.467 and significance value is 0.801). We have thus concluded that there is equality of means across stock returns in the region. Since the equality of returns is confirmed, conducting post hoc test (multiple comparison) to know which specific groups' means differ is irrelevant in this case. Now we will present the similar analysis as above with time (years) as the category variable. This will enable us to see whether mean returns are similar for different time period. We have conducted variance analysis firstly for the whole region together and secondly for specific indices using time as the category variable. We found that the average stock returns as a whole in the SAARC region vary between different years. We found that F-stat is (6.075 and significance level is (0.000). As the variances of stock market returns are different for different time period, it indicates the returns are volatile. In this case, since the variance homogeneity was not there, we looked at robust test for equality of means and confirmed the difference in means. We found differences in the mean returns across time periods. Next step in the analysis of variance analysis when differences exist is to analyze which groups (which time period in this case) differ from other periods. We conducted a multiple comparison test using Tukey Post hoc approach and the results are presented in table 5. This analysis produces "year of the sample period effect", meaning which year's returns are different from rest of the years during the sample period. As per the findings, average return of 2008 is statistically lower than the returns of most of the years (2006, 2007, 2009 and 2010). Return of 2011 is found lower than the returns of 2009. In the table (*) indicates significant difference in mean at 5% significance level.

Table5. Multiple Comparisons -Year of the Sample Period Effect
Multiple comparison of returns using time as the category variable confirms that the differences in stock returns exist for different years in the SAARC region. Since analysis of variance by category of time provides us the time varying effect on returns, we were interested to look at the time effect more at more micro level than just the year of the sample period effect. Barrak (2009) conducted a study to analyze time effect for the stock returns of three stock markets in Gulf Cooperation Council (GCC). His study looked at the day of the week effect and found that returns on Saturdays are significantly higher than on other days except Tuesday. In our case since we are dealing with the weekly data due to unavailability of daily data for Bhutan, so we choose to analyze week of the month effect. This allows us to find returns of which week is different from the returns of rest of the weeks. Similarly, to find the season of the year effect we repeated the procedure after recoding data by four seasons as: November to January, February to April, May to July, and, August to October. We analyzed both week of the month effect and season of the year effect for stock return of individual indices. We found that the average returns across weeks of any month and across season of any year for each return stock returns were similar. This suggests that the time effects, at least in the weekly and seasonally classified data were non-existent in the regional stock markets. Multiple comparison of results for each stock both for weekly and seasonal effects were analyzed but tables are not presented here due to space requirement.

CONCLUSION
To conclude from our finding we state that stock returns are not distributed normally in all the countries in the SAARC region at least when full sample (22006 -2011) was considered. However, when data was split into smaller samples (annual period), we observed normality in data distribution for most countries (except for Bhutan) in most of the smaller (annual) samples sizes. It indicates that there exists sample size effect on data normality. This finding is similar to that of Aggrawal (2004) who analyzed with daily and monthly returns for different sample sizes of Nifty and Sensex of Indian Stock. Although we found numerical differences in the percentages of annualized returns across the countries, the risk differences were not proportionately associated. When we tested differences in weekly mean returns between Bhutanese stock and other stock returns, we found no statistical differences. Across the stock markets in the region, the mean returns of 2008 were found to be lower than the returns for other years. Although the annualized returns were different for different countries (when full sample was considered), the differences in the risk levels were not in the same proportion as in the return differences. Comparing returns and risks in absolute number terms, Nepal seems to have one of the highest risks but return is some where at number three among the SAARC countries. On the other hand Bangladeshi and Sri Lankan returns are the highest with mediocre risk levels. This provides opportunity for the investors to gain extra return without having to take proportionately higher risks. This is in contrast to what EMH advocates.
Opportunity seems to exist but investors in South Asia are constrained with legal restrictions in their own countries for capital mobility. Economic integration in the region is very poor as stated by Dubey (2007) and there are lot of restrictions for cross boarder trade and investments. One-way ANNOVA test reveals that there was no week of the month effect and season of the year effect in any of the stock markets in the region. May be it can be stated that the time effect on return difference is non-existent in the stock markets in South Asian region. However, one should keep in mind that analysis of day of the week effect is more appropriate to conclude on time effect. At least from our findings we can state that investors entering into the stock market with selection of week or season and existing with similar logic will not generate extra gain, as time effect is absent.