MONITORING PARETO TYPE IV SRGM USING SPC

The Reliability of the Software Process can be monitored efficiently using Statistical Process Control (SPC). SPC is the application of statistical techniques to control a process. SPC is a study of the best ways of describing and analyzing the data and then drawing conclusion or inferences based on available data. With the help of SPC the software development team can identify software failure process and find out actions to be taken which assures better software reliability. This paper provides a control mechanism based on the cumulative observations of Interval domain data using mean value function of Pareto type IV distribution, which is based on Non-Homogenous Poisson Process (NHPP). The unknown parameters of the model are estimated using maximum likelihood estimation approach. Besides it also presents an analysis of failure data sets at a particular point and compares Pareto Type II and Pareto Type IV models.


INTRODUCTION
Software Reliability is the most dynamic quality characteristic which can measure and predict the operational quality of the software system during its intended life cycle. Software Reliability is the probability of failure free operation of software in a specified environment during specified duration [Musa 1998], , Satya Prasad[2007]. To identify and eliminate human errors in software development process and also to improve software reliability, the Statistical Process Control concepts and methods are the best choice. The NHPP based models are the most important models because of their simplicity, convenience and compatibility. The NHPP based software reliability growth models are proved quite successful in practical software reliability engineering [Musa et al., 1987].
SPC is concerned with quality of conformance. SPC can be divided into control charting and process capability study. Control charts provide a means of determining the type of variation (common cause or assignable cause) that is present in a process. Process capability study determines the ability of the "in control" process to produce product which meets specifications. The origin of SPC dates back to the 1920s and 1930s at the Western Electric Company and Bell Telephone Laboratories Walter Shewhart (1891-1967) recognized that variation in a production process can be understood and controlled through the use of statistical methods.
Here SPC concepts and methods are used to monitor the performance of a software process over time in order to verify that the process remains in the state of Statistical control. It helps in finding assignable causes, long term improvements in the software process. Software quality and reliability can be achieved by eliminating the causes or improving the software process or its operating procedures [4]. The most popular technique for maintaining process control is control charting. The control chart is one of the seven tools for quality control. Software process control is used to secure, that the quality of the final product will conform to predefined standards.
A process is said to be statistically "in-control" when it operates with only chance causes of variation. On the other hand, when assignable causes are present, then we say that the process is statistically "out-of-control". SPC provides a real time analysis to establish controllable process baselines; learn, set and dynamically improve process capabilities; and focus business areas needing improvement. The early detection of software failures will improve the software reliability. The selection of proper SPC charts is essential to effective statistical process control implementation and use. The SPC chart selection is based on data, situation and need [5].
This paper presents Pareto type IV model to analyse the software system using SPC. The layout of the paper is as follows: Section2 describes the formulation and interpretation of the model for the underlying NHPP, Section 3 describes the Pareto type II software reliability growth model, Section 4 describes the proposed Pareto type IV software reliability growth model, Section 5 discusses parameter estimation of Pareto type IV model based on interval domain data. Section 6 describes the techniques used for software failure data analysis for a live data and Section 7 Conclusion. In conclusion it is proved that both models results the failure situation at the same point.

NHPP MODEL
There are numerous software reliability growth models available for use according to probabilistic assumptions. The Non Homogenous Poisson Process (NHPP) based software reliability growth models are proved to be quite successful in practical software reliability engineering [1]. NHPP model formulation is described in the following lines.
A software system is subjected to failures at random times caused by errors present in the system.

Let
be a counting process representing the cumulative number of failures by time"t". Since there are no failures at t=0 we have It is reasonable to assume that the number of software failures during non-overlapping time intervals do not affect each other. In other words, for any finite collection of times . The n random variables This implies that the counting process {N(t), t>0} has independent increments. Let m(t) represent the expected number of software failures by time"s". The mean value function m(t) is finite valued, nondecreasing, non-negative and bounded with the boundary conditions.
Where "a" is the expected number of software errors to be eventually detected.
Suppose N (t) is known to have a Poisson probability mass function with parameters m (t) i.e., Then N(t) is called an NHPP. Thus the stochastic behaviour of software failure phenomena can be described through the N(t) process. Various time domain models have appeared in the literature (Kantam and Subbarao, 2009) which describe the stochastic failure process by an NHPP. S e p t 3 0 , 2 0 1 3

MODEL DESCRIPTION: PARETO TYPE II SRGM
We consider m (t) as given by The parameter estimation of the above mean value function is already derived [7].
Using theses parameters we already proposed a problem namely Statistical Process Control (SPC) which is used for Monitoring the Software Reliability [16]. This is also a Poisson model with mean "a".

THE PROPOSED PARETO TYPE IV SRGM
Let N (t) be the number of errors remaining in the system at time "t"

PARAMETER ESTIMATION BASED ON INTERVAL DOMAIN DATA
In this section we develop expressions to estimate the parameters of the Pareto type IV model based on interval domain data. Parameter estimation is of primary importance in software reliability prediction.
A set of failure data is usually collected in one of two common ways, time domain data and interval domain data. In this paper, parameters are estimated for the interval domain data.
The mean value function of Pareto type IV model is given by In order to have an assessment of the software reliability, a, b and c are unknown parameters and estimated using Newton Raphson method. Expressions are now delivered for estimating "a", "b" and "c" for the Pareto type IV model.
Assuming the given failure data set for the cumulative number of detected errors ni in a given time interval (0, ti) where i=1,2, ….. n and 0 < t1< t2< …tn, then the logarithmic likelihood function (LLF) for interval domain data [12] is given by Differentiating Log L with respect to "a" we have S e p t 3 0 , 2 0 1 3 The parameter "b" is estimated by Newton Raphson iterative Method using the formula = , where g(b) and g"(b) are obtained as follows.
Similarly the parameter "c" is estimated by using the formula Where g (c ) and g 1 (c ) are obtained as follows.
Solving the above equations simultaneously yields the point estimate of the parameters b and c. These equations are to be solved iteratively and their solutions in turn when substituted gives the value of "a".

DATA ANALYSIS
In this section, we present the analysis of one software failure data set. The set of software errors analysed here is borrowed from a real software development project as published in Pham (2005)   The control limits are calculated by the following equations taking the standard values 0.00135, 0.99865 and 0.5. S e p t 3 0 , 2 0 1 3 These limits are converted to form. They are used to find whether the software process is within control limits or not. Graphical representations of both the models are given below.

Fig 2. Mean Value Chart for Dataset1
By placing the successive differences on y axis and failure week on x axis the values of control limits are placed on Mean Value Chart. The Mean Value chart (Fig 1) shows that at 14 th point the failure data has fallen below ). Fig 2 shows that at 28 th point the failure data has fallen below ).This indicates that the failure process is identified. Thus Mean Value charts are significant in identifying early detection of failure data.

CONCLUSION
In this paper Pareto type IV software reliability growth model with SPC is proposed. Comparison is made between Pareto Type II and Pareto Type IV which shows that the failure points for both types is the same and fallen below ) at 14 th point and 28 th point respectively as shown in figures 1 and 2. We conclude that our method of the control charts are giving a +ve recommendation for their use in finding out preferable control process or desirable out of control signal. The early detection of software failure will improve the software reliability. Therefore, we may conclude that both the models are equally best choice for an early detection of software failures. Mean Value Chart S e p t 3 0 , 2 0 1 3