A NOVEL APPROACH OF JOB ALLOCATION USING MULTIPLE PARAMETERS IN CLOUD ENVIRONMENT

Cloud computing is Internet ("cloud") based development and use of computer technology ("computing"). It is a style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet. This research deals with the balancing of work load in cloud environment. Load balancing is one of the essential factors to enhance the working performance of the cloud service provider. Grid computing utilizes the distributed heterogeneous resources in order to support complicated computing problems. Grid can be classified into two types: computing grid and data grid. We propose an improved load balancing algorithm for job scheduling in the Grid environment. Hence, in this research work, a multi-objective load balancing algorithm has been proposed to avoid deadlocks and to provide proper utilization of all the virtual machines (VMs) while processing the requests received from the users by VM classification. The capacity of virtual machine is computed based on multiple parameters like MIPS, RAM and bandwidth. Heterogeneous virtual machines of different MIPS and processing power in multiple data centers with different hosts have been created in cloud simulator. The VM’s are divided into 2 clusters using K-Means clustering mechanism in terms of processor MIPS, memory and bandwidth. The cloudlets are divided into two categories like High QOS and Low QOS based on the instruction size. The cloudlet whose task size is greater than the threshold value will enter into High QOS and cloudlet whose task size is lesser than the threshold value will enter into Low QOS. Submit the job of the user to the datacenter broker. The job of the user is submitted to the broker and it will first find the suitable VM according to the requirements of the cloudlet and will match VM depending upon its availability. Multiple parameters have been evaluated like waiting time, turnaround time, execution time and processing cost. This modified algorithm has an edge over the original approach in which each cloudlet build their own individual result set and it is later on built into a complete solution.


INTRODUCTION
Cloud could be a term used as a trope for the wide space networks (like internet) or several such giant networked atmosphere. It came partially from the cloud like image wont to represent the complexities of the networks within the schematic diagrams. It represents all the complexities of the network which can hold everything from cables, routers, servers, knowledge centers and every one such alternate devices. Cloud computing is an on demand service in which shared resources, information, software and other devices are provided according to the clients requirement at specific time. It's a term which is generally used in case of Internet. The whole Internet can be viewed as a cloud. Capital and operational costs can be cut using cloud computing [36].With traditional desktop computing, we run copies of software programs on our personal computer. The documents we make are stored on our own pc. Although documents can be accessed from other computers on the network, they cannot be accessed by computers outside the network. This is PC-centric. By cloud computing, the software programs one use are not run from one's personal computer, but are quite stored on servers accessed via the Internet. If a computer crashes, the software is still available for others to use. Similar goes for the documents one create; they are stored on a collection of servers accessed through the Internet. Anyone with permission can not only access the documents, but can also edit and cooperate on those documents in real time. Rapid elasticity: -Capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time. v.
Measured service: -Cloud systems automatically control and optimize resource use by leveraging a metering capability1 at some level of abstraction appropriate to the type of service (e.g., storage, processing, bandwidth, and active user accounts). Resource usage can be monitored, controlled, and reported, providing transparency for both the provider and consumer of the utilized service.

LOAD BALANCING
It is a process of reassigning the total load to the individual nodes of the collective system to make resource utilization effective and to improve the response time of the job, simultaneously removing a condition in which some of the nodes are over loaded while some others are under loaded. It is used to achieve a high user satisfaction and resource utilization ratio, hence improving the overall performance of the system. Proper load balancing can help in utilizing the available resources optimally, thereby minimizing the resource consumption. It also helps in implementing fail-over, enabling scalability, avoiding bottlenecks and over-provisioning, reducing response time etc. Load Balancing is a method to distribute workload across one or more servers, network interfaces, hard drives, or other computing resources [13]. Typical datacenter implementations rely on large, powerful (and expensive) computing hardware and network infrastructure, which are subject to the usual risks associated with any physical device, including hardware failure, power and/or network interruptions, and resource limitations in times of high demand. Load balancing in the cloud differs from classical thinking on load-balancing architecture and implementation by using commodity servers to perform the load balancing. This provides for new opportunities and economies-of-scale, as well as presenting its own unique set of challenges. Load balancing is used to make sure that none of your existing resources are idle while others are being utilized. Modern high-traffic websites must serve hundreds of thousands, if not millions, of concurrent requests from users or clients and return the correct text, images, video, or application data, all in a fast and reliable manner. To cost-effectively scale to meet these high volumes, modern computing best practice generally requires adding more servers [28]. A load balancer acts as the "traffic cop" sitting in front of your servers and routing client requests across all servers capable of fulfilling those requests in a manner that maximizes speed and capacity utilization and ensures that no one server is overworked, which could degrade performance. If a single server goes down, the load balancer redirects traffic to the remaining online servers. When a new server is added to the server group, the load balancer automatically starts to send requests to it.

Figure 1. Model of Load Balancing [28]
In order to distribute the necessary tasks, load balancers go through a series of steps. First, the load balancer will query the available servers to ensure their availability. The load balancer pings a server, and if the expected response occurs, it will be included in the available list. If the server fails to respond, it will not be used until another test is performed and it returns with the appropriate response.

RELATED WORK
We have studied and analyze many load balancing techniques for cloud computing. After our detailed literature review we are able to discover theme of our work including the techniques we have incorporated in various phases, to implement it and to evaluate our results and conclusion. proposed a new technique to achieve load balancing called Load Balancing with Optimal Cost Scheduling Algorithm. The user selects a list of services available from the service pack. The scheduler processes these tasks to the virtual machine (VM) based on the configuration and computing power of the VM. This task is achieved with minimum execution cost which is a profit for the service provider and minimum execution time which is an advantage for both service provider and the user. • While computing midpoint, a single virtual machine can be assigned with multiple tasks of higher instruction size.
• No suitable criteria has been defined for handling the faulty virtual machines and task migration at that particular time.

OBJECTIVES
The primary purpose of the cloud system is that its client can utilize the resources to have economic benefits. A resource allocation management process is required to avoid underutilization or overutilization of the resources which may affect the services of the cloud.
• To implement and study the performance of existing load balancing algorithms.

•
To compute the capacity of VM based on multiple parameters like MIPS, RAM and Bandwidth.

•
To optimize the overhead time required for processing task list and Vm list.

•
To implement the fault tolerance mechanism for Virtual machines.

•
To develop the proposed algorithm and compare the performance of proposed algorithm with existing algorithm.

RESEARCH METHODOLOGY
The proposed algorithm is implemented through the simulation software like cloud sim used. For the implementation of the application, java language is used. This research work considers Datacenter, VM, host and Cloudlet components from CloudSim for implementation of a proposed algorithm. Clustering is done at the VM level on the basis of Vm capacity in terms of processor, memory, bandwidth by using K-Means Clustering mechanism. • Create the multiple number of user cloudlets of different requirements and size. Cloudlets are divided into categories: high processing and low processing tasks.

•
The decision is based on length of the cloudlets. The cloudlet whose task size is greater than the threshold value will enter into high processing tasks and cloudlet whose task size is lesser than the threshold value will enter into low processing tasks.

•
Submit the job of the user to the datacenter broker. DCB will first find the suitable VM according to the requirements of the cloudlet and will match VM depending upon its availability. • Create a backup task of each cloudlet which is going to execute by the VM.

•
If Task T1 executes successfully on VM, then back up of task T1 is no longer required and therefore, backup task is deleted successfully. • If VM which is executing the current task has become faulty, then the backup task of T1 will be executed by another VM depending upon its availability.

•
The VM which is failed is added to the blacklisted table so that no other cloudlet is further assigned to the faulty VM.

•
Change the status of that virtual machine from idle to busy. • Dispatch our cloudlet to that virtual machine and we will modify the rating of that particular virtual machine.

•
Repeat the same procedure for all the remaining cloudlets. .

K-MEANS CLUSTERING
K-Means clustering aims to partition, n (number of observations) quantities into k (assume k clusters) clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. K-means is one of the simplest unsupervised learning algorithms that solve the well-known clustering problem. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume k clusters) fixed a priori. The main idea is to define k centers, one for each cluster. These centers should be placed in a cunning way because of different location causes different result. So, the better choice is to place them as much as possible far away from each other. The next step is to take each point belonging to a given data set and associate it to the nearest center. When no point is pending, the first step is completed and an early group age is done. At this point re-calculate k-new centroids as barycenter of the clusters resulting from the previous step. After having these k new centroids, a new binding has to be done between the same data set points and the nearest new center. A loop has been generated. As a result of this loop it may be noticed that the k centers change their location step by step until no more changes are made or in other words centers do not move any more. Finally, this algorithm aims at minimizing an objective function known as the squared error function given by: Where, '||xi -vj||' is the Euclidean distance between xi and vj.
'ci' is the number of data points in i th cluster.
'c' is the number of cluster centers.

PARAMETERS USED TOTAL PROCESSING TIME:-
It is defined as the time interval between the request sent and the response received by the cloud user/ consumer.   For the above bar chart, it is clear that total processing cost has been reduced.

TOTAL WAITING TIME
Waiting time=Allocation Time-generation Time For the above bar chart, it is clear that total waiting has been reduced.

CONCLUSION
The performance of improved load balancing algorithm has been studied in the research work. In order to evaluate performance of the proposed model simulation study has been put through various test conditions. It has been found that the model works well in ensuring an even distribution of the workload. In the current work, it has been assumed that all the incoming requests are independent to each other. The experiment conducted is compared with previous algorithm. The result indicates that the approaches outperform to previous algorithm in terms of total processing time, total processing cost and waiting time. To obtain a better solution, the model should be made more realistic by considering issues related to load balancing such as data locality, communication cost and flow time and results can be tested in real Cloud environment. However, within a computational cloud environment high throughput is of great interest rather than the load balancing. To achieve this, we have proposed the new dynamic load balancing algorithm. The experimental results obtained by applying the new proposed algorithm in the Cloud Sim Simulator, shows that the new work has outperformed the existing scheduling algorithms in large scale distributed systems.