A Survey of multimedia videoconferencing system and a proposal for a novel hybrid cloud and P2P architecture

Technological advances of the Internet and network technology have allowed the development and deployment of new services as multipoint multimedia applications: long-distance education, IPTV, distributed games and videoconferencing systems. In this paper, we focus on videoconferencing systems, which represents the most complex type of video communication. These systems have been used for several of years in business, and more recently in everyday life and they allow interactive communications and facilitate the joint work among users regardless of their geographical location. In this paper, we describe a variety of proposed videoconferencing solutions and classify them according to characteristics such as network architectures and video coding technologies and we propose a novel hybrid cloud and P2P architecture with introduction a function for QoS Adaptation.


I. INTRODUCTION
The Internet has become an essential part of our daily life. Many people are now stay connected through emails and instant messenger services. However, with the development of multimedia and network technologies, the transfer of multimedia data in real time can be supported and there are increasing demands for real-time group communication applications (e.g., video-conferencing, online-gaming, and long-distance education) . Internet videoconferencing belongs to the category of group communication unlike others that are consisting of point-to-point conversations or file transfers. This type of application has many characteristics.
First, it is better suited to set up meetings, discussion, seminars or workshops. It's for this reason that it has t ypically small group size, fewer than ten participants, and the membership usually changes rapidly: any member may join, leave or invite other members to the conference at any time.
Besides, there often have a handful of sources and a large numbers of receivers. For each source, at least two types of media streams are involved, voice and video, both of which are highly bandwidth intensive. On the other hand, most Internet users have very limited bandwidth, and the Internet connections are of great diversity, including dial-up, DSL, cable modem and LAN. It is very challenging to serve all these types of users. As we know, videoconferencing application is a real-time application involving two-way communication. It has very stringent requirements on end-toend latency. This is different from the media streaming application, which only has one-way data transmission and allows a few seconds of buffering time at the receiver side.
Finally, providing multi-point videoconferencing service is challenging because of its high bandwidth demand and strict streaming quality requirement. Table 1 presents the following constraints to this kind of system.

Scalability
The system must scale according to the number of users who are connected to the service (multi -conference service).

Bandwidth constraint
The video streaming rate should not exceed the channel capacity.

Real-time constraint
The delay in video packet delivery should not exceed the play-out deadline of a video frame at reception time Quality of Service (QoS) QoS must guarantee a minimum decoded video quality and a maximum transmission error rate over the duration of the streaming session despite the variation in channel conditions.
In this paper, we present an overview of the videoconferencing solutions and compare them. The remainder of this paper is organized as follows: Section 2 presents different solutions of videoconferencing services .Section 3 surveys the current state-of-the-art in overlay networks solutions namely, ALM, P2P, and CDN, and provides a qualitative comparison. Section 4 presents a comparative study between different researches. Section 5, we presents a new architecture for videoconferencing system with introduction a function for QoS Adaptation. We conclude in Sect. 6.

II. STATE OF ART : VIDEOCONFERENCING SYSTEM
Generally, videoconferencing services are provided in two different ways. The first type is the centralized system. It utilizes a high-quality videoconferencing room system with professional equipment and dedicated bandwidth. It is usually implemented by means of either centralized client-server architectures or high performance devices called Multipoint Conference Unit (MCU) [1]. The latter is a central device with a larger bandwidth for Internet connection than a regular participant. These systems use a centralized server to distribute the video signal and it has the capability to serve an N number of participants in a multipoint conference. Each participant requires software such as WebEx 2003 from Cisco ( Polycom viewstation, Netmeeting from Microsoft, Radvision etc.) to be able to log in to the conference and communicate directly with other participants. Generally, drawbacks of the centralized system are that it is not scalable and the complexity and cost of the operations of the MCUs. In addition, it requires a higher bandwidth to disseminate a single video signal among participants. A u g 20, 2 0 1 3 The second kind is implementing this system on personal computers. This type is often free of charge and easy to install and use, although the quality cannot be guaranteed. It is the multicast videoconferencing system. Multicasting is one of the most efficient mechanisms to distribute data which a sender can transmit its information to a large number of receivers without having to send multiple copies of the same data over a physical link. It is another approach to reduce bandwidth demands of videoconferencing whenever the underlying network supports it.
The evolution of multicast technology has experienced two stages. The earlier is known as IP multicast. The idea was first introduced by Steve Deering[2], who suggested thatfunctionalities of multicast should be implemented at the network layer, in which a tree delivery structure composed of routers is usually employed, with data packets only replicated at branching nodes. There are several Mbone applications using this approach for multimedia conferencing, such as RAT [3], and VIC [4]. This approach is not fully deployed due to concerns related to its complexity, scalability, security, and the lack of multicast support in many organizations.
After a decade of research into the various issues of IP Multicasting such as routing, group management, address allocation, authorization and security, Quality of Service (QoS) and scalability, the widespread deployment of IP Multicast on the global inter-network has been dogged by technical, administrative and business related issues [5]. These issues are detailed in the El-Sayed PhD thesis. He presented a many technical and marketing reasons, such as the lack of a scalable inter-domain multicast routing protocol, the requirement of global deployment of multicastcapable IP routers and the lack of practical pricing models. Therefore, there have been recent proposals to alternative group communication services that either grow out of the IP Multicast model and still support IP Multicasting or offer a competing model.
El-Sayed et al. give a survey of such proposals [6] where they present a survey of multicasting approaches alternative to classic IP Multicasting. These include using reflectors, permanent tunneling (e.g. MBONE), relying on specific routing services such as IPv6, and Application Layer Multicasting or automatic tunneling. The most important solution is Overlay network. Given these advantages which include its simple configuration, flexible implementation and the customization of some attributes, such as data transcoding, error recovery, flow control, scaling, management and security of different messages, overlay technique has become the subject of much research and is being used as a basis for the development of new models.

III. OVERLAY NETWORKS
An overlay network is a computer network built on top of another network. Nodes in this are considered connected by virtual or logical links each corresponding to a path, perhaps through many physical links, in the underlying network . Three kinds of this technique transfer the content parts: the Application Layer Multicast (ALM) network , the Peerto-Peer (P2P) network and the Content Distribution Network (CDN) .

A. ALM Solutions
The major difference between multicast IP and ALM, termed as End-system multicast(ESM), is that in the former, packets are replicated at routers whereas in the latter, packets are replicated at end-hosts. Specifically, in ALM, members in a multicast group communicate via an overlay network in which each edge corresponds to a direct unicast path between two group members. ALM has been successfully adapted by many applications to reduce network transmission latency and deploy in networks without network-layer multicast. In the literature, we found six pieces of representative work on ALM protocol design for videoconferencing applications.
Chu and al. [7] explore the use of ESM, the first application layer multicast protocols, for conferencing applications. In ESM, end systems self-organize into an overlay structure using a fully distributed protocol named Narada. The latter adopts a mesh-first strategy in constructing multicast trees. It forms a rich connect graph (called a mesh) and then generates source-specific data distribution trees based on the mesh using multicast routing protocol DVMRP (Distance Vector Multicast Routing protocol).The disadvantage of this protocol is that there is no control over the resulting spanning tree for a given mesh.
Differing from Narada, ALMI [8] is a centralized protocol. Each session has a session controller which takes all the responsibility of membership registration and multicast tree maintenance. The multicast tree is a shared tree constructed with a tree-first strategy. The session controller periodically recalculates a new tree based on the end-toend measurements collected by session members. Although a shared tree is easy to manage, it does not have as good delay properties as source-specific trees. The centralized design also causes two problems: (1) if the controller fails, the multicast tree has to stay unchanged and thus is vulnerable to network changes; (2) During the switch of multicast trees, there will be evident turbulence in performance.
The protocol for multi-sender 3D videoconferencing [9] uses a hybrid approach of the above two systems. It adopts a centralized approach similar to ALMI for tree management and uses a mesh-first strategy similar to Narada for tree generation.
The novel idea in this protocol is to use a double-algorithm approach for participant joining. If the local algorithm fails to attach a new receiver, the global algorithm will be used to investigate a rearrangement of all trees. However, it still has the following shortcomings that prevent it from practical deployment: (1) it suffers from the one-point-failure problem just as ALMI does; (2) it does not take the dynamic nature of the Internet into consideration and assumes static available bandwidth in calculating multicast trees; (3) The second algorithm re-arranges all trees without considering their original topologies, thus there will be inevitable jitters and long latencies during the switching of the trees.
DigiMetro [10] is a fully distributed protocol. All the members are logically equal : each of them maintains a complete member list and takes full charge of its own multicast tree. DigiMetro makes a clear distinction between the concep t of a conference and that of a multicast session. While a conference is made up of a group of members, a multicast session is composed of a single data source and a number of receivers. Thus, in a multi-party conference, there are multiple multicast sessions, since every conference member is a data source.
Lim et al. [11] proposed another approach named N-Tree, a bandwidth fair application layer multicast for multi-party video conferencing. It builds a distribution tree for each source, and aims to satisfy requirements of latency and multicast bandwidth fairness. The N-Tree algorithm is shown to be convenient for videoconferencing with small number of participants.
In [12], a new application layer multicast algorithm using distributed service architecture and scalable video coding is proposed for scalable videoconferencing services. The proposed algorithm considers the limitations of the human perception while participating in a videoconference so as to minimize traffic that is not necessary for the communication session. The newly proposed algorithm can effectively reduce the total traffic load of the scalable videoconferencing service.

B. P2P solutions
P2P is another overlay network type used to transfer video stream chunks. It is a powerful platform for a variety of multimedia streaming applications over the Internet such as video-on-demand, video conferencing, live broadcasting, etc. A P2P system is extremely cost-effective since it utilizes the resources (CPU cycles, storage space, and uplink bandwidth) of peer machines. Every node can directly access each other's data, computing resources; such systems constitute the distributed computing model, shown in Figure 2. P2P is an extremely popular method in which nodes in the network, called peers, offer resources such as bandwidth, processor and storing capacity to other nodes. Consequently, as the number of users increases, the global resources of the network also grow. However, using this technology in a video conferencing system greatly reduces the video quality experienced by the user. In the literature, we found four pieces of representative work on P2P videoconferencing applications. A u g 20, 2 0 1 3 Vanets [13] is a P2P videoconferencing system that distinguishes between active and passive participants (active participants are producers of video stream, whereas passive participants represent viewers only). Vanets takes advantage of transcoding sees [14] to allocate streaming rates optimally for all participating peers in the conference. In other words, transcoding can change the bit rate to meet the requirements of peers, as explained by [15]. In transcoding, the video signal is changed by the relaying peer to meet a lower encoding rate through either reencoding or changing key parameters such as the quantization values of Perlman.
Akkus et al. [16] use layered video in a P2P multiparty video conferencing system with the assumption that each peer can send and receive (at least) one full-quality video stream, i.e., could participate in a one-to-one video conference, and they consider an optimization problem where the number of base layer video receivers is minimized. In this approach, they utilize periodical round trip time (RTT) measurements between any two peers and assume the delay in each direction is same and only depends on the network latency.
The multi-rate and multi-party P2P videoconferencing system [17] proposes that different receivers in the same group can receive a video at different video rates. In this system, an optimal set of tree structures is determined for routing multi-rate content using scalable video coding [18]. This system divides peers into many groups. Each group can be represented by a tree and each peer in the tree can receive different video rates rather than a single rate [19]. In this system, an optimal set of tree structures is determined for routing multi-rate content using scalable video coding.
Celerity [20] is an approach to multiparty videoconferencing with emphasis on low end-to-end delay. It also uses at most depth-2 multicast trees as our approach but removes all paths which would lead to violating a delay bound. The model assumes the capacity bottleneck can be anywhere in the network unlike our approach where only the uplink bandwidth of peers is considered. They present a packet loss rate based primal subgradient algorithm to solve a similar optimization problem to ours with an added delay bound.

C. CDN solutions
Then to the late 90s, in order to ease the pressure on the server side, Content Delivery Networks (CDN) has developed rapidly to become an effective scheme to solve problems of multicast, as shown in Figure 3. It is an established and covers over the Internet by nodes in different regions of the composition of the virtual network server group. It is responsible for the content of the servers efficient, stable release to the nearest place from the client to ensure that the content in a very efficient way to provide services for the user's request.

Figure 3:CDN Content distribution network deployment
The CDN basic idea is to avoid as much as possible on the Internet the link which may affect the data transmission speed and stability, allows users to get the desired data information nearby, for which reduces latency, and also can alleviate network congestion issues. But want to CDN doing like this that in each area of the edge of a large number of classification placed the cost of deploying the proxy server is very high, and the overall of it, has not reduced the same content on the network bandwidth usage and transmission of waste. Similarly, it does not meet the increasing demand for video users.
On the other hand, there are generally two types of content distribution models as classified in [21]: Fluid model and Chunk Content Distribution model. Fluid model provides continuous transferring of the content from the source to the multiple receivers. This model has a tightly coupled connection (directly distributed bit by bit continuously from source to destination) between adjacent peers; therefore, it is considered as an optimal distribution model to utilize bandwidth for fast peers while causing congestion for slow peers. The second type, Chunk content distribution model, chops the content into equally sized pieces (called chunks) and subsequently distributes each chunk. A peer not distributed piece until it has fully received that piece. Chunk model is considered a loosely coupled connection (stores the chunks prior to their distribution). Consequently, this model is considered an optimal distribution model for slow peers.
In the research [22], the server-based infrastructure is modified into a peer-to-peer videoconferencing system while preserving the same functionality and features of the existing MCS. This modification can be achieved using a hybrid content distribution model, which is a combination of fluid and chunk content distribution models to distribute parts of the video stream fairly among participants. The hybrid content distribution model offers a better way of handling heterogeneous networks because it can distinguish between a fast peer and a slow peer, dealing with each one according to its capabilities.
A recent study [24] provides a survey of free multi-party video conferencing systems, and a measurement work to compare four representative systems including Nefsis [25] in terms of their performance, mechanisms and quality of experience.
Nefsis provides dedicated cloud computing resources for video conferencing. Users automatically connect to geographically close servers distributed on the Internet to have a low-latency experience (Nefsis, online).
Many other popular online chatting applications (like Skype, Msn, Yahoo, messenger, Google talk, etc.) only support multi-party audio conference and 2-party videoconference, and therefore are not considered here.

IV. COMPARATIVE STUDY
We consider eleven Videoconferencing applications, for which we list the maximum bandwidth they can support, the max delay, the maximum number of simultaneous conference participants, the category of architecture and the technology of video coding. In Table 2 we present this comparison of the different aspects of solutions.  Table 2, we observe that the maximum bandwidth is 14 Mbps which corresponds to cloud computing solutions and the second bandwidth is 2 Mbps which corresponds to scalable coding and that the solutions based at P2P architectures features the best bandwidth and the best delay (<100 ms).
Based on these observations, we make the following inferences on the applicability of the approaches to different applications:  The ALM approach has the ability to disseminate video signals faster, but the main disadvantage of this system is that it copes badly with a heterogeneous network.
 P2P architecture does not need any special hardware or network infrastructure support. There is no central server and the conference need to be organized and managed by its participants, thus reducing the chanceof single point failure but increasing the complexity of conference management. All audio/video streams fare transferred among peers but not between clients and server, thus alleviating stress on network. However, because of different network situation of each participant, the QoS can not be guaranteed as MCU-based solution.
 The CDN approach has the ability to disseminate video signals faster ,more control of asset delivery and network load is awarded, but the main disadvantage of this system is that the fees associated with the service. Many of the larger CDN have high setup fees and other hidden fees.

V. OPEN ISSUES AND FUTURE WORK
The list of videoconferencing system presented in this article covers the diversity of approaches and serves to illustrate their characteristics, but it is not an exhaustive list since it focuses on relatively earlier efforts at the exclusion of the protocols that have currently emerged. In recent years, the technology of cloud computing has been widely applied in e-busniess, e-eduction etc. On the one hand, cloud computing is an Internet-based computing; where resources, software and information are provided to computers on-demand, like a public utility; is emerging as a platform for sharing resources like infrastructure, software and various applications. On the other hand, P2P networking has favorable characteristics, such as high scalability, self-configuration and organization. Many people consider them as suitable infrastructures for supporting real time streaming. However, P2P networks posses dynamic characteristics that can decrease drastically the performance of these real-time applications. In this paper, we propose a novel hybrid cloud and P2P architecture for videoconferencing system in a both centralized and peer-to-peer distributed manner. We favor cloud technology to be future key for bringing back the centralized architecture in multimedia communication and by expanding it for P2P streaming support we believe it could bring double benefit to both the cloud service providers, and the end users.
In order to take advantage of the cloud technology and make multimedia streaming more efficient, we introduce APIs (application programming interface) in the cloud, containing build in functions for automatic QoS adaptation, which permits calculate QoS parameters such as bandwidth, jitter and latency, among a cloud service provider and its potential clients and can effectively reduce the total traffic load. We suggest that by extending this feature it is possible to implement automatic two functions : one function for calculation of QoS parameters (bandwidth, jitter and latency) and a function adaptation based in human perception to minimize unnecessary traffic on the P2P overlay network for the communication session . The functions would be part of a web service and will be represented to the cloud providers' clients through a user friendly interface. This would enable them by connecting to the provider's page, consult the current status of a streaming content, the connected clients using the service and their QoS status. Figure 4 represents a general view of our proposed architecture. It depicts the cloud that contains multimedia streaming servers, (their number depends on the provider of the streaming service with possibility to scale as the number of customers or petitions rise). The service has first level or directly connected clients (Cli ent A) and higher level clients (Client B, C). The idea behind this is that first level clients after login, consult and choose one among the three types of price packets(high quality, medium quality, low quality). Afterwards, they decide to make contract directly with the provider of the service enjoying high streaming quality.

VI. CONCLUSION
In this article, the latest achievements, techniques and models, in the area of Videoconferencing system have been presented. Furthermore, in section 4, this article proposes and shows the benefits of using P2P architecture and scalable coding to improve the characteristics of solutions to satisfy many of the constraints of this type of applications. In section 5,we propose a novel hybrid cloud and P2P architecture for videoconferencing system that combined a both centralized and distributed architecture. A hybrid distribution network can be an efficient solution for videoconferencing services. This enhancement can be obtained by strategic placing of certain distribution network nodes into the Cloud provider infrastructure, taking advantage of the reduced packet loss and low latency that exists among its datacenters.

REFERENCES
[1] ITU-T Study Group XV -Recommendation H.231: Multipoint control units for audiovisual systems using digital channels up to 1920 kbits/s; 1993.