Smallest Univalue Segment Assimilating Nucleus approach to Brain MRI Image Segmentation using Fuzzy C-Means and Fuzzy K-Means Algorithms

Image segmentation still remains an important task in image processing and analysis. Sequel to any segmentation process, preprocessing activities carried out on the images have a great effect on the accuracy of the segmentation task. This paper therefore laid emphasis on the preprocessing stage of brain Magnetic Resonance Imaging (MRI) images using Smallest Univalue Segment Assimilating Nucleus (SUSAN) and bias field correction algorithms. Subsequently, brain tissue extraction tool was employed in extracting non-brain tissues from the brain image. Afterwards, Fuzzy K-Means (FKM) and Fuzzy C-Means (FCM) segmentation algorithms were employed for segmenting brain MRI images acquired from four different MRI databases into their White Matter (WM), Gray Matter (GM) and Cerebrospinal Fluid (CSF) constituents. Evaluation metrics such as cluster validity functions using partition coefficients and partition entropy; area error metrics such as false positive, true positive, true negative and false negative (FN); similarity index, sensitivity and specificity were used to evaluate the performance of both techniques. A comparative analysis of the experimental results revealed that in most instances, FKM segmentation technique is preferable to FCM segmentation technique for brain MRI segmentation task.


INTRODUCTION
For accurate information elicitation from medical images, image segmentation is a crucial task that cannot be undermined. It is an essential routine because the eventual outcome of the analysis will determine the success of the pattern image segmentation stage. This involves measuring and visualizing the brain's anatomical structures, analyzing brain changes, delineating pathological regions, surgical planning and image-guided interventions [1]. Generally, image segmentation entails partitioning an image of interest into its constituent regions that are non-overlapping whilst possessing similar attributes that are semantically meaningful. With focus on brain Magnetic Resonance Image (MRI) and iris images, segmentation of MRI image entails classifying the image into its constituent components: the white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF) while segmenting IRIS image entails classifying the Iris image into iris setosa, iris versicolour, and iris virginica. An efficient image segmentation technique must be able to uniquely and correctly classify these dataset into their constituent components to an acceptable degree.
Generally, image segmentation techniques could be classified into layer and block-based segmentation methods [2]. While layered approach defines image shape masks and explain their appearance and depth, the block based approach focuses on extracting features such as colour, textual and spatial information. Block based approach also provides information about the pixels, regions, edges or boundaries of extracted features. This research employs fuzzy variants of C and K means algorithms mostly denoted as Fuzzy C-Means (FCM) and Fuzzy K-Means (FKM) algorithms; a block based approach in segmenting brain MRI and IRIS images. The fuzzy nature of fuzzy algorithms is established in the fact that when an image is broken down into segments, each segment can belong to more than one cluster, though different segments have varying membership function which indicates their strength of association between the element and other clusters. Fuzzy segmentation technique involves assigning membership levels to the segments and then using these levels to assign data elements to one or more clusters. The advantages of fuzzy segmentation approach includes yielding regions more homogeneous than other methods; reducing the spurious blobs; removing noisy spots; reduced sensitivity to noise compared to other techniques [3]. However, fuzzy approach requires prior knowledge about the number of clusters present or possible in a particular dataset; this may be difficult to obtain a priori in a new dataset. In this research work, I S S N 2 2 7 7 -3061 V o l u m e 1 6 N u m b e r 7 I n t e r n a t i o n a l j o u r n a l o f C o m p u t e r s a n d T e c h n o l o g y 7066 | P a g e D e c e m b e r , 2017 http://cirworld.com/ DOI: 10.24297/ijct.v16i7.6170 emphasis will be laid on Fuzzy C-Means (FCM) and Fuzzy K-Means (FKM) clustering algorithms for brain MRI segmentation.

RELATED WORKS
FCM has been widely used for image segmentation in several fields such as feature extraction, pattern and image recognition, fuzzy identification etc. it is widely accepted because it preserves more information from the original image than other segmentation methods [4]. FCM segmentation technique was adopted in [5] for the identification and segmentation of lung nodules in Computed Tomography (CT) images. Their results showed that FCM was able to effectively separate the parenchyma from the mediastinum and the thoracic wall from the lung nodules. A study to determine the most suitable cluster validity index needed to achieve optimal clustering was carried out in [3]. The study employed the topmost 18 validity index in turn to synthetic images corrupted with noise of varying levels, and simulated volumetric MRI datasets. The results obtained showed that the various indexes have different outcomes for various noise levels, while some indexes can help in determining the accurate number of clusters existing in a dataset. As no algorithm is entirely flawless, several improvements of FCM have been proposed by researchers, these have led to the emergence of various versions of FCM. Some of these variations reported in [6] include Bias Corrected A comprehensive performance analysis among these variations was carried and their performance was validated on real MRI images as well as synthetic images. Partition coefficient (Vpc), partition entropy (Vpe), time complexity and segmentation accuracy were used as the performance evaluation metrics. High value of Vpc implies better performance while low value of Vpe implies better clustering. SFCM, TEFCM and MDFCM yielded a high Vpc and low Vpe values which denotes better performance and clustering. When considering feature structure, RFCMK yielded an appreciable performance while WIPFCM, MDFCM and KWFLICM also produce a better performance. Fuzzy c means was modified for segmenting T1-T2 weighted brain MRI images in [4], the modified algorithm called Modified Robust Fuzzy c-Means with weight Bias Estimation (MRFCM-wBE) was employed to solve the intensity inhomogeneity and noise affecting the segmentation process of computer assisted segmentation algorithms across differentiating borders between tissues of medical images. Fuzzy C-means was modified by reducing the number of iterations using dist-max initialization algorithm after which the algorithm was executed iteratively. Running time for obtaining clusters, number of iterations for completion of clustering the datasets, and silhouette width for clustering accuracy were the yardstick used to measure the performance of MRFCM-wBE. Checkerboard and lung cancer datasets were used as test data. The performance of the algorithm was compared with other segmentation algorithms such as Gaussian Kernel based Fuzzy c-Means algorithm (GKFCM), BCFCM, Improved Fuzzy Segmentation (IFS) algorithm and SFCM. The algorithm showed an appreciable improvement in running time over the remaining algorithms. In [7], Micro calcification clusters in digitized mammograms were detected using FCM and Possibilistic FCM segmentation techniques. The approach adopted involved detecting and separating micro calcifications which are small deposits of calcium in breast tissue. The results obtained showed that the algorithms adopted could detect and separate micro calcifications for normal tissues though PFCM performed better since its obtained results depend solely on the threshold value being the most appropriate. FCM has also been used in [8] for the detection of leukemia from blood images. This employs morphological contour segmentation approach to identify the edges of nuclei in white blood cells and further separates them from the microscopic blood image. The approach adopted yielded a better result in terms of accuracy and time consumption when compare to normal hematologist's visual classification. In addition to leukemia detection, FCM segmentation technique could be used in jaw tumor detection as carried out in [9] using panoramic X-ray images as test data. The approach employed involved grouping and clustering jaw's X-ray image pixels into background region and lesion region. Comparing the results obtained with other segmentation techniques, an appreciable result was obtained in terms of specificity, Similarity Index, Area error metrics and Sensitivity value.
Fuzzy K-Means (FKM)a variant of k-means algorithm has also been widely used for image segmentation. In FKM segmentation technique, the data of interest is partitioned into k-groups of disjoint cluster. To begin with, the k-centroid point is calculated after which the cluster with the nearest distance to the centroid point is determined. A research to determine if FKM actually divides a data into k groups and if the k-clusters chosen is accurate was carried out in [10]. Their work also looks into the color space at which KCM segments data better. Experiments were carried out on L*a*b colour space and RGB colour space images using 1, 2 and k-cluster color images (k>3). Silhouette Width which is a function of cluster's tightness and separation likewise Ground truth images were used to evaluate the adopted approach. Results showed that KCM considerably segment images much better in L*a*b* colour space when compared to RGB colour space.
The initial clusters centre in FKM algorithms are generated randomly; this leads to non-repeatable clustering results that may be hard to comprehend. As a result of this, a kernel density based clustering technique where the data is mapped to a high-dimensional space for classification was proposed in [11]. This involved selecting an initial seed value with maximum variance from the data matrix, then employing Gaussian kernel for the estimation of the initial density. Afterwards, subsequent seed values are selected that has densities equivalent to the initial selected points. In this way, k points with similar density with respect to each other are obtained. The proposed technique was able to avoid the dead and trapped centre limitation of clustering based segmentation approach.
The performance of three fuzzy algorithms viz: FCM, Gustafson-Kessel (GK) and FKM algorithms were analyzed based on their clustering output criteria in [12]. Liver disorders and wine datasets were used to test the performance of these algorithms. With FKM and FCM using Euclidian distance measure and GK using fuzzy covariance matrix in their distance metrics, the obtained experimental results showed that GK produced a similar result to FCM but FKM outperforms both FCM and GK algorithms. Thus, the efficiency of FKM was reported to be better than FCM and GK algorithms. Few comparative work between FKM and FCM have been carried out so far, hence, this research work attempts to further test FKM and FCM algorithm on brain MRI images with a view to establish which algorithms perform better.

Materials and Methods
The proposed method consists of pre-processing and the actual segmentation stage. The result of the segmentation stage to a large extent is dependent on the success of the preprocessing stage; hence, efforts were put in place to ensure that the pre-processing stage was meticulously carried out.

Pre-processing stage
The preprocessing stage involves three steps which are: noise reduction, bias field correction and brain extraction.

Noise Reduction
To eliminate the effect of noise on the image and improve local features using local smoothing, Smallest Univalue Segment Assimilating Nucleus (SUSAN) algorithm illustrated in [13] was used. SUSAN algorithm perfectly works for carrying out edge detection, corner detection and structure-preserving image noise reduction. SUSAN algorithm achieves this by placing a predetermined window (mask) at each point of an image and then calculates the brightness threshold of each pixel within the mask at each point. The calculated brightness was compared with the center point or nucleus of the pixel using equation (1): Where is the position of the nucleus in the image and is the position at any other point within the mask. I is the brightness of any pixel with "t" as the brightness difference threshold and "c" is the output of the comparison. This comparison was done for each pixel within the mask using equation (2): Where n is the number of pixels in the USAN. The comparison gives the USAN's area which will be ultimately minimized. The brightness difference threshold "t" helps in knowing the minimum contrast of features which will be detected likewise the maximum amount of noise which will be ignored. The next step of SUSAN algorithm is to carry out a comparison between "n" and a fixed threshold "g". There would have been no need for the geometric threshold if noise were to be absent. Nonetheless, to ensure noise is totally eradicated or reduced to the nearest minimum, "g" is set to 3nmax /4 where nmax is the maximum value of "n". The initial edge response is calculated using equation (3): Depending on the type of edge point being expected (typical straight edge or standard edge points), the direction of the edges associated to an image point must be calculated. If it were to be a typical straight edge, equation (3) can be used to compute the required edge else the direction of the edge must be calculated by first computing the center of gravity followed by finding the longest axis of symmetry. The center of gravity could be computed using equation (4) while the edge direction is projected by finding the sum of equations (5, 6 and 7): If the computed USAN area is smaller than the mask diameter then the edge belongs to the typical edge category however, if the USAN area is larger then, it belongs to the standard edge category [1].

Bias Field Correction
Bias field arises as a result of spatial inhomogeneity of the magnetic field, deviations in the sensitivity of the reception coil, and contact between the magnetic field and the human body [22]. Scanning MRI at 0.5T could make the effect of bias field minimal, hence, its effect could be neglected but when MRI are scanned with MR scanners with magnetic field above 1.5T, the effect of the bias field is always strong and could affect MRI analysis [4], at this instance the effect needs to be corrected. To correct the effect of this signal, the modified fuzzy-c means approach reported in [21] was adopted. This entails the following steps: (a). Choosing an initial cluster prototypes and setting the bias field to a very small positive value (b). Updating partition matrix where using equation (8): Where and the distance expression . is the observed intensity at the pixel, is the bias field at the pixel, is the cluster prototype, is the set of neighbours that exist in a window around the true intensity, is the cardinality of this set of neighbours and the effect of these neighbours is controlled by the parameter .
(c). Obtaining new prototypes of the clusters in the form of weighted averages of the patterns using: (d). Compute the bias term using: (e). Steps (b) -(d) will be repeated till termination criteria in (11) is satisfied (11) Where V is a vector of cluster centers and ϵ is a small number that can be taken during the initialization process.

Removal of Non-brain Tissue
Since emphasis is on brain tissue segmentation, removal of non-brain and extra-cerebral tissues like skull, neck, fat, bones, skin, eyeballs e. t. c. is necessary. These tissues can distort the segmentation process and affect the accuracy of the segmentation result. There are several brain tissue extraction algorithms such as: statistical parametric mapping, brain surface extractor, minneapolis consensus strip, threshold morphologic brain extraction and Brain Extraction Tool (BET). This research work employs BET; a widely known publicly available brain tissue extraction tool in extracting non-brain tissues from the brain image. The "generate binary brain mask image" option of BET was used to effect the brain tissue extraction.

Segmentation Stage
The segmentation task was carried out using FCM and FKM clustering algorithms.

FUZZY C-MEANS Technique
With FCM, a pattern or data sample can belong to more than one cluster while the membership value assigned to each pattern is used to indicate the similarity. In this research, FCM clustering begins by supplying a set of data and a number of clusters into Matlab environment using the Matlab function fcm. The function in turn returns the best cluster centers and membership value for each pattern. FCM clustering is an iterative process, hence, the cluster centers was repeatedly moved to the right location within the dataset by minimizing the objective function. The iteration terminates when the objective function enhancement between two consecutive iterations is less than the minimum amount of improvement specified [16]. FCM algorithm as illustrated in [3,4,17] expects a dataset of interest "x" with n data samples such that to be partitioned into clusters "c". FCM clustering task begins by minimizing the objective function J which is defined as: is the membership value of dataset for each class "I" and pixel "j", is the cluster while is the number of clusters. Since FCM employs an iterative process, this is achieved by with updating the membership value and cluster center repeatedly using equations (13) and (14) Where "·" is a norm metric. The membership value must satisfy the constraint: (15)

FUZZY K-MEANS Technique
Integrating fuzzy logic into K-Means clustering algorithm yields FKM [16]. FKM as illustrated in [13,15] implies that the data set of interest well be partitioned into k-clusters such that each cluster is compact and far from other clusters. This can be achieved by minimizing the distance between the cluster centers and the patterns that belong to that cluster. To establish the fuzziness of the pattern and the cluster center, the degree of the belongingness was represented by a characteristic function . This function must fulfill the constraint: (19) The objective function J must be minimized with respect to and using equation (20), such that  (20) Where the number of patterns is denoted by N, the measured Euclidean distance between and is denoted by while the number of clusters is denoted by k. Typically, the patterns are not expected to overlap but if they do, then each pattern belongs to more than one cluster i.e In this instance, should be referred to as a membership function rather than a characteristic function. As a result of this, equation (20) was modified as  (21) Where m remains the fuzzifier parameter which controls the degree of fuziness. Minimizing the objective function J may produce an insignificant solution, hence, constraints illustrated in equations (22) and (23)

Dataset used
The segmentation approach was validated using four different brain MRI datasets. Real brain MRI datasets were downloaded from [18]. The dataset was acquired using a physical phantom on a 3T MRI scanner with a turbo spin-echo sequence. The data was acquired with a 220 mm x 292 mm field of view on a 256×340 Cartesian sampling grid and was collected with an array of receiver coils, leading to 16 channels of information. The data was provided in Matlab format and were referred to as "D1". Synthetic brain MRI dataset was also downloaded from [23]. Each image in the downloaded dataset (referred to as D2) has a slice thickness of 1mm, resolution of 1mmX1mmX1mm, 20% intensity non-uniformity (RF), noise levels ranging from 0% to 9%, and 181X217X181 matrix voxels. All these parameters are provided before the final dataset is downloaded. Manually labeled brain MRI images [D3] were also downloaded from the Internet Brain Segmentation Repository (IBSR) [24]. The labelled images were not created by an algorithm but by neuroanatomical experts. The images in the datasets have different levels of difficulty which can be used to evaluate the segmentation task in different conditions. Finally, T1 and T2 brain MRI dataset from [25] with 1x1x1 mm 3 resolution and a voxel size of 2x2x2 mm 3 were also used.

Evaluation Metrics
The performance of both FCM and FKM algorithms were measured using various performance evaluation metrics such as cluster validity functions, Area error metrics, similarity index, sensitivity and specificity

Cluster Validity Function
Cluster validity function uses fuzzy partition to determine the efficiency of the segmentation task. Fuzzy partition is a function of the partition coefficient and partition entropy .
Partition Coefficients: Partition coefficients ranges from 0 to 1 with a high value indicating a better performance.

Fig 1: Fuzzy Partition Coefficient using D1
As shown in Figure 1, the results showed that both FKM and FCM could correctly classify the phantom dataset into three clusters but at different partition coefficient. FKM having the highest partition coefficient 0.85 at the third cluster center shows that FKM segments the dataset better at a higher partition coefficient when compared to FCM at 0.65. Partition coefficients of the remaining three datasets are highlighted in table 1. The results show that both techniques have the peak partition coefficients at the 3 rd cluster points across all the datasets (D2, D3 and D4) used. This showed that all the algorithms could at least identify the three clusters correctly.

Partition Entropy
A high partition coefficient is expected for an accurate clustering while low partition entropy is expected. It is computed using equation (27)

Area Error Metrics
Using manually labeled brain MRI images [D3] as the reference image, the area ratio of the False Positive (FP), True Positive (TP), True Negative (TN) and False Negative (FN) as used in [9] was employed to determine how correctly FKM and FCM could classify dataset D3 into its constituent WM, GM and CSF component. The area metrics are given as: The manually measured cluster area is represented by while those computed using FCM and FKM are represented by . The results obtained as illustrated in Table 2 showed that FKM performed preferably better when compared with FCM on dataset "D3". FKM having a high TP at 94% when compared to that of FCM at 90% showed that FKM could identify more area for each clusters than FCM. Also, the FP of FKM being lower than FCM showed that FKM was able to recognize clusters accurately than FCM. The P-value of each metric is less than 0.005 and this statistically showed that the result is highly significant.

Similarity Index, Sensitivity and Specificity
To further determine the effectiveness of the algorithms, Similarity Index (SI), sensitivity and specificity of each technique were computed using the following equations: SI could also be used to know if there is an overlap among the clusters or not. Specificity can be used to determine if an algorithm could classify an image correctly into the expected clusters (WM, GM, CSF in this sense) while sensitivity calculates the number of images that were correctly classified. The mean of the computed metrics on D3 is as described in Table 3. The result showed that FKM segmentation technique is more accurate and reliable in WM, GM and CSF segmentation when compared to FCM, though the results of FCM segmentation is not totally bad.

CONCLUSION
A comparative analysis aimed at asserting a superior segmentation technique between FKM and FCM algorithm has been carried out in this paper. Using partition coefficients as a metric, both techniques have the peak partition coefficients at the 3 rd cluster points across all the datasets. This showed that the techniques could correctly classify the datasets into three clusters, though, FKM records a higher value when compared to FCM across all the datasets used which showed that FKM classifies at a faster rate when compared to FCM. Also, when the partition entropy for each technique was computed, FKM attained lower partition entropy when compared to FCM and this further established the superiority of FKM segmentation technique. With the manually labeled brain MRI images as a reference image, the area error metrics (FP, TN, TP, FN) of the techniques were computed to further know the technique that classifies the datasets better. FKM achieved good results which further establish the fact that it's a better technique when compared to FCM. The P-value of each metric being less than 0.005 statistically showed that the computed result is highly significant. Conclusively, similarity index, sensitivity and specificity of each technique using dataset D3 were computed. The mean of the computed metrics revealed that FKM segmentation technique is more accurate and reliable in WM, GM and CSF segmentation when compared to FCM. Based on the results of the various evaluation metrics used, it could be conjectured that FKM is a good segmentation technique and its preferable to FCM segmentation technique.