Performance Evaluation of Pattern Storage Network of Associative memory with Sub-optimal GA for Hand written Hindi ‘SWARS’

In this paper we are performing the evaluation of Hopfield neural network as Associative memory for recalling of memorized patterns from the Sub-optimal genetic algorithm for Handwritten ‘SWARS’ of Hindi language. In this process the genetic algorithm is employed from sub-optimal form for recalling of memorized patterns corresponding to the presented noisy prototype input patterns. The sub-optimal form of GA is considered as the non-random initial population or solution. So, rather than random start, the GA explores from the sum of correlated weight matrices for the input patterns of training set. The objective of this study is to determine the optimal weight matrix for correct recalling corresponds to approximate prototype input pattern of Hindi ‘SWARS’ . In this study the performance of neural network is evaluated in terms of the rate of success for recalling of memorized Hindi ‘SWARS’ for presented approximate prototype input pattern with GA in two aspects. The first aspect reflects the random nature of the GA and the second one exhibit the suboptimal nature of the GA for its exploration. The simulated results demonstrate the better performance of network for recalling of the memorized Hindi SWARS using genetic algorithm to evolve the population of weights from sub-optimal weight matrix.


INTRODUCTION
Pattern storage & recalling i.e. pattern association is one of prominent method for the pattern recognition task that one would like to realize using an artificial neural network (ANN) as associative memory feature. Pattern storage is generally accomplished by a feedback network consisting of processing units with non-linear bipolar output functions. The Hopfield neural network is a simple feedback neural network (NN) which is able to store patterns locally in the form of connection strengths between the processing units. This network can also work for the pattern completion on the presentation of partial information or prototype input pattern. The stable states of the network represent the memorized or stored patterns. Since the Hopfield neural network with associative memory [1][2] was introduced, various modifications [3][4][5][6][7][8][9][10] are developed for the purpose of storing and retrieving memory patterns as fixedpoint attractors. The dynamics of these networks have been studied extensively because of their potential applications [21][22][23][24]. The dynamics determines the retrieval quality of the associative memories corresponding to already stored patterns. The pattern information in an unsupervised manner is encoded as sum of correlation weight matrices in the connection strengths between the proceeding units of feedback neural network using the locally available information of the pre and post synaptic units which is considered as final or parent weight matrix. Hopfield [1] proposed a fully connected neural network model of associative memory in which we can store information by distributing it among neurons, and recall it from the dynamically relaxed neuron states. If we map these states corresponding to certain desired memory vectors, then the time evolution of dynamics leads to a stable state. These stable states of the networks represent the stored patterns. Hopfield used the Hebbian learning rule [25] to prescribe the weight matrix for establishing these stable states. A major drawback of this type of neural networks is that the memory attractors are constantly accompanied with a huge number of spurious memory attractors so that the network dynamics is very likely to be trapped in these attractors [6], and thereby prevents the retrieval of the memory attractors. Hopfield type networks also likely be trapped in non-optimal local minima close to the starting point, which is not desired. The presence of false minima will increase the probability of error in recall of the stored pattern. The problem of false minima can be reduced by adopting the evolutionary algorithm to accomplish the search for global minima. There have been a lot of researchers who apply evolutionary techniques (simulated annealing and Genetic algorithm) to minimize the problem of false minima [10]. Imades & Akira [10][11][12][13][14][15][16][17][18][19] have applied evolutionary computation to Hopfield neural networks in various ways. A rigorous treatment of the capacity of the Hopfield associative memory can be found in [20]. The Genetic algorithm has been identified as one of prominent search technique for exploring the global minima in Hopfield neural network [24]. Developed by Holland [26], a Genetic algorithm is a biologically inspired search technique. In simple terms, the technique involves generating a random initial population of individuals, each of which represents a potential solution to a problem. Each member of this population evaluates from a fitness function which is selected against some known criteria. The selected members of the population from the fitness function are used to generate the new population as the members of the population are then selected for reproduction based potential solutions from the operations of the genetic algorithm. The process of evaluation, selection, and recombination is iterated until the population converges to an acceptable optimal solution. Genetic algorithms (GAs) require only fitness information, not gradient information or other internal knowledge of a problem as in case of neural networks. Genetic algorithms have traditionally been used in optimization but, with a few enhancements, can perform classification, prediction and pattern association as well [27][28][29]. The GA has been used very effectively for function optimization and it can perform efficient searching for approximate global minima. It has been observed that the pattern recalling in the Hopfield type neural networks can be performed efficiently with GA [13]. The GA in this case is expected to yield alternative global optimal values of the weight matrix corresponding to all stored patterns.
The conventional Hopfield neural network suffers from the problem of non-convergence and local minima on increasing the complexity of the network. However, GA is particularly good to perform efficient searching in large and complex space to find out the global optima and for convergence. Considerable research into the Hopfield network has shown that the model may trap into four types of spurious attractors. Four well identified classes of these attractors are mixture states [4], spin glass states [58], compliment states and alien attractors [59]. As the complexity of the of the search space increases, GA presents an increasingly attractive alternative for pattern storage & recalling in Hopfield type neural networks of associative memory. The neural network applications address problems in pattern classification, prediction, financial analysis, and control and optimization [30]. In most current applications, neural networks are best used as aids to human decision makers instead of substitutes for them. Genetic algorithms have helped market researchers performing market segmentation analysis [31]. Genetic algorithms and neural networks can be integrated into a single application to take advantage of the best features of these technologies [32]. Much work has been done on the evolution of neural networks with GA [33][34][35][36][37]. There have been a lot of researches which apply evolutionary techniques to layered neural networks. However, their applications to fully connected neural networks remain few so far. The first attempt to conjugate evolutionary algorithms with Hopfield neural networks dealt with training of connection weights [45] and design of the neural network architecture [46,47], or both [48][49][50][51]. Evolution has been introduced in neural networks at three levels: architectures, connection weights and learning rules [38]. The evolution of connection weights proceeds at the lowest level on the fastest time scale in an environment determined by architecture, a learning rule, and learning tasks. The evolution of connection weights introduces an adaptive and global approach to training, especially in the reinforcement learning and recurrent network learning paradigm. Training of neural networks using evolutionary algorithms started in the beginning of 90's [16,52]. Reviews can be found in [24,[27][28][29]35]. Cardenas et al. [53] presented the architecture optimization of neural networks using parallel genetic algorithms for pattern recognition based on person faces. They compared the results of the training stage for sequential and parallel implementations. The genetic evolution has been used as data structures processing for image classification [54].
In this paper we are exploring the GA for efficient recalling of memorized patterns as auto associative memory from the Hopfield neural network corresponding to the presented input pattern vector of handwritten Hindi 'SWARS' characters. The recalling in this associative memory network is performed under the consideration of reducing the effect of false minima by using evolutionary searching method like genetic algorithm. In this approach the GA starts from the suboptimal weight matrix as the initial population of solution. The suboptimal weight matrix reflects the encoded patterns information of the training set by using unsupervised Hebbian learning rule i.e. sum of correlation weight matrices. Each correlation term is corresponding to individual pattern information. Hence, the GA starts from the sum of correlation matrices for training set which we call as parent weight matrix, and it determines the optimal weight matrix for the presented noisy prototype input patterns of the handwritten 'SWARS' of Hindi language. The performance of pattern storage network is evaluated as rate of success in recalling of correct memorized pattern correspond to the presented prototype input pattern of handwritten 'SWARS' with GA which starts from sub-optimal solution i.e. sub-optimal GA. The simulated results indicate the better performance of the suboptimal genetic algorithm (SGA) as compared with Hebbian rule in success rate for recalling of correct memorized 'SWARS' characters. In the following sections we will present the description of patterns used for training, the Hopfield neural network used for storing the patterns, the GA used for recalling the already stored patterns, experiments detail, discussion of our results obtained through simulation, and the conclusion of our investigations.

SAMPLE PATTERN REPRESENTATION
The patterns used for the simulations are shown in Figure 1. Each pattern consists of a 5 X 5 pixel matrix representing a handwritten character of Hindi 'SWARS'. White and black pixels are respectively assigned corresponding values of -1 and +1.

THE HOPFIELD NEURAL NETWORK
and ; j i  with set of L patterns to be memorized and N is the number of processing units}. The network is initialized as: The activation value and output of every unit in Hopfield model can represent as: and Associative memory involves the retrieval of a memorized pattern in response to the presentation of some prototype input patterns as the arbitrary initial states of the network. These initial states have a certain degree of similarity with the memorized patterns and will be attracted towards them with the evaluation of the neural network.
Hence, in order to memorize 13 handwritten Hindi 'SWARS' of in a 25-unit bipolar Hopfield neural network, there should be one stable state corresponding to each stored pattern. Thus at the end, the memory pattern should be fixed-point attractors of the network and must satisfy the fixed-point condition as:  (10) Therefore, the following activation dynamics equation must satisfy to accomplish the pattern storage: similarly for the L th patterns, we have: Thus, after the learning for all the patterns, the final parent weight matrix can be represented as: So that, from equation (16) & (17) , we get: This square matrix is considered as the parent eight matrix because it represents the partial solution or sub-optimal solution for the pattern recalling corresponding to the presented prototype input pattern vector. The next generation population of solutions will be evolved from this sub-optimal weight matrix. Thus, we consider that the GA will now start from this sub-optimal initial solution rather than random one. So, we consider this as sub-optimal GA. Hopfield suggested that the maximum limit for the storage is 0  by using replica method [3]. Wasserman [39] showed that the maximum number of memories ' m ' that can be stored in a network of ' n 'neurons and recalled exactly is less that 2 cn where ' c 'is a positive constant greater than one. It has been identified that the storage capacity strongly depends on learning scheme. Researchers have proposed different learning schemes, instead of the Hebbian rule to increase the storage capacity of the Hopfield neural network [55,56] Gardner showed that the ultimate capacity will be N p 2  as a function of the size of the basin of attraction [57]. Imada and Akira [10] applied the genetic algorithm to the Hopfield model as an associative memory, and obtained the capacity of 33% of the number of neurons. It has also been observed that the possibility of false minima may occur during the recalling of memorized patterns. However the GA has been identified as being particularly good at performing efficient searching in large and complex spaces to determine the global optima or minimize the possibility of false minima. Kumar and Singh investigated [24] that the GA of the evolutionary algorithms is much suitable choice to reduce the affect of false minima from the Hopfield neural network during the recalling of memorized patterns.

PATTERN RECALLING WITH GENETIC ALGORITHM
In GA implementation we consider the cycle of generating the new population with better individuals and restart the search until an optimum solution is found. In this process the two fitness evaluation functions have been used. The first fitness function is determining the best matrices of the weight populations those settle the network in a stable state corresponds to correct memorized pattern for presented input pattern. This input pattern is one of the memorized patterns. The second fitness evaluation function is selecting the weight matrices from the populations of the network in a stable state correspond to the correct memorized pattern for presented prototype noisy or approximate input pattern. It indicates that the stable states of the network will use for the evaluation of weight populations. Thus in the recalling process, stable state of the network correspond to the stored pattern should retain for the selected weight matrix on the presentation of prototype input pattern.
In this implementation process the two fitness evaluation functions are used. The first fitness evaluation function determines the suitable weight matrices which are responsible to generate the correct recalling of the memorized pattern for the error-free or exact input pattern that has been used in the training set. It means that, at the first level of filtering only those weight matrices will select which provide the correct pattern auto-association for the samples of training pattern set. Thus, at this level no approximate prototype input pattern is presented. It represents only weight matrices those exhibit the pattern association during the training of the network and should carry in the next generation of population, whereas the second fitness evaluation function is used after the crossover operator. The crossover operator is applied only to chromosomes those have been passed from the first fitness evaluation function. The second fitness evaluation function applies to determine the population of weight matrices those are responsible for recalling of correct memorized pattern for presented approximate or noisy prototype input pattern. Thus, the second fitness evaluation function is actually selecting the final population of chromosomes which are required for obtaining the optimal solution.
The parameters used for genetic algorithm are summarized in Table 1.

The Mutation Operator
The mutation operator plays a secondary role in the genetic algorithm. Mutation performs the modification of the value of each gene of a solution with some probability m p .
Nevertheless the choice of m p is critical to GA performance and has been studied by DeJong [40]. The typical value of mutation probability is in the range 0.005-0.05. The idea of adapting mutation and crossover to improve the performance of GAs has been used by researchers using the different criterion  [41][42]. Whitely et al. [43] idea for adaptation is based on the Hamming distance between solutions; while in Srinivas et al.

The steps for mutation operator:
Step 1: Generate the mutation positions in the chromosome randomly.
Step 2: Modify the parent chromosome at the positions generated in step 1, using equation (19).
Step 3: Repeat step 1 and 2 until a number N of mutated chromosome populations have been created.

Elitism
Elitism is used when creating each generation so that the genetic operators do not loose good solutions. This involves copying the Hebbian-encoded weight matrix i.e., the suboptimal solution unchanged in the new population, which includes L W for creating the total number M (i.e. 1  N ) of chromosomes.

The First Fitness Evaluation
The first fitness evaluation function (  (21) Here 0 t has been set to N (the number of processing units).
We must note that fitness 1 implies that all the initial patterns have been stored as fixed points. Thus, we consider only those generated weight matrices that have the fitness evaluation value 1. Hence, all the selected weight matrices will be considered as the new generation of the population. This new population will be used for generating the next better population of weight matrices with the crossover operator.

The Crossover Operator
The power of GAs arises from crossover. Crossover is a structured and randomized exchange of genetic material between solutions, with the probability that 'good' solutions can generate 'better' ones. Thus, crossover is an operation which may be used to combine multiple parents and make off spring. This operator is responsible for the recombination of the selected population of weight matrices. This operator forms a new solution by taking some parameters from one parent and exchanging them with ones from another at the very same point. Here, we are applying the recombination with the uniform crossover. In this process, the network selects randomly (with uniform distribution) a string of nonzero chromosomes from a selected weight matrix and exchanges it with string of nonzero chromosomes from another selected weight matrix. Thus, a large population of the weight matrices will be generated. Hence, on applying this crossover operator with the constraint that the numbers of genes or alleles selected for exchange should be equal for the two weight matrices, the modification has been made in the selected weight matrices as follows:

Steps for Crossover Operation
Step 1: Initialize the crossover population size limit with value N N  .
Step 2: Extract two chromosomes from among the Step 3: Obtain a sub matrix of order u t  randomly in each extracted chromosome for exchanging the values.
Step 4: Exchange the sub matrices between the chromosomes.
Step 5: Include both chromosomes in the crossover population.
Step 6: Check whether the population size is equal to N N  .
If not, go to step 2 again.

The Second Fitness Evaluation
In the process of recalling the stored pattern, corresponding to a approximate prototype input pattern of handwritten Hindi 'SWARS', the best suitable weight matrix is selected from the generated population of K weight matrices. Let the state of the network corresponding to the already stored This process will continue for all the weight matrices from the K population. It is possible to obtain more than one optimal weight matrices for the recalling of prototype input pattern.

EXPERIMENTS DETAIL
To do the simulation we are considering the Hopfield neural network of 25 neurons. This neural network is trained for pattern storage with the Hebbian learning rule for given training set of handwritten 'SWARS' of Hindi language. In the training set every pattern consists with 25 bipolar features. The pattern information of the training set is encoded in the form of connection strengths of interconnections between the neurons of the network. In this way the parent weight matrix is constructed which represents the encoded pattern information of memorized patterns. Now we start with the process of recalling for any presented noisy prototype input pattern of already memorized pattern. The process of recalling is accomplished with two different methods namely the Hebbian rule and the sub optimal GA. The implementation of above stated methods is performed with experiments to study the performance behavior of these methods.
The recalling through Hebbian rule starts by applying the presented prototype input pattern to the Hopfield network and then we continue to iterate the network till it reaches to stability. The stable state of the network reflects any one of the memorized pattern. In this method the possibility of false recalling or false minima is likely to occur in most of the cases if the presented input pattern is a noisy pattern. The recalling process through the suboptimal GA can be described in the algorithmic form as:

RESULTS AND DISCUSSION
The results presented in this section are demonstrating that, within the simulation framework presented above, large significant difference exists between the performance of genetic algorithm and the Hebbian rule for recalling of memorized handwritten 'SWARS' of Hindi language in Hopfield neural network using the Hebbian learning rule.
The results are indicating about the difference between sub-optimal GA and Hebbian rule for rate of success to evaluate the performance of these two techniques for associative memory feature. In the experiments, the prototype input patterns used for recalling purpose consists the error which has been generated randomly with respect to memorized patterns. Tables 2-7 show the results of recalling the patterns which containing zero-,one-,two-,three-,four-,and five-bit errors from stored patterns in the Hopfield neural network. The performance of sub optimal GA corresponding to zero-bit, and one-bit error in prototype input patterns is found much better with respect to conventional technique. The results clearly indicate that the Hebbian rule works well for a noiseless pattern, for most of the cases, but its performance degrades substantially and recalling success goes down to a maximum of 1.30% in the case of one-bit error, 0.20% in the case of two-bit error, 0.01% in the case of three-bit error, and 0.000% in the cases of fourand five bit errors. On the other hand, GA recall the pattern successfully even when high noise is presented in the input test pattern i.e. four-bit and five-bit errors. It is observed that in most of the cases the performance of the suboptimal GA outperform the conventional Hebbian method. It is also observed that variance in the mutation probability in sub optimal GA does not have any substantial impact in the performance of this algorithm.       [3] claimed that the capacity of deterministic Hopfield model with the Hebbian rule is about 0.15N for the noisy prototype input patterns, where N the number of nodes in the network is. If such a network is overloaded with a number of patterns exceeding its capacity, its performance rapidly deteriorates toward zero. Here, we have stored the 13 handwritten 'SWARS' of Hindi language in a network of 25 nodes and the performance of the GA suggests that on inducing 5-bit error in presented prototype input pattern the network is able to recall the stored patterns. It implies that the network capacity has increased up to 0.45N. Thus, the numbers of attractors are existing here and successfully explored during the recalling process. It is quit obvious to understand that the GA has searched the suitable optimal weight matrices which are responsible to generate sufficiently large number of attractions. Hence, the Hebbian rule which has been used to encode the pattern information is not the optimal weight matrix for finding the global minima of the problem due to the limited capacity of the Hopfield model. Thus capacity has been increased with GA by exploring the optimal weight matrices for the encoded patterns.
The simulation program, which is developed in MATLAB-7, to test the Hebbian rule, the suboptimal GA, and the random GA for the recalling of handwritten 'SWARS' of Hindi language, stores the patterns in the Hopfield neural network of 25 neurons. It is to note that during suboptimal GA, the success is considered only if the recalling is done within 20-iterations.

CONCLUSION
The simulation results i.e. Tables 2-7 indicates that the sub optimal genetic algorithm has more success rate than the Hebbian rule for the stated problem. It has been found that suboptimal GA can give more than one convergent weight matrices for any prototype input pattern in comparison to the Hebbian rule, if the prototype input pattern is correctly recognized. This shows the more chances for GAs to explore better solution than the Hebbian rule. Further the sub optimal GA is starting from sub optimal weight matrix so it has more chances to explore more number of convergent weight matrices with respect to random GA. In the purposed method it can be seen that the two fitness evaluation functions are used. There are two basic advantages of the two fitness evaluation functions: 1. The randomness of the GA has minimized, because the population is filtered twice. Hence, the less number of populations will be generated and the generated population will be more fitted for the solution. 2. As the number of population has minimized, the searching time will also be reduced. Thus, the GA has improved in its implementation because it is less random and consuming less time for searching the optimal solution. The direct application of GA to the pattern association has been explored in this research. The aim is to introduce an alternative approach to solve the pattern association problem. The results from the experiments conducted on the algorithm are quite encouraging. Nevertheless more work needs to be perform especially on the tests for noisy input patterns. We can also use this concept for pattern recognition for hand written 'VYANJANS' of Hindi language, overlapped alphabet and curve scripts. It is also proposed to undertake the following problems in future research program.
1. We would like to train the Hopfield neural network with the genetic algorithm and then the pattern recalling with genetic algorithm. 2. We would like to apply the same approach on the quantum Hopfield neural network. 3. We would like to introduce other types of sigmoid function, chromosome, fitness function, crossover to change the weight values through evolution.