Estimation of the Optimal Regularization Parameters in Optimal Control Problems with time delay

In this paper we use the L-curve method and the Morozov discrepancy principle for the estimation of the regularization parameter in the regularization of time-delayed optimal control computation. Zeroth order, first order and second order differential operators are considered. Two test examples show that the L-curve method and the two discrepancy principles give close estimations for the regularization parameters.


INTRODUCTION
Control parameterization technique (CPT) belongs to the class of direct transcription methods (see [6,15]) for solving optimal control problems (OCPs) [10][11][12]. It relies on partitioning the time space, over which the OCP is to be solved using two types of partitions: first, the time space is partitioned by switching points representing the times at which the parameters of the control function (constant, linear, etc) switch their values; then in the second partition, each switching interval is partitioned by a number of quadrature points, at which the state variables are to be evaluated. As all the direct transcription methods, the CPT transcripts an OCP into a nonlinear programming problem (NLP) [2,3,4,7,8,9]. This resulting NLP is to be solved using any NLP solver such as the SQP [15], Matlab's optimization toolbox [14], the FSQP, etc.
The Matlab's optimization toolbox uses the quasi Newton methods (BFGS, DFP) to solve the nonlinear programming problems [14]. Starting from any initial guess for the optimal solution, the optimization process starts by the identity matrix as an initial approximation for the Hessian matrix. Then, the Hessian matrix is updated at each iteration, maintaining its positive-definiteness, to guarantee that, the direction of search would be always a descent direction. The optimization process terminates, when the directional derivative is less than a given tolerance FunTol, and the maximum constraint violation is less than another given tolerance ConTol. The default value of both the tolerences FunTol and ConTol is 10 -7 [14].
If the condition numbers associated to the computation of the optimal control solution are evaluated, they would lead to the fact that the OCPs are ill-conditioned [19]. In [1] we computed the condition numbers resulted by the discretization of the optimal control problem, using the classical fourth-order Runge-Kutta method. Those computations showed that the condition numbers of the active constraints, projected Hessian and the whole Lagrangean system are more likely to increase with the number of the switching intervals per a delay interval than by the number of the quadrature intervals per a switching interval.
Methods of regularization are used to stabilize the solutions obtained for ill-conditioned problems [19,20]. The stabilization of the solution of the optimal control problems requires moving from the optimal solution (optimal controller) to a controller in the neighbourhood with better stability properties (smoothness). Regularization is done be choosing a positive regularization parameter. Choosing very small values for the regularization parameter will keep the solution close to the optimal controller but might not contribute a lot in the stabilization of the solution. On the other-hand, choosing large values of the regularization parameter will smoothen the solution of the OCP but take the solution away from the optimal controller.
Staying closer to the optimal control while smoothing the resulting control are two extremes, which should be balanced. This requires the computation of an optimal regularization parameter, which is a problematic issue, in which one balances between two extremes. The first extreme is to smooth the computed control, whereas the second extreme is to stay close to the optimal control. Benyah and Jennings [19] developed methods based on the L-curve and the discrepency principle methods for the computation of the regularization parameters on solving optimal control problems without time delays.
In this paper we extend their work to optimal control problems with time delays and we propose a new discrepancy principle for the computation of the regularization parameters, based on the method found in [18] for the regularization of the solutions of linear systems. The rest of this paper is organized as follows. In Section 2, we state the time-delayed optimal control problem under consideration and discuss the concept of the regularization in optimal control and the methods for computing the regularization parameters. In Section 3 we state the two test examples that would be considered in the rest of the paper. In Section 4, we show the results obtained by applying the L-curve method, for discretization with the classical fourth-order Runge-Kutta method. In sections 5 and 6, we use a Morosov discrepancy principle and an iterative Morosov discrepancy principle to compute the regularization parameter. In Section 7 are the conclusions and remarks.

METHODS OF REGULARIZATION IN TIME-DELAYED OPTIMAL CONTROL PROBLEMS
We consider an optimal control problem with time-delay arguments, of the form: with initial data given by are given piecewise continuous functions.
The system is subject to continuous state inequality constraints ]. , It is subject to equality constraints ], , and subject to terminal conditions: where the functions The aim from the regularization in the optimal control problems, is to make a trade-off between the smoothness of the control, and maintaining the minimum value of the objective function. In Tikhonov regularization a penalty is applied to the objective function for unwanted properties on the control \citep*{Ben2, Ben3}.
The penalized objective function is of the form is the regularization parameter. Its seen that, for any The zero th -order differential operator m I L  penalizes the control on variation from the null function 0. The first order differential operator 1 D L  penalizes the control on deviation from a constant and the second-order differential operator 2 D L  penalizes the control on deviation from a straight line (see Benyah and Jennings [19,20]).
 be any two positive real numbers such that, 2 1    . Benyah and Jennings [19] and Ramlau [18] stated and proved the following properties: If a large value for the regularization parameter  is chosen, then a smooth control  u  that minimizes the term u L  can be obtained, but with a high value for the objective function.
On the other-hand, by choosing a small value for the regularization parameter  , the smoothness of the control u  will not improve significantly, but the value of the objective function will not go far from its minimum value.
An optimal value of the parameter  is the value at which the control seems more stable. If we move to a point in the neighbourhood of this optimal value, we will not see high increases or decreases in either the objective function or the term u L  .

TEST EXAMPLES
In the rest of this paper we consider the following two test examples.

Example 1
and subject to the continuous inequality constraint and subject to the terminal equality constraint

Example 2
We consider the optimal control problem subject to the to the dynamics described by the delay differential equations:

Solutions of Examples 1 and 2
We ran computer programmes using 16 switching intervals per a delay interval with two quadrature intervals per a switching interval. The solutions of examples 1 and two are explained by figures 1 and 2.  Table 1.

THE L-CURVE METHOD
The L-curve for an optimal control problem is the parametric plot of the points ) ), ( ( At the optimal value of the regularization parameter *  , we obtain the best level of smoothness in the control function, without completely compromising the optimality of the corresponding control. Therefore, we expect that

Test Example 1
We applied the control parameterization technique using 16 switching intervals per a delay interval with two quadrature intervals per a switching interval.  and by using a zero th , first and second order differential operators, we obtain results which we explain in Table 2-4. Table 2. Regularization using the zeroth-order differential operator  Table 3. Regularization using the first-order differential operator   Table 4. Regularization using the second-order differential operator 10    are optimal regularization parameters, corresponding to zeroth, first and second order operators respectively. In the log scale, the curves have the same shape, but in the log scale, the corner of the L-curve can be seen better. Therefore, we plot the data Figure 1

Test Example 2
We applied the control parameterization technique using 16 switching intervals per a delay interval with two quadrature intervals per a switching interval. For  and by using a zero th , first and second order differential operators, we obtain results which we explain in tables 5-7. Table 5. Regularization using the zeroth-order differential operator    Table 7. Regularization using the second-order differential operator

The Discrepancy Principle
In this section we look for the most plausible acceptable control  u  , which solves the optimization problem is the value of the objective function for the unregularized control and  is a given tolerance. Now, we consider a problem of the form: are given piecewise continuous functions.
The system is subject to continuous state inequality constraints ]. , and subject to terminal conditions: By solving the optimal control problems in examples 1 and 2, using the Morosov discrepancy principle with 5 10    , we computed the optimal regularization parameters and show them in tables 7 and 8, respectively.

Another Discrepancy Principle
In [21], Ramlau proposed an iterative algorithm from linear inverse problems for the computation of the opimal regularization parameter in linear systems. In this section, we extend the work of Ramlau for the computation of the optimal control parameters in optimal control problems with time delays.
The algorithm starts from a regularization parameter 1 0    that are satisfying: The algorithm terminates at a regularization parameter We describe the method by the following algorithm:

Conclusions
We discussed the methods of regularization in time delayed optimal control computation. For the estimation of the regularization parameters using zeroth, first and second order differential operators, we have used the L-curve and two discrepancy principle methods. Two common features between the three methods are that the regularization parameter decreases as the order of the differential operator increases and the variation increases. When it comes to compare between the performances of the L-curve method and the two discrepancy principle, we recommend that the L-curve method be used to determine a trust region for the optimal regularization parameter, whereas the discrepancy principle methods can be used to look for the optimal value of the regularization parameter. That is the discrepancy principle gives more precise values for the optimal regularization parameter than the L-curve method.