




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The optimal control problem with a quadratic cost function and a fixed terminal state. It covers the linear least-squares formulation of the problem, the use of the linear algebra framework, the riccati differential equation, and the dynamic programming approach. A detailed mathematical analysis of the problem, including the derivation of necessary conditions, the characterization of the optimal control, and the discussion of various solution techniques. It is likely to be useful for students and researchers interested in optimal control theory and its applications, particularly in the context of linear systems and quadratic cost functions.
Typology: Lecture notes
1 / 184
This page cannot be seen from the preview
Don't miss anything!
Andr´e L. Tits
July 2011
Chapter 1
Motivation and Scope
We give some examples of design problems in engineering that can be formulated as math- ematical optimization problems. Although we emphasize here engineering design, optimiza- tion is widely used in other fields such as economics or operations research. Such examples can be found, e.g., in [18].
Example 1.1 Design of an operational amplifier (opamp) Suppose the following features (specifications) are desired
In this course, we deal with parametric optimization. This means, for this example, that we assume the topology of the circuit has already been chosen, the only freedom left being the choice of the value of a number of “design parameters” (resistors, capacitors, various transistor parameters). In real world, once the parametric optimization has been performed, the designer will possibly decide to modify the topology of his circuit, hoping to be able to achieve better performances. Another parametric optimization is then performed. This loop may be repeated many times. To formulate the opamp design problem as an optimization problem, one has to spec- ify one (possibly several) objective function(s) and various constraints. We decide for the following goal:
minimize the power dissipated subject to gain-bandwidth product ≥ M 1 (given) frequency response ≤ M 2 at all frequencies.
The last constraint will prevent two high a “peaking” in the frequency response, thereby ensuring sufficient closed-loop stability margin. We now denote by x the vector of design parameters
Copyright ©c1993–2011, Andr´e L. Tits. All Rights Reserved 3
and, assuming that all the Ωi’s are identical, if we define φ : Rn^ × Ω → Rk^ by
φ(x, ω) =
φ^1 (x, ω) .. . φk(x, ω)
(1.8)
we obtain again
min{f (x)|g(x) ≤ 0 , φ(x, ω) ≤ 0 ∀ω ∈ Ω} (1.9)
[This is called a semi-infinite optimization problem: finitely many variables, infinitely many constraints.] Note. If we define
ψi(x) = sup ω∈Ω
φi(x, ω) (1.10)
(1.9) is equivalent to
min{f (x)|g(x) ≤ 0 , ψ(x) ≤ 0 } (1.11)
(more precisely, {x|φ(x, ω) ≤ 0 ∀ω ∈ Ω} = {x|ψ(x) ≤ 0 }) and ψ(x) can be absorbed into g(x).
Exercise 1.1 Prove the equivalence between (1.9) and (1.11). (To prove A = B, prove A ⊂ B and B ⊂ A.)
This transformation may not be advisable, for the following reasons:
(i) some potentially useful information (e.g., what are the ‘critical’ values of ω) is lost when replacing (1.9) by (1.11)
(ii) for given x, ψ(x) may not be computable exactly in finite time (this computation involves another optimization problem)
(iii) ψ may not be smooth even when φ is, as shown in the exercise below. Thus (1.11) may not be solvable by classical methods.
Exercise 1.2 Suppose that φ: Rn^ × Ω → R is continuous and that Ω is compact, so that the ‘sup’ in (1.10) can be written as a ‘max’.
(a) Show that ψ is continuous.
(b) By exhibiting a counterexample, show that there might not exist a continuous function ω(·) such that, for all x, ψ(x) = φ(x, ω(x)).
Exercise 1.3 Again referring to (1.10), show, by exhibiting counterexamples, that continuity of ψ is no longer guaranteed if either (i) Ω is compact but φ is merely continuous in each variable separately or (ii) φ is jointly continuous but Ω is not compact, even when the “sup” in (1.10) is achieved for all x.
Copyright ©c1993–2011, Andr´e L. Tits. All Rights Reserved 5
Motivation and Scope
Exercise 1.4 Referring still to(1.10), exhibit an example where φ ∈ C∞ (all derivatives exist and are continuous), where Ω is compact, but where ψ is not everywhere differentiable.
However, in this course, we will limit ourselves mostly to classical (non semi-infinite) prob- lems (and will generally assume continuous differentiability), i.e., to problems of the form
min{f (x) : g(x) ≤ 0 , h(x) = 0 }
where f : Rn^ → R, g : Rn^ → Rm, h : Rn^ → Rℓ, for some positive integers n, m and ℓ.
Remark 1.1 To fit the opamp design problem into formulation (1.11) we had to pick one of the design specifications as objective (to be minimized). Intuitively more appealing would be some kind of multiobjective optimization problem.
Example 1.2 Design of a p.i.d controller (proportional - integral - derivative) The scalar plant G(s) is to be controlled by a p.i.d. controller (see Figure 1.1). Again, the structure of the controller has already been chosen; only the values of three parameters have to be determined (x = [x 1 , x 2 , x 3 ]T^ ).
R(s) (^) E(x,s) Y(x,s)
T(x,s)
−
x + x s + 1 2 xs^3 G(s)
Figure 1.1:
Suppose the specifications are as follows
We decide to minimize the ISE, while keeping the Nyquist plot of T (x, s) outside some forbidden region (see Figure 1.2) and keeping rise time, settling time, and overshoot under given values.
The following constraints are also specified.
− 10 ≤ x 1 ≤ 10 , − 10 ≤ x 2 ≤ 10 ,. 1 ≤ x 3 ≤ 10
Exercise 1.5 Put the p.i.d. problem in the form (1.6), i.e., specify f , gi, φi, Ωi.
6 Copyright ©c1993-2011, Andr´e L. Tits. All Rights Reserved
Motivation and Scope
Unlike Example 1.1 and Example 1.2, Example 1.3 is an ‘optimal control’ problem. Whereas discrete-time optimal control problems can be solved by classical optimization techniques, continuous-time problems involve optimization in infinite dimension spaces (a complete ‘waveform’ has to be determined).
To conclude this section we now introduce the class of problems that will be studied in this course. Consider the abstract optimization problem
(P ) min{f (x) | x ∈ S}
where S is a subset of a vector space X and where f : X → R is the cost or objective function. S is the feasible set. Any x in S is a feasible point.
Definition 1.1 A point xˆ is called a (strict) global minimizer for (P ) if xˆ ∈ S and
f (ˆx) ≤ f (x) ∀x ∈ S (<) (∀x ∈ S, x 6 = ˆx)
Assume now X is equipped with a norm.
Definition 1.2 A point xˆ is called a (strict) local minimizer for (P) if ˆx ∈ S and ∃ ǫ > 0 such that
f (ˆx) ≤ f (x) ∀x ∈ S ∩ B(ˆx, ǫ) (<) (∀x ∈ S ∩ B(ˆx, ǫ), x 6 = ˆx)
8 Copyright ©c1993-2011, Andr´e L. Tits. All Rights Reserved
(i) Finite-dimensional
unconstrained equality constrained inequality [and equality] constrained linear, quadratic programs, convex problems multiobjective problems discrete optimal control
(ii) Infinite-dimensional
calculus of variations (no “control” signal) (old: 1800) optimal control (new: 1950’s)
Note: most types in (i) can be present in (ii) as well.
Essentially, solve the problem. The steps are
Copyright ©c1993–2011, Andr´e L. Tits. All Rights Reserved 9
Chapter 2
Linear Optimal Control: Some
(Reasonably) Simple Cases
References: [24, 1]. Consider the linear control system
x˙(t) = A(t)x(t) + B(t)u(t), x(t 0 ) = x 0 (2.1)
where x(t) ∈ Rn, u(t) ∈ Rm^ and A(·) and B(·) are matrix-valued functions. Suppose A(·), B(·) and u(·) are continuous. Then (2.1) has the unique solution
x(t) = Φ(t, t 0 )x 0 +
∫^ t
t 0
Φ(t, σ)B(σ)u(σ)dσ
where the state transition matrix Φ satisfies the homogeneous differential equation
∂ ∂t
Φ(t, t 0 ) = A(t)Φ(t, t 0 )
with initial condition Φ(t 0 , t 0 ) = I.
Further, for any t 1 , t 2 , the transition matrix Φ(t 1 , t 2 ) is invertible and
Φ(t 1 , t 2 )−^1 = Φ(t 2 , t 1 ).
Throughout these notes, though, we will typically merely assume that the components of the control function u are piecewise continuous. The reason for this is that, as will become clear later on (e.g., see “bang-bang control”), in many important cases, optimal controls are “naturally” discontinuous, i.e., the problem has no solution if minimization is carried out over the set of continuous function.
Definition 2.1 A function u : R → Rm^ is piecewise continuous (or p.c.) if it is right- continuous and, given any a, b ∈ R with a < b, it is continuous on [a, b] except for possibly finitely many points of discontinuity.
Copyright ©c1993–2011, Andr´e L. Tits. All Rights Reserved 11
Linear Optimal Control: Some (Reasonably) Simple Cases
Note that some authors do not insist on right-continuity. The reason we do is that without such requirement there typically will not be a unique optimal control—changing the value of an optimal control at, say, a single point ˆt does not affect is optimality. (A first instance of this appears in the next section, in connection with equations (2.9)-(2.10). Also note that, if u is allowed to be discontinuous, solutions to differential equation 2.1) must be understood as satisfying it “almost everywhere”. Thus, unless otherwise specified, in the sequel, the set of admissible controls will be the set U is defined by U := {u : [t 0 , t 1 ] → Rm, p.c.}.
Consider the optimal control problem
minimize J(u) ≡
∫ (^) t 1
t 0
( x(t)T^ L(t)x(t) + u(t)T^ u(t)
) dt +
x(t 1 )T^ Qx(t 1 )
subject to x˙(t) = A(t)x(t) + B(t)u(t) (2.2) x(t 0 ) = x 0 , (2.3) u ∈ U, (2.4)
where x(t) ∈ Rn, u(t) ∈ Rm^ and A(·), B(·) and L(·) are matrix-valued functions, minimiza- tion is with respect to u and x. The initial and final times t 0 and t 1 are given, as is the initial state x 0. The mappings A(·), B(·), and L(·), defined on the domain [t 0 , t 1 ], are assumed to be continuous. Without loss of generality, L and Q are assumed symmetric. The problem just stated is, in a sense, the simplest continuous-time optimal control problem. Indeed, the cost function is quadratic and the dynamics linear, and there are no constraints (except for the prescribed initial state). While a linear cost function may be even simpler than a quadratic one, in the absence of (implicit or explicit) constraints on the control, such problem would have no solution (unless the cost function is a constant). In fact, the problem is simple enough that it can be solved without much advanced mathematical machinery, by simply “completing the squares”. Doing this of course requires that we add to and subtract from J(u) a quantity that involves x and u. But doing so would likely modify our problem! The following fundamental lemma gives us a key to resolving this conundrum.
Lemma 2.1 (Fundamental Lemma) Let A(·), B(·) be continuous matrix-value functions and K(·) = KT^ (·) be a matrix-valued function, with continuously differentiable entries. Suppose K^ ˙(t) exists on [t 0 , t 1 ]. Then, if x(t) and u(t) are related by
x ˙(t) = A(t)x(t) + B(t)u(t), (2.5)
it holds x(t 1 )T^ K(t 1 )x(t 1 ) − x(t 0 )T^ K(t 0 )x(t 0 )
12 Copyright ©c1993-2011, Andr´e L. Tits. All Rights Reserved
Linear Optimal Control: Some (Reasonably) Simple Cases
Consider the case of scalar, time-independent values a = 0, b = 1, l = −1, q = 0. The corresponding Riccati equation is
k˙ = 1 + k^2 , k(T ) = 0
We get atan(k(t)) − atan(k(T )) = t − T,
yielding k(t) = tan (t − T ),
with a finite escape time at t = T − π 2. For the time being, we assume that a solution exists on [t 0 , t 1 ], and we denote its value at time t by K(t) = Π(t, Q, t 1 )
From (2.7) and (2.8), we get
J˜(u) =^1 2
∫^ t^1
t 0
‖BT^ (t)Π(t, Q, t 1 )x(t) + u(t)‖^2 dt. (2.9)
Since u is free, J(u) attains its minimal value when the first term is set to zero, yielding the
u∗(t) = −BT^ (t)Π(t, Q, t 1 )x(t). (2.10)
Remark 2.1 By “closed-loop” it is meant that the right-hand side of (2.10) does not depend on the intial state x 0 nor on the initial time t 0 , but only on the current state and time. Such formulations are of major practical importance: If, for whatever reason (modeling errors, perturbations) the state a time t is not what it was predicted to be (when the optimal control u∗(·) was computed, at time t 0 ), (2.10) is still optimal over the remaining time interval (assuming no modeling errors or perturbations between times t and t 1 ).
To formally verify that u∗^ is optimal, simply substitute (2.10) into (2.1) and note that the resulting autonomous system has a unique solution x, which is continuous. Plugging this solution into (2.10) yields an “open-loop” expression for u∗, which also shows that it is continuous, hence belongs to U. In view of (2.9), the optimal value is given by
V (x 0 , t 0 ) := J(u∗) =
xT 0 Π(t 0 , Q, t 1 )x 0 ,
where V is known as the value function. Now suppose that, starting from x(t 0 ) at time t 0 , the state reaches x(τ ) at time τ ∈ (t 0 , t 1 ). The remaining portion of the minimal cost is the minimum, over u ∈ U, subject to x˙ = Ax + Bu with x(τ ) fixed, of
Jτ (u) :=
∫ (^) t 1 τ
( x(t)T^ L(t)x(t) + u(t)T^ u(t)
) dt + x(τ )T^ Π(τ, Q, t 1 )x(τ ) (2.11) =
∫ (^) t 1 τ ‖B(t)
T (^) Π(τ, Q, t 1 )x(t) + u(t)‖ (^2) dt + x(τ )T (^) Π(τ, Q, t 1 )x(τ ), (2.12)
14 Copyright ©c1993-2011, Andr´e L. Tits. All Rights Reserved
i.e, the cost-to-go is Jτ (u∗) = x(τ )T^ Π(τ, Q, t 1 )x(τ ).
Hence, the “cost-to-go” from any time t < t 1 and state x ∈ Rn^ is
V (x, t) = Jt(u∗) =
xT^ Π(t, Q, t 1 )x. (2.13)
Remark 2.2 We have not made any positive definiteness (or semi-definiteness) assumption on L(t) or Q. The key assumption we have made is that the stated Riccati equation has a solution Π(t, Q, t 1 ). Below we investigate conditions (in particular, on L and Q) which insure that this is the case. At this point, note that, if L(t) 0 for all t and Q 0, then J(u) ≥ 0 for all u ∈ U, and expression (2.13) of the cost-to-go implies that Π(t, Q, t 1 ) 0 whenever it exists.
Exercise 2.1 Investigate the case of the more general cost function
J(u) =
∫^ t^1
t 0
x(t)T^ L(t)x(t) + u(t)T^ S(t)x(t) +
u(t)T^ R(t)u(t)
) dt,
where R(t) 0 for all t. Hints: (i) define a new inner product on U; (ii) let v(t) = u(t) + M(t)x(t) with M(t) judiciously chosen.
Returning to the question of existence/uniequeness of the solution to the differential Riccati equation, first note that the right-hand side of (2.8) is Lipschitz continuous over bounded subsets of Rn, and that this implies that, for any given Q, there exists τ < t 1 such that the solution exists, and is unique, in [τ, t 1 ]. Now, for t ∈ [τ, t 1 ], define p(t) = Π(t, Q, t 1 )x(t), so that the optimal control law (2.10) satisfies u∗(t) = −BT^ (t)p(t). Then x and p together satisfy the the linear system [ x ˙(t) p˙(t)
[ A(t) −B(t)BT^ (t) −L(t) −AT^ (t)
] [ x(t) p(t)
] (2.14)
evolving in R^2 n.
Exercise 2.2 Verify that x and p satisfy (2.14).
We first note that the Riccati equation (2.14) can be solved via system (2.14).
Theorem 2.1 Let Ψ(·, ·) be the 2 n × 2 n state transition matrix for system (2.14) and let [ X(t) P (t)
] = Ψ(t, t 1 )
[ I Q
]
Then Π(t, Q, t 1 ) = P (t)X(t)−^1
solves (2.14) for t ∈ [τ, t 1 ], for any τ < t 1 such that X(t)−^1 exists on [τ, t 1 ].
Copyright ©c1993–2011, Andr´e L. Tits. All Rights Reserved 15
2.1 Free terminal state, unconstrained, quadratic cost
where x(t) satisfies (2.2) with initial condition x(τ ) = x. Letting ˆx(t) be the solution to (2.2) corresponding to u(t) = 0 ∀t and ˆx(τ ) = x, i.e., ˆx(t) = Φ(t, τ )x, and using (2.15), we can write
0 ≤ xT^ Π(τ, Q, t 1 )x ≤
∫^ t^1
τ
x ˆ(t)T^ L(t)ˆx(t)dt + ˆx(t 1 )T^ Qxˆ(t 1 = xT^ F (τ )x (2.16)
with
F (τ ) =
∫ (^) t 1
τ
Φ(t, τ )T^ L(t)Φ(t, τ )dt + Φ(t 1 , τ )T^ QΦ(t 1 , τ ),
a continuous function. Since this holds for all x ∈ Rn^ and since Π(t, Q, t 1 ) is symmetric, it follows that, using, e.g., the spectral norm,
‖Π(τ, Q, t 1 )‖ ≤ ‖F (τ )‖ ∀τ ∈ (ˆt, t 1 ].
Since F (·) is continuous on (−∞, t 1 ], hence bounded on [ˆt, t 1 ], a compact set, it follows that Π(·, Q, t 1 ) is bounded on (ˆt, t 1 ], completing the proof by contradiction.
Exercise 2.3 Prove that, if A = AT^ and F = F T^ , and 0 ≤ xT^ Ax ≤ xT^ F x for all x, then ‖A‖ 2 ≤ ‖F ‖ 2 , where ‖ · ‖ 2 denotes the spectral norm. (I.e., ‖A‖ 2 = max{‖Ax‖ 2 : ‖x‖ 2 = 1 }.)
Thus, when L(t) is positive semi-definite for all t and Q is positive semi-definite, our problem has a unique optimal control given by (2.10).
Pontryagin’s Principle In connection with cost function J with uT^ u generalized to uT^ Ru (although we will still assume R = I for the time being), let H : Rn^ × Rn^ × Rm^ × R → R be given by
H(ξ, η, v, θ) =
vT^ Rv +
ξT^ L(θ)ξ + ηT^ (A(θ)ξ + B(θ)v),
and let H : Rn^ × Rn^ × R → R be defined by
H(ξ, η, θ) = min v∈Rm H(ξ, η, v, θ),
where clearly (when R = I), the minimum is achieved when v = −BT^ (θ)η. Function H is the pre-Hamiltonian,^1 (sometimes called control Hamiltonian or pseudo-Hamiltonian) and H the Hamiltonian (or true Hamiltonian). Thus
H(ξ, η, t) =
ξT^ L(t)ξ + ηT^ A(t)ξ −
ηT^ B(t)BT^ (t)η
[ ξ η
]T [ L(t) AT^ (t) A(t) −B(t)BT^ (t)
] [ ξ η
] .
(^1) This terminology is borrowed from P.S. Krishnaprasad
Copyright ©c1993–2011, Andr´e L. Tits. All Rights Reserved 17
Linear Optimal Control: Some (Reasonably) Simple Cases
The gradient of H with respect to the first 2n arguments is given by
∇H(ξ, η, t) =
[ L(t) AT^ (t) A(t) −B(t)BT^ (t)
] [ ξ η
] .
System (2.14) can be equivalently written in the canonical form
x˙(t) = ∇pH(x(t), p(t), t) = ∇pH(x(t), p(t), u∗(t), t) p˙(t) = −∇xH(x(t), p(t), t) = −∇xH(x(t), p(t), u∗(t), t).
i.e.,
z˙(t) = J∇H(x(t), p(t), t) = J∇H(x(t), p(t), u∗(t), t) (2.17)
where z(t) =
[ x(t) p(t)
] and J =
[ 0 I −I 0
]
. The auxiliary conditions can be written
x(t 0 ) = x 0 , p(t 1 ) = Qx(t 1 ),
yielding a two-point boundary-value problem. Note that, equivalently, p(t 1 ) = ∇φ(x(t 1 )), where φ is the terminal cost φ(x) = 12 xT^ Qx. Also, the optimal cost J(u∗) can be equivalently expressed at x(t 0 )T^ p(t 0 ). This is one instance of Pontryagin’s Principle; see Chapter 5 for more details. In general, however, H may be nonsmooth (due to the “min” operation), so the expressions involving its gradients may not be valid.
Remark 2.3 Along trajectories of (2.14),
d dt
H(x(t), p(t), t) = ∇H(x(t), p(t), t)T^ z˙(t) +
∂t
H(x(t), p(t), t)
=
∂t
H(x(t), p(t), t)
since
∇H(x(t), p(t), t)T^ z˙(t) = ∇H(x(t), p(t), t)T^ J∇H(x(t), p(t), t) = 0 (since JT^ = −J).
In particular, if A, B and L do not depend on t, H(x(t), p(t), t) is constant along trajectories of (2.14).
We now turn our attention to the case of infinite horizon (t 1 = ∞). To simplify the analysis, we also assume that A, B and L are constant. We also simplify the notation by translating the origin of time to the intial time t 0 , i.e., by letter t 0 = 0. Assuming (as above) that L = LT^ ≥ 0, we write L = CT^ C, so that the problem can be written as
18 Copyright ©c1993-2011, Andr´e L. Tits. All Rights Reserved