1. Introduction
In practice, the system under consideration is modeled by learning the operator that can reproduce the system by using data sampled from the respective system. Typical operator learning problems are formulated on finite grids, using finite-difference methods that approximate the domain of the operator under investigation. Recovering the continuous limit is a challenging undertaking, particularly since irregularly sampled data may alter the evaluation of the learned operator. The use of differential equation solvers to learn dynamics through continuous deep learning models of neural networks, called “Neural Ordinary Differential Equations” (NODE), has been introduced by Chen et al. [
1]. As demonstrated by various applications [
1,
2,
3,
4,
5,
6,
7,
8,
9], NODE models provide an explicit connection between deep feed-forward neural networks and dynamical systems, offering flexible trade-offs between efficiency, memory costs and accuracy while bridging modern deep learning and traditional numerical modelling. However, NODE models are limited to describing systems that are instantaneous, since each time-step is determined locally in time, without contributions from the state of the system at other times.
In contradistinction to differential equations, integral equations (IE) model global spatio-temporal relations, which are learned through an IE-solver [see, e.g., 10] which samples the domain of integration continuously. Due to their non-local behavior, IE-solvers are suitable for modeling complex dynamics. The problem of learning dynamics from data through integral equations has been addressed by Zappala et al [
11], who have introduced the Neural Integral Equation (NIE) and the Attentional Neural Integral Equation (ANIE). The NIE and the ANIE can be used to generate dynamics and can also be used to infer the spatio-temporal relations that generated the data, thus enabling the continuous learning of non-local dynamics with arbitrary time resolution [
11,
12]. Often, ordinary and/or partial differential equations can be recast in integral-equation forms that can be solved more efficiently using IE-solvers, as exemplified in scattering theory [
13], fluid flow [
14], and integral neutron and photon transport [
15].
Zappala et al [
16] have also developed a deep learning method called Neural Integro-Differential Equation (NIDE), which “learns” an integro-differential equation (IDE) whose solution approximates data sampled from given non-local dynamics. The motivation for using NIDE stems from the need to model systems that present spatio-temporal relations which transcend local modeling, as illustrated by the pioneering works of Volterra on population dynamics [
17]. Combining the properties of differential and integral equations, IDEs also present properties that are unique to their non-local behavior [
18,
19,
20], with applications in computational biology, physics, engineering and applied sciences [
18,
19,
20,
21,
22,
23].
All neural nets are trained by minimizing a “loss functional” which aims at representing the discrepancy between a “reference solution” and the output produced by the respective net’s decoder. The neural-net is optimized to reproduce the underlying physical system as closely as possible. However, the physical system modeled by a neural-net comprises parameters that stem from measurements and/or computations which are subject to uncertainties. Therefore, even though the neural net would ideally model perfectly the system’s parameters, the uncertainties inherent in these parameters would propagate to the subsequent results of interest, which are various functionals of the net’s decoder output rather than some “loss functional.” Hence, it is important to quantify the uncertainties induced in the decoder’s output by the uncertainties that afflict the parameters/weights underlying the physical system modeled by the respective neural-net. The quantification of the uncertainties in the net’s decoder and derived results (called ”responses”) of interest require the computation of the sensitivities of the decoder’s response with respect to the optimized weights/parameters comprised within the neural net.
Neural nets comprise not only scalar-valued weights/parameters but also scalar-valued functions (e.g., correlations, material properties, etc.) of the model’s scalar parameters. It is convenient to refer to such scalar-valued functions as “features of primary model parameters.” Cacuci [
24] has recently introduced the “n
th-Order Features Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (n
th-FASAM-N),” which enables the most efficient computation of the exact expressions of arbitrarily high-order sensitivities of model responses with respect to the model’s “features.” Subsequently, the sensitivities of the responses with respect to the primary model parameters are determined, analytically and trivially, by applying the “chain-rule” to the expressions obtained for the response sensitivities with respect to the model’s features/functions of parameters.
Based on the general framework of the n
th-FASAM-N methodology [
24], Cacuci has developed specific sensitivity analysis methodologies for NODE-nets, as follows: the “First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Ordinary Differential Equations (1st-FASAM-NODE)” [
25] and the “Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Ordinary Differential Equations (2nd-FASAM-NODE)” [
26]. The 1st-FASAM-NODE and the 2nd-FASAM-NODE are pioneering sensitivity analysis methodologies which enable the computation, with unparalleled efficiency, of exactly-determined first-order and, respectively, second-order sensitivities of decoder response with respect to the optimized/trained weights involved in the NODE’s decoder, hidden layers, and encoder.
Two important families of IDEs are the Volterra and the Fredholm equations. In a Volterra IDE, the interval of integration grows linearly during the system’s dynamics, while in a Fredholm IDE the interval of integration is fixed during the dynamic-history of the system, but at any given time instance within this interval, the system depends on the past, present and future states of the system. By applying the general concepts underlying the n
th-FASAM-N methodology [
24], Cacuci [
27,
28] has also developed the general methodologies underlying the “Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Fredholm-Type (2nd-FASAM-NIE-F)” and the “Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Volterra-Type (2nd-FASAM-NIE-V).” The 2nd-FASAM-NIE-F encompasses the “First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Fredholm-Type (1st-FASAM-NIE-F), while the 2nd-FASAM-NIE-V encompasses the “First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Volterra-Type (1st-FASAM-NIE-V).” The 1st-FASAM-NIE-F and 1st-FASAM-NIE-V methodologies, respectively, enable the computation, with unparalleled efficiency, of exactly-determined first-order sensitivities of decoder response with respect to the NIE-parameters, requiring a single “large-scale” computation for solving the 1st-Level Adjoint Sensitivity System (1st-LASS), regardless of the number of weights/parameters underlying the NIE-net. The 2nd-FASAM-NIE-F and 2nd-FASAM-NIE-F methodologies, respectively, enable the computation (with unparalleled efficiency) of exactly-determined second-order sensitivities of decoder response with respect to the NIE-parameters, requiring only as many “large-scale” computations as there are first-order sensitivities with respect to the feature functions.
This work presents the “First- and Second Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations of Fredholm-Type” abbreviated as “1st-FASAM-NIDE-F” and “2nd-FASAM-NIDE-F,” respectively. These methodologies are also based on the general framework of the n
th-FASAM-N methodology [
24]. The 1st-FASAM-NIDE-F is presented in
Section 2, while the 2nd-FASAM-NIDE-F is presented in
Section 3, in the sequel.
Section 4 presents an illustrative application of the 1st-FASAM-NIDE-F and 2nd-FASAM-NIDE-F methodologies to a heat transfer model. This illustrative model has been chosen because it can be formulated either as a first-order differential-integral equation of Fredholm type or as a conventional second-order “neural ordinary differential equation (NODE)”, while admitting exact closed-form solutions/expressions for all quantities of interest, including state functions, first-order and second-order sensitivities. The availability of these alternative formulations, either as a NIDE-F or a NODE, of the illustrative paradigm heat conduction model makes it possible to compare the detailed, step-by-step, applications of the 1st-FASAM-NIDE-F versus the 1st-FASAM-NODE methodologies (for computing most efficiently the exact expressions of the first-order sensitivities of decoder response with respect to the model parameters) and, subsequently, to compare the applications of the 2nd-FASAM-NIDE-F versus the 2nd-FASAM-NODE methodologies (for computing most efficiently the exact expressions of the second-order sensitivities of decoder response with respect to the model parameters).
The discussion offered in
Section 5 concludes this work by highlighting the unparalleled efficiency of the 1st-FASAM-NIDE-F and 2nd-FASAM-NIDE-F methodologies, respectively, for computing exact first- and second-order sensitivities, respectively, of decoder responses to model parameters in optimized NIE-F networks. Ongoing work aims at developing the “First- and Second-Order Features Adjoint Sensitivity Analysis Methodologies for Neural Integro-Differential Equations of Volterra-Type” (1st- FASAM-NIDE-V and 2nd-FASAM-NIDE-V, respectively), which will enable, in premiere, the most efficient computation of the exact expressions of the first- and second-order sensitivities of decoder-responses with respect to the optimized network’s weights/parameters for NIDE-V neural nets.
2. First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations of Fredholm-Type (1st-FASAM-NIDE-F)
The mathematical expression of the network of nonlinear Fredholm-type Neural Integro-Differential Equations (NIDE-F) considered in this work generalizes the NIDE-net model introduced in [
16] and is represented in component form by the following system of
Nth-order integro-differential equations:
The boundary conditions, imposed at the “initial time”
and/or “final time”
on the functions
and their time-derivatives associated with the encoder of the NIDE-F net represented by Equation (1) are represented in operator form as follows:
The quantities appearing in Equations (1) and (2) are defined as follows:
- (i)
The real-valued scalar quantities and , , are time-like independent variables which parameterize the dynamics of the hidden/latent neuron units. Customarily, the variable is called the “global time” while the variable is called the “local time. The initial time-value is denoted as while the stopping time-value is denoted as . Thus, the dynamics modeled by Eq.(1) depends both on non-local effects, as well as on instantaneous information.
- (ii)
The components of the -dimensional vector-valued function represents the hidden/latent neural networks; denotes the total number of components of . In this work, the symbol “” will be used to denote “is defined as” or, equivalently, “is by definition equal to.” The various vectors will be considered to be column vectors. Typically, vectors will be denoted using bold lower-case letters. The dagger “” symbol will be used to denote “transposition.”
- (iii)
The components of the column-vector represent the “primary” network parameters, namely scalar learnable adjustable parameters/weights, in all of the latent neural nets, including the encoders(s) and decoder(s), where denotes the total number of adjustable parameters/weights.
- (iv)
The scalar-valued components , , of the vector-valued function represent the ”feature/functions of the primary model parameters.” The quantity denotes the total number of such feature functions comprised in the NIDE-F. In particular, all of the model parameters that might appear solely in the boundary and/or initial conditions are considered to be included among the components of the vector . In general, is a nonlinear vector-valued function of . The total number of feature functions must necessarily be smaller than the total number of primary parameters (weights), i.e., . When the NIDE-F comprises only primary parameters, it is considered that for all .
- (v)
The functions model the dynamics of the neurons in a latent space where the local time integration occurs, while the functions map the local space back to the original data space. The functions model additional dynamics in the original data space. In general, these functions are nonlinear in their arguments.
- (vi)
The functions are coefficient-functions, which may depend nonlinearly on the functions and , associated with the order,, of the time-derivatives of the functions .
- (vii)
The operators , , represent boundary conditions associated with the encoder and/or decoder, imposed at and/or at on the functions and on their time-derivatives; the quantity “BC” denotes the “total number of boundary conditions.”
Customarily, the NIDE-F net is “trained” by minimizing a user-chosen loss functional representing the discrepancy between a reference solution (”target data”) and the output produced by the NIDE-F decoder. The “training” process produces “optimal” values for the primary parameters
, which will be denoted in this work by using the superscript “zero,” as follows:
. Using these optimal/nominal parameter values to evaluate the NIDE-F net yields the optimal/nominal solution
which will satisfy the following form of Equation (1):
subject to the following optimized/trained boundary conditions:
After the NIDE-F net is optimized to reproduce the underlying physical system as closely as possible, the subsequent responses of interest are no longer “loss functionals” but become specific functionals of the NIDE-F’s “decoder” output, which can be generally represented by the functional defined below:
The function models the decoder. The scalar-valued quantity is a functional of and , and represents the NIDE-F’s decoder-response. At the optimal/nominal parameter values, i.e., at , the decoder response takes on the following formal form:
The physical system modeled by the NIDE-F net comprises parameters that stem from measurements and/or computations. Consequently, even if the NIDE-F net models perfectly the underlying physical system, the NIDE-F’s optimal weights/parameters are unavoidably afflicted by uncertainties stemming from the parameters underlying the physical system. Hence, it is important to quantify the uncertainties induced in the decoder output, , by the uncertainties that afflict the parameters/weights underlying the physical system modeled by the NIDE-F net. The relative contributions of the uncertainties afflicting the optimal parameters to the total uncertainty in the decoder response are quantified by the sensitivities of the NIDE-F decoder-response with respect to the optimized NIDE-F parameters. The general methodology for computing the first-order sensitivities of the decoder output, , with respect to the components of the feature function , and with respect to the primary model parameters , will be presented in this Section.
The known nominal values of the primary model parameters (“weights”) characterizing the NIDE-V net will differ from the true but unknown values of the respective weights by variations denoted as . The variations will induce corresponding variations , , in the feature functions. The variations and will induce, through Equation (1), variations around the nominal/optimal functions . In turn, the variations and will induce variations in the NIE decoder’s response.
The “First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations of Fredholm-Type (1st-FASAM-NIDE-F)” aims at obtaining the exact expressions of the first-order sensitivities (i.e., functional derivatives) of the decoder’s response with respect to the feature function and the primary model parameters, followed by the most efficient computation of these sensitivities. The 1st-FASAM-NIDE-F will be established by applying the same principles as those underlying the 1st-FASAM-N [
24] methodology. The fundamental concept for defining the sensitivity of an operator-valued quantity
with respect to variations
in a neighborhood around the nominal values
, has been shown in 1981 by Cacuci [
29] to be provided by the 1st-order Gateaux- (G-) variation
of
, which is defined as follows:
for a scalar
and for arbitrary vectors
in a neighborhood
around
. When the G-variation
is linear in the variation
, it can be written in the form
, where
denotes the first-order G-derivative of
with respect to
, evaluated at
.
Applying the definition provided in Equation (7) to Equation (5) yields the following expression for the first-order G-variation
of the response
:
where the “direct effect term” arises directly from variations
and is defined as follows:
and where the “indirect effect term” arises indirectly, through the variations
in the hidden state functions
, is defined as follows:
The direct-effect term can be quantified using the nominal values but the indirect-effect term can be quantified only after determining the variations , which are caused by the variations through the NIDE-F net defined in Equation (1).
The first-order relationship between the variations
and
is obtained from the first-order G-variations of Equations (1) and (2). The first-order G-variations of Equations (1) and (2), respectively, are obtained, by definition, as follows:
Carrying out the operations indicated in Equations (11) and (12) yields the following NIDE-F net of Fredholm-type for the function
:
where:
The NIDE-F net represented by Equations (13) and (14) is called [
24] the “1st-Level Variational Sensitivity system (1st-LVSS) and its solution,
is called [
24] the “1st-level variational function.” All of the quantities in Equations (13) and (14) are to be computed at the nominal parameter values, but the respective indication has not been explicitly shown in order to simplify the notation.
It is important to note that the 1st-LVSS is linear in the variational function . Therefore, the 1st-LVSS represented by Equation (13) can be written in matrix-vector form as follows:
where the
-dimensional rectangular matrix
comprises as components the quantities
defined in Equation (15), while the components of the
square matrix
are operators (algebraic, differential, integral) defined below, for
:
Note that the 1st-LVSS would need to be solved anew for each variation
,
, in order to determine the corresponding function
, which is prohibitively expensive computationally if
is a large number. The need for repeatedly solving the 1st-LVSS can be avoided if the variational function
could be eliminated from appearing in the expression of the indirect-effect term defined in Equation (10). This goal can be achieved [
24] by expressing the right-side of Equation (10) in terms of the solutions of the “1st-Level Adjoint Sensitivity System (1st-LASS)” to be constructed next. The construction of this 1st-LASS will be performed in a Hilbert space comprising elements of the same form as
, defined on the domain
. This Hilbert space is endowed with an inner product of two elements
and
, denoted as
and defined as follows:
The next step is to construct the inner product of Equation (13) with a vector , where the superscript “(1)” indicates “1st-Level”, to obtain the following relationship:
The terms appearing in Equation (20) are to be computed at the nominal values but the respective notation has been omitted for simplicity.
Using the definition of the adjoint operator in
, the term on the left-side of Equation (20) is integrated by parts and the order of summations is reversed to obtain the following relation:
where the operator
denotes the formal adjoint of the operator
and where
represents the scalar-valued bilinear concomitant evaluated on the boundary
and/or
. Note that the
matrix valued operator
acts linearly on the vector
. The “star” superscript
* will be used in this work to denote “formal adjoint operator.”
It follows from Equations (20)and (21) that the following relation holds:
The term on the left-side of Equation (22) is now required to represent the indirect effect term defined in Equation (10) by imposing the following relation:
Using Equations (22) and (23) in Equation (10) yields the following expression for the indirect effect term:
The boundary conditions accompanying Equation (23) for the function
are now chosen at the time values
and/or
so as to eliminate all unknown values of the 1st-level variational function
from the bilinear concomitant
which remain after implementing the initial conditions provided in Equation (2). These boundary conditions for the function
can be represented in operator form as follows:
The Fredholm-like NIDE net represented by Equations (23) and (25) will be called the “1st-Level Adjoint Sensitivity System” and the solution, , will be called the “1st-level adjoint sensitivity function.” The 1st-LASS is solved using the nominal/optimal values for the parameters and for the function but this fact has not been explicitly indicated in order to simplify the notation. Notably, the 1st-LASS is independent of any parameter variations so it needs to be solved just once to obtain the 1st-level adjoint sensitivity function The 1st-LASS is linear in but is, in general, nonlinear in .
Adding the result obtained in Equation (24) for the indirect-effect term
to the result obtained in Equation (9) for the direct-effect term yields the following expression for the first-order G-differential of the response
:
where
denotes the first-order sensitivity of the response
with respect to the components
of the “feature”. Each sensitivity
is obtained by identifying the expression that multiplies the corresponding variation
and can be represented formally in the following integral form:
The functions will be subsequently used for determining the exact expressions of the second-order sensitivities of the response with respect to the components of the feature function of model parameters.
In the following subsections, the detailed forms of the 1st-LASS will be provided for first-order (n=1) and, respectively, second-order (n=2) Fredholm-like NIDE.
2.1. First-Order Neural Integral Equations of Fredholm-Type (1st-NIDE-F)
The representation of the first-order
neural integral equations of Fredholm-type (1st-NIDE-F) is provided below, for
:
The typical boundary conditions provided at
(“encoder”) are as follows:
where the scalar values
are known, albeit imprecisely, since they are considered to stem from experiments and/or computations. Equations (28) and (29) are customarily considered an “initial value (NIDE-F) problem” although the independent variable
t could represent some other physical entity (e.g., space, energy, etc.) rather than time.
The 1st-LVSS for the function
is obtained by G-differentiating Equations (28) and (29), and has the following particular forms of Equations (13) and (14) for
:
where:
The 1st-LASS is constructed by using Equation (19) to form the inner product of Equation (30) with a vector
to obtain the following relationship:
Examining the structure of the left-side of Equation (33) reveals that the bilinear concomitant will arise from the integration by parts of the first term the on the left-side of Equation (33) to obtain the following relation:
where the bilinear concomitant
has the following expression, by definition:
The second term on the left-side of Equation (33) will be recast in its “adjoint form” by reversing the order of summations so as to transform the inner product involving the function
to an inner product involving the function
, as follows:
The third term on the left-side of Equation (33) is now recast in its “adjoint form” by reversing the order of summations and integrations so as to transform the inner product involving the function
into an inner product involving the function
, as follows:
The fourth term on the left-side of Equation (33) will be recast in its “adjoint form” by reversing the order of summations and integrations so as to transform the inner product involving the function into an inner product involving the function , as follows:
Using the results obtained in Equations (34)‒(38) in the left-side of Equation (33) yields the following relation:
The relation in Equation (39) is rearranged as follows:
The term on the right-side of Equation (40) is now required to represent the “indirect-effect” term defined in Equation (10), which is achieved by requiring the components of the function
to satisfy the following system of first-order NIDE-F equations:
The relation obtained in Equation (41) is the explicit form of the relation provided in Equation (23) for the particular case when , i.e., when considering first-order neural integral equations of Fredholm-type (1st-NIDE-F).
The unknown values
in the bilinear concomitant
in Equation (40) are eliminated by imposing the following final-time conditions:
It follows from Equations (33)‒(42) and (31) that the indirect-effect term defined in Equation (10) has the following expression in terms of the 1st-level adjoint sensitivity function
:
The first-order NIDE-F obtained in Equations (41) and (42) represents the explicit form for the particular case n=1 of the 1st-LASS represented, in general, by Equations (23) and (25). To obtain the 1st-level adjoint sensitivity function , the 1st-LASS is solved backwards in time (globally) using the nominal/optimal values for the parameters and for the function but this fact has not been explicitly indicated in order to simplify the notation. Notably, the 1st-LASS is independent of any parameter variations so it needs to be solved just once to obtain the 1st-level adjoint sensitivity function The 1st-LASS is linear in but is, in general, nonlinear in .
Using the results obtained in Equations (43) and (9) in Equation (8) yields the following expression for the G-variation
, which is seen to be linear in the variations
,
, in the model’s feature functions (induced by variations in the model’s primary parameters) and the variations
,
in the decoder’s initial conditions:
The expression in Equation (44) is to be satisfied at the nominal/optimal values for the respective model parameters, but this fact has not been indicated explicitly in order to simplify the notation.
Identifying in Equation (44) the expressions that multiply the variations
yields the following expressions for the decoder response sensitivities with respect to the encoder’s initial conditions:
It is apparent from Equation (45) that the sensitivities are functionals of the form predicted in Equation (27). It is also apparent from Equation (45) that the sensitivities are proportional to the values of the respective component of the 1st-level adjoint function evaluated at the initial-time . This relation provides an independent mechanism for verifying the correctness of solving the 1st-LASS from to (backwards in time) since the sensitivities can be computed independently of the 1st-LASS by using finite differences of appropriately high-order in conjunction with known variations and the correspondingly induced variations in the decoder response. Special attention needs to be devoted, however, to ensure that the respective finite-difference formula is accurate, which may need several trials with different values chosen for the variation .
It also follows from Equations (44) and (32) that the sensitivities
of the response
with respect to the components
of the feature function
have the following expressions, written in the form of Equation (27):
where
The subscript “1” attached to the quantity indicates that this quantity refers to a “first-order” NIDE-F net, while the superscript “(1)” indicates that this quantity refers to “first-order” sensitivities.
The sensitivities with respect to the primary model parameters can be obtained by using the result shown in Equation (46) together with the “chain rule” of differentiating compound functions, as follows:
When there only model parameters (i.e., there are no feature functions of model parameters), then for all , and the expression obtained in Equation (46) yields directly the first-order sensitivities , for all . In this case, all of the sensitivities , for all would be obtained by computing integrals (using quadrature formulas). In contradistinction, when features of parameters can be established, only integrals would need to be computed (using quadrature formulas) to obtain the , ; the sensitivities with respect to the model parameters would subsequently be obtained analytically using the chain-rule provided in Equation (48).
Occasionally, the boundary conditions may be provided through a measurement at the boundary
(“decoder”), as follows:
where the scalar values
are known, albeit imprecisely, since they are considered to stem from experiments and/or computations. In such a case, the determination of the first-order sensitivities
of the response
with respect to the components
of the feature function
follows the same steps as in Section 2.1.2, above, yielding the following results:
- (i)
The 1st-LASS will become an “initial value problem” comprising Equation (41), subject not the conditions shown in Equation (42), but subject to the following “initial conditions
- (ii)
The sensitivities of the response with respect to the components of the feature function will have the same formal expressions as in Equation (46) but the components of the 1st-level adjoint function will be the solution of Equations (41) and (50).
- (iii)
The sensitivities of the response
with respect to boundary conditions at
will have the following expressions:
2.2. Second-Order Neural Integral Equations of Fredholm-Type (2nd-NIDE-F)
The representation of the second-order
neural integral equations of Fredholm-type (2nd-NIDE-F) is provided below, for
:
There are several combinations of boundary conditions that can be provided, either for the function
and/or for its first-derivative
,
, at either
(encoder) or at
(decoder), or a combination thereof. For illustrative purposes, consider that the boundary conditions are as follows:
The 1st-LVSS is obtained by taking the G-variations of Equations (52) and (53) to obtain the following system, comprising the forms taken on for
by Equations (13) and (14), respectively:
where for
and
:
The 1st-LASS is constructed by using Equation (19) to form the inner product of Equation (54)with a vector
to obtain the following relationship:
Examining the structure of the left-side of Equation (57) reveals that the bilinear concomitant will arise from the integration by parts of the first and third terms the on the left-side of Equation (57), as follows:
where the bilinear concomitant
has the following expression:
The remaining terms on the left-side of Equation (57) will be recast into their corresponding “adjoint form” by using the results obtained in Equations (34)‒(38). Using these results together with the results obtained in Equations (58) and (59) yields the following expression for the left-side Equation (57):
Using Equation (58) and rearranging the terms on the right-side of Equation (60) yields the following relation:
The term on the right-side of Equation (61) is now required to represent the “indirect-effect” term defined in Equation (10), which is achieved by requiring the components of the function
to satisfy the following 1st-LASS:
The relation obtained in Equation (62) is the explicit form of the relation provided in Equation (23) for the particular case when , i.e., when considering second-order neural integral equations of Fredholm-type (2nd-NIDE-F).
The unknown values involving the function
in the bilinear concomitant
defined in Equation (59) are eliminated by imposing the following conditions:
It follows from Equations (33)‒(42) and (31) that the indirect-effect term defined in Equation (10) has the following expression in terms of the 1st-level adjoint sensitivity function
:
where the boundary quantity
contains the known remaining terms after having implemented the known boundary conditions given in Equations (55) and (63), and has the following explicit expression:
Using the results obtained in Equations (64), (65), (56) and (9) in Equation (8) yields the following expression for the G-variation
, which is seen to be linear in the variations
,
(
) and
(
):
The expression in Equation (66) is to be satisfied at the nominal/optimal values for the respective model parameters, but this fact has not been indicated explicitly in order to simplify the notation.
It also follows from Equations (66) and (56) that the sensitivities
of the response
with respect to the components
of the feature function
have the following expressions, written in the form of Equation (27):
where
The subscript “2” attached to the quantity indicates that this quantity refers to a “second-order” NIDE-F net, while the superscript “(1)” indicates that this quantity refers to “first-order” sensitivities. As expected, the expression of reduces to the expression of when the “second-order NIDE-F net” reduces to the “first-order NIDE-F net” in the case when .
Identifying in Equation (66) the expressions that multiply the variations
yields the following expressions for the decoder response sensitivities with respect to the encoder’s initial-time conditions:
Identifying in Equation (66) the expressions that multiply the variations
yields the following expressions for the decoder response sensitivities with respect to the final-time conditions:
If the boundary conditions imposed on the forward functions and/or the first-derivatives , , differ from the illustrative ones selected in Equation (53), then the corresponding boundary conditions for the 1st-level adjoint function would also differ from the ones shown in Equation (63), as would be expected. The components of would consequently have different values; therefore, all of the first-order sensitivities would have values different from those computed using Equation (68), even though the formal mathematical expressions of the respective sensitivities would remain unchanged. Of course, the sensitivities and would have expressions that would differ from those in Equations (69) and (70), respectively, if the boundary conditions in Equation (53), and consequently those in Equation (63), were different, since the residual bilinear concomitant would have a different expression from that shown in Equation (65).
3. Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations of Fredholm Type (2nd-FASAM-NIDE-F)
The second-order sensitivities of the response
defined in Equation (5) will be computed by conceptually using their basic definitions as being the “first-order sensitivities of the first-order sensitivities.” Recall that the generic expression of the first-order sensitivities,
,
, of the response with respect to the components of the feature function
is provided in Equation (46). It follows that the second-order sensitivities of the response with respect to the components of the feature function will be provided by the first-order G-differential
of
, which is by definition obtained as follows:
where the indirect-effect term
comprises all dependencies on the vectors
and
of variations in the state functions
and
, around the respective nominal values denoted as
and
, respectively, which are computed at the nominal parameter values
. This indirect-effect term is defined as follows:
The variational function
is the solution of the system of equations obtained by G-differentiating the 1st-LASS defined in Equations (23) and (25), which is by definition obtained as follows:
Carrying out the operations indicated in Equations (73) and (74) yields the following relations:
For subsequent derivations, it is convenient to represent the relations in Equation (75) in matrix-vector form, as follows:
where
As indicated by Equation (78), the variational functions
and
are the solutions of the system of matrix equations obtained by concatenating the 1st-LVSS defined by Equations (14) and (16) with Eqs (77) and (78). The concatenated system thus obtained will be called the 2nd-Level Variational Sensitivity System (2nd-LVSS) and has the block-matrix form provided below:
To distinguish block-matrices from block-vectors, two bold capital-letters have been used (and will henceforth be used) to denote block-matrices, as in the case of “the second-level variational matrix” . The “2nd-level” is indicated by the superscript “(2)”. The argument “”, which appears in the list of arguments of , indicates that this matrix is a -dimensional block-matrix comprising four submatrices, each of dimensions . The structure of the block-matrix is provided below:
The argument “2” which appears in the list of arguments of the vector
and of the “variational vector”
in Equation (80) indicates that each of these vectors is a 2-block column vector, each block comprising a column-vector of dimension
; the vectors
and
are defined as follows:
The 2-block vector
is defined as follows:
The 2-block column vector in Equation (81) represents the concatenated boundary/initial conditions provided in Equations (14) and (77), evaluated at the nominal parameter values. The argument “2” in the expression in Equation (81) indicates that this expression is a two-block column vector comprising two vectors, each of which has -components, all of which are zero-valued.
The need for solving the 2nd-LVSS is circumvented by deriving an alternative expression for the indirect-effect term
defined in Equation (72), in which the function
is replaced by a 2nd-level adjoint function that is independent of variations in the model parameter and state functions. This 2nd-level adjoint function will be the solution of a 2nd-Level Adjoint Sensitivity System (2nd-LASS), which will be constructed by using the same principles as employed for deriving the 1st-LASS. The 2nd-LASS is constructed in a Hilbert space
,
, comprising block-vectors having the same structure as
that can generically be represented as follows:
, with
, for
. The Hilbert space
is endowed with the following inner product of two vectors
and
:
The inner product defined in Equation (85) will be used to construct the 2nd-Level Adjoint Sensitivity System (2nd-LASS) for a 2nd-level adjoint function
,
,
, by implementing the following sequence of steps, which are conceptually similar to those implemented in
Section 2 for constructing the 1st-FASAM-NIDE-F methodology:
Using Equation (85), construct the inner product of the yet undetermined function
with Equation (80) to obtain the following relation:
Use the definition of the operator adjoint to
in the Hilbert space
to transform the inner product on the left-side of Equation (86) as follows:
where the quantity
denotes the corresponding bilinear concomitant on the domain’s boundary, evaluated at the nominal values for the parameters and respective state functions, and where the operator
denotes the formal adjoint of the matrix-valued operator
, comprising
block-matrices, each of dimensions
, having the following block-matrix structure.
-
Require the inner product on the right-side of Equation (87) to represent the indirect-effect term
defined in Equation (72) by imposing the following relation:
where
Since the source-term on the right-side of Equation (89) is a distinct quantity for each value of the index , this index has been added to the list of arguments of the function in order to emphasize that a distinct function will correspond to each index . Of course, the adjoint operator that acts on the function is independent of the index and could, in principle, be inverted just once and stored for subsequent repeated applications to the -dependent source terms for computing the corresponding functions .
-
The definition of the function
is completed by requiring it to satisfy adjoint boundary/initial conditions represented in operator form as follows:
The boundary/initial conditions represented by Equation (91) are determined imposing the following requirements:
(a) they must be independent of unknown values of ;
(b) the substitution of the boundary and/or initial conditions represented by Equations (81) and (91) into the expression of the bilinear concomitant must cause all terms containing unknown boundary/initial values of to vanish.
The NIDE-net comprising Equations (89) and (91) is called the “2nd-Level Adjoint Sensitivity System (2nd-LASS)” and its solution, , , is called the “2nd-level adjoint sensitivity function.” The unique properties of the 2nd-LASS will be highlighted in the sequel, below.
Using in Equation (72) the relations defining 2nd-LASS together with the 2nd-LVSS and the relation provided in Equation (87) yields the following alternative expression for the indirect-effect term, involving the 2nd-level adjoint sensitivity function
instead of the 2nd-level variational function
:
where
denotes known residual (non-zero) boundary terms which may not have vanished after having used the boundary and/or initial conditions represented by Equation (81) and (91).
Replacing the expression obtained in Equation (92) into Equation (71) yields the following expression:
The expressions of the second-order sensitivities of the response with respect to the components of the feature function are obtained by performing the following sequence of operations:
- (i)
Use Equation (84) to recast the second term on the right-side of Equation (93) as follows:
- (ii)
Recall that
, where the quantities
were defined in Equation (15). Recall that
where the quantities
were defined in Equation (76). Insert these expressions in Equation (94) to obtain the following relation:
- (iii)
Insert into Equation (93) the equivalent expression obtained in Equation (95), and subsequently identifying the quantities that multiply the variations
, to obtain the following expression for the second-order sensitivities
:
It is important to note that the 2nd-LASS is independent of variations and variations in the respective state functions. It is also important to note that the -dimensional matrix is independent of the index . Only the source-term depends on the index . Therefore, the same solver can be used to invert the matrix in order to solve numerically the 2nd-LASS for each -dependent source in order to obtain the corresponding -dependent -dimensional 2nd-level adjoint function . Computationally, it would be most efficient to store, if possible, the inverse matrix , in order to multiply directly the inverse matrix with the corresponding source term , for each index , in order to obtain the corresponding -dependent -dimensional 2nd-level adjoint function .
Since the adjoint matrix is block-diagonal, solving the 2nd-LASS is equivalent to solving two 1st-LASS, with two different source terms. Thus, the “solvers” and the computer program used for solving the 1st-LASS can also be used for solving the 2nd-LASS. The 2nd-LASS was designated as the “second-level” rather than the “second-order” adjoint sensitivity system, since the 2nd-LASS does not involve any explicit 2nd-order G-derivatives of the operators underlying the original system but involves the inversion of the same operators that need to be inverted for solving the 1st-LASS.
If the 2nd-LASS is solved -times, the 2nd-order mixed sensitivities will be computed twice, in two different ways, in terms of two distinct 2nd-level adjoint functions. Consequently, the symmetry property provides an intrinsic (numerical) verification that the 1st-level adjoint function and the components of the 2nd-level adjoint function and are computed accurately.
The second-order sensitivities of the decoder-response with respect to the optimal weights/parameters
, are obtained analytically by using the chain rule in conjunction with the expressions obtained in Equations (46) and(96), as follows:
4. Illustrative Application of the 1st-FASAM-NIDE-F and 2nd-FASAM-NIDE-F Methodologies to a Heat Transfer Model
The application of the 1st-FASAM-NIDE-F Methodology will be illustrated in this Section by considering a model of linear steady-state heat conduction through a homogeneous slab of thickness
, having a constant thermal conductivity denoted as
and involving a distributed heat source that is proportional to the temperature distribution within the slab; the proportionality constant is denoted as
. The slab is considered to be insulated on one side, which is held at a temperature
. The temperature distribution within the slab, denoted as
, is thus modeled by the following linear heat conduction equation:
Consider that the model response of interest, denoted as
, is the average temperature within the slab, which is defined as follows:
The model’s primary parameters are
, which can be subject to uncertainties, but their nominal/optimal values
are considered to be known. These parameters are considered to be components of the following (column) “vector of model parameters”:
The solution of Equation (98) has the following expression:
The quantity
is a “feature function” of the primary model parameters. Using in Equation (99) the result obtained in Equation (101) yields the following closed form expression for the model response:
At the nominal parameter values, the nominal value of the temperature distribution and of the average temperature response, respectively, have the following expressions:
4.1. Applying the 1st-FASAM-NIDE-F Methodology to Obtain the First-Order Response Sensitivities to the Primary Model Parameters
The heat conduction equation presented in Equation (98) can be recast into the following equivalent NIDE-F form:
The first-order sensitivities of the response
will be determined from the first-order Gateaux- (G-) differential, denoted as
, of
, which is obtained by applying the definition of the G-differential to Equation (99), as follows:
The variation
is the solution of the 1st-Level Variational Sensitivity System (1st-LVSS) obtained by G-differentiating Equation (105), which yields the following NIDE-F for arbitrary variations
and
around the nominal values
:
Performing the operations indicated in Equation (107) yields the following form for the 1st-LVSS:
For subsequent reference, it is noted that the solution of the above 1st-LVSS has the following expression:
The 1st-LVSS would need to be solved repeatedly, using every possible parameter variation, in order to determine the corresponding value of the temperature variation . These repeated computations can be avoided by eliminating the appearance of the variation in Equation (106); this aim can be achieved by deriving an alternative expression for the response variation that would not involve the variation . This alternative expression for will be constructed in terms of the first-level adjoint function, which is, in turn, obtained as the solution of the 1st-Level Adjoint Sensitivity System (1st-LASS) to be constructed next by using the inner product defined in Equation (10) for the single-component function . Forming the inner product of Equation (109) with a yet undefined function yields the following relation:
The relation obtained in Equation (111) is satisfied at the nominal/optimal parameter values but this fact has not been explicitly indicated in order to simplify the notation. Integrating by parts the first term on the left-side of Equation (111) and rearranging the second term on the left-side of Equation (111) yields the following relation:
The function
will now be determined as follows: (i) require that the last term on the right-side of Equation (112) be identical to the G-differential
defined in Equation (106); and (ii) eliminate the unknown quantity
in Equation (112). These requirements lead to the following NIDE-F for the function
:
The NIDE-F-net represented by Equations (113) and (114) constitutes the 1st-Level Adjoint Sensitivity System (1st-LASS) for the 1st-level adjoint sensitivity function . The 1st-LASS is satisfied at the nominal parameter values but this fact has not been explicitly indicated in order to simplify the notation.
Using Equations (111)‒(114) in conjunction with Equation (106) yields the following alternative expression for the G-differential in terms of :
Using in Equation (115) the expression provided for
in Equation (108) and identifying the expressions that multiply the variations
and
yields the following expressions for the first-order sensitivities of the response with respect to the initial condition
and the feature function
, respectively:
The expressions obtained in Equations (116) and (117) can be evaluated after having determined the 1st-level adjoint sensitivity function
. Also, these expressions are to be evaluated using the nominal/optimal parameter values, but this fact has not been explicitly indicated in order to simplify the notation. Notably, the 1st-LASS is independent of parameter variations, so it needs to be solved only once to determine
. The closed-form explicit expression of the solution of the 1st-LASS represented by Equations (113) and (114) is provided below:
Using the expression obtained in Equation (118) into Equations (116) and (117), respectively, and performing the respective integrations yields the following closed-form expressions:
As expected, the expressions obtained in Equations (119) and (120) coincide with the expressions that would be obtained by the direct differentiation of the expression for the model response obtained in Equation (102) with respect to and , respectively. Of course, the closed-form exact expression for the model response in terms of the model’s primary parameters and/or feature functions is not available in practice.
The sensitivities of the model response with respect to the primary parameters are obtained by using the result obtained in Equation (120) in conjunction with the following “chain-rule of differentiation”:
4.2. Applying the 1st-FASAM-NODE Methodology to Obtain the First-Order Response Sensitivities to the Primary Model Parameters
The traditional form of the heat conduction model provided in Equation (98) is a neural ordinary differential equation (NODE) which can be analyzed directly by using the “First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Ordinary Differential Equations (1st-FASAM-NODE) introduced by Cacuci [
25]. The G-differential of Equation (98) yields the following 1st-LVSS in NODE-form satisfied by the temperature variation
:
The 1st-LASS corresponding to the above 1st-LVSS is obtained by implementing the same steps as outlined in the previous Subsection, by constructing the inner-product of a yet undetermined function
with Equation (123) to obtain the following relation:
The relation obtained in Equation (125) is to be evaluated at the nominal/optimal parameter values but this fact has not been explicitly indicated in order to simplify the notation.
Integrating by parts the first term on the left-side of Equation (125) yields the following relation:
Identifying the last term on the right-side of Equation (126) with the G-differential provided in Equation (106), using the conditions provided in Equation (124) and eliminating the unknown boundary values on the right-side of Equation (126) yields the following expression for the G-differential in terms of the function :
where the 1st-level adjoint sensitivity function
is the solution of the following 1st-Level Adjoint Sensitivity System (1st-LASS):
Identifying the quantities that multiply the variations
and
in Equation (127) yields the following expressions for the sensitivities of the model response with respect to
and
:
The 1st-LASS can be readily solved to obtain the following expression for the 1st-level adjoint sensitivity function
:
Using in Equations (130) and (131) the expression for
obtained above yields the following expressions:
All of the results obtained in Equations (125)‒(134) are to be evaluated at the nominal parameter values but this fact has not been explicitly indicated in order to simplify the notation.
4.3. Comparison: Applying the 1st-FASAM-NODE Methodology Versus Applying the 1st-FASAM-NIDE-F Methodology
In cases where the model can be equivalently expressed in either NODE or in NIE-F forms, such as shown in Equation (98) or Equation (105), respectively, it is important to highlight the similarities and differences between applying the 1st-FASAM-NODE methodology versus applying the 1st-FASAM-NIDE-F methodology for determining the first-order response sensitivities to the underlying model parameters. Evidently, the final results obtained in Equations (133) and (134) by treating the heat conduction model as a NODE, cf. Equations (98), are identical with the corresponding results obtained in Equations (119) and (120) by having treated the heat conduction model as a NIDE-F, cf. Equation (105). Furthermore, even though the form of the 1st-LVSS produced by the NODE methodology, namely Equations (123) and (124), differs from the form of the 1st-LVSS produced by the NIDE-F methodology, namely Equations (108) and (109), the solutions to these 1st-LVSS are identical to each other, having the expression provided in Equation (110)
However, the 1st-LASS corresponding to the NODE heat conduction model differs from the 1st-LASS corresponding to the NIDE-F heat conduction model, so that the corresponding 1st-level adjoint sensitivity function for the NODE-model, namely Equation (132), differs from the 1st-level adjoint sensitivity function for the NIDE-F heat conduction model, which is provided in Equation (118). Consequently, the expressions obtained in terms of the respective 1st-level adjoint sensitivity functions of the sensitivities of the model response with respect to the primary parameters and feature function for the NODE-representation, namely Equations (130) and (131), differ from those obtained for the NIDE-F representation, namely Equations (116) and (117). The structure of the 1st-LASS and expressions for sensitivities appear to be simpler in the NODE-representation than in the NIDE-F representation, but the choice of representation/framework will be largely influenced by the neural-net software available to the individual user.
4.4. Illustrative Application of the 2nd-FASAM-NIDE Methodology Versus the 2nd-FASAM-NODE Methodology for Computing the Second-Order Response Sensitivities to Model Features and Parameters
The general principles underlying the 2nd-FASAM-NIDE-F methodology presented in
Section 3 will be applied to the paradigm heat conduction model considered in this Section in order to highlight the salient issues arising when applying this methodology to determine the second-order sensitivities of model responses to model features and parameters.
4.4.1. Application of the 2nd-FASAM-NIDE-F methodology
When applying the 2nd-FASAM-NIDE-F, the second-order sensitivities arise from the first-order sensitivities obtained in Equations (116) and (117). Thus, the second-order sensitivities arising from Equation (116) are provided by its G-differential for arbitrary variations around the nominal parameter and function values (indicated by the use of the superscript “zero”). Using in Equation (116) the result obtained in Equation (118) and applying the definition of the G-differential to the resulting expression yields the relation below:
where the expressions for the above direct-effect and, respectively, indirect-effect terms are obtained as shown below:
The direct-effect term can be evaluated immediately. The indirect-effect term depends on the variational function
, which is the solution of the G-differentiated 1st-LASS, comprising Equations (113) and (114), obtained by definition as follows:
Performing the operations indicated in Equation (138) yields the following NIDE-F, to be evaluated at the nominal parameter values:
Since the indirect-effect term only depends on the variational function
but does not depend on the variational function
, the relations presented in Equations (139) and (140) constitute the 2nd-LVSS for the function
, which is dependent on parameter variations and would need to be solved anew for each parameter variation of interest. The need for computing
can be avoided by expressing the indirect-effect term defined by Equation (137) in terms of a 2nd-level adjoint sensitivity function that is independent of parameter variations. This adjoint function will be denoted as
, where the argument “1” indicates that this adjoint function corresponds to the first-order sensitivity
, which was chosen in this case to be the “first” 1st-order sensitivity to be considered. The 2nd-LASS to be satisfied by
will be constructed by applying the 2nd-FASAM-NIDE-F, which commences by forming the inner product of
with Equation (140), to obtain the following relation:
Integrating by parts the first term on the left-side of Equation (141) and reversing the order of integrations in the remaining terms yields the following relation:
The first term on the left-side of Equation (142) is now required to represent the indirect-effect term defined in Equation (137) to obtain the relation below:
The unknown quantity is eliminated from Equation (142) by imposing the following condition:
Replacing the results obtained in Equations (139), (143) and (144) into Equation (142) yields the following alternative expression for the indirect-effect term:
where the 2nd-level adjoint sensitivity function
is the solution of the 2nd-Level Adjoint Sensitivity System (2nd-LASS) comprising Equations (143) and (144). The 2nd-LASS is a NIDE-F net that does not depend on parameter variations and needs to be solved once only at the nominal parameter values; its solution,
, is used in Equation (145).
Adding the expressions obtained in Equations (145) and (136) yields the following expression:
It follows from Equation (146) that:
The 2nd-LASS represented by Equations can be solved to obtain the following closed-form expression, to be evaluated at the nominal parameter values, for its solution:
Inserting the above expression for
into Equation (147) and performing the respective integrations yields the following closed-form expression for the mixed second-order sensitivity:
The validity of the above expression can be readily verified by taking the appropriate derivative of either of the first-order sensitivities provided in Equations (133) and (134).
The second-order sensitivities arising from Equation (117) are provided by its G-differential for arbitrary variations around the nominal parameter and function values (indicated by the use of the superscript “zero”), which is by definition obtained as follows:
where the direct-effect and, respectively, indirect-effect terms have the following expressions:
The variational function
is the solution of Equations (108) and (109) while the variational function
is the solution of Equations (139) and (140). Altogether, these four equations constitute the 2nd-LVSS for the two-component vector-valued variational function
. The need for repeatedly solving this 2nd-LVSS for all parameter variations of interest is circumvented by eliminating the appearance of
in the expression of the indirect-effect term defined in Equation (152), by constructing an alternative expression for this term using the solution of the 2nd-LASS, to be constructed by applying the steps outlined in
Section 3, as follows:
Consider the two-component vector function
, where the first argument denotes the component number and the second argument (“2”) indicates that this function will correspond to the “second” first-order sensitivity
. Using the inner product defined in Equation (85), construct the inner product of
with Equations (108) and (140), respectively, to obtain the following relation:
Integrate by parts the first and third terms on the left-side of Equation (153) and rearrange the terms to obtain the following relation:
Require the third and fourth terms on the left-side of Equation (154) to represent the indirect-effect term defined in Equation (152) by imposing the following relations:
Eliminate the unknown terms
on the left-side of Equation (154) by imposing the following boundary conditions:
Insert the boundary conditions represented by Equations (109) and (139) into Equation (154) and use the relations underlying the 2nd-LASS to obtain the following expression for the indirect-effect term defined in Equation (152):
Add the expression obtained in Equation (158) to the expression of the direct-effect term provided in Equation (151) to obtain the following expression:
Insert the expression of
into the second term on the right-side of Equation (159) and collect the terms multiplying the variations
and
, respectively, to obtain the following expressions:
The algebraic manipulations involved in obtaining the closed-form expressions of the second-order sensitivities presented in Equations (160) and (161) are straightforward but involve a large amount of algebra stemming from the fact that the 2nd-LASS involves the two-component 2nd-level adjoint sensitivity function
. The reason for needing such a two-component adjoint function stems from the expression of the first-order sensitivity
provided in Equation (117), which involves both the original function
and the 1st-level adjoint sensitivity function
. A significant amount of algebraic manipulations could be avoided by eliminating the appearance of either
or
in the expression of
. If either of these functions were eliminated from appearing in the expression of
, then the G-differential of
would depend either just on
or just on
, which are “single-component” (as opposed to a “two-components”) variational sensitivity functions. In such a case, the corresponding 2nd-LASS would also comprise just a single-component (as opposed to a “two-component) 2nd-level adjoint sensitivity function. These considerations will be illustrated in the following by using Equation (101) to eliminate the appearance of the function
in the expression provided in Equation (117) for
, which would consequently take on the following simplified expression:
Applying the definition of the G-differential to Equation (162) yields the following expression:
where the direct-effect and the indirect-effect terms are defined below:
The appearance in Equation (165) of the variational function
is eliminated by following the same procedure as followed in the foregoing for the indirect-effect term
. Ultimately, the indirect-effect term
will have the following expression in terms of a 2nd-level adjoint sensitivity function denoted as
:
where the 2nd-level adjoint sensitivity function
is the solution of the following 2nd-LASS:
Adding the expressions obtained in Equations (164) and (166) yields the following expression for the G-differential
:
It follows from Equation (169) that the respective second-order sensitivities have the following expressions:
The mixed second-order sensitivity
in Equation (170) does not depend on the 2nd-level adjoint sensitivity function
and was therefore evaluated immediately. Solving Equations (167) and (168) yields the following expression, to be evaluated at the nominal parameter values, for the 2nd-level adjoint sensitivity function
:
Inserting the result obtained in Equation (172) into Equation (171) and performing the respective operations yields the following expression:
It is evident from Equation (147) and Equation (160) or, alternatively, Equation (173) that the mixed second-order sensitivity is computed twice, employing distinct expressions involving distinct 2nd-level adjoint sensitivity functions. This mechanism provides a stringent verification of the accuracy of the computation of the respective adjoint sensitivity functions.
In practice, the closed-form analytical expressions of the original functions, such as provide in Equation (101), are seldom available. Nevertheless, if such expressions are available, they can be advantageously used to reduce the amount of computations involved in determining the response sensitivities, as shown in the foregoing.
4.4.2. Alternative derivation of the second-order sensitivities by applying the 2nd-FASAM-NODE-F methodology
When applying the 2nd-FASAM-NODE methodology, the second-order sensitivities arise from the first-order sensitivities obtained in Equations (130) and (131). Thus, the second-order sensitivities arising from Equation (130) are provided by its G-differential for arbitrary variations around the nominal parameter and function values (indicated by the use of the superscript “zero”), which is by definition obtained as follows:
where
denotes the derivative of the Dirac-delta functional. The variational function
is the solution of the following 2nd-LVSS, obtained by G-differentiating Equations (128) and (129):
The above 2nd-LVSS for the function
is to be satisfied at the nominal parameter values, but the superscript “zero” (which has been used to denote this fact) has been omitted to simplify the notation. The need for repeatedly solving this 2nd-LVSS for all parameter variations of interest is circumvented by eliminating the appearance of
in Equation (174). This aim will be accomplished by expressing
in terms of the solution of the 2nd-LASS to be constructed by applying the steps outlined in
Section 3. Thus, consider an adjoint function that will be denoted as
, where the argument “1” indicates that this adjoint function corresponds to the first-order sensitivity
, which is chosen in this case to be the “first” 1st-order sensitivity to be considered. The 2nd-LASS to be satisfied by
will be constructed by applying the 2nd-FASAM-NODE, which commences by forming the inner product of
with Equation (175), to obtain the following relation:
Integrating by parts the first term on the left-side of Equation (177) and rearranging the terms yields the following relation:
The last term on the left-side of Equation (178) is now required to represent the G-differential defined in Equation (174) to obtain the relation below:
The unknown boundary terms are eliminated from Equation (178) by imposing the following conditions:
The system of equations comprising Equations (179) and (180) constitute the 2nd-LASS for the 2nd-level adjoint sensitivity function .
Replacing the results obtained in Equations (176), (179) and (180) into Equation (178) yields the following alternative expression for the indirect-effect term:
where the 2nd-level adjoint sensitivity function
is the solution of the 2nd-Level Adjoint Sensitivity System (2nd-LASS) comprising Equations (179) and (180). The 2nd-LASS is a NODE net that does not depend on parameter variations and needs to be solved once only at the nominal parameter values; its solution,
, is used in Equation (181) to determine the respective second-order response sensitivities.
Identifying in Equation (181) the quantities that multiply the respective parameter variations yields the following expressions:
Solving the 2nd-LASS represented by Equations (179) and (180) yields the following expression for the 2nd-level adjoint sensitivity function
:
where
denotes the Heaviside-functional. Using in Equation (182) the results obtained in Equations (183) and (132) yields the following expression:
As expected, the expression obtained in Equation (184) is identical to the expressions obtained in Equation (170) and (149).
The second-order sensitivities arising from the first-order sensitivity represented by Equation (131) are obtained from its G-differential for arbitrary variations around the nominal parameter and function values. Thus, applying the definition of the G-differential to Equation (131) yields the following expression:
where the direct-effect and indirect-effect terms have the expressions below:
The indirect-effect term will be recast in terms of an alternative expression that will not involve the variational functions and by applying the principles of the 2nd-FASAM-NODE, which are fundamentally the same as those underlying the 2nd-FASAM-NIDE-F, as follows:
Adding the expression obtained in Equation (194) to the expression of the direct-effect term provided in Equation (186) yields the following expression for the G-differential
:
It follows from the expression obtained in Equation (195) that:
Solving the 2nd-LASS represented by Equations (191)‒(193) yields the following expressions for the components of
:
Using in Equation (196) the result obtained in Equation (198) yields the following expression:
As expected, the above expression coincides with the expression obtained, successively, in Equations (149), (170) and (184). Evidently, the expression of the mixed second-order sensitivity can be determined in several distinct ways, using distinct adjoint sensitivity functions, thus providing alternatives for verifying the computational accuracy of the respective adjoint functions, when these functions are computed numerically, as is the case in practice.
Inserting the results obtained in Equations (101), (183), (198) and (199) into Equation (197) and performing the respective integrations yields the following expression:
As expected, the above expression coincides with the expression obtained in Equation (173).