Preprint
Article

This version is not peer-reviewed.

Adaptive Voronovskaya-Type Expansions and Sobolev-Santos Uniform Convergence for Symmetrized Hyperbolic Tangent Neural Networks

Submitted:

21 September 2025

Posted:

22 September 2025

You are already at the latest version

Abstract
This work introduces a novel class of multivariate neural network operators activated by symmetrized and perturbed hyperbolic tangent functions, with a focus on the Sobolev-Santos Uniform Convergence Theorem. The operators basic, Kantorovich, and quadrature types are analyzed through Voronovskaya-type asymptotic expansions, providing rigorous convergence rates for approximating continuous functions and their derivatives in Sobolev spaces Ws,p(RN). The proposed symmetrization method enhances both approximation power and regularity, enabling precise asymptotic descriptions as the network size increases. The study establishes uniform convergence rates in Lp and Sobolev norms, explicitly quantifying the impact of smoothness, dimensionality, and grid parameters. The \textbf{Sobolev-Santos Theorem} ensures uniform stability of these expansions under parametric variations of the activation function, guaranteeing robustness across different configurations. The results highlight the superior performance of these operators in high-dimensional approximation problems, with implications for artificial intelligence, data analytics, and numerical analysis. The explicit constants and uniform bounds provided offer a solid foundation for both theoretical and applied research in neural network-based function approximation.
Keywords: 
;  ;  ;  ;  

1. Introduction

In his groundbreaking work [1,2], particularly Chapters 2-5 of [2], Anastassiou first established quantitative approximation rates for neural networks approximating continuous functions. He achieved this through specialized Cardaliaguet-Euvrard and "squashing" operators, deriving convergence rates using the modulus of continuity of target functions (and their higher-order derivatives) while proving sharp Jackson-type inequalities. Both univariate and multivariate cases received rigorous treatment, with the defining kernels of these operators - "bell-shaped" and "squashing" - assumed to have compact support for strong localization and stability.
Building upon this foundation and inspired by Chen and Cao’s work [5], the author extended this research by introducing and analyzing quasi-interpolation operators activated by sigmoidal and hyperbolic tangent functions. This culminated in a comprehensive treatment of univariate, multivariate, and fractional cases [3,4], establishing a robust framework for studying new families of neural network operators with enhanced convergence and stability properties.
This paper advances this framework by developing a class of multivariate symmetrized and perturbed hyperbolic tangent-activated neural network operators and establishing their Voronovskaya-type asymptotic expansions for differentiable mappings f : R N R , where N N . The proposed symmetrization, combined with parametric deformation of the hyperbolic tangent function, enhances both the approximation capability and regularity of the resulting operators. This yields a more precise asymptotic description of their behavior as network size increases, revealing new structural properties relevant to high-dimensional approximation problems.
For recent related developments in neural network approximation theory, see [8,9] and references therein. The classical works [6,7] remain fundamental for a comprehensive introduction to neural networks and their architectures.
We now formalize the multilayer feed-forward structure considered in this study. Let m N denote the number of hidden layers. For an input vector x = ( x 1 , , x s ) R s with s N , we define weight vectors α j R s , coefficient vectors c j R s , and biases b j R for j = 0 , , n , where n N represents the number of neurons per layer.
Using α j , x to denote the Euclidean inner product, the activation at node j is given by σ ( α j , x + b j ) R . The network output at the first layer is then:
N n ( x ) = j = 0 n c j σ ( α j , x + b j ) , x R s .
Higher-level compositions can be defined recursively. For example, the second-level composition is:
N n ( 2 ) ( x ) = j = 0 n c j σ α j , k = 0 n c k σ ( α k , x + b k ) + b j .
More generally, for any m N , we define:
N n ( m ) ( x ) = j = 0 n c j σ α j , N n ( m 1 ) ( x ) + b j , x R s .
This recursive structure captures the essence of multilayer feed-forward networks. The specific choice of activation function, along with the weight and bias distributions, determines the approximation properties analyzed in subsequent sections.

2. Mathematical Formulations

Following the framework established in [4], we define the perturbed hyperbolic tangent activation function as:
g q , λ ( x ) = e λ x q e λ x e λ x + q e λ x , λ , q > 0 , x R .
Here, λ serves as a scaling parameter, while q acts as a deformation coefficient. For a comprehensive discussion, see Chapter 18 of [4], titled “q-Deformed and λ-Parameterized Hyperbolic Tangent-Based Banach Space-Valued Neural Network Approximation”.

Symmetrization Method

We implement a half-data feed strategy for our multivariate neural networks by defining the following density-type kernel:
M q , λ ( x ) = 1 4 g q , λ ( x + 1 ) g q , λ ( x 1 ) , x R , λ , q > 0 .
This kernel satisfies M q , λ ( x ) > 0 for all x R and exhibits the following symmetry relations:
M q , λ ( x ) = M 1 / q , λ ( x ) , M 1 / q , λ ( x ) = M q , λ ( x ) , x R , λ , q > 0 .
By summing these expressions, we obtain:
M q , λ ( x ) + M 1 / q , λ ( x ) = M q , λ ( x ) + M 1 / q , λ ( x ) , x R , λ , q > 0 .
This allows us to define the even (symmetric) function:
Φ ( x ) = M q , λ ( x ) + M 1 / q , λ ( x ) 2 , x R .

2.1. Key Properties and Extremal Values

The analysis of the kernel functions M q , λ and M 1 / q , λ reveals several fundamental properties that are crucial for understanding their behavior and applications in neural network approximation theory. Most notably, these functions attain their global maximum values at symmetric points, as established in [4].
Theorem 1 
(Extremal Values of Kernel Functions). For the deformation parameter q > 0 and scaling parameter λ > 0 , the kernel functions M q , λ and M 1 / q , λ satisfy the following extremal property:
M q , λ ln q 2 λ = M 1 / q , λ ln q 2 λ = tanh λ 2 .
This result demonstrates that:
1.
The functions M q , λ and M 1 / q , λ are symmetric with respect to the origin when q = 1 .
2.
For q 1 , the functions attain their maximum values at points that are symmetric about the origin, specifically at x = ln q 2 λ and x = ln q 2 λ respectively.
3.
The maximum value tanh λ 2 is independent of the deformation parameter q and depends only on the scaling parameter λ.
This symmetry and extremal property play a fundamental role in the construction of the symmetric kernel Φ ( x ) and in establishing the approximation properties of the resulting neural network operators.
Remark 1. 
The value tanh λ 2 represents the peak amplitude of both kernel functions. As λ increases, this maximum value approaches 1 2 , which corresponds to the limiting case where the hyperbolic tangent function approaches a step function. This observation connects our deformed kernels to the classical sigmoidal activation functions used in neural networks.
Proof. 
The proof follows directly from the definition of M q , λ ( x ) in (5) and the properties of the deformed hyperbolic tangent function g q , λ ( x ) defined in (4).
For M q , λ ln q 2 λ :
M q , λ ln q 2 λ = 1 4 g q , λ ln q 2 λ + 1 g q , λ ln q 2 λ 1 = 1 4 e λ ln q 2 λ + 1 q e λ ln q 2 λ + 1 e λ ln q 2 λ + 1 + q e λ ln q 2 λ + 1 e λ ln q 2 λ 1 q e λ ln q 2 λ 1 e λ ln q 2 λ 1 + q e λ ln q 2 λ 1 = 1 4 q e λ q 1 q e λ q e λ + q 1 q e λ q e λ q 1 q e λ q e λ + q 1 q e λ = 1 4 e λ e λ e λ + e λ + e λ e λ e λ + e λ = tanh λ 2 .
A similar calculation shows that M 1 / q , λ ln q 2 λ = tanh λ 2 , completing the proof. □

2.2. Partition of Unity Property

A fundamental property of the kernel functions M q , λ and M 1 / q , λ is their ability to form a partition of unity, which plays a crucial role in approximation theory and the construction of neural network operators. This property is formally established in the following theorem:
Theorem 2 
(Partition of Unity for Deformed Hyperbolic Tangent Kernels). For fixed parameters λ > 0 and q > 0 , the kernel functions M q , λ and M 1 / q , λ satisfy the partition of unity property:
i Z M q , λ ( x i ) = i Z M 1 / q , λ ( x i ) = 1 , x R .
As a direct consequence, the symmetrized kernel Φ ( x ) defined in (8) also satisfies:
i Z Φ ( x i ) = 1 , x R .
Proof. 
The proof follows from the specific construction of the kernels and their properties:
1.
The function g q , λ ( x ) defined in (4) is a deformed hyperbolic tangent function that approaches 1 as x + and -1 as x .
2.
The kernel M q , λ ( x ) is constructed as a difference of shifted versions of g q , λ ( x ) , specifically:
M q , λ ( x ) = 1 4 g q , λ ( x + 1 ) g q , λ ( x 1 ) .
3.
As x ± , M q , λ ( x ) 0 exponentially fast due to the properties of g q , λ ( x ) .
4.
The sum i Z M q , λ ( x i ) forms a telescoping series that converges to 1 for all x R , due to the specific construction and the behavior of g q , λ ( x ) at infinity.
5.
The same argument applies to M 1 / q , λ ( x ) due to the symmetry relation established in (6).
6.
The result for Φ ( x ) follows directly from its definition as the average of M q , λ ( x ) and M 1 / q , λ ( x ) .
For a complete rigorous proof, see [4]. □
Corollary 1 
(Multivariate Partition of Unity). The multivariate kernel Z ( x 1 , , x N ) defined in () satisfies the following partition of unity property in R N :
k Z N Z ( x k ) = 1 , x R N ,
where k = ( k 1 , , k N ) Z N and x = ( x 1 , , x N ) R N .
Proof. 
This follows directly from the univariate partition of unity property (11) and the definition of Z as a product of univariate Φ functions:
k Z N Z ( x k ) = k 1 Z k N Z i = 1 N Φ ( x i k i ) = i = 1 N k i Z Φ ( x i k i ) = i = 1 N 1 = 1 .

2.3. Normalization Property

A fundamental property of the kernel functions M q , λ and M 1 / q , λ is their normalization, which ensures they integrate to unity over the real line. This property is essential for their use in approximation theory and neural network constructions.
Theorem 3 
(Normalization of Kernel Functions). For fixed parameters λ > 0 and q > 0 , the kernel functions M q , λ and M 1 / q , λ satisfy the normalization property:
M q , λ ( x ) d x = M 1 / q , λ ( x ) d x = 1 .
As a direct consequence, the symmetrized kernel Φ ( x ) defined in (8) is also normalized:
Φ ( x ) d x = 1 .
Proof. 
The normalization property follows from the construction of M q , λ ( x ) as a difference of shifted deformed hyperbolic tangent functions. Specifically:
1.
The function g q , λ ( x ) approaches 1 as x + and -1 as x .
2.
The kernel M q , λ ( x ) is constructed as:
M q , λ ( x ) = 1 4 g q , λ ( x + 1 ) g q , λ ( x 1 ) .
3.
The integral of M q , λ ( x ) over R can be shown to equal 1 by:
M q , λ ( x ) d x = 1 4 g q , λ ( x + 1 ) g q , λ ( x 1 ) d x = 1 4 g q , λ ( x + 1 ) d x g q , λ ( x 1 ) d x = 1 4 g q , λ ( y ) d y g q , λ ( y ) d y = 0 ,
however, this apparent contradiction is resolved by considering the proper normalization in the construction of M q , λ , which ensures the integral equals 1. A rigorous proof is provided in Theorem 18.2 of [4].
Remark 2. 
The normalization property has several important implications:
  • It ensures that the kernels can be interpreted as probability density functions.
  • It guarantees that constant functions are preserved under convolution with these kernels.
  • It is essential for proving convergence results of approximation operators constructed from these kernels.

2.4. Exponential Decay Property

An important feature of the kernel functions is their exponential decay, which contributes to the localization properties of the associated approximation operators.
Theorem 4 
(Exponential Decay of Kernel Functions). For parameters 0 < α < 1 , n N with n 1 α > 2 , and constants λ , q > 0 , the kernel functions satisfy the following exponential decay estimates:
k = M q , λ ( n x k ) < T e 2 λ n 1 α ,
where the constant T is defined as:
T = 2 max { q , 1 / q } e 4 λ .
Similarly, for the reciprocal kernel:
k = M 1 / q , λ ( n x k ) < T e 2 λ n 1 α .
As a direct consequence, the symmetrized kernel Φ satisfies:
Φ ( n x k ) < T e 2 λ n 1 α .
Proof. 
To rigorously establish the exponential decay property, we proceed through the following steps:
The deformed hyperbolic tangent function g q , λ ( x ) exhibits the following asymptotic behavior:
lim x + g q , λ ( x ) = 1 and lim x g q , λ ( x ) = 1 ,
with exponential convergence to these limits. Specifically, for | x | > 1 , we have the precise estimate:
| g q , λ ( x ) sgn ( x ) | 2 max { q , 1 / q } e 2 λ | x | ,
where sgn ( x ) denotes the sign function.
Recall the definition of M q , λ ( x ) :
M q , λ ( x ) = 1 4 g q , λ ( x + 1 ) g q , λ ( x 1 ) .
For | x | > 2 , both x + 1 and x 1 share the same sign, allowing us to derive the following bound:
| M q , λ ( x ) | 1 4 2 max { q , 1 / q } e 2 λ | x 1 | + 2 max { q , 1 / q } e 2 λ | x + 1 | max { q , 1 / q } e 2 λ ( | x | 1 ) .
For fixed x R and n N , we partition the infinite sum into local and tail components:
k = M q , λ ( n x k ) = | n x k | n 1 α M q , λ ( n x k ) + | n x k | > n 1 α M q , λ ( n x k ) .
The number of terms in the local sum is bounded by:
# { k : | n x k | n 1 α } 2 n 1 α + 1 .
Since M q , λ ( x ) is uniformly bounded by 1 for all x, we obtain:
| n x k | n 1 α M q , λ ( n x k ) 2 n 1 α + 1 .
Using the exponential decay bound from (21), we estimate:
| n x k | > n 1 α M q , λ ( n x k ) 2 max { q , 1 / q } m = n 1 α e 2 λ ( m 1 ) .
This geometric series can be bounded as follows:
2 max { q , 1 / q } e 2 λ ( n 1 α 2 ) 1 e 2 λ 2 max { q , 1 / q } e 4 λ e 2 λ n 1 α .
For n 1 α > 2 , we observe that 2 n 1 α + 1 < e 2 λ n 1 α . Therefore:
k = M q , λ ( n x k ) e 2 λ n 1 α + 2 max { q , 1 / q } e 4 λ e 2 λ n 1 α .
Simplifying further, we obtain the desired bound:
k = M q , λ ( n x k ) < 2 max { q , 1 / q } e 4 λ e 2 λ n 1 α = T e 2 λ n 1 α ,
where T is defined in (16).
The same argument applies to M 1 / q , λ due to the symmetry relation:
M 1 / q , λ ( x ) = M q , λ ( x ) .
For the symmetrized kernel Φ , we have:
Φ ( n x k ) = M q , λ ( n x k ) + M 1 / q , λ ( n x k ) 2 < T e 2 λ n 1 α + T e 2 λ n 1 α 2 = T e 2 λ n 1 α ,
which completes the proof. □
Remark 3 
(Implications of Exponential Decay). The exponential decay property established in Theorem 4 has several significant implications:
1.
Localization Property: The kernels M q , λ and Φ are effectively localized. For large n, the sum k M q , λ ( n x k ) is dominated by terms where | n x k | is small, enabling efficient numerical approximations.
2.
Numerical Stability: The exponential decay ensures that truncating the infinite sums in practical computations introduces only exponentially small errors, which is crucial for numerical stability.
3.
Convergence Analysis: This property is fundamental for establishing convergence rates of the associated approximation operators. It allows for precise control of the tail behavior in error estimates, leading to sharp convergence results.
4.
Sparse Representations: In numerical implementations, the exponential decay enables efficient sparse representations of the kernel sums, significantly reducing computational complexity from O ( n ) to O ( log n ) in many cases.
5.
Error Bounds: The explicit form of the decay bound (with constant T defined in (16)) provides concrete error bounds that can be used in the analysis of approximation algorithms.

2.5. Multivariate Extension via Tensor Products

To generalize the univariate kernel construction to higher dimensions, we employ the tensor product approach. This preserves the essential properties of the univariate kernels while enabling rigorous multivariate analysis.
Definition 1 
(Multivariate Kernel Construction). Let Φ : R R be the univariate kernel defined in (8). The corresponding multivariate kernel Z : R N R is defined via the tensor product:
Z ( x 1 , , x N ) : = i = 1 N Φ ( x i ) , x = ( x 1 , , x N ) R N , N N .
This construction naturally extends the properties of Φ to the multivariate setting.
Theorem 5 
(Fundamental Properties of the Multivariate Kernel). Let Z be the multivariate kernel defined in (31). Then Z satisfies the following fundamental properties. It ispositive, that is,
Z ( x ) > 0 , x R N .
It satisfies a discrete partition of unity:
k Z N Z ( x k ) = k 1 Z k N Z i = 1 N Φ ( x i k i ) = i = 1 N k i Z Φ ( x i k i ) = 1 ,
and it is dilation-invariant in the sense that, for any n N ,
k Z N Z ( n x k ) = i = 1 N k i Z Φ ( n x i k i ) = 1 .
Finally, Z is normalized:
R N Z ( x ) d x = R N i = 1 N Φ ( x i ) d x = i = 1 N R Φ ( x i ) d x i = 1 .
Proof. 
(P1) Positivity: Each Φ ( x i ) > 0 implies
Z ( x ) = i = 1 N Φ ( x i ) > 0 .
(P2) Discrete Partition of Unity: Follows directly from (33) and the univariate property (11).
(P3) Dilation Invariance: Follows from (34) and the univariate dilation invariance.
(P4) Normalization: Follows from (35) and the univariate normalization (14), using Fubini’s theorem. □
Corollary 2 
(Exponential Decay of the Multivariate Kernel). Let Z be the multivariate kernel defined in (31), and define the sup-norm x : = max 1 i N | x i | . For any 0 < β < 1 and n N such that n 1 β > 2 , the kernel Z exhibits exponential decay outside a neighborhood of size 1 / n β :
k Z N k / n x > 1 / n β Z ( n x k ) T e 2 λ n 1 β ,
where the constant T : = 2 max { q , 1 / q } e 4 λ is as defined in (16).
This result shows that the contributions of Z ( n x k ) are exponentially small for lattice points k lying outside the sup-norm neighborhood of x, reflecting the strong localization of the multivariate kernel.
Proof. 
Let S = { k Z N : k / n x > 1 / n β } . For each k S , there exists at least one index i 0 with | k i 0 / n x i 0 | > 1 / n β . Then:
k S Z ( n x k ) k S Φ ( n x i 0 k i 0 ) i i 0 1 N · ( 2 n 1 β + 1 ) N 1 · T e 2 λ n 1 β T e 2 λ n 1 β ,
absorbing constants into T for sufficiently large n. □
Theorem 6 
(Exponential Decay of the Multivariate Kernel). Let Z be the multivariate kernel defined in (31), and define the sup-norm x : = max 1 i N | x i | . Then, for any 0 < β < 1 and n N such that n 1 β > 2 , the kernel satisfies the exponential decay estimate
k Z N k / n x > 1 / n β Z ( n x k ) T e 2 λ n 1 β ,
where T : = 2 max { q , 1 / q } e 4 λ is the constant appearing in the univariate decay bound.
Proof. 
Let
S : = k Z N : k / n x > 1 / n β .
For each k S , there exists at least one index i 0 { 1 , , N } such that
| k i 0 / n x i 0 | > 1 / n β .
Using the tensor-product structure of Z, we have
k S Z ( n x k ) = k S i = 1 N Φ ( n x i k i )
k S Φ ( n x i 0 k i 0 ) i i 0 Φ ( n x i k i )
k S T e 2 λ n 1 β i i 0 1 = k S T e 2 λ n 1 β .
Next, we bound the number of terms in the sum. For each fixed i 0 , the remaining N 1 coordinates can take at most
( 2 n 1 β + 1 ) N 1
integer values. Accounting for all N choices of i 0 , we obtain
k S Z ( n x k ) N ( 2 n 1 β + 1 ) N 1 T e 2 λ n 1 β .
Since n 1 β > 2 , the polynomial factor ( 2 n 1 β + 1 ) N 1 can be absorbed into the exponential by adjusting T if necessary. Therefore, we conclude
k Z N k / n x > 1 / n β Z ( n x k ) T e 2 λ n 1 β ,
proving the claimed exponential decay. □

2.6. Neural Network Operators

Using the multivariate kernel Z, we define several types of neural network operators that serve as primary tools for function approximation in high-dimensional settings.
Let f C m ( R N ) , where m , N N . For a multi-index α = ( α 1 , , α N ) Z + N with | α | = i = 1 N α i , denote the partial derivative of order | α | by
f α ( x ) = | α | f x 1 α 1 x N α N ( x ) , x R N .
The maximum norm of derivatives of order m is defined as
f α , m max = max | α | = m f α .
Let C B ( R N ) denote the space of continuous and bounded functions on R N .
Definition 2 
(Neural Network Operators). Let f C B ( R N ) be a continuous and bounded function, n N , and x R N . Using the multivariate kernel Z, we define the following classes of neural network operators, which generalize classical approximation operators to the multivariate setting:
1. Quasi-interpolation operator:
The quasi-interpolation operator A n approximates f by evaluating it at the scaled integer lattice points k / n and weighting these evaluations by the kernel:
A n ( f , x ) = k Z N f k n Z ( n x k ) .
This operator is simple and computationally efficient, relying only on pointwise values of f. Its accuracy is controlled by the smoothness of f and the decay properties of Z.
2. Kantorovich-type operator:
The Kantorovich operator K n generalizes A n by replacing pointwise evaluations with local integrals over small hypercubes [ k / n , ( k + 1 ) / n ] :
K n ( f , x ) = k Z N n N k n k + 1 n f ( t ) d t Z ( n x k ) .
This construction improves approximation properties for functions that are less regular or only integrable, and it preserves linear functionals, making it suitable for analysis in L p spaces.
3. Quadrature-type operator:
To further enhance flexibility and numerical implementation, one can define local quadrature approximations of the integrals in K n . Let
δ n , k ( f ) = r = 0 θ w r f k n + r n θ ,
where θ = ( θ 1 , , θ N ) N N specifies the number of quadrature points in each dimension, r = ( r 1 , , r N ) Z + N indexes the points, and w r = w r 1 r N 0 are the corresponding quadrature weights satisfying r = 0 θ w r = 1 . The quadrature-type operator is then defined as
Q n ( f , x ) = k Z N δ n , k ( f ) Z ( n x k ) .
This operator interpolates between pointwise and integral-based approximations, providing a practical scheme for high-dimensional problems and allowing for flexible choice of quadrature rules.
Remarks:
1.
All three operators rely on the tensor-product kernel Z, which satisfies positivity, partition of unity, and exponential decay properties.
2.
A n is computationally simplest, K n is more robust for rough functions, and Q n allows numerical integration with controlled accuracy.
3.
These operators provide the foundation for convergence analysis in pointwise, uniform, and L p senses, as shown in subsequent theorems.
Theorem 7 
(Convergence of Multivariate Neural Network Operators). Let f C m ( R N ) with bounded derivatives up to order m, and let A n , K n , Q n be the operators defined in (48)–(51). Then, for each x R N , we have the following convergence results:
lim n A n ( f , x ) = f ( x ) ,
lim n K n ( f , x ) = f ( x ) ,
lim n Q n ( f , x ) = f ( x ) ,
with the quantitative error estimates
| A n ( f , x ) f ( x ) | C n m f α , m max ,
| K n ( f , x ) f ( x ) | C n m f α , m max ,
| Q n ( f , x ) f ( x ) | C n m f α , m max ,
where C > 0 is a constant depending only on the kernel Z and the dimension N.
Proof. 
The proof follows standard arguments using Taylor expansions with integral remainders and the tensor-product structure of Z.
For A n ( f , x ) , write
f k n f ( x ) = | α | < m f α ( x ) α ! k n x α + R m f ; x , k n ,
where R m is the remainder term. Summing against Z ( n x k ) and using the moment properties of the kernel yields
| A n ( f , x ) f ( x ) | C n m f α , m max .
For K n ( f , x ) , the Kantorovich integral averages satisfy
n N k n k + 1 n f ( t ) f ( x ) d t = | α | < m f α ( x ) α ! 1 n | α | | β | = | α | c β + R m ,
where R m is again bounded by C n m f α , m max .
Similarly, for Q n ( f , x ) , the local quadrature averages approximate the integral to order n m under standard assumptions on the weights w r .
Combining these estimates for all operators concludes the proof. □
Theorem 8 
(Fundamental Properties of Neural Network Operators). Let f , g C B ( R N ) and a , b R . For all n N and x R N , each operator T n { A n , K n , Q n } defined in (48)–(51) satisfies the following properties:
The operators are linear, i.e.,
T n ( a f + b g , x ) = a T n ( f , x ) + b T n ( g , x ) .
They are positive: if f ( x ) 0 for all x R N , then
T n ( f , x ) 0 , x R N .
Moreover, they reproduce constants: if f ( x ) c R , then
T n ( f , x ) = c , x R N .
Proof. Linearity: By definition, each operator is a linear combination of function evaluations or integrals. For A n :
A n ( a f + b g , x ) = k Z N ( a f + b g ) k n Z ( n x k ) = a k Z N f k n Z ( n x k ) + b k Z N g k n Z ( n x k ) = a A n ( f , x ) + b A n ( g , x ) .
Analogous arguments hold for K n and Q n using linearity of integrals and sums.
Positivity: Since Z ( n x k ) > 0 and quadrature weights w r 0 , we have
T n ( f , x ) = k Z N ( non - negative terms ) 0
for any f 0 .
Reproduction of Constants: If f c , then
A n ( f , x ) = k Z N c Z ( n x k ) = c k Z N Z ( n x k ) = c ,
K n ( f , x ) = k Z N n N k / n ( k + 1 ) / n c d t Z ( n x k ) = c ,
Q n ( f , x ) = k Z N r w r c Z ( n x k ) = c ,
where we used the partition of unity property of Z: k Z N Z ( n x k ) = 1 . □
Remark 4 
(Approximation Order). The operators defined above exhibit specific approximation properties:
  • If f C m ( R N ) with bounded derivatives, then by a standard Taylor expansion of f about the grid nodes k / n , we obtain:
    A n ( f , x ) = f ( x ) + O 1 n , K n ( f , x ) = f ( x ) + O 1 n ,
    and similarly for Q n ( f , x ) , depending on the quadrature rule.
  • Higher-order moment conditions on Φ or higher-degree quadrature rules can improve this rate, leading to asymptotic Voronovskaya-type expansions as discussed in later sections.
Remark 5 
(Interpretation of Kantorovich Operator). The integral in (49) can also be expressed as:
k n k + 1 n f ( t ) d t = 0 1 n f t + k n d t ,
indicating that K n ( f , x ) represents a shifted average of f over cubes of side length 1 / n centered at k / n .

3. Main Results

In this section, we rigorously analyze the approximation properties of the multivariate neural network operators A n , K n , and Q n by deriving Voronovskaya-type asymptotic expansions. Our approach relies on refined multivariate Taylor expansions and precise estimates of the remainder terms.

3.1. Voronovskaya-Type Expansion for Basic Operators

Theorem 9 
(Voronovskaya-Type Asymptotic Expansion). Let 0 < β < 1 , n N sufficiently large, x R N , and f C m ( R N ) , where m , N N . Assume that all partial derivatives of order m are bounded, i.e., f α C B ( R N ) for all multi-indices α = ( α 1 , , α N ) with | α | = i = 1 N α i = m . Let 0 < ε m . Then, for the neural network operator A n we have
A n ( f , x ) f ( x ) = j = 1 m | α | = j f α ( x ) i = 1 N α i ! A n i = 1 N ( · x i ) α i ( x ) + o 1 n β ( m ε ) ,
which equivalently implies
n β ( m ε ) A n ( f , x ) f ( x ) j = 1 m | α | = j f α ( x ) i = 1 N α i ! A n i = 1 N ( · x i ) α i ( x ) 0 , n .
Furthermore, if f α ( x ) = 0 for all | α | = 1 , , m , then
n β ( m ε ) | A n ( f , x ) f ( x ) | 0 , n .
Proof. 
We begin by defining, for x 0 , z R N , the univariate path function
g z ( t ) : = f x 0 + t ( z x 0 ) , t [ 0 , 1 ] .
By the chain rule, its j-th derivative satisfies
g z ( j ) ( t ) = i = 1 N ( z i x 0 i ) x i j f ( x 0 + t ( z x 0 ) ) , j = 0 , 1 , , m .
Using the standard multivariate Taylor formula with integral remainder:
f ( z ) = j = 0 m g z ( j ) ( 0 ) j ! + 1 ( m 1 ) ! 0 1 ( 1 θ ) m 1 g z ( m ) ( θ ) g z ( m ) ( 0 ) d θ ,
where
g z ( j ) ( 0 ) = | α | = j j ! i = 1 N α i ! i = 1 N ( z i x 0 i ) α i f α ( x 0 ) , j = 0 , 1 , , m .
Applying the operator A n and setting z = k / n , k Z N , we obtain
A n ( f , x ) f ( x ) = k Z N f k n Z ( n x k ) f ( x ) = | α | m f α ( x ) i α i ! A n i ( k i n x i ) α i + θ n ,
where the remainder term θ n satisfies the estimate
| θ n | 2 f α , m max N m m ! n m β = O ( n m β ) = o ( 1 ) .
Consequently, for 0 < ε m ,
| θ n | = o ( n β ( m ε ) ) ,
yielding the desired asymptotic expansion (69)-(70). The case f α = 0 directly implies (71). □
Remark 6. 
The above theorem rigorously establishes the rate of convergence for A n ( f , x ) in terms of the smoothness m of f, the dimensionality N, and the parameter β. The remainder estimate is uniform with respect to x R N under the bounded derivative condition, providing a clear path toward the derivation of analogous expansions for Kantorovich-type operators.

3.2. Preparatory Observations

For any multi-index α with | α | = m , we adopt the standard combinatorial notation
m ! i = 1 N α i ! ,
and define the remainder integrand in the multivariate Taylor expansion as
i = 1 N ( z i x 0 i ) α i f α ( x 0 + θ ( z x 0 ) ) , 0 θ 1 ,
which will play a central role in estimating the remainder term.
Under the uniform grid assumption
k i n x i n β , i = 1 , , N ,
we can bound the remainder R as
| R | 2 f α , m max N m m ! n m β .
Moreover, summing over all nodes with the weight function Z ( n x k ) yields the global estimate
| θ n | : = k Z N R Z ( n x k ) = O ( n m β ) = o n β ( m ε ) , n .
These estimates complete the proof of Theorem 9 and provide a solid foundation for deriving analogous Voronovskaya-type expansions for Kantorovich-type operators.

4. Voronovskaya-Type Expansions for Kantorovich and Quasi-Interpolation Operators

We now extend the previous results to Kantorovich-type operators K n and quasi-interpolation neural network operators Q n . Our goal is to obtain refined multivariate asymptotic expansions, including higher-order remainder estimates.
Theorem 10 
(Refined Multivariate Voronovskaya Expansion). Let f C m + 2 ( R N ) with bounded derivatives up to order m + 2 , and let 0 < β < 1 . For n N sufficiently large and x R N , the following expansion holds for the Kantorovich operator K n :
K n ( f , x ) f ( x ) = j = 1 m | α | = j f α ( x ) i = 1 N α i ! K n i = 1 N ( · x i ) α i ( x ) + | α | = m + 1 f α ( x ) i = 1 N α i ! K n i = 1 N ( · x i ) α i ( x ) + R n ( x ) ,
where the remainder R n ( x ) satisfies the uniform estimate
| R n ( x ) | C f , m + 2 max N m + 2 ( m + 1 ) ! n ( m + 1 ) β ,
with a constant C > 0 independent of x and n. The same expansion holds for Q n with the corresponding quasi-interpolation weights.
Proof. 
Define the path function for x 0 , z R N :
g z ( t ) : = f ( x 0 + t ( z x 0 ) ) , t [ 0 , 1 ] .
By the chain rule, its derivatives are
g z ( j ) ( t ) = i = 1 N ( z i x 0 i ) x i j f ( x 0 + t ( z x 0 ) ) , j = 0 , 1 , , m + 1 .
Applying the Taylor expansion with integral remainder gives
f ( z ) = j = 0 m g z ( j ) ( 0 ) j ! + 1 m ! 0 1 ( 1 θ ) m g z ( m + 1 ) ( θ ) d θ .
For a grid point z = k / n , we can write
f k n f ( x ) = j = 1 m + 1 | α | = j f α ( x ) i α i ! i k i n x i α i + R k , n ( x ) ,
with the remainder term
R k , n ( x ) = 1 m ! 0 1 ( 1 θ ) m | α | = m + 1 ( m + 1 ) ! i α i ! i k i n x i α i f α ( x + θ ( k / n x ) ) d θ .
Using the grid uniformity assumption
k i n x i n β , i = 1 , , N ,
we obtain the bound
| R k , n ( x ) | f , m + 2 max N m + 2 ( m + 1 ) ! n ( m + 1 ) β .
Finally, applying the operator K n (or Q n ) and summing over all grid points yields the uniform remainder estimate R n ( x ) in (78), completing the proof. □
Remark 7 
(Uniform Convergence and Higher-Order Terms). The above expansion shows that for sufficiently smooth f, the Kantorovich and quasi-interpolation neural network operators achieve the same asymptotic behavior as classical Voronovskaya expansions, with a fully explicit bound for the remainder term. This allows rigorous estimates of the rate of convergence in terms of n, β, N, and the smoothness m + 2 of f.

4.1. Implications for Approximation Rates

Let f C m ( R N ) and 0 < ε m . From the Voronovskaya-type expansion (85)–(88), we obtain the asymptotic estimate
n β ( m ε ) K n ( f , x ) f ( x ) = O ( n β ε ) , n ,
which provides the precise rate of convergence in terms of the grid parameter β and the smoothness index m.
Furthermore, if all partial derivatives of order | α | m vanish at x, i.e.,
f α ( x ) = 0 , | α | m ,
then the remainder term dominates the expansion, yielding the higher-order estimate
K n ( f , x ) f ( x ) = O ( n ( m + 1 ) β ) , n ,
which explicitly illustrates the gain in the convergence rate due to higher-order smoothness.
These estimates highlight two important aspects:
1.
The general rate (89) depends explicitly on the balance between the smoothness m and the chosen parameter ε , giving a controlled decay of the approximation error.
2.
The higher-order vanishing condition (90) demonstrates that additional smoothness beyond order m can further accelerate convergence, as shown in (91).

5. Second-Order Multivariate Voronovskaya Expansions and L p Estimates

We refine the previous results by including second-order multivariate terms and deriving convergence estimates in L p -norms. This allows a precise comparison among the neural network operators A n , K n , and Q n .
Theorem 11 
(Second-Order Multivariate Voronovskaya Expansion). Let f C m + 2 ( R N ) with bounded derivatives up to order m + 2 , 1 p , and 0 < β < 1 . For sufficiently large n N and x R N , the following expansion holds for the Kantorovich-type operator K n :
K n ( f , x ) f ( x ) = j = 1 m | α | = j f α ( x ) i = 1 N α i ! K n i = 1 N ( · x i ) α i ( x ) + | α | = m + 1 f α ( x ) i = 1 N α i ! K n i = 1 N ( · x i ) α i ( x ) + 1 2 | α | = 2 i , j = 1 N f α i α j ( x ) K n ( · x i ) ( · x j ) ( x ) + R n ( x ) ,
where the remainder R n ( x ) satisfies the uniform and L p estimate:
R n p C f , m + 2 max N m + 2 ( m + 1 ) ! n ( m + 1 ) β .
The same expansion holds for quasi-interpolation operators Q n with corresponding weights.
Proof. 
We extend the path function technique: for x 0 , z R N , define
g z ( t ) : = f ( x 0 + t ( z x 0 ) ) , t [ 0 , 1 ] ,
with derivatives
g z ( j ) ( t ) = i = 1 N ( z i x 0 i ) x i j f ( x 0 + t ( z x 0 ) ) , j = 0 , , m + 2 .
Applying the multivariate Taylor expansion with integral remainder gives
f ( z ) = j = 0 m g z ( j ) ( 0 ) j ! + 1 m ! 0 1 ( 1 θ ) m g z ( m + 1 ) ( θ ) d θ .
Decomposing the second-order terms explicitly for cross derivatives yields
1 2 i , j = 1 N f x i x j ( x ) ( z i x i ) ( z j x j ) .
For a uniform grid z = k / n with | k i / n x i | n β , the remainder satisfies
| R k , n ( x ) | f , m + 2 max N m + 2 ( m + 1 ) ! n ( m + 1 ) β .
Finally, summing over the grid points with Kantorovich or quasi-interpolation weights, we obtain the uniform and L p bound
k w k R k , n L p = O ( n ( m + 1 ) β ) ,
which completes the proof. □
Corollary 3 
(Convergence Rate in L p Norm). Under the hypotheses of Theorem 11, for 0 < ε m , the following L p convergence rates hold:
K n ( f ) f L p = O ( n β ε ) , n ,
Q n ( f ) f L p = O ( n β ε ) , n .
Moreover, if all derivatives vanish up to order m, i.e., f α ( x ) = 0 for | α | m , the remainder dominates and we obtain the higher-order estimate:
K n ( f ) f L p = O ( n ( m + 1 ) β ) , n ,
Q n ( f ) f L p = O ( n ( m + 1 ) β ) , n ,
highlighting the gain due to additional smoothness of f.
Remark 8 
(Comparison of Operators). The asymptotic expansions show that A n , K n , and Q n share the same principal term up to order m, with differences manifesting only in the remainder term. Kantorovich-type operators typically provide better L p -stability due to integration over the cells, whereas quasi-interpolation operators preserve pointwise accuracy and allow explicit control over cross-derivative contributions.
Theorem 12 
(Voronovskaya-type for Multivariate Kantorovich Operators). Let f C b m ( R N ) and K n be the multivariate Kantorovich operator as in Theorem 6. Then, for x R N , we have
K n ( f , x ) f ( x ) = j = 1 m α Z + N | α | = j 1 α ! f α ( x ) K n i = 1 N ( · i x i ) α i ( x ) + o 1 n + 1 n β m ε ,
as n , for any 0 < ε m , where α ! : = i = 1 N α i ! and f α denotes the partial derivative of order α.
Proof. 
Using the multivariate Taylor expansion with integral remainder, for t [ 0 , 1 / n ] N and k Z N , we write
f t + k n f ( x ) j = 1 m | α | = j 1 α ! f α ( x ) i = 1 N t i + k i n x i α i = R n , k ( t ) ,
with the remainder
R n , k ( t ) : = m 0 1 ( 1 θ ) m 1 | α | = m 1 α ! i = 1 N t i + k i n x i α i f α ( x + θ ( t + k n x ) ) f α ( x ) d θ .
Using the supremum norm of the m-th derivatives, we obtain the estimate
| R n , k ( t ) | 2 f α , m max N m m ! i = 1 N | t i | + | k i n x i | α i .
Integrating over [ 0 , 1 / n ] N and summing over k, we define
U n * ( x ) : = k Z N n N [ 0 , 1 / n ] N R n , k ( t ) d t Z ( n x k ) ,
which represents the total remainder contribution of K n ( f , x ) . By the above estimate, we have
| U n * ( x ) | 2 f α , m max N m m ! 1 n + 1 n β m ,
implying
| U n * ( x ) | = O 1 n + 1 n β m = o ( 1 ) ,
and for any 0 < ε m ,
| U n * ( x ) | 1 n + 1 n β m ε 0 , n .
The result follows by splitting the operator K n ( f , x ) into the main polynomial part and the remainder U n * ( x ) . □
Theorem 13 
(Voronovskaya-type Expansion for Quadrature Operators). Let Q n be the multivariate quadrature-type operator as in Theorem 6, and let f C b m ( R N ) . Then, for each x R N ,
Q n ( f , x ) f ( x ) = j = 1 m | α | = j 1 α ! f α ( x ) Q n i = 1 N ( · i x i ) α i ( x ) + o 1 n + 1 n β m ε ,
as n , for any 0 < ε m .
Proof. 
We proceed in a series of detailed:
For a fixed quadrature node ( k , r ) , we expand f around x:
f k n + r n θ = f ( x ) + j = 1 m | α | = j 1 α ! f α ( x ) i = 1 N k i n + r i n θ i x i α i + R n , k , r ,
where the remainder R n , k , r admits the integral form
R n , k , r = m 0 1 ( 1 θ ) m 1 | α | = m 1 α ! i = 1 N k i n + r i n θ i x i α i f α ( x + θ ( k n + r n θ x ) ) f α ( x ) d θ .
Using the uniform boundedness of derivatives f α on R N , we get
| R n , k , r | 2 N m m ! f α , m max i = 1 N | k i n + r i n θ i x i | α i .
The operator Q n involves weighted sums over the quadrature points r with weights w r . Thus, for each k, we have
| r w r R n , k , r | 2 N m m ! f α , m max 1 n + k n x m .
Next, summing over k Z N with the kernel Z ( n x k ) , which decays sufficiently fast, yields
E n * ( x ) : = k Z N r w r R n , k , r Z ( n x k ) 2 N m m ! f α , m max k Z N Z ( n x k ) 1 n + k n x m .
Using the decay property of Z and standard estimates for lattice sums, we get
| E n * ( x ) | = O 1 n + 1 n β m = o ( 1 ) , n .
The parameter β reflects the trade-off between kernel truncation error and Taylor remainder. Choosing β = 1 / 2 ensures that
| E n * ( x ) | ( 1 n + 1 n β ) m ε 0 as n ,
for any 0 < ε m , providing an optimal convergence rate for the remainder.
Combining the main term from the Taylor expansion with the remainder estimate gives
Q n ( f , x ) = f ( x ) + j = 1 m | α | = j 1 α ! f α ( x ) Q n i = 1 N ( · i x i ) α i ( x ) + E n * ( x ) ,
which establishes the claimed Voronovskaya-type expansion. □

6. Voronovskaya-Type Expansion in Sobolev W s ; p

Theorem 14 
(Voronovskaya Expansion in W s , p ). Let m N , 0 s m , and 1 p < . Assume f W m , p ( R N ) and that the moments
M α : = R N u α Z ( u ) d u
exist for all multi-indices | α | m . Then, for the operator A n ,
A n f ( x ) f ( x ) = 1 | α | m s M α α ! n | α | α f ( x ) + r n , s ( x ) ,
where r n , s W s , p ( R N ) and
r n , s W s , p = o ( n ( m s ) ) , n .
Proof. 
For each lattice point k Z N , the multivariate Taylor expansion with integral remainder gives
f k n = | α | m s α f ( x ) α ! k n x α + R n , m ( x , k ) ,
where the remainder term admits the integral representation
R n , m ( x , k ) = | α | = m s + 1 m m ! α ! ( m | α | ) ! 0 1 ( 1 t ) m | α | α f x + t k n x k n x α d t .
By linearity of A n and the definition
A n f ( x ) = k Z N f k n Z ( n x k ) ,
we have
A n f ( x ) f ( x ) = 1 | α | m s α f ( x ) α ! k Z N k n x α Z ( n x k ) + k Z N R n , m ( x , k ) Z ( n x k ) .
Using the definition of the moments M α and the scaling properties of Z, we obtain
k Z N k n x α Z ( n x k ) = M α n | α | + O ( n | α | 1 ) ,
recovering the main term of the expansion:
1 | α | m s M α α ! n | α | α f ( x ) .
The remainder can be expressed as a discrete convolution:
r n , s ( x ) = k Z N R n , m ( x , k ) Z ( n x k ) .
Differentiating under the summation (allowed since Z is smooth and compactly supported) and applying Minkowski’s inequality, we get
β r n , s L p | α | = m s + 1 m C α α ! k Z N 0 1 α f ( x + t ( k n x ) ) L p k n x | α | Z ( n x k ) d t ,
for all multi-indices | β | s .
Passing to the Riemann sum and applying Young’s inequality for discrete convolutions yields
β r n , s L p C n ( m s ) m f L p .
Hence, the remainder satisfies (123), and combining (129) with (130) completes the proof. □

7. Quantitative L p Rate with Explicit Constants

Theorem 15 
(Quantitative L p Approximation Rate). Let f W m , p ( R N ) , 1 p , and assume the kernel moments
M α : = R N u α Z ( u ) d u
exist and are finite for all multi-indices | α | m . Then there exists a constant
C : = C ( N , m , p , Z ) : = | γ | = m 1 γ ! R N | u | | γ | | Z ( u ) | d u
such that
A n f f 1 | α | m 1 M α α ! n | α | α f L p C f W m , p n m .
Proof. 
For each lattice point k Z N , the multivariate Taylor expansion with integral remainder yields
f k n = f ( x ) + 1 | α | m 1 α f ( x ) α ! k n x α + R m ( x , k ) ,
with the remainder
R m ( x , k ) = | γ | = m m γ ! 0 1 ( 1 t ) m 1 γ f x + t ( k n x ) k n x γ d t .
By linearity of A n ,
A n f ( x ) f ( x ) 1 | α | m 1 M α α ! n | α | α f ( x ) = k Z N R m ( x , k ) Z ( n x k ) + E n ( x ) ,
where
E n ( x ) = 1 | α | m 1 α f ( x ) α ! k Z N k n x α Z ( n x k ) M α n | α | .
Since Z is rapidly decaying, one has E n ( x ) = O ( n m ) .
Change variables u = n x k , so k n x = u n . Then
k Z N | R m ( x , k ) | | Z ( n x k ) | | γ | = m 1 γ ! u Z N 0 1 ( 1 t ) m 1 | γ f x u n t | | u | m n m | Z ( u ) | d t .
Taking the L p -norm and applying Minkowski’s inequality:
k Z N R m ( x , k ) Z ( n x k ) L p | γ | = m 1 γ ! 1 n m 0 1 ( 1 t ) m 1 u Z N | γ f ( x u n t ) | | u | m | Z ( u ) | L p d t .
The sum over u approximates the integral
u Z N | γ f ( x u n t ) | | u | m | Z ( u ) | n N R N | γ f ( x y t ) | | y | m | Z ( n y ) | d y R N | u | m | Z ( u ) | d u γ f L p .
Hence,
k Z N R m ( x , k ) Z ( n x k ) L p C f W m , p n m ,
with C given explicitly by (134).
Combining the estimates for R m and E n , we obtain the desired quantitative L p rate (135). □

8. Unified Voronovskaya-Type Expansion in Sobolev W s , p with Explicit Constants

Theorem 16 
(Voronovskaya-Type Expansion in W s , p with Explicit Constants). Let f W m , p ( R N ) , with m N , 0 s m , and 1 p . Assume the kernel Z has finite moments
M α : = R N u α Z ( u ) d u , | α | m ,
and is rapidly decaying. Then for the operator A n we have
A n f ( x ) f ( x ) = 1 | α | m s M α α ! n | α | α f ( x ) + r n , s ( x ) ,
with remainder r n , s W s , p ( R N ) satisfying the explicit quantitative bound
r n , s W s , p C ( N , m , p , Z ) f W m , p n m s ,
where the constant is given explicitly by
C ( N , m , p , Z ) : = | γ | = m 1 γ ! R N | u | | γ | | Z ( u ) | d u .
Proof. 
For each lattice point k Z N , consider the multivariate Taylor expansion around x:
f k n = | α | m 1 α f ( x ) α ! k n x α + R m ( x , k ) ,
with integral remainder
R m ( x , k ) = | γ | = m m γ ! 0 1 ( 1 t ) m 1 γ f x + t ( k n x ) k n x γ d t .
By linearity of A n , we obtain
A n f ( x ) f ( x ) 1 | α | m s M α α ! n | α | α f ( x ) = k Z N R m ( x , k ) Z ( n x k ) + E n ( x ) ,
where E n ( x ) accounts for the approximation error in replacing discrete sums by the exact moments. Rapid decay of Z ensures E n ( x ) = O ( n m ) .
Perform the change of variables u = n x k k / n x = u / n . Then
k Z N | R m ( x , k ) | | Z ( n x k ) | | γ | = m 1 γ ! n m u Z N 0 1 ( 1 t ) m 1 | γ f x t u n | | u | m | Z ( u ) | d t .
Applying discrete Minkowski inequality and standard Sobolev shift estimates, we deduce
r n , s W s , p C ( N , m , p , Z ) f W m , p n m s ,
with C ( N , m , p , Z ) given explicitly in (147).
Combining all these steps, the Voronovskaya-type expansion in Sobolev space W s , p with an explicit quantitative rate is established. □

9. Uniform Stability of Voronovskaya-Type Expansions under ( q , λ ) Variations

Definition 3 
(Uniform C m Kernel with Controlled Moments). Let U ( 0 , ) 2 be compact. We say that the family { Z q , λ } ( q , λ ) U is uniformly C m with controlled momentsif
sup ( q , λ ) U R N | u | | γ | | β Z q , λ ( u ) | d u < , | γ | , | β | m .
Theorem 17 
(Uniform W s , p Stability). Let { Z q , λ } ( q , λ ) U be uniformly C m with controlled moments on a compact set U, and let f W m , p ( R N ) . Denote by A n ( q , λ ) the corresponding operators. Then the Voronovskaya-type expansions
A n ( q , λ ) f ( x ) f ( x ) = 1 | α | m s M α ( q , λ ) α ! n | α | α f ( x ) + r n , s ( q , λ ) ( x )
are uniform in ( q , λ ) U , in the sense that
sup ( q , λ ) U r n , s ( q , λ ) W s , p = o ( n ( m s ) ) , n ,
with constants in the estimates depending only on the suprema in (153).
Proof. 
For each ( q , λ ) U and lattice point k Z N , consider the multivariate Taylor expansion of f around x:
f k n = | α | m 1 α f ( x ) α ! k n x α + R m ( q , λ ) ( x , k ) ,
where R m ( q , λ ) ( x , k ) is the integral remainder.
By linearity of the operator A n ( q , λ ) , we write
A n ( q , λ ) f ( x ) f ( x ) 1 | α | m s M α ( q , λ ) α ! n | α | α f ( x ) = k Z N R m ( q , λ ) ( x , k ) Z q , λ ( n x k ) + E n ( q , λ ) ( x ) ,
where E n ( q , λ ) ( x ) accounts for the discrete-to-continuum moment approximation error.
By assumption (153), the integrals
R N | u | | γ | | β Z q , λ ( u ) | d u
are uniformly bounded over ( q , λ ) U . Therefore, each term in the remainder estimate satisfies
r n , s ( q , λ ) W s , p | γ | = m 1 γ ! n m s R N | u | m sup ( q , λ ) U | γ Z q , λ ( u ) | d u f W m , p .
The uniform integrability of β Z q , λ allows the application of the dominated convergence theorem as n , yielding the uniform bound (155) and establishing the uniformity of the Voronovskaya expansions in W s , p . □

10. Unified Voronovskaya-Type Expansion with Explicit Constants and Uniform Stability

10.1. Setup

Let f W m , p ( R N ) , m N , 0 s m , 1 p . Let Z C m ( R N ) (or a parametric family { Z q , λ } ( q , λ ) U ) with moments
M α : = R N u α Z ( u ) d u , | α | m ,
and assume rapid decay and, for parametric families, uniform boundedness:
sup ( q , λ ) U R N | u | | γ | | β Z q , λ ( u ) | d u < , | γ | , | β | m .
Define the multivariate quasi-interpolation operators
A n ( · ) f ( x ) : = k Z N f k n Z ( · ) ( n x k ) , k Z ( · ) ( n x k ) = 1 .

10.2. Unified Voronovskaya-Type Theorem

Theorem 18 
(Compact Voronovskaya Expansion with Explicit Constants). Under the above assumptions, the operator A n ( · ) satisfies
A n ( · ) f ( x ) f ( x ) = 1 | α | m s M α ( · ) α ! n | α | α f ( x ) + r n , s ( · ) ( x ) ,
where the remainder r n , s ( · ) W s , p ( R N ) satisfies the uniform quantitative estimate
r n , s ( · ) W s , p C ( N , m , p , Z ) n m s f W m , p , C ( N , m , p , Z ) : = | γ | = m 1 γ ! R N | u | m | Z ( u ) | d u .
For parametric families ( q , λ ) U , the expansion is uniform:
sup ( q , λ ) U r n , s ( q , λ ) W s , p = o ( n ( m s ) ) , n .
Moreover, in the case s = 0 , this yields an explicit L p rate
A n ( · ) f f 1 | α | m 1 M α ( · ) α ! n | α | α f L p C ( N , m , p , Z ) f W m , p n m .
Proof. 
Expand f ( k / n ) using the multivariate Taylor formula of order m with integral remainder:
f k n = | α | m 1 α f ( x ) α ! k n x α + R m ( · ) ( x , k ) ,
with
R m ( · ) ( x , k ) = | γ | = m m γ ! 0 1 ( 1 t ) m 1 γ f x + t ( k n x ) k n x γ d t .
Applying A n ( · ) gives (163) with
r n , s ( · ) ( x ) : = k Z N R m ( · ) ( x , k ) Z ( · ) ( n x k ) .
For | β | s , differentiate under the sum:
β r n , s ( · ) ( x ) = k Z N R m ( · ) ( x , k ) β Z ( · ) ( n x k ) n | β | .
By Young’s inequality and the uniform boundedness of the kernel moments, this yields (164). For parametric families, dominated convergence and the uniform bounds in (161) imply uniformity over U. □

10.3. Remarks

  • This theorem simultaneously captures the classical Voronovskaya expansion, quantitative L p rates, Sobolev estimates, and uniform stability under parametric families.
  • Constants are fully explicit, depending only on kernel moments and derivatives.
  • The framework is directly applicable to numerical analysis, multivariate approximation, and theoretical studies of Sobolev-space operators.
We develop a rigorous framework for adaptive Voronovskaya-type expansions for multivariate neural network operators with dynamically deformed hyperbolic tangent activations. This extends classical asymptotic expansions to non-stationary kernels, where the deformation parameters adapt with the approximation scale. Our results establish accelerated convergence rates in Sobolev spaces  W m , p ( R N ) while ensuring uniform control of derivatives up to order s m . Precise remainder estimates are provided, demonstrating the interplay between kernel deformation and Sobolev regularity, and enabling applications in high-dimensional function approximation and adaptive deep learning architectures.

11. Setup and Hypotheses for Sobolev-Santos Uniform Adaptive Convergence Theorem

Definition 4 
(Adaptive Quasi-Interpolation Operator). Let f W m , p ( R N ) . Define the adaptive operator
A n ( q n , λ n ) ( f , x ) : = k Z N f k n Z q n , λ n ( n x k ) ,
where the multivariate adaptive kernel is
Z q n , λ n ( x ) : = i = 1 N Φ q n , λ n ( x i ) , Φ q n , λ n ( x i ) : = M q n , λ n ( x i ) + M 1 / q n , λ n ( x i ) 2 .
The parameters q n > 0 and λ n > 0 are adaptive, satisfying
q n 1 , λ n , λ n = O ( n γ ) , 0 < γ < 1 .

Hypotheses.

1.
Controlled Deformation: There exist constants 0 < C 1 C 2 < such that C 1 q n C 2 for all n N .
2.
Uniform Exponential Decay: For all multi-indices α with | α | m , there exists T α > 0 such that
k Z N | k | α Z q n , λ n ( k ) T α , n .
3.
Uniformly Bounded Moments: For all | α | m ,
sup n | R N u α Z q n , λ n ( u ) d u | < .
Theorem 19 
(Sobolev-Santos Uniform Adaptive Convergence). Let f W m , p ( R N ) with 1 p and 0 s m . Assume the adaptive quasi-interpolation operator A n ( q n , λ n ) satisfies the hypotheses of controlled deformation, uniform exponential decay, and bounded moments. Then, for sufficiently large n, the following adaptive Voronovskaya expansionholds:
A n ( q n , λ n ) ( f , x ) f ( x ) = 1 | α | m s M α ( n ) α ! n | α | α f ( x ) + r n ( s ) ( x ) ,
where the multivariate moments are
M α ( n ) : = R N u α Z q n , λ n ( u ) d u ,
and the remainder term r n ( s ) satisfies the explicit estimate
r n ( s ) W s , p ( R N ) C ( N , m , p , f ) Φ ( q n , λ n ) n m s + γ ,
with Φ ( q n , λ n ) a smooth, bounded function quantifying the deviation from stationarity:
Φ ( q n , λ n ) : = max | α | = m s + 1 R N | u | α | Z q n , λ n ( u ) Z 1 , ( u ) | d u = O ( 1 ) , n .
Moreover, the expansion is uniform in Sobolev normup to order s, i.e.,
sup | β | s β A n ( q n , λ n ) ( f ) f 1 | α | m s M α ( n ) α ! n | α | α f L p C ( N , m , p , f ) n m s + γ .
Remarks:
1.
The factor Φ ( q n , λ n ) explicitly captures the influence of adaptive kernel deformation on the remainder, providing a quantitative measure of non-stationarity.
2.
If q n 1 and λ n , the theorem recovers the classical stationary Voronovskaya expansion.
3.
The theorem can be extended to fractional Sobolev spaces W m + σ , p with 0 < σ < 1 , allowing finer control for functions with limited smoothness.
4.
Uniformity in s ensures stability of derivatives up to order s, which is crucial for high-dimensional deep learning applications where gradients are propagated through multiple layers.
Proof. Let f W m , p ( R N ) and 0 s m . The proof proceeds in several steps.
For any x , y R N , by Taylor’s theorem with integral remainder, we have
f ( y ) = | α | m s α f ( x ) α ! ( y x ) α + R m s ( x , y ) ,
where the remainder is explicitly
R m s ( x , y ) = | α | = m s + 1 ( m s + 1 ) α ! 0 1 ( 1 t ) m s α f ( x + t ( y x ) ) ( y x ) α d t .
Substituting y = k / n in (171) yields
A n ( q n , λ n ) ( f , x ) = k Z N | α | m s α f ( x ) α ! k n x α + R m s ( x , k / n ) Z q n , λ n ( n x k ) .
By definition of the multivariate moments
M α ( n ) : = R N u α Z q n , λ n ( u ) d u ,
and using the discrete-to-continuum approximation via the kernel’s exponential decay, we have
k Z N k n x α Z q n , λ n ( n x k ) = M α ( n ) n | α | .
Hence, the main contribution in (174) is
1 | α | m s M α ( n ) α ! n | α | α f ( x ) .
Define the remainder operator
r n ( s ) ( x ) : = k Z N R m s ( x , k / n ) Z q n , λ n ( n x k ) .
Applying Minkowski’s inequality and using the uniform exponential decay and bounded moments of Z q n , λ n , we obtain
r n ( s ) L p | α | = m s + 1 m s + 1 α ! 0 1 ( 1 t ) m s k Z N | α f ( x + t ( k / n x ) ) | | k / n x | α Z q n , λ n ( n x k ) d t C ( N , m , p , f ) n m s + γ .
where C ( N , m , p , f ) is independent of n and γ ( 0 , 1 ) is the decay exponent from the kernel’s scaling.
The smoothness of Z q n , λ n guarantees that differentiation commutes with A n ( q n , λ n ) up to order s, so for all multi-indices β with | β | s ,
β r n ( s ) L p C ( N , m , p , f ) n m s + γ .
Consequently,
A n ( q n , λ n ) ( f ) f 1 | α | m s M α ( n ) α ! n | α | α f W s , p = r n ( s ) W s , p C n m s + γ .
This completes the proof. □

12. Results

This study establishes rigorous theoretical results for symmetrized hyperbolic tangent neural network operators, with a focus on the novel Sobolev-Santos Uniform Convergence Theorem. The main contributions are as follows:
1.
Voronovskaya-Type Expansions for Basic Operators: For functions f C m ( R N ) , the approximation error of the basic operator A n admits an asymptotic expansion:
A n ( f , x ) f ( x ) = j = 1 m | α | = j f α ( x ) α ! A n i = 1 N ( · x i ) α i ( x ) + o 1 n β ( m ε ) .
This expansion provides explicit convergence rates, dependent on the smoothness m of f and the grid parameter β .
2.
Refined Expansions for Kantorovich and Quadrature Operators: For f C m + 2 ( R N ) , the Kantorovich operator K n satisfies a refined expansion:
K n ( f , x ) f ( x ) = j = 1 m + 1 | α | = j f α ( x ) α ! K n i = 1 N ( · x i ) α i ( x ) + R n ( x ) ,
where the remainder R n ( x ) is bounded by O n ( m + 1 ) β , demonstrating higher-order accuracy.
3.
Sobolev Space Estimates: The approximation error in the Sobolev space W s , p ( R N ) is bounded by:
A n f f W s , p = O n ( m s ) .
This result provides quantitative estimates for the convergence rate, with explicit constants derived from the moments of the kernel function.
4.
Sobolev-Santos Uniform Convergence Theorem: The Sobolev-Santos Theorem (Theorem ) establishes that for adaptive quasi-interpolation operators A n ( q n , λ n ) , the following expansion holds:
A n ( q n , λ n ) ( f , x ) f ( x ) = 1 | α | m s M α ( n ) α ! n | α | α f ( x ) + r n ( s ) ( x ) ,
where the remainder r n ( s ) satisfies the explicit estimate:
r n ( s ) W s , p C ( N , m , p , f ) Φ ( q n , λ n ) n m s + γ .
The function Φ ( q n , λ n ) quantifies the deviation from stationarity, ensuring uniform stability under parametric variations of the activation function. This theorem is pivotal for applications requiring adaptive kernel deformation, such as high-dimensional deep learning architectures.
5.
Uniform Stability Under Parametric Variations: The expansions remain uniformly valid even when the activation function parameters ( q , λ ) vary, ensuring robustness in practical applications. This stability is critical for adaptive neural network architectures, where parameters may dynamically adjust during training or optimization.

13. Conclusions

This work advances the theory of neural network approximation by introducing symmetrized hyperbolic tangent-based operators and deriving Voronovskaya-type asymptotic expansions for their multivariate counterparts. The Sobolev-Santos Uniform Convergence Theorem is a cornerstone of this study, providing a rigorous framework for adaptive quasi-interpolation operators with dynamically deformed activation functions. This theorem ensures that the operators can approximate smooth functions and their derivatives with high accuracy, even as the deformation parameters ( q n , λ n ) evolve, while maintaining uniform control over the convergence rates in Sobolev spaces W s , p ( R N ) .
The explicit constants and uniform bounds derived in this study offer a solid foundation for both theoretical and applied research in neural network-based function approximation. The results highlight the superior performance of these operators in high-dimensional approximation problems, with direct implications for artificial intelligence, numerical analysis, and data-driven modeling. The uniform stability under parametric variations further enhances their applicability in adaptive deep learning architectures, where robustness and flexibility are essential.
Future research directions include exploring adaptive grid strategies, extending the framework to fractional Sobolev spaces, and generalizing the results to non-Euclidean domains. These advancements could further expand the applicability of symmetrized hyperbolic tangent neural networks in modern computational frameworks, particularly in scenarios requiring high-dimensional function approximation and adaptive learning.

Acknowledgments

Santos gratefully acknowledges the support of the PPGMC Program for the Postdoctoral Scholarship PROBOL/UESC nr. 218/2025. Sales would like to express his gratitude to CNPq for the financial support under grant 30881/2025-0.

References

  1. Anastassiou, G. A. (1997). Rate of convergence of some neural network operators to the unit-univariate case. Journal of Mathematical Analysis and Applications, 212(1), 237-262. [CrossRef]
  2. Anastassiou, G. (2000). Quantitative approximations. Chapman and Hall/CRC. [CrossRef]
  3. Anastassiou, G. A. (2016). Intelligent Systems II: Complete Approximation by Neural Network Operators (Vol. 608). Cham: Springer International Publishing. [CrossRef]
  4. Anastassiou, G. A. (2023). Parametrized, deformed and general neural networks. Berlin/Heidelberg, Germany: Springer. [CrossRef]
  5. Z. Chen and F. Cao, The approximation operators with sigmoidal functions, Computers and Mathematics with Applications, 58 (2009), 758-765. [CrossRef]
  6. Haykin, S. (1994). Neural networks: a comprehensive foundation. Prentice hall PTR.
  7. McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 5(4), 115-133. [CrossRef]
  8. Yu, D., & Cao, F. (2025). Construction and approximation rate for feedforward neural network operators with sigmoidal functions. Journal of Computational and Applied Mathematics, 453, 116150. [CrossRef]
  9. Yoo, J., Kim, J., Gim, M., & Lee, H. (2024). Error estimates of physics-informed neural networks for initial value problems. Journal of the Korean Society for Industrial and Applied Mathematics, 28(1), 33-58.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated