Computer Science and Mathematics

Sort by

Article

Security Systems

Beyond Semantic Noise: Diagnosing and Correcting Structural Bias in Code-Mixed Script Detection via XAI-Driven Hybridization

Prasert Teppap

Wirot Ponglangka

Panudech Tipauksorn

Prasert Luekhong

Abstract: In the contemporary cybersecurity landscape, the detection of code-mixed malicious scripts embedded within high-trust domains (e.g., governmental and academic websites) constitutes a critical defensive challenge. Traditional Transformer-based models, while effective in natural language processing, often exhibit "Structural Bias," where they erroneously interpret the benign complexity of legacy HTML structures as malicious obfuscation, resulting in elevated false positive rates. To address this limitation, this study proposes an XAI-Driven Hybrid Architecture that synergizes context-aware semantic embeddings from WangChanBERTa with outlier-robust structural features. Validated on a rigorously curated high-fidelity corpus of 5,000 samples, our model achieves a state-of-the-art F1-Score of 0.9908. Beyond standard metrics, Explainable AI (XAI) diagnosis reveals a critical "Dual-Validation" mechanism: structural features effectively veto semantic hallucinations triggered by benign complexity, acting as a crucial safety net. Crucially, the proposed architecture functions as a 'Dual-Validation' mechanism, where structural features effectively veto semantic hallucinations triggered by benign complexity. The integration of these components leads to a 50% reduction in the False Positive Rate (FPR), decreasing from 0.024 in baseline scenarios to 0.012, thereby confirming the operational significance of Selective Integration. This method effectively reduces 'alert fatigue,' providing a scalable solution for SOC analysts tasked with protecting critical infrastructure from advanced code-mixed threats.

Posted: 18 December 2025

https://doi.org/10.20944/preprints202512.1607.v1

Article

Computer Science and Mathematics

Information Systems

The Entropic Time Constraint: An Operational Bound on Information Processing Speed

Amir Hameed Mir

Abstract:

We derive an operationally defined lower bound on the physical time \( \Delta t \)required to execute any information-processing task, based on the total entropy produced \( \Delta\Sigma \). The central result, \( \Delta t \geq \tau_{\Sigma} \Delta\Sigma \), introduces the Process-Dependent Dissipation Timescale \( \tau_{\Sigma} \equiv 1/\langle \dot{\Sigma} \rangle_{\text{max}} \), which quantifies the maximum achievable entropy production rate for a given physical platform. We derive \( \tau_{\Sigma} \) from microscopic system-bath models and validate our framework against experimental data from superconducting qubit platforms. Crucially, we obtain a Measurement Entropic Time Bound:\( \Delta t_{\text{meas}} \geq \tau_{\Sigma} k_{\text{B}}[H(P) - S(\rho)] \), relating measurement time to information gained. Comparison with IBM and Google quantum processors shows agreement within experimental uncertainties. This framework provides a thermodynamic interpretation of quantum advantage as reduced entropy production per logical inference and suggests concrete optimization strategies for quantum hardware design.

Abstract:

Posted: 18 December 2025

https://doi.org/10.20944/preprints202512.1654.v1

Article

Computer Science and Mathematics

Applied Mathematics

Entropy-Based Portfolio Optimization in Cryptocurrency Markets: A Comparative Study of Shannon, Tsallis, and Weighted Shannon Entropies

Silvia Cristina Dedu

Florentin Șerban

Abstract: Traditional mean–variance portfolio optimization is ill-suited to cryptocurrency markets, where extreme volatility, fat-tailed distributions, and unstable correlations undermine variance as a risk measure. To overcome these limitations, this paper develops a unified entropy-based framework for portfolio diversification grounded in the Maximum Entropy Principle (MaxEnt). Within this formulation, Shannon entropy, Tsallis entropy, and Weighted Shannon Entropy (WSE) emerge as complementary specifications derived analytically via the method of Lagrange multipliers, ensuring mathematical tractability and interpretability. Empirical validation is conducted on a portfolio of four leading cryptocurrencies—Bitcoin (BTC), Ethereum (ETH), Solana (SOL), and Binance Coin (BNB)—using weekly return data from January to March 2025. Results reveal that Shannon entropy converges to near-uniform diversification, Tsallis entropy (q = 2) penalizes concentration more strongly and enhances robustness against tail risk, while WSE integrates asset-specific informational priorities, aligning allocations with investor preferences or market characteristics. Comparative analysis confirms that all three models yield allocations more resilient and structurally balanced than variance-driven portfolios, mitigating estimation risk and concentration effects. This study provides a coherent mathematical formulation of entropy-based portfolio optimization by embedding Shannon, Tsallis, and Weighted Shannon entropies within a common Maximum Entropy (MaxEnt) optimization framework. Beyond its immediate empirical scope, this work also opens several avenues for future research. First, entropy-based portfolio construction can be extended to dynamic multi-period settings with transaction costs and liquidity frictions, which are particularly relevant in cryptocurrency markets. Second, the framework may be generalized to incorporate alternative entropy measures such as Rényi or Kaniadakis entropy, enabling more refined sensitivity to tail risks and nonlinear dependencies. The proposed framework provides a flexible foundation for future extensions toward dynamic, multi-period portfolio optimization under uncertainty.

Posted: 18 December 2025

https://doi.org/10.20944/preprints202512.1640.v1

Article

Computer Science and Mathematics

Artificial Intelligence and Machine Learning

Mean Reversion and Heavy Tails: Characterizing Time Series Data Using Ornstein-Uhlenbeck Processes and Machine Learning

Sebastian Raubitzek

Sebastian Schrittwieser

Georg Goldenits

Alexander Schatten

Kevin Mallinger

Abstract: We present a supervised method to estimate two local descriptors of time-series dynamics, the mean-reversion rate θ and a heavy tail estimate α, from short windows of data. These parameters summarize recovery behavior and tail heaviness and are useful for interpreting stochastic signals in sensing applications. The method is trained on synthetic, dimensionless Ornstein–Uhlenbeck processes with α-stable noise, ensuring robustness for non-Gaussian and heavy-tailed inputs. Gradient-boosted tree models (CatBoost) map window-level statistical features to discrete (α, θ) categories with high accuracy and predominantly adjacent-class confusion. Using the same trained models, we analyze daily financial returns, daily sunspot numbers, and NASA POWER climate fields for Austria. The method detects changes in local dynamics, including shifts in financial tail structure after 2010, weaker and more irregular solar cycles after 2005, and a redistribution in clear-sky shortwave irradiance around 2000. Because it relies only on short windows and requires no domain-specific tuning, the framework provides a compact diagnostic tool for signal processing, supporting characterization of local variability, detection of regime changes, and decision making in settings where long-term stationarity is not guaranteed.

Posted: 18 December 2025

https://doi.org/10.20944/preprints202512.1588.v1

Article

Computer Science and Mathematics

Mathematical and Computational Biology

Kappa-Frameshift Background Mutations and Long-Range Correlations of the DNA Base Sequences

Elias Koorambas

Abstract: Following Livadiotis G. and McComas D. J. (2023) [1], we propose a new type of DNA frameshift mutations that occur spontaneously due to information exchange between the DNA sequence of length bases (n) and the mutation sequence of length bases (m), and respect the kappa-addition symbol ⊕κ. We call these proposed mutations Kappa-Frameshift Background (KFB) mutations. We find entropy defects originate in the interdependence of the information length systems (or their interconnectedness, that is, the state in which systems with a significant number of constituents (information length bases) depend on, or are connected with each) by the proposed KFB-mutation). We also quantify the correlation among DNA information length bases (n) and (m) due to information exchange. In the presence of entropy defects, the Landauer’s bound and minimal metabolic rate for a biological system are modified. We observe that the different n and κ scales are manifested in the double evolutionary emergence of the proposed biological system through subsystems correlations. For specific values of the kappa parameter we can expect deterministic laws associated with a single biological polymer in the short term before the polymer explores over time all the possible ways it can exist.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202503.0891.v2

Article

Computer Science and Mathematics

Artificial Intelligence and Machine Learning

Deep Temporal Convolutional Neural Networks with Attention Mechanisms for Resource Contention Classification in Cloud Computing

Ning Lyu

Feng Chen

Chong Zhang

Chihui Shao

Junjie Jiang

Abstract: This paper addresses the challenge of efficiently identifying and classifying resource contention behaviors in cloud computing environments. It proposes a deep neural network method based on multi-scale temporal modeling and attention-based feature enhancement. The method takes time series resource monitoring data as input. It first applies a Multi-Scale Dilated Convolution (MSDC) module to extract features from resource usage patterns at different temporal resolutions. This allows the model to capture the multi-stage dynamic evolution of resource contention behaviors. An Attention-based Feature Weighting (AFW) module is then introduced. It learns attention weights along both the temporal and feature dimensions. This enables the model to emphasize key time segments and core resource metrics through saliency modeling and feature enhancement. The overall architecture supports end-to-end modeling. It can automatically learn temporal patterns of resource contention without relying on manual feature engineering. To evaluate the effectiveness of the proposed method, this study constructs a range of contention scenarios based on real-world cloud platform data. The model is assessed under different structural configurations and task conditions. The results show that the proposed model outperforms existing mainstream temporal classification models across multiple metrics, including accuracy, recall, F1-score, and AUC. It demonstrates strong feature representation and classification capabilities, especially in handling high-dimensional, multi-source, and dynamic data. The proposed approach offers practical support for resource contention detection, scheduling optimization, and operational management in cloud platforms.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1556.v1

Article

Computer Science and Mathematics

Security Systems

SplitML: A Unified Privacy-Preserving Architecture for Federated Split-Learning in Heterogeneous Environments

Devharsh Trivedi

Aymen Boudguiga

Nesrine Kaaniche

Nikos Triandopoulos

Abstract: Federated Learning (FL) and Split Learning (SL) maintain client data privacy during collaborative training by keeping raw data on distributed clients and only sharing model updates (FL) or intermediate results (SL) with the centralized server. However, this level of privacy is insufficient, as both FL and SL remain vulnerable to security risks like poisoning and various inference attacks. To address these flaws, we introduce SplitML, a secure and privacy-preserving framework for Federated Split Learning (FSL). SplitML generalizes and formalizes FSL using IND−CPAD secure Fully Homomorphic Encryption (FHE) combined with Differential Privacy (DP) to actively reduce data leakage and inference attacks. This framework allows clients to use different overall model architectures, collaboratively training only the top (common) layers while keeping their bottom layers private. For training, clients use multi-key CKKS FHE to aggregate weights. For collaborative inference, clients can share gradients encrypted with single-key CKKS FHE to reach a consensus based on Total Labels (TL) or Total Predictions (TP). Empirical results show that SplitML significantly improves protection against Membership Inference (MI) attacks, reduces training time, enhances inference accuracy through consensus, and incurs minimal federation overhead.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1579.v1

Article

Computer Science and Mathematics

Mathematical and Computational Biology

Stochastic Modelling and Analysis of Within-Farm Highly Pathogenic Avian Influenza Dynamics in Dairy Cattle

Parul Tiwari

Malavika Smitha

Hammed Olawale Fatoyinbo

Abstract: Highly pathogenic avian influenza (HPAI) has expanded its host range with recent detections in dairy cattle, raising critical concerns regarding within-herd persistence and cross-species spillover. This study develops a stochastic SEI_sI_aR − B compartmental model to analyse HPAI transmission, explicitly accounting for environmental pathogen reservoirs and noise intensities through Wiener processes. The positivity and boundedness of solutions are established, and the disease-free and endemic equilibria are analytically derived. The basic reproduction number is determined using the next-generation matrix method. Numerical simulations confirm that the model dynamics are consistent with theoretical analysis and illustrate how stochastic fluctuations significantly influence disease persistence. Furthermore, sensitivity analysis using Latin Hypercube Sampling (LHS) and Partial Rank Correlation Coefficients (PRCC) identifies the transmission rate from asymptomatic infectious cattle (β_a) as the primary driver of transmission. The model effectively captures the dynamics of environmental variability affecting HPAI spread, suggesting that effective control strategies must prioritise the early detection and isolation of asymptomatic carriers alongside environmental management.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1583.v1

Article

Computer Science and Mathematics

Computer Vision and Graphics

A Latent Space Diffusion Transformer for High-Quality Video Frame Interpolation

Wei Chen

Jiing Fang

Abstract: Video Frame Interpolation (VFI) is critical for generating smooth slow-motion and increasing video frame rates, yet it faces significant challenges in achieving high fidelity, accurate motion modeling, and robust spatiotemporal consistency, particularly for large displacements and occlusions. This paper introduces TemporalFlowDiffuser (TFD), a novel end-to-end latent space diffusion Transformer designed to overcome these limitations with exceptional efficiency and quality. TFD employs a lightweight Video Autoencoder to compress frames into a low-dimensional latent space. A Spatiotemporal Transformer models complex spatiotemporal dependencies and motion patterns, augmented by auxiliary latent optical flow features. Leveraging Flow Matching as its diffusion scheduler, TFD achieves high-quality frame generation with remarkably few denoising steps, making it highly suitable for real-time applications. Our extensive experiments on a challenging high-motion dataset demonstrate that TFD significantly outperforms state-of-the-art methods like RIFE across metrics such as PSNR, SSIM, and VFID, showcasing superior visual quality, structural similarity, and spatiotemporal consistency. Furthermore, human evaluation confirms TFD's enhanced perceptual realism and temporal smoothness, validating its efficacy in generating visually compelling and coherent video content.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1587.v1

Article

Computer Science and Mathematics

Mathematics

Fast Computation for Square Matrix Factorization

Artyom M. Grigoryan

Abstract:

In this work, we discuss the method of the QR-factorization which is based of the transformations which is called the discrete signal induced heap transformations (DsiHT). These transformations are generated by given signals and can be composed by elementary rotations. The order of processing data, or the path of the transformation, is an important characteristic of it, and the correct choice of such paths can lead to a significant reduction in the operation when calculating the factorization for large matrices. Such paths are called fast paths of the DsiHTs, and they define sparse matrices with more zero coefficients than when calculating QR-factorization in the traditional path, that is, when processing data in the natural order x₀,x₁,x₃,… . For example, in the first stage of the factorization of a 512×512 matrix, a matrix is used with 257,024 zero coefficients of total 262,144 coefficients, when using the fast paths. For comparison, the calculations in the natural order requires a 512×512 matrix with only 130,305 zero coefficients it this stage. The effectiveness of the proposed method is illustrated in comparison with the QR-factorization based on the sequence of Householder reflections (or transformations). Examples with the 4×4, 5×5, and 8×8 matrices are described in detail. The example of the QR-factorization of 256×256 complex matrix is also described and compared with the method of Housholder reflections which is used in programming language MATLAB.

Abstract:

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1575.v1

Article

Computer Science and Mathematics

Artificial Intelligence and Machine Learning

Evaluation and Benchmarking of Generative and Agentic AI Systems: A Comprehensive Survey

Manish Shukla

Abstract: The rapid emergence of generative and agentic artificial intelligence (AI) has outpaced traditional evaluation practices. While large language models excel on static language benchmarks, real-world deployment demands more than accuracy on curated tasks. Agentic systems use planning, tool invocation, memory, and multi-agent collaboration to perform complex workflows. Enterprise adoption therefore hinges on holistic assessments that include cost, latency, reliability, safety, and multi-agent coordination. This survey provides a comprehensive taxonomy of evaluation dimensions, reviews existing benchmarks for generative and agentic systems, identifies gaps between laboratory tests and production requirements, and proposes future directions for more realistic, multi-dimensional benchmarking.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1421.v1

Article

Computer Science and Mathematics

Artificial Intelligence and Machine Learning

LawLLM-DS: A Two-Stage Parameter-Efficient Fine-Tuning Framework for Legal Judgment Prediction with Symmetry-Aware Label Graphs

Pengcheng Zhao

Chengcheng Han

Kun Han

Abstract: Legal judgment prediction (LJP) increasingly relies on large language models whose full fine-tuning is memory-intensive and susceptible to catastrophic forgetting. We present LawLLM-DS, a two-stage Low-Rank Adaptation (LoRA) framework that first performs legal knowledge pre-tuning with an aggressive learning rate and subsequently refines judgment relations with conservative updates, using dedicated LoRA adapters, 4-bit quantization, and targeted modification of seven Transformer projection matrices to keep only 0.21% of parameters trainable. From a structural perspective, the twenty annotated legal elements form a symmetric label co-occurrence graph that exhibits both cluster-level regularities and asymmetric sparsity patterns, and LawLLM-DS implicitly captures these graph-informed dependencies while remaining compatible with downstream GNN-based representations. Experiments on 5,096 manually annotated divorce cases show that LawLLM-DS lifts macro F1 to 0.8893 and achieves an accuracy of 0.8786, outperforming single-stage LoRA and BERT baselines under the same data regime. Ablation studies further verify the contributions of stage-wise learning rates, adapter placement, and low-rank settings. These findings demonstrate that curriculum-style, parameter-efficient adaptation provides a practical path toward lightweight yet structure-aware LJP systems for judicial decision support.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1406.v1

Review

Computer Science and Mathematics

Computational Mathematics

The Goldbach Comet Revisited: Density, Obstruction, and the Ω–λ–Κ Framework for an Analytic Explanation of Goldbach’s Conjecture

Bouchaib Bahbouhi

Abstract: Goldbach’s strong conjecture, asserting that every even integer greater than two can be expressed as the sum of two prime numbers, remains one of the oldest unresolved problems in mathematics. Despite overwhelming numerical verification and powerful partial results, a complete analytic proof has remained elusive. At the same time, extensive computations have revealed a striking empirical phenomenon known as Goldbach’s comet: the rapidly growing number of Goldbach representations as a function of the even integer E, forming a characteristic comet-like structure when plotted.This review article provides a comprehensive synthesis of classical analytic number theory, modern distributional results on primes, and recent structural insights in order to explain the existence, shape, and persistence of Goldbach’s comet. We introduce and develop a unified framework based on three complementary quantities: the dominance ratio Ω(E), measuring the growth of available prime density relative to local obstructions; the density field λ, encoding the smooth asymptotic behavior of primes; and the obstruction constant Κ, bounding the maximal effect of local gaps and covariance.We show that Ω(E) diverges, reflecting a fundamental scale separation between global prime density and local irregularities, and that λ-weighted obstructions remain bounded while density grows without bound. This framework explains why no gap-based or covariance-based mechanism can suppress Goldbach representations and why the number of representations necessarily increases. We argue that Goldbach’s conjecture is thereby reduced to a single, well-identified uniform realization problem within existing analytic methods.Rather than claiming a final proof, this article aims to clarify the conceptual structure underlying Goldbach’s conjecture, explain Goldbach’s comet as a necessary consequence of prime density dominance, and position the conjecture within a sharply defined analytic frontier. The result is a coherent, literature-grounded explanation of why Goldbach’s conjecture must be true and what precise technical step remains to complete its proof.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1469.v1

Article

Computer Science and Mathematics

Computer Vision and Graphics

SuperSegmentation: KeyPoint Detection and Description with Semantic Labeling for VSLAM

Rajarshi Karmakar

Ciaran Eising

Rekha Ramachandra

Sahil Zaidi

Abstract: We propose SuperSegmentation, a unified, fully-convolutional architecture for semantic keypoint correspondence in dynamic urban scenes. The model extends SuperPoint’s self-supervised interest point detector–descriptor backbone with a DeepLab-style Atrous Spatial Pyramid Pooling head for semantic segmentation and a lightweight sub-pixel regression branch. Using Cityscapes camera intrinsics and extrinsics to construct geometry-aware homographies, SuperSegmentation jointly predicts keypoints, descriptors, semantic labels(e.g., static vs. dynamic classes), and sub-pixel offsets from a shared encoder. Our experiments are conducted on Cityscapes, where a backbone pretrained on MS-COCO with strong random homographies over approximately planar images is fine-tuned with deliberately attenuated synthetic warps, as we found that reusing the aggressive COCO-style homographies on Cityscapes produced unrealistically large distortions. Within this controlled setting, we observe that adding semantic masking and sub-pixel refinement consistently improves stability on static structures and suppresses keypoints on dynamic or ambiguous regions.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1410.v1

Article

Computer Science and Mathematics

Mathematics

A Note on Kadec-Klee Property

Wojciech M Kozlowski

Abstract: The objective of this paper is to rigorously define the Kadec-Klee property for modular spaces endowed with a sequential convergence structure, and to demonstrate that this property leads to the normal structure in such spaces. Consequently, we establish that the Kadec-Klee property defined herein implies the corresponding fixed point property for these spaces. These results are new in the modular space setting. Furthermore, given that the examined class of spaces encompasses Banach spaces, modular function spaces, and various other types, our theory offers a comprehensive, unified framework for exploring the interconnections between the Kadec-Klee property, normal structure, and the fixed point property.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1553.v1

Article

Computer Science and Mathematics

Artificial Intelligence and Machine Learning

A Machine Learning Approach to Predicting Vacation Choices Based on Demographic and Lifestyle Factors

M. Farzam Hussain

Noor Amin

Abstract: Planning a vacation is not easy and choosing a destination is itself a difficult task. But with modern machine learning technology we can predict user preferences and recommend suitable destinations for vacations. This research aims to analyze public preferences between two popular vacation destinations named mountains and beaches, using ma- chine learning techniques. By considering demographic factors like age, gender, income, education and lifestyle choices, this study explores the influences on vacation destination preferences. A unique dataset containing over 52,000 instances is used to predict whether individuals prefer mountains or beaches, employing algorithms like Decision Tree, Random Forest, Gradient Boosting, Deep Learning, and Ensemble Methods. The study concludes that Deep Learning models achieved the highest accuracy of 99.81%, followed by Gradient Booster at 98.85%. The results suggest that machine learning can enhance personalized travel recommendations and contribute to more efficient tourism marketing.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1546.v1

Article

Computer Science and Mathematics

Artificial Intelligence and Machine Learning

Intelligent Recommendation Systems Using Multi-Scale LoRA Fine-Tuning and Large Language Models

Huajun Zhang

Lin Zhu

Chong Peng

Jiasen Zheng

Junjiang Lin

Runyuan Bao

Abstract: This study proposes a multi-scale LoRA fine-tuning recommendation algorithm based on large language models to address the limitations of traditional recommender systems in semantic understanding, feature redundancy, and parameter transfer efficiency. The method preserves the semantic representation ability of large models while achieving unified modeling of global preferences and local interests through multi-scale semantic decomposition, low-rank parameter adaptation, and cross-scale fusion mechanisms. The model first inputs user-content interaction sequences into a pre-trained language model to obtain context-aware semantic embeddings. Then, a multi-scale semantic pooling structure extracts hierarchical feature information to capture multi-granularity preference relations. Based on this, a multi-scale LoRA module performs low-rank decomposition and cross-scale alignment of weight matrices, significantly reducing parameter size and improving fine-tuning efficiency. Finally, a cross-scale attention fusion layer dynamically reconstructs global and local features to optimize recommendation ranking. Systematic experiments conducted on the MovieLens-1M dataset validate the effectiveness of the proposed method across multiple evaluation metrics. The results show that the model outperforms several baseline algorithms in Precision@K, NDCG@K, Recall@K, and Coverage, demonstrating the advantages of multi-scale structure and LoRA parameterization in enhancing recommendation accuracy, diversity, and generalization. Overall, this research provides a feasible solution for structural optimization and parameter-efficient fine-tuning of large language models in efficient recommendation tasks.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1413.v1

Article

Computer Science and Mathematics

Mathematics

Beyond a Naive Absolute Infinite

Ward Blondé

Abstract: This paper proposes an idealized, philosophical axiomatization of the absolute infinite in a meta-formal class theory, called MK\( ^{meta} \), that can be back-translated to the formal Morse--Kelley with the axiom of global choice (GC). First, class ordinals and class cardinals are introduced, which avoid the Burali-Forti paradox. Second, GC is assumed to make class cardinals well-orderable. Third, the Hamkinsian multiverse \( M_h \) is defined as the meta-formal collection of all the models \( v \) of any relatively consistent, formal theory. Fourth, a meta-formal theory is rigorously defined by ranging over all the sets \( x\in v\in M_h \). Fifth, \( V^{meta} \) is the unique model of any meta-formal theory. At last, the absolute infinite \( \Omega^{meta}_{card} \) is the proper class cardinality of \( V^{meta} \). Moreover, truth relativism can be countered in a GC-consistent branch of \( M_h \), by accepting the axioms that maximize \( V^{meta} \). Consequently, the definition of \( M_h \) can be used as a rebuttal of both height and width potentialism, when combined with the argument that only the meta-formal level can capture the entire mathematical reality.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1545.v1

Review

Computer Science and Mathematics

Artificial Intelligence and Machine Learning

Generative AI for Text-to-Video Generation: Recent Advances and Future Directions

Kadhim Hayawi

Sakib Shahriar

Abstract: Text-to-video (T2V) generation has recently emerged as a transformative technology within the field of generative AI, enabling the creation of realistic, temporally coherent videos based on natural language descriptions. This paradigm provides significant added value in many domains such as creative media, human-computer interaction, immersive learning, and simulation. Despite its growing importance, systematic discussion of T2V is still limited compared with adjacent modalities such as text-to-image and image-to-video. To alleviate the scarcity of discussions in the T2V field, this paper provides a systematic review of works published from 2024 onward, consolidating fragmented contributions across the field. We survey and categorize the selected literature into three principal areas, namely, T2V methods, datasets, and evaluation practices, and further subdivide each area into subcategories that reflect recurring themes and methodological patterns in the literature. Emphasis will then be placed on identifying key research opportunities and open challenges that need further investigation.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1476.v1

Article

Computer Science and Mathematics

Mathematical and Computational Biology

DNABERT2-CAMP: A Hybrid Transformer-CNN Model for E. coli Promoter Recognition

Hua-Lin Xu

Xiu-Jun Gong

Hua Yu

Ying-Kai Wang

Abstract: Accurate identification of promoters is essential for deciphering gene regulation but remains challenging due to the complexity and variability of transcriptional initiation signals. Existing deep learning models often fail to simultaneously capture long-range dependencies and precise local motifs in DNA sequences. To address this, we propose DNABERT2-CAMP, a hybrid deep learning framework that integrates global sequence context with localized feature extraction for enhanced promoter recognition in Escherichia coli. The model leverages a pre-trained DNABERT-2 Transformer to encode evolutionary conserved patterns across extended contexts, while a novel CAMP (CNN-Attention-Mean Pooling) module detects fine-grained promoter motifs through convolutional filtering, multi-head attention, and mean pooling. By fusing global embeddings with high-resolution local features, our approach achieves robust discrimination between promoter and non-promoter sequences. Under 5-fold cross-validation, DNABERT2-CAMP attained an accuracy of 93.10% and a ROC AUC of 97.28%. It also demonstrated strong generalization on independent external data, achieving 89.83% accuracy and 92.79% ROC AUC. These results underscore the advantage of combining global contextual modeling with targeted local motif analysis for accurate and interpretable promoter identification, offering a powerful tool for synthetic biology and genomic research.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202512.1533.v1

of 622