Reinforcement Threshold controlled ModeRNN tuned with j* and Q_max Bilingual Spatiotemporal Attention Fusion for Inclusive Real-Time Sign Language Interpretation

Harapriya Kar, School of Computer Science and Engineering, Vellore Institute of Technology, Vellore 632014, IndiaFollow
P Viswanathan, School of Computer Science and Engineering, Vellore Institute of Technology, Vellore 632014, IndiaFollow

Abstract

Deaf communities still struggle with communication, partly due to the inefficiency of current sign language recognition systems, their poor generalization, and their inability to manage regional and linguistictions. This work suggests a novel architecture that blends attention-based spatiotemporal processing (RTC-ModeRNN-BSF) with a reinforcement threshold–controlled ModeRNN to solve these problems. The model adapts its computation based on the complexity of the input gesture, using between 2 and 8 attention slots, while gradually reducing exploration during training ($\varepsilon$: 0.9→ 0.1). Dual-stream memory pathways are optimized using joint log-likelihood maximization (J-Star) and computational pruning (Q-Max) to capture both immediate sequential patterns (C_t) and hierarchical spatiotemporal dependencies (M_t). The hybrid gradient descent using the Adam W optimizer ensures dependable convergence while avoiding feature memorization. The proposed system converges 47% faster than conventional techniques, with an average classification accuracy of 99% across datasets of American Sign Language (ASL), Indian Sign Language (ISL), and Chinese Sign Language (CSL). Furthermore, it shows notable cross-lingual adaptation with 78.5% accuracy on unseen sign languages without retraining, consistently maintaining 93–97% performance under real-world challenges such as partial occlusion, changing lighting, and increasing signing speeds.

Keywords

Attention, Bilingual, Memory transition, Reinforcement, Spatiotemporal, Unified threshold

Subject Area

Computer Science

Article Type

Article

First Page

1694

Last Page

1710

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite this Article

Kar, Harapriya and Viswanathan, P (2026) "Reinforcement Threshold controlled ModeRNN tuned with j* and Q_max Bilingual Spatiotemporal Attention Fusion for Inclusive Real-Time Sign Language Interpretation," Baghdad Science Journal: Vol. 23: Iss. 5, Article 12.
DOI: https://doi.org/10.21123/2411-7986.5296

Download

COinS

Reinforcement Threshold controlled ModeRNN tuned with j* and Q_max Bilingual Spatiotemporal Attention Fusion for Inclusive Real-Time Sign Language Interpretation

Abstract

Keywords

Subject Area

Article Type

First Page

Last Page

Creative Commons License

How to Cite this Article

Search

Submission Locations

Reinforcement Threshold controlled ModeRNN tuned with j* and Qmax Bilingual Spatiotemporal Attention Fusion for Inclusive Real-Time Sign Language Interpretation

Authors

Abstract

Keywords

Subject Area

Article Type

First Page

Last Page

Creative Commons License

How to Cite this Article

Share

Search

Submission Locations

Reinforcement Threshold controlled ModeRNN tuned with j* and Q_max Bilingual Spatiotemporal Attention Fusion for Inclusive Real-Time Sign Language Interpretation