“Optimizing Large‑Scale Language Model Inference via Firmware‑Level and Architectural Attention Sparsity”. International Journal of Modern Medicine 4, no. 10 (October 31, 2025): 14–20. Accessed January 15, 2026. https://intjmm.com/index.php/ijmm/article/view/78.