DR. ADRIAN M. THORNE. Optimizing Large‑Scale Language Model Inference via Firmware‑Level and Architectural Attention Sparsity. International Journal of Modern Medicine, [S. l.], v. 4, n. 10, p. 14–20, 2025. Disponível em: https://intjmm.com/index.php/ijmm/article/view/78.. Acesso em: 26 apr. 2026.