Publications
(*) Equal Contribution. (†) Corresponding Author.
2026
- EACL
KNN-SSD: Enabling Dynamic Self-Speculative Decoding via Nearest Neighbor Layer Set OptimizationIn Findings of EACL, 2026
2025
- Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought ReasoningIn ArXiv, 2025
- TokenSkip: Controllable Chain-of-Thought Compression in LLMsIn EMNLP, 2025
2024
-
- Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative DecodingIn Findings of ACL, 2024