Talks


  • [2025.06] Sharing Panel: Efficient Reasoning in Large Language Models at NICE and MLNLP. [video]
  • [2025.05] Stop Overthinking: Towards Efficient Reasoning in Large Language Models at Theory Lab, Huawei Hong Kong Research Center. [slides]
  • [2025.01] Speculative Decoding for Efficient LLM Inference at COLING 2025. [homepage] [slides] [video]
  • [2024.03] Unlocking the Efficiency of LLM Inference: A Comprehensive Survey of Speculative Decoding at NICE and CIP Group @CASIA. [video] [slides]