When Smiley Turns Hostile: Interpreting How Emojis Trigger LLMs' Toxicity
Shiyao Cui,Xijia Feng,Yingkang Wang,Junxiao Yang,Zhexin Zhang,Biplab Sikdar,Hongning Wang,Han Qiu,Minlie Huang
BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs
Junxiao Yang,Jinzhe Tu,Haoran Liu,Xiaoce Wang,Chujie Zheng,Zhexin Zhang,Shiyao Cui,Caishun Chen,Tiantian He,Hongning Wang,Yew-Soon Ong,Minlie Huang
Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen
Zhexin Zhang,Yuhao Sun,Junxiao Yang,Shiyao Cui,yuanchao zhang,Hongning Wang,Minlie Huang
Trust-Region Adaptive Policy Optimization
Mingyu Su,Jian Guan,Yuxian Gu,Minlie Huang,Hongning Wang (with Assoc. Prof. Hongning Wang)