D^2MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving

Publication
In the 31th Annual International Conference on Mobile Computing and Networking (MobiCom) (CCF-A)