.js-id-AI-Computing-Cyberinfrastructure
Introduction of “AI Computing Cyberinfrastructure”
The unprecedented impact of foundation model technology, represented by ChatGPT, is driving a revolutionary paradigm shift in AI, bringing new opportunities and challenges to many industries. However, the high training, inference and maintenance costs of foundation model technologies limit their widespread adoption.
Introduction of "AI Computing Cyberinfrastructure"
(EuroSys2024) Warming Serverless ML Inference via Inter-function Model Transformation
Serverless ML inference is an emerging cloud computing paradigm for low-cost, easy-to-manage inference services. In serverless ML inference, each call is executed in a container; however, the cold start of containers results in long inference delays.
(EuroSys2024) Warming Serverless ML Inference via Inter-function Model Transformation
(INFOCOM2024) An Elastic Transformer Serving System for Foundation Model via Token Adaptation
Transformer model empowered architectures have become a pillar of cloud services that keeps reshaping our society. However, the dynamic query loads and heterogeneous user requirements severely challenge current transformer serving systems, which rely on pre-training multiple variants of a foundation model to accommodate varying service demands.
(INFOCOM2024) An Elastic Transformer Serving System for Foundation Model via Token Adaptation