AI Computing Cyberinfrastructure

Introduction of "AI Computing Cyberinfrastructure"

The unprecedented impact of foundation model technology, represented by ChatGPT, is driving a revolutionary paradigm shift in AI, bringing new opportunities and challenges to many industries. However, the high training, inference and maintenance costs of foundation model technologies limit their widespread adoption.

(EuroSys2024) Warming Serverless ML Inference via Inter-function Model Transformation

Serverless ML inference is an emerging cloud computing paradigm for low-cost, easy-to-manage inference services. In serverless ML inference, each call is executed in a container; however, the cold start of containers results in long inference delays.

(INFOCOM2024) An Elastic Transformer Serving System for Foundation Model via Token Adaptation

Transformer model empowered architectures have become a pillar of cloud services that keeps reshaping our society. However, the dynamic query loads and heterogeneous user requirements severely challenge current transformer serving systems, which rely on pre-training multiple variants of a foundation model to accommodate varying service demands.