Introduction
CosmicAC is a self-hosted GPU compute platform. You deploy it onto your own Kubernetes cluster, and it gives your team two ways to use GPU hardware:
- GPU Container jobs — isolated KubeVirt VMs with dedicated GPUs and full shell access.
- Managed Inference — serve open-source language models without building a serving stack yourself.
Unlike a managed cloud service, you run the control plane. CosmicAC schedules jobs, exposes GPUs to VMs, and manages the inference serving layer — all on infrastructure you own.
How it works
CosmicAC runs as a set of services on your Kubernetes cluster. It relies on:
- Kubernetes to schedule workloads.
- KubeVirt to run GPU Container jobs as virtual machines.
- GPU device plugins (e.g. the NVIDIA GPU Operator) to expose GPU hardware to scheduled jobs.
Users interact with the platform through the CosmicAC CLI or the web UI. Shell access to a GPU Container connects over Hyperswarm SSH, a peer-to-peer SSH tunnel.
For a deeper breakdown of the components and how they connect, see Architecture.