Covalent Cloud Documentation

Iterate, build, deploy and scale AI easier than ever with the Covalent compute orchestration platform

What is Covalent Cloud?

Covalent Cloud simplifies building and scaling complex AI and high-performance computing (HPC) applications. You define your compute needs (CPUs, GPUs, storage, deployment) directly within your local Python code, and Covalent Cloud handles the rest. This approach liberates you from the complexities of server management and cloud configurations, enabling Covalent to handle the intricate details behind the scenes. This allows you to concentrate on what truly matters: developing innovative and powerful AI and HPC applications.

Serverless Cost Savings: Dramatically cuts costs, often by 50-60%, by optimizing the use of high-compute resources like GPUs only when necessary, eliminating wasteful practices such as maintaining GPU VMs for Jupyter Notebook development during idle times.
Simplified AI Development: Empowers software and machine learning engineers to develop scalable AI applications without requiring any knowledge of the underlying complexities of high-compute infrastructure.

Featured Examples:

Generative AI/LLM

Multi Agent LLMs

Generative AI/LLM

Deploying Llama 3 inference for text generation

How It Works

Write your Python code and specify computational requirements. Covalent automates the rest, from containerization and resource provisioning to task scheduling and scaling.

Fine Tune
Inference and serve

import covalent_cloud as cc
import covalent as ct

resource = cc.CloudExecutor(env="huggingface",
                            gpu_type="h100",
                            num_gpu=8,
                            num_cpus=150,
                            memory="200GB")

@ct.electron(executor=resource)
def fine_tune_model(data):
    # training logic
    return model_path

import covalent_cloud as cc
from vllm import LLM, SamplingParams

gpu_h100x4 = cc.CloudExecutor(env="vllm",
                              num_cpus=4,
                              num_gpus=4,
                              gpu_type="h100")

@cc.service(executor=gpu_h100x4,name="llama3b",auth=True)
def vllm_lamma(model):
    llm = LLM(model=model)
    return {"llm": llm}

@vllm_lamma.endpoint("/generate")
def generate(llm, full_prompt, num_tokens=1500):
    sampling_params = SamplingParams(max_tokens=num_tokens)
    return llm.generate(prompts, sampling_params)

Features

On-Demand Scalable GPU Access: Access a variety of GPU resources on-demand. Deploy Python functions to GPUs at scale without worrying about the underlying hardware.
Real-Time Monitoring: Keep tabs on your deployments with real-time monitoring of outputs and errors through our intuitive UI.
Inference and Serving: Beyond simple functions, deploy and serve high compute functions like LLMs, as secure, authenticated web endpoints with just a few lines of code, utilizing GPUs as needed.
Beyond Infrastructure: Enhance your workflows with shared volumes, manage secrets safely, and use custom container images to meet your unique application needs.
Bring Your Own Compute: Connect your own cloud or on-premises systems to our platform. Covalent Cloud orchestrates and manages your jobs seamlessly, letting you concentrate on innovation instead of infrastructure.

Next Steps

Getting Started Guide: Dive into our detailed start-up guide to get up and running quickly.
Explore Examples: Check out various examples that cover a wide range of applications, from simple tasks to complex AI deployments.
Explore Basics: Learn the fundamentals of creating and managing tasks, workflows, and more in our Basics section.

Covalent Cloud Documentation

What is Covalent Cloud?​

Featured Examples:​

How It Works​

Features​

Next Steps​

What is Covalent Cloud?

Featured Examples:

How It Works

Features

Next Steps