Migrate from

Migrate from AWS Batch

If you're running batch embeddings, transcription, OCR, or other GPU-heavy inference on AWS Batch + GPU instances, you can move the inference half to Common Compute in an afternoon. The orchestration patterns map cleanly; only the runtime changes.

Three ways to migrate, in order of effort: (1) one-line replace per call site, (2) drop-in boto3 shim, (3) full rewrite to the native SDK.

Mental model — what maps to what

AWS Batch concepts → Common Compute concepts:

Job queue → workload (we route to capacity, you don't pick a queue)
Job definition → workload_id + model_id (declarative; no container to maintain)
Compute environment → none (we own the fleet — Apple Silicon, sandboxed)
submit_job → cc.jobs.submit() / openai-compatible call (returns a job id)
describe_jobs → cc.jobs.get(id) (same shape: state, attempts, result_uri)
cancel_job / terminate_job → cc.jobs.cancel(id)
S3 input bucket → R2 (presigned upload URL returned at submit time)
CloudWatch Logs → /v1/me/logs (filterable by job id, time range)
IAM role → API key with `inference` scope (per-key, revocable)
Reserved capacity → priority tier (`batch` is the default)

Path 1 — Drop-in shim (zero refactor)

If your code calls boto3.client('batch').submit_job(...), install the optional shim and change one import line. The shim translates submit_job/describe_jobs/list_jobs/cancel_job/terminate_job into the equivalent Common Compute calls.

bash

pip install 'common-compute[aws-batch]'

python

# before
import boto3
batch = boto3.client("batch", region_name="us-east-1")

# after — same API, talks to Common Compute
from commoncompute.aws_batch import client as boto3_compat
batch = boto3_compat()

resp = batch.submit_job(
    jobName="nightly-embeddings",
    jobQueue="ml-gpu",                    # → mapped to workload routing
    jobDefinition="bge-large:1",          # → workload_id:model_id
    containerOverrides={
        "environment": [
            {"name": "INPUT_S3_URI", "value": "s3://bucket/corpus.jsonl"},
        ],
    },
)
print(resp["jobId"])  # the Common Compute job id, returned in AWS-Batch shape

The shim is a thin adapter — it does not require boto3 installed and does not call AWS. It returns response dicts shaped exactly like AWS Batch responses (jobId / jobName / status / startedAt / stoppedAt / statusReason).

Path 2 — Native SDK (cleanest)

If you can spend an hour, the native SDK is more idiomatic Python and gives you typed responses, streaming, and async support.

python

from commoncompute import Client

cc = Client()  # reads CC_API_KEY

# OpenAI-compatible — works against existing pipelines
embeddings = cc.embeddings.create(
    model="bge-large",
    input=open("corpus.jsonl").read().splitlines(),
)

# Batch-style — get a job id back, poll or webhook
job = cc.jobs.submit(
    workload_id="coreml_embed",
    model_id="bge-large",
    payload={"texts": [...]},
    priority="batch",                     # default — cheaper than realtime
    max_spend_usd=50.00,                  # hard cap, like AWS Budgets
    idempotency_key="run-2026-04-24",     # safe to retry
)
status = cc.jobs.get(job.id).state        # 'queued' | 'running' | 'completed' | 'failed'

Path 3 — One-call replace (per file)

If you only have a handful of submit_job sites, you can call the SDK directly without the shim. The function-shape is unchanged from any other Python HTTP client.

Video pipelines (the most common AWS Batch + GPU use case)

Video transcoding and AI processing are the workloads that disproportionately drive AWS Batch + GPU instance bills. Common Compute handles them via two workloads: vt_transcode (hardware H.264/HEVC via Apple VideoToolbox) and the AI workload of your choice (whisper_ane for transcription, coreml_vision for frame analysis, mlx_image for generation, vision_ocr for screen-grab text).

python

from commoncompute.aws_batch import client
import json

batch = client()

# Per-file pipeline: one job per video — same as AWS Batch.
for video_id, source_s3_uri in iter_videos():
    # Step 1: hardware transcode 4K → 1080p HEVC
    transcode = batch.submit_job(
        jobName=f"transcode-{video_id}",
        jobQueue="video-gpu",                     # informational
        jobDefinition="vt_transcode",
        containerOverrides={"environment": [
            {"name": "INPUT_PAYLOAD", "value": json.dumps({
                "input_uri": source_s3_uri,        # s3:// or r2://
                "codec": "hevc",
                "bitrate_kbps": 8000,
                "resolution": "1080p",
            })},
        ]},
    )

    # Step 2: extract audio + transcribe
    audio = batch.submit_job(
        jobName=f"transcribe-{video_id}",
        jobQueue="video-gpu",
        jobDefinition="whisper_ane",
        containerOverrides={"environment": [
            {"name": "INPUT_PAYLOAD", "value": json.dumps({
                "input_uri": source_s3_uri,
                "model": "whisper-large-v3",
            })},
        ]},
    )

    # Wait for both, download outputs.
    transcode_done = batch.wait_until_done(jobId=transcode["jobId"], timeout=300)
    audio_done = batch.wait_until_done(jobId=audio["jobId"], timeout=300)

    batch.download_result(jobId=transcode["jobId"], dest=f"{video_id}.mp4")
    batch.download_result(jobId=audio["jobId"],     dest=f"{video_id}.transcript.json")

S3 → R2 input transfer is on the roadmap. Today, upload your inputs to R2 (or pass a public-read S3 URL the provider can pull). Long-term we'll mirror buckets so the migration is byte-identical.

AI batch jobs (embeddings, OCR, classification)

If your pipeline already uses AWS Batch to run embeddings or vision over a corpus on g5/p5 instances, the OpenAI-compatible path is even shorter than the shim.

python

from commoncompute import Client

cc = Client()

# Embeddings — drop-in OpenAI-compatible. Returns the result inline
# for batches <= 256 inputs; larger batches return an async job id.
result = cc.embeddings.create(
    model="bge-large",
    input=open("corpus.jsonl").read().splitlines(),
)
for item in result["data"]:
    save_embedding(item["index"], item["embedding"])

What you don't have to manage anymore

Container images — no ECR, no Dockerfile, no base-image patches
Compute environments — no scaling policies, no spot interruption handling
GPU instance availability — no waiting on p5/g5 reservations
Cost dashboards — quotes are returned before the job runs

What changes about your pipelines

S3 → R2: input bytes upload to a presigned URL we return; you don't manage buckets
IAM → API keys with scopes; rotate via /app/api-keys
VPC peering → not applicable; we expose a public HTTPS API only
Failure semantics: tasks retry up to max_attempts (default 3), then dead_letter — same shape as AWS Batch state transitions

Cost comparison (typical)

Embeddings (bge-large) on g5.xlarge through AWS Batch: ~$0.045 per 1M tokens including instance idle. Same workload on Common Compute: $0.009 per 1M tokens. The savings come from amortizing across idle Apple Silicon instead of dedicated cloud GPU.

Bring your last AWS Batch invoice to [email protected] and we'll send a per-line projection — the math is deterministic.