Overview

GPT-4o Image combines perception and generation. Use one API to create new images, edit existing assets, or extract structured metadata from uploads—all while sharing the same authentication and response schema.

Generation

curl -X POST "https://api.transendai.net/v1/images/gpt4o/generation" \
  -H "Authorization: Bearer $TRANSEND_API_KEY" \
  -H "Content-Type": "application/json" \
  -d '{
    "prompt": "High-end sneaker on a marble pedestal with volumetric lighting",
    "size": "1024x768",
    "guidance": 6.5
  }'

Field	Description
`prompt`	Natural language description.
`size`	Width × height (max 2048 in either dimension).
`guidance`	0–10 float controlling adherence to the prompt.
`reference_images`	Optional array of URLs to guide style/composition.

Editing

curl -X POST "https://api.transendai.net/v1/images/gpt4o/edit" \
  -H "Authorization: Bearer $TRANSEND_API_KEY" \
  -F "[email protected]" \
  -F "[email protected]" \
  -F 'payload={
    "prompt": "Swap the background to black marble and add cyan accent lighting.",
    "size": "1024x1024"
  }'

Masks are optional; if omitted, GPT-4o automatically infers editable regions.

Understanding

curl -X POST "https://api.transendai.net/v1/images/gpt4o/analyze" \
  -H "Authorization: Bearer $TRANSEND_API_KEY" \
  -F "[email protected]" \
  -F 'payload={"tasks":["caption","objects","text"]}'

Response excerpt:

{
  "analysis": {
    "caption": "Coffee shop receipt totaling $18.50",
    "objects": [
      { "label": "receipt", "confidence": 0.99 },
      { "label": "latte", "confidence": 0.81 }
    ],
    "text": [
      { "content": "Total $18.50", "bounding_box": [42, 128, 310, 156] }
    ]
  }
}

Streaming

Set stream: true to receive partial outputs in SSE. Each event includes step metadata such as denoise, upscale, and final.

Error Reference

Code	Meaning	Fix
`400`	Unsupported task combination.	Request any subset of `generation`, `edit`, `analyze` separately.
`413`	Upload too large.	Compress assets to 25 MB or provide signed URLs.
`422`	Mask mismatch.	Ensure mask dimensions match the source image.

Tips

Combine perception with generation by running analyze first, then piping results into a follow-up generation request.
Use response_format: { "type": "json_schema" } to enforce structured metadata when generating descriptions or labels.
Monitor GPU-intensive operations via the observability dashboards.

GPT-4o Image API