Create a video (streaming)

curl --request POST \ --url https://api.boson.ai/v1/videos/stream \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "ref_image": "<string>", "model": "higgs-avatar", "input": "<string>", "size": "640x640" } '

Authorizations

Authorization

string

header

required

Your Boson API key, sent as Authorization: Bearer $BOSON_API_KEY.

Body

Provide a ref_image plus exactly one driving input: input (audio-to-video) or input_tts (text-to-video).

ref_image

string

required

Reference image (the face to animate): an http(s) URL, data URI, or base64-encoded raw image bytes. Supported formats: PNG, JPEG, WEBP. Inline (base64 / data-URI) payloads: max 10 MB.

model

enum<string>

default:higgs-avatar

Avatar model ID / public alias.

Available options:

higgs-avatar

input

string | null

Audio-to-video: the driving speech audio as an http(s) URL, data URI, or base64-encoded raw audio bytes. Supported formats: AAC, WAV, MP3, FLAC, OPUS. Max duration: 60 s (it sets the output video length). Provide exactly one of input / input_tts.

input_tts

object

Text-to-video: a speech request (the same body as POST /v1/audio/speech). The gateway synthesizes the voice and the avatar lip-syncs to it. The nested stream field is not supported. Provide exactly one of input / input_tts.

Show child attributes

size

enum<string>

default:640x640

Output video size (WxH): square 640x640, landscape 640x480, or portrait 480x640.

Available options:

640x640,

640x480,

480x640

Response

The fragmented-MP4 (fMP4) byte stream.

The response is of type file.