Skip to main content
The resources (image, video, voice) generated by our API are valid for 7 days. Please save the relevant resources as soon as possible to prevent expiration.
Experience our talking photo technology in action by exploring our interactive demo on GitHub: AKool Talking Photo Demo.

API Endpoints

Talking Photo Operations

Getting Started

Basic Workflow

  1. Create Talking Photo Video:
    • Prepare your talking photo image URL
    • Prepare your audio URL
    • Optionally provide prompt
    • Call the Create By Talking Photo API with your resources
  2. Check Results:
    • Use the Get Video Info Result API to check the status of your video
    • Download the result URL when the status is “Success” (video_status = 3)

Response Code Description

Please note that if the value of the response code is not equal to 1000, the request has failed or encountered an error.
CodeDescription
1000Success
1003Parameter error or Parameter cannot be empty
1008The content you get does not exist
1009You do not have permission to operate
1015Create video error, please try again later
1101Invalid authorization or The request token has expired
1102Authorization cannot be empty
1200The account has been banned
1201Create audio error, please try again later

Talking Photo Status Codes

When checking results, the video_status field indicates the current state:
StatusDescription
1In Queue - Your request is waiting to be processed
2Processing - Talking photo video is currently being generated
3Success - Video completed, result URL is available
4Failed - Video generation failed, please check your input resources

Best Practices

Image Requirements

  • Quality: Use high-resolution images for better results
  • Face Visibility: Ensure faces are clearly visible and not obscured
  • Lighting: Well-lit images produce better talking photo results
  • Angle: Frontal or slight angle faces work best

Audio Requirements

  • Quality: Use clear, high-quality audio files
  • Format: Standard audio formats (MP3 recommended)
  • Duration: Audio duration determines video length

API Usage Tips

  • Webhook: Use the webhookUrl parameter to receive notifications when processing is complete
  • Prompt: Use the prompt parameter to control hand gestures for more natural-looking videos
  • Resolution: Choose appropriate resolution (720 or 1080) based on your needs - higher resolution may take longer to process
  • Result Cleanup: Save generated videos promptly as they expire after 7 days

Common Use Cases

Basic Talking Photo Video

Create a simple talking photo video from an image and audio:
{
  "talking_photo_url": "https://example.com/photo.jpg",
  "audio_url": "https://example.com/audio.mp3",
  "resolution": "720",
  "webhookUrl": ""
}

Talking Photo with Prompt

Use the prompt parameter to control hand gestures for professional presentations:
{
  "talking_photo_url": "https://example.com/photo.jpg",
  "audio_url": "https://example.com/audio.mp3",
  "prompt": "Throughout the entire video, maintain natural and smooth hand movements. When speaking, use appropriate hand gestures to emphasize key points, such as opening your hands to express welcome or explanation, pointing your fingers forward to emphasize, and placing your hands together to summarize. The gestures should be coherent, not stiff, with moderate amplitude, and the frequency should be coordinated with the speaking speed. During pauses, the gestures naturally return to a relaxed state, presenting a professional and friendly presentation style.",
  "resolution": "1080",
  "webhookUrl": ""
}

Support

For additional help and examples, check out our: