This is the main endpoint for face detection. It automatically detects whether the input is an image or video and processes accordingly.
Key Features
- Auto Media Type Detection: Automatically determines if the input is an image or video
- 5-Point Landmarks: Detects 5 key facial landmarks for each face
- Bounding Boxes: Provides precise face region coordinates
- Video Face Tracking: Tracks faces across frames and identifies removed faces
- Async Processing: Downloads and processes media asynchronously for better performance
How It Works
- Media Type Detection: The API analyzes the URL to determine if it’s an image or video
- Media Download: Downloads the media from the provided URL asynchronously
- Processing:
- For images: Loads and analyzes the single image
- For videos: Extracts frames at regular intervals and analyzes each frame
- Face Detection: Uses InsightFace model to detect faces and landmarks
- Face Tracking (videos only): Tracks faces across frames and marks previous positions as removed
Request Parameters
url (required)
The URL of the media file to process. Must be publicly accessible. Supported formats:- Images:
.jpg,.jpeg,.png,.bmp,.webp - Videos:
.mp4,.mov,.avi,.webm
num_frames (optional)
Number of frames to extract and analyze from videos. Default: 5 Recommendations for videos:- Short videos (< 10s): 5-10 frames
- Medium videos (10-30s): 10-20 frames
- Long videos (> 30s): 20-50 frames
Response Format
Success Response
Response Fields
error_code
- Type:
integer 0: Success1: Error occurred (checkerror_msg)
error_msg
- Type:
string - Success:
"SUCCESS" - Error: Detailed error message
faces_obj
- Type:
object - Dictionary keyed by frame index (as string)
- For images: Only
"0"key is present - For videos: Multiple keys like
"0","5","10", etc.
landmarks
- Type:
array - Array of 5-point landmarks for each detected face
- Format:
[[[x1, y1], [x2, y2], [x3, y3], [x4, y4], [x5, y5]], ...] - Order: Left Eye, Right Eye, Nose, Left Mouth Corner, Right Mouth Corner
region
- Type:
array - Bounding boxes for each detected face
- Format:
[[x, y, width, height], ...] (x, y)is the top-left corner of the bounding box
removed
- Type:
array - Bounding boxes of faces that were present in previous frames but are no longer visible
- Only applicable for video processing
- Format:
[[x, y, width, height], ...]
frame_time
- Type:
numberornull - Time in seconds for this frame in the video
nullfor images
Examples
Example 1: Image Face Detection
Request:For image detection, the
num_frames parameter is not required and will be ignored if provided.Example 2: Video Face Detection
Request:Error Responses
Invalid URL
No Faces Detected
Processing Error
Use Cases
1. Face Swap Preprocessing
Detect face landmarks to prepare images for face swapping operations.2. Face Recognition
Extract face regions and landmarks for face recognition systems.3. Video Analysis
Track faces across video frames for content analysis or editing.4. Face Alignment
Use landmarks to align faces for consistent processing.5. Facial Animation
Use landmarks as control points for facial animation.Best Practices
URL Requirements
- Use HTTPS URLs for better security
- Ensure URLs are publicly accessible (no authentication required)
- Use direct links to media files (avoid redirects)
Performance Optimization
- For videos, use an appropriate
num_framesvalue- More frames = higher accuracy but longer processing time
- Fewer frames = faster processing but may miss faces
- Cache results if processing the same media multiple times
Error Handling
Always check theerror_code before processing results:
Rate Limits
Authorizations
Your API Key used for request authorization. If both Authorization and x-api-key have values, Authorization will be used first and x-api-key will be discarded.
Body
application/json
URL of the video or image to process. The media type will be auto-detected based on the file extension.
Example:
"https://example.com/media.mp4"
Number of frames to extract and analyze (only used for videos, ignored for images)
Required range:
1 <= x <= 100Example:
5
Response
Face detection completed successfully
Error code (0: success, 1: error)
Example:
0
Error message or success message
Example:
"SUCCESS"
Dictionary of face detection results keyed by frame index (as string). For images, only frame "0" will be present. For videos, multiple frames will be present (e.g., "0", "5", "10", etc.)
Example:
{
"0": {
"landmarks": [
[
[100, 120],
[150, 120],
[125, 150],
[110, 180],
[140, 180]
]
],
"region": [[80, 100, 100, 120]],
"removed": [],
"frame_time": null
}
}