Capabilities
Transcription
Speech-to-text with speaker detection and timed segments.
Transcription converts video audio to text. It auto-detects language and returns timed segments with optional speaker diarization.
curl -X POST https://api.netraflow.com/v1/jobs \
-H "Content-Type: application/json" \
-H "X-Api-Key: sk_live_your_key_here" \
-d '{
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"capabilities": ["transcription"]
}'{
"data": {
"job_id": "job_abc123",
"status": "completed",
"results": {
"transcription": {
"text": "We're no strangers to love. You know the rules and so do I...",
"segments": [
{
"start": 0.0,
"end": 4.8,
"text": "We're no strangers to love",
"speaker": 0
},
{
"start": 4.8,
"end": 8.2,
"text": "You know the rules and so do I",
"speaker": 0
}
],
"language": "en",
"duration_seconds": 212.0,
"word_count": 423,
"speakers_detected": 1
}
}
}
}Response fields
Prop
Type
Segment fields
Prop
Type