The <track> element represents a timed text file to provide users with multiple languages or commentary for videos. You can use multiple tracks and set one as default to be used when the video starts.
You can provide a transcript of the video.
- Subtitles — what you might expect to see while watching a foreign-language film — they’re a transcription or translation of the video’s dialogue.
- Captions, — for viewers who can’t hear the audio of the video, and include descriptions of non-dialogue sound. For example, if a character in a video slams a door off-camera, the captions would include something like [door slams]. Both subtitles and captions are displayed by the browsers as text overlays on top of the playing video.
- Descriptions are not displayed visually, but are rather spoken out loud by a screen reader, benefitting viewers who can’t see the video. Not surprisingly, descriptions describe what’s happening visually in the scene.
The <track> element represents a timed text file to provide users with multiple languages or commentary for videos. You can use multiple tracks, and set one as default to be used when the video starts.
The text is displayed in the lower portion of the video player. At this time the position and color can’t be controlled, but you can retrieve text through script and display it in your own way.
Text tracks use a simplified version of the Web Video Text Track (WebVTT) or Timed Text Markup Language (TTML) timed text file formats.
You define the track inside the video element.
The attributes are:
- kind. Defines the type of text content. Possible values are: subtitles, captions, descriptions, chapters, metadata.
- src. URL of the timed text file.
- srclang.Language of the timed text file. For information purposes; not used in the player.
- label. Provides a label that can be used to identify the timed text. Each track must have a unique label.
- default. Specifies the default track element. If not specified, no track is displayed.
WebVTT files are 8-bit Unicode Transformation Format (UTF-8) format text files that look like the following.
|00:00.000 –> 00:10.000|
|This text is related to the first ten seconds of the video|
|00:10.000 –> 00:20.000|
|This text is related to the next ten seconds of the video|
The file starts with the tag
WEBVTT on the first line, followed by a line feed. The timing cues are in the format
HH:MM:SS.sss. The Start and End cues are separated by a space, two hyphens and a greater-than sign (
--> ), and another space. The timing cues are on a line by themselves with a line feed. Immediately following the cue is the caption text. Text captions can be one or more lines. The only restriction is that there must be no blank lines between lines of text.
The MIME type for WebVTT files is text/vtt, which you will need to set in IIS.
Getting the Captions
Here is a sample that reads the captions in the WebVTT and puts it on the the page as it is being read. It uses reads the text from the active track to get the cues and inserts them into the display paragraph.
Note: The track must be loaded to read the cues.
For more samples, including one where you can interact with the transcript to take your user to a particular part of the video, see Video (Windows).
Not all browsers support WebVTT directly. IE10 and Windows Store applications do support WebVTT.
All of these solutions support video subtitles, and some offer additional features.
IIS supports serving .Ogg, WevM, and MP4, but you will need to set your MIME type for your track element.
For more information about setting MIME type, see Add a MIME Type (IIS 7).
Apache support is explained in <video> on Mozilla.
- Working with HTML5 multimedia components – Part 1: Video
- WebVTT Living Standard and Media Multiple Text Tracks API
- Video (Windows) for how you can use tracks in IE10.
- What’s new in HTML5: The Track Element
Sample code is available in the DevDays GitHub repository. See https://github.com/devdays/html5-tutorials