The future of HTML5 video

This article first appeared in issue 224 of .net magazine – the world's best-selling magazine for web designers and developers.

HTML5 brings native multimedia to browsers. In ye olde days, video and audio were handed off to third-party plug-ins (which may not be available for every device or operating system). Communication between a browser and a plug-in is limited and therefore the multimedia was very much a black box.

<track src=subtitles.vtt kind=subtitles srclang=en label="English">

The src attribute points to the external timed text tracks. The kind attribute tells the browser if these are subtitles (the dialogue is transcribed and possibly translated because otherwise it wouldn’t be understood), captions (transcription or translation of the dialogue, sound effects, musical cues and other relevant audio information suitable for when sound is unavailable or not clearly audible), descriptions (textual descriptions of the video component of the media resource, intended for audio synthesis when the visual component is obscured, unavailable, or not usable, for example, because the user is interacting with the application without a screen while driving, or because the user is blind), chapters or metadata. The default is subtitles.

<video controls> <source src=movie.mp4 type=video/mp4> <source src=movie.webm type=video/webm> <track type=subtitles srclang=en src=subtitles-en.vtt> <track type=subtitles srclang=de src=subtitles-de.vtt label="German"> <!-- fall back content, eg a Flash movie or YouTube embed code </video>

You can change text size, for example S:150% increases the size to 150 per cent of the default. It’s possible to have subtitles appear incrementally (for example, with karaoke lyrics in which the line appears one word at a time, but the previous word doesn’t disappear when a new one is displayed). You can style different speakers’ words with different colours, and there is basic support for styling different words with different colours. For more information visit delphiki.com/webvtt/#cue-settings.

In May 2011, WebKit announced it would implement Mozilla’s own flavour of a full-screen API. This API allows any element to go full-screen (not only <video>) – you might want
full-screen <canvas> games or video widgets embedded in a page via an <iframe>. Scripts can also opt in to having alphanumeric keyboard input enabled during full-screen view, which means that you could create your super spiffing platform game using the <canvas> API and it could run full-screen with full keyboard support.

There are two main use cases for this. Imagine a site that shows videos of sporting events: there might be multiple video elements, each from a different camera angle – for example, one on each goal, one in the air and one tracking the ball. A site showing a concert might have one camera on the bass guitar, one on the guitar, one on the Peruvian noseflute. Moving the seek bar, or changing the playback rate to slow motion, on one video should affect each of the other videos.

<video mediagroup="jedward" src="bass-guitar.webm">..</video> <video mediagroup="jedward" src="lead-guitar.webm">..</video> <video mediagroup="jedward" src="idiot-1.webm">..</video> <video mediagroup="jedward" src="idiot-2.webm">..</video>

<video mediagroup="described-film" src="mankini-magic.webm">..</video> <audio mediagroup="described-film" src="describe.ogg">..</audio>

Traditionally the territory of the Flash plug-in, HTML5 now adds the facility to connect to a device’s camera and microphone. Previously known as HTML5 <device>, this functionality is now wrapped in an API called getUserMedia. To tell the device what type of media we wish to get, we pass audio or video as arguments. Because many devices have a forward-facing camera, which captures the user’s image, and rear camera, we can pass in the token’s user or environment.

if(navigator.getUserMedia) { navigator.getUserMedia('audio, video user', successCallback, ¬ errorCallback);

var video = document.getElementsByTagName(‘video’)[0] ... function successCallback( stream ) { video.src = stream; }

Of course, it’s just possible that the designers of the getUserMedia API had other uses in mind, besides drawing moustaches. It could be used for browser-based QR/bar code readers. Or, more excitingly, augmented reality. The HTML5 Working Group is currently specifying a peer-to-peer API which will allow you to hook up your camera and microphone to the <video> and <audio> elements of someone else’s browser, making it possible to do video conferencing.

Thank you for reading 5 articles this month* Join now for unlimited access

Enjoy your first month for just £1 / $1 / €1

*Read 5 free articles per month without a subscription

Join now for unlimited access

Try first month for just £1 / $1 / €1

The Creative Bloq team is made up of a group of design fans, and has changed and evolved since Creative Bloq began back in 2012. The current website team consists of eight full-time members of staff: Editor Georgia Coggan, Deputy Editor Rosie Hilder, Ecommerce Editor Beren Neale, Senior News Editor Daniel Piper, Editor, Digital Art and 3D Ian Dean, Tech Reviews Editor Erlingur Einarsson and Ecommerce Writer Beth Nicholls and Staff Writer Natalie Fear, as well as a roster of freelancers from around the world. The 3D World and ImagineFX magazine teams also pitch in, ensuring that content from 3D World and ImagineFX is represented on Creative Bloq.

The future of HTML5 video

Where we are now

Multimedia subtitling and captioning

WebVTT

Full-screen video

Synchronising media elements

Accessing camera and microphone

WebRTC

Related articles