With the introduction of HTML5, developers have acquired a new tool for delivering video to their audiences. Alongside the juggernaut Flash Player – installed in 99 per cent of internet-enabled personal computers – HTML5 is poised to be a major player in the video world.
However, for better or worse, the HTML5 specification doesn’t rigidly define all aspects of video delivery, and the inevitable format wars that arise in such cases are making life a little difficult for developers and content creators everywhere. This multi-part series looks at how to navigate these ever-changing waters. This first installment will discuss how to use HTML5 in conjunction with Flash Player to reach the widest possible audience, and how to create video assets in all the forms required to achieve this goal.
What is HTML5?
The first thing that really needs to be said is that HTML5 is not a standard. It’s still in flux and that can have a big impact on early adopters. The draft HTML5 specification was started in 2004 by Web Hypertext Application Technology Working Group (WHATWG), a collection of industry players founded by individuals of Apple, the Mozilla Foundation, and Opera Software. The Wide Web Consortium (W3C) then adopted it as a working draft in 2007. The spec’s Last Call (seeking peer and public review of the spec in its near-final form) was issued in May 2011 but, though no major new features are expected, it isn’t even feature-complete. Experts don’t expect the specification to be ratified by the W3C until anywhere from 2014 to 2022, depending on whose guess you’re listening too, and when it was made.
Fun Fact: Although the W3C is currently still using HTML5 as the spec’s moniker, the WHATWG – considered by many to be the principal force behind the evolution of HTML – has dropped numbering from future efforts. It plans to stick with “HTML” and dubs it a “living standard.”
What does all this mean? At this point, it really only means that early adopters need to be prepared for changes as the specification goes through it’s final, albeit multi-year, phase. Many HTML5 features are available today, and most experts agree that radical changes in the spec are not likely.
The trouble is, we’ve only been talking about the official relatively-soon-to-be standard thus far. Implementation of that standard is another matter. Each browser, and every version of each browser, implements HTML5 features to a varying degree, and with varying success. That makes things more difficult for developers, and we’re just getting started. Here’s a quick snapshot of browser and mobile operating system (OS) compatibility when it comes to basic HTML5 video features:
What is HTML5 video?
<video src="vid.mp4" width="320" height="240" controls></video>
But before we can get to much better uses of the markup, we need to recognise that this is only a small part of the picture. Arguably more important (and certainly more complex) than the markup, is the format of the content itself. Although the term format is used fairly freely, it most often collectively represents not only the algorithms for compressing and decompressing video and audio data, but also the container that delivers this data in a cohesive, usable form. Simply put, a container is a wrapper for video, audio, and related content.
One of the biggest challenges facing HTML5 developers is that the current HTML5 draft specification doesn’t recommend any one video format browsers should support. The teams behind each browser are left to determine which video format is best suited to the task. This has resulted in a splintered implementation of video features and poses a significant hurdle that content creators must overcome. In fact, to support HTML5 video delivery in all of the most widely used browsers, a developer must either create three separate files for each video, or rely on a server solution that transcodes (converts from one format to another) on the fly.
For this article, we’ll confine our discussion to HTML5- and Flash-compatible containers and, for the most part, the video and audio portions therein. However, these wrappers can also support additional features such as subtitles, chapter information, and meta-data, as well as how all of these data types are synchronised. In some cases, containers support further options such as digital rights management (DRM), basic 3D rendering options, and hardware rendering support.
There are four primary containers supported by HTML5 and/or Flash. The pros and cons of each format are largely down to the video and audio features they include (which I’ll talk about in a moment), but some container-specific features warrant mentioning here, too.
MPEG-4 (more accurately MPEG-4, Part 14) is a proprietary container controlled by the Moving Picture Experts Group. Using the familiar file extension .mp4 (though .m4v is sometimes also used), the MPEG-4 container is arguably the most recognised. It has its roots in Apple’s QuickTime container (which typically uses the .mov extension) and is currently the most common choice for desktop and mobile devices for video playback. It is, for example, the container used by the iTunes Mac/Windows/iPhone/iPad/iTouch eco-system. However, due to patent and licensing issues, this is slowly changing and the HTML5 initiative is accelerating that change. As a container, MPEG-4 is notable because it supports digital rights management – an essential feature for some content creators, and is also attractive because hardware acceleration (chip-based, or assisted, encoding and/or decoding at runtime) is widely available for the format. In addition to its inclusion in the HTML5 spec, Flash Player also supports the MPEG-4 container.
Ogg is a free, open container format maintained by the Xiph.Org Foundation. Though the Xiph.Org Foundation recommends use of the .ogv file extension, it’s quite common to see Ogg video files with the .ogg extension. Ogg is native in the Linux OS, and supported in the Mac and Windows operating systems by QuickTime components. The Windows OS can also use a Windows Media Player extension or DirectShow filters to display Ogg content.
Hardware acceleration is already available for WebM files and Adobe announced that future versions of Flash Player would support the WebM container.
The Flash Video container makes use of two containers. FLV is an older, proprietary format that supports Flash Player versions going back to version 7, introduced in 2003. Flash Player version 9 update 3 introduced support for the F4V format, which is based on MPEG-4. (Mentioned previously, valid MPEG-4 containers are supported by Flash Player, when using version 9 update 3 and later.)
Flash Player does not rely on file extensions to play compatible video files, but the extensions commonly used for each container are .flv and .f4v, respectively. (The F4V format also includes the .f4p extension for protected video.) Flash Player 10.2 and later supports hardware acceleration for Flash Video.
The main task of a container is to bring the video and audio content together. Manipulating these assets is the responsibility of codecs.
Codecs are algorithms used for encoding assets at authortime and decoding them for playback at runtime. (The term codec is a portmanteau derived from “coder”/“decoder.”). Software and hardware makers (such as the companies behind browsers and mobile devices) must decide which containers and codecs to support with technologies such as HTML5 and Flash Video. Factors such as quality, file size, bandwidth and similar issues all play a part in this decision.
One of the biggest such factors is the impact of patents and licensing costs, and a push for free and open codecs is much of the reason for the video format fragmentation currently faced by web video developers. There are many codecs but, for this discussion, we'll focus on the four primary codecs for HTML5 and Flash delivery. First up are the video codecs: H.264, Theora, VP8, and VP6. H.264 and VP6 are restricted patents and charge licensing fees, while Theora and VP8 are free and either unencumbered by patents or maintained under a mandate not to enforce these patents.
Also referred to as MPEG-4 Part 10 or Advanced Video Coding (AVC), H.264 was developed by MPEG and became a standard in 2003. It’s a high-quality codec that is optimised for low- and high-bandwidth/CPU devices, from phones to Blu-ray players. Its specification includes "profiles," ranging from Baseline to High, that dictate varying degrees of quality and optional features. It also includes scalable profiles that allow a single file to adjust quality based on playback capabilities. It’s supported through both software and hardware acceleration and is widely used in everything from mobile devices to high definition video in broadcast, DVD, and similar environments. H.264 is part of the MPEG-4 container, and is also widely used in Flash Video, either directly or as part of the F4V container.
Theora is a free, open video codec. While other contains can use Theora as a video codec, it is most often associated with the Ogg format. On2 Technologies originally developed what would become Theora, as VP3. Theora was derived from VP3 after On2 released VP3 into the public domain. The Xiph.Org Foundation now maintains the codec. It requires on licensing agreements and is unencumbered by known patents other than the original VP3 patents, which have been licensed royalty-free. Theora is native on Linux, and supported on Mac and Windows operating systems through open source encoders/decoders.
Also developed by On2 Technologies, VP8 is known for quality similar to the H.264 High profile, but with a low decoding complexity similar to the H.264 Baseline profile. Google acquired VP8 from On2 to become the video codec of its WebM container. Google provided an irrevocable promise not to enforce its related patents, making VP8 royalty-free, making it an attractive alternative to H.264. It’s typically supported by software encoding and decoding, but hardware acceleration is already in use and growing.
VP6 is the codec most commonly used in the FLV Flash Video Format. It’s a reasonably high quality codec that also includes attractive features such as alpha channel support.
Contributing to an impressive running theme in this article, On2 Technologies also developed VP6. Licensed to Macromedia (and then Adobe), VP6 is restricted by patents and licensing issues. This, and the fact that Flash Player is required for playback, makes the FLV Flash Video Format a less attractive long-term solution for many developers, but still serves a case-by-case need – particularly when it comes to the use of alpha channels.
As with video, there are many audio codecs, but here we’ll focus on AAC, Vorbis, and MP3, as they are most commonly used with HTML5 and Flash video assets
Advanced Audio Coding (AAC) was engineered as a possible successor to the MP3 format. It’s known to have better quality than MP3 but at the same bitrate. It’s also possible to encode AAC file using any bitrate, while MP3 encoding requires one of several predetermined bit rates. AAC is one of the codecs used in the MPEG-4 container, and the audio codec most commonly used for this container for HTML5 compatibility. The format supports digital rights management, but is encumbered by patents and licensing issues. It’s typically decoded with software, but some hardware decoding implementations do exist.
Vorbis is most commonly associated with the Ogg container, but can be used in MP4, WebM, and other formats. It’s the only free, open codec among those widely used in Flash and HTML5, and unencumbered by patents. In fact, development on Vorbis was started when licensing fees were announced for the MP3 format. It’s known for having comparable or superior audio quality than MP3 or AAC but with smaller file sizes and is popular in the gaming industry. The format is native on Linux and Android, and supported in Mac and Windows operating systems through QuickTime Components and DirectShow filters, as well as dedicated software players.
You know what MP3 is, right? It’s pretty much the standard for portable digital audio devices, and is also used by the FLV Flash Video format. It’s patented, requires licensing fees, and must be encoded at one of several pre-determined bit rates.
Container/Codec browser compatibility
As you can see, even a one-paragraph overview of the video codecs, audio codecs, and containers available to HTML5 and Flash video still requires a little time and attention to digest. Still, it’s time well spent, as this material is important if you’re to deliver video assets to the widest possible audience. Why? Because, as mentioned briefly before, among the major browsers and mobile operating systems, no one format is universally supported.
So, that leaves us with a time-machine trip back to the heady days of 1997, when the bulk of our work as developers was dealing with inconsistencies between browsers and operating systems. The following table is a snapshot of compatibility between video formats (containers, video codecs, and audio codecs) and the major desktop browsers and mobile operating systems.
|MP4: H.264/AAC||9.0+||3.1+||3.0 - 11.0 †||3.0+||2.0+|
|WebM: VP8/Vorbis||*||4.0+||*||6.0+||10.6+||2.3.3+ local,|
The table shows a few things worthy of note. First, an empty cell indicates no support, while a populated cell shows the version at which support for a video format was added. (Asterisks are used to indicate that support is available through third-party software such as plugins.)
Second, while Chrome supported all three video formats at the time of this writing, that may change in the future. † Google announced that Chrome would drop support for the H.264 video codec with the release of version 11.0 — a great example of the aforementioned changing HTML5 landscape — although it appears that decision was postponed.
And finally, no format is supported by all browsers. When dealing with modern browsers, it’s possible to reach all HTML5 users with a combination of MPEG-4 and WebM assets. However, if (as one example) a user has version 3.6 of Firefox, only an Ogg video will fit the bill.
Furthermore, if a user hasn’t upgraded to an HTML5-compatible version of his or her preferred browser, using Flash Player for video playback is likely the best solution for that user. As a result, we end up with three or more versions of each video to reach the widest possible audience.
But the fun doesn’t stop there. Beyond policy decisions, like Google’s dropping of H.264 support, bugs, idiosyncrasies and the lack of parity across all platforms mean that even a single browser or operating system doesn’t always handle video the same way. Some of these issues will be discussed later in this article but, in a nutshell, successfully delivering video to multiple platforms takes some planning and workarounds. Which leads us to look at an improved version of the HTML5 markup introduced briefly earlier in this article. By introducing the HTML5 video <source> tag to the mix, and keeping an eye on known potential problems, one can effectively handle most of the quirks.
Improved HTML5 markup
The first version of HTML5 video markup discussed looked like this:
<video src="vid.mp4" width="320" height="240" controls></video>
The key issue here is that the video source is identified using the src attribute of the <video> tag. This allows only one source for the <video> element. This approach is not recommended because of the lack of universal support for one video format.
If, however, you omit the src attribute and, instead, add HTML5 <source> tags as seen below, a browser will start with the first source and, if unsupported, move on to the next source, and so on.
<video width="320" height="240" poster="vid.png" controls> <source src="vid.webm" type='video/webm; codecs="vp8, vorbis"'> <source src="vid.mp4" type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'> <source src="vid.ogv" type='video/ogg; codecs="theora, vorbis"'> </video>
Additional attributes for the video element include:
- preload: suggests how much of the video asset should be preloaded. Options include none, metadata (for fetching things like dimensions, duration, first frame, etc.), and auto (to allow the browser to preload as much of the asset as is practical)
- autoplay: a Boolean that determines if the video plays immediately upon loading
- audio: dictates the initial audio state (currently, only muted is supported)
- loop: a Boolean that determines if the video returns to the beginning to play again after playing its full duration
The first source tag in the sample markup identifies a WebM video file. If that’s not supported, a MPEG-4 file is used. And, if MPEG-4 is incompatible, an Ogg video file is used. In all cases, the type attribute states the container type and codecs required for playback.
Depending on how comprehensive you want to be when targeting your audience, you may run into a few known issues. Fortunately, their solutions are pretty simple.
- First and foremost, be sure your server supports the mime types required to play each video format you use. One simple way to achieve this is to create an .htaccess file in the same directory as your HTML file and use the AddType directive to add the needed mime types:
AddType video/ogg .ogv AddType video/mp4 .mp4 AddType video/webm .webm
- iOS 3.x doesn’t like the poster attribute. Omit it from the <video> tag
- On iPad iOS 3.x, a video will not play unless the mp4 <source> is listed first
- Android < 2.3 doesn’t like the type attribute. Omit it from the MP4 source tag (the only container Android currently supports)
So, to circumvent known issues with HTML5 delivery, and reach older versions of iOS and Android, use this fourth version of the HTML5 markup we’ve evolved. Note that the poster attribute of the video element has been removed, the MP4 source has been listed first, and the latter’s type attribute has been removed.
<video width="320" height="240" controls> <source src="vid.mp4"> <source src="vid.webm" type='video/webm; codecs="vp8, vorbis"'> <source src="vid.ogv" type='video/ogg; codecs="theora, vorbis"'> </video>
We’ve finished the work needed to reach the widest HMTL5 audience possible, but what should we do about browsers that are not HTML5 compatible?
Flash Video integration
Despite rampant propaganda, Flash Player is still an impressive force and still widely used. In many cases, you may either prefer or require Flash Player video delivery to support a desired feature. I’ll itemise a few of those features in the next installment of this article. Much more important, however, is the fact that HMTL5 video won’t even work for a very large portion of users. It’s just too new to reach the kind of penetration that Flash Player enjoys.
A quick glance at NetMarketShare’s desktop browser version statistics indicates that close to 50 per cent of the browsers in use today are not HMTL5 compatible. This will continue to change as more users upgrade their browsers, and the mobile market will continue to make HTML5 video delivery more and more essential. Today, however, ignoring Flash Player inevitably means a much smaller audience for your content.
Fortunately, you can code your pages to deliver Flash content when desired – either as a fallback from HTML5, or as a first choice, reverting to HTML5 when necessary.
Fall back to Flash
The simplest approach to delivering video using both HTML5 and Flash Player is to capitalise on how browsers parse HTML. You may recall that, when using the <video> and <source> tags, an HTML5-compatible browser will walk through the provided sources until it finds a compatible video format. This process can be thought to continue, in a way, because if a browser doesn’t support the HTML5 <video> element, these tags will be ignored.
Here’s the final example of the HTML5 markup used throughout this article. This very simple example uses the Open Source Media Framework Strobe player developed by Adobe to serve as the Flash video player.
<video width="320" height="240" controls> <source src="vid.mp4"> <source src="vid.webm" type='video/webm; codecs="vp8, vorbis"'> <source src="vid.ogv" type='video/ogg; codecs="theora, vorbis"'> <object width="320" height="240"> <param name="movie" value="StrobeMediaPlayback.swf"> <param name="flashvars" value="src=http://yourdomain.com/vid.mp4"> <param name="allowFullScreen" value="true"> </object></video>
That’s all there is to it. As you can see, the HTML5 markup accounts for the simple list of known issues discussed herein, and falls back to Flash Player when HTML5 Video is not supported.
Fall back to HTML5
Perhaps the best advantage of this approach is that you can use the contents of the <div> to elegantly suggest that the user acquire the Flash Player if it’s really better for a specific task. In fact, SWFObject supports Flash Player’s Express Install option to automate this process for you, if desired.
I hope this article has provided the fundamentals required to understand both how to deliver video assets to the widest possible audience, but also some of the technical background needed to optimise assets for the task. In the next installment, we’ll put these fundamentals to work by discussing:
- Select encoding software packages for creating the video files.
- Examples of when Flash Player is the best tool for the job and when HTML5 may be preferred.
See you then!