The MPEG group of video compression standards are by far the most common today, and are only becoming more common, squeezing out the proprietary and one-off standards of old. Cable, satellite, and over-the-air digital TV are all based on MPEG-2, and even a lot of IPTV is some sort of MPEG. DVDs are MPEG-2. Two of the three video formats allowed on BluRay discs are MPEG variants. Even video cameras are lately moving to MPEG-based formats (e.g. AVCHD), even though MPEG is inimical to editing. Probably most home movies never actually get edited, despite all the good intentions to do so, but this is even happening in a lot of pro video cameras (e.g. XDCAM), despite the fact that most pro video does get edited. MPEG is everywhere.
No surprise, then, that people frequently find themselves with an MPEG file that they need to edit. The problem is, MPEG was designed to be used only with already-edited video, as a final delivery format for video that is only played back as-is.
The process of editing video involves cutting the video between two frames, removing some sections of video, rearranging others, and splicing new video in. To make this easy, each frame of video must stand alone, with no dependency on its neighbors. There are many video encoding techniques with this property: DV, MJPEG, Apple ProRes...
The thing about most video, though, is that each frame is usually very much like the next one. Take the simple case of video of a person talking. The background might not be changing at all, the person’s head might move very slightly from one frame to the next, and their face only slighly more so. This is the key realization behind the MPEG video encoding formats: most of the changes from one frame to the next are slight, and frequently there are large sections that change either not at all, or so slightly that a person does not notice.
By taking advantage of this property, the creators of the MPEG video encoding standards opened a door to higher compression that must remain closed when ease of editing is a primary goal.
MPEG video, in a nutshell, has one full “reference” frame (called an I frame, technically) for every 14 or so “difference” frames. These 15 or so frames are called a “group of pictures,” or GOP. (If you’re a Democrat, there’s no need to feel uneasy, because it’s pronounced as a word, /gawp/, not spelled out, gee-oh-pee.) The GOP length can vary, but it’s always a single I frame followed by one or more difference frames. There’s a lot more to MPEG video than this, but that’s all we need to go into to explain why editing MPEG video is hard.
There are two types of difference frames.
The obvious type are called P frames, and they only depend on the previous frame in MPEG-2, or up to 3 frames back in MPEG-4. The P frame following an I frame simply describes what happened in the video since that I frame. The easiest change to describe is no change: areas of the frame that are substantially the same as in the previous frame just get copied. Next easiest is simple linear motion, so the decoder can just copy a block of pixels from one area in the previous frame to their new position in the current frame. Finally, any areas of the frame that are simply different relative to the previous one are re-encoded using the same technique used for the pieces of the I frame, called macroblocks.
You could encode the entire video as a single I frame followed by nothing but P frames. They did try this in the experimental stage of MPEG development, but it didn’t work because MPEG only encodes the approximate appearance of video, in a further effort to keep the file size low. The errors in each approximation would thus build up until they became obvious, then ugly, then mud. They found that the best balance between video quality degradation and the high cost of I frames is to have one nice, clean I frame every half second or so. You will thus usually find that the GOP size is 15 frames when the frame rate is 30 per second, 12 for 24 fps video, etc. There are exceptions, such as truncated GOPs when the encoder does scene-change detection, or “long-GOP” encoding, where they deliberately put more difference frames between the reference I frames to get higher compression rates. The important thing is that a GOP is almost always a big fraction of a second long.
The other type of difference frame is called a B frame, which stands for “bidirectional.” In MPEG-2 a B frame can describe differences relative to both the previous and the next frame. MPEG-4 gives B frames a little more reach, but the idea is the same. This ability to reach both forward and backward in time makes B frames even smaller than P frames, on average, a big win for file size. For technical reasons, though, you don’t want too many of them, so the most common GOP frame pattern is IPBBPBB... (Actually, it’s stored in the file as IBBPBBP... to make decoding easier.)
From an MPEG editing standpoint, the interesting question, once you introduce B frames, is what happens at the GOP boundary? The most common frame pattern looks like ...PBBIPBB..., so can that final B frame in one GOP refer to the I frame starting the next GOP? Yes, it can, but it doesn’t have to. An MPEG video encoder is allowed to write out a B frame that only refers to differences in the previous frame, or it can change the frame pattern to force a P frame to be the last one in a GOP. This is called “closed GOP” encoding, because each GOP is self-contained, closed off from the others. The problem with closed GOP encoding is that it requires a slightly larger file to achieve the same quality level, because you lose out on the benefit of that final B frame. Because it’s less efficient and more complex, closed GOP encoding isn’t even an option in a lot of encoders, and is never the default when it is available.
The simplest type of MPEG editor only works with closed GOP files, and is called a “GOP-accurate editor.” Such a program only allows you to make cuts right before an I frame. With the typical half-second GOP size, a GOP-accurate editor can thus be forced to make a cut nearly half a second away from where you might wish to make the cut if you were able to cut between any two frames. This is adequate for cutting commercials out of a television program you recorded, but not accurate enough for serious video editing.
The more powerful sort of MPEG editor is called “frame-accurate,” because it allows cuts between any two frames. These come in two flavors.
The cleverest sort will only re-encode those GOPs containing cut points, copying unedited GOPs from the original file straight into the new output file. Thus, you pay the re-encoding penalties of time and quality loss only at the cut points.
Not all frame-accurate MPEG editors are this clever, however. Some simply let you build up an “edit decision list,” then on saving out the new file, it re-encodes the entire video, creating a whole new set of GOPs. This is espeically common with NLE programs that happen to also be MPEG-aware, even at the high end. Apple’s Final Cut Pro works this way, for instance.
Despite the rising popularity of MPEG as a whole, the clever sort of frame-accurate MPEG editor has gone down in popularity, for a bunch of reasons. First, few people really want to have two video editing packages around, one a clever frame-accurate MPEG editor, and another more widely useful package for general-purpose editing. Second, the rise of MPEG-based video cameras means that every general-purpose NLE these days does have at least limited ability to read and thus edit MPEG. Third, computers are getting so fast that the sting of re-encoding the entire file isn’t as painful as it used to be. Fourth, the quality hit isn’t as bad as a lot people imagine it will be; see my re-DCT tests for details. Finally, you often want to re-encode the whole file anyway, if only to get a different bit rate. Video cameras usually capture at a much higher bit rate than is typical for final delivery, since high video quality is more valuable at the start of the video editing pipeline than at the end, for final delivery.
If you still think you want to try the dedicated MPEG editing path, I can give you the names of a few products to look at. Before I do, be warned that I haven’t used any of these in years, some of them not at all. Take these as references, not recommendations.
The classic, and only remaining program I know of that is a true decicated MPEG editing program, is Womble’s MPEG2VCR. On revisiting this article to update it, I found that they’ve since added more products with this capability, but I’ve not tried any of the new ones. The last time I used MPEG2VCR was a demo version back in 2000 or 2001, which was okay, but I dind’t find it worth purchasing. I would hope it’s gotten a lot better since then, but I couldn’t say.
The high-quality MPEG encoding program TMPGEnc has some basic MPEG editing abilities. I tried to use them long ago, but never was very impressed. However, maybe you already have TMPGEnc for other reasons, and so might have a basic MPEG editor without knowing it.
Finally, a product I do use regularly, Elecard’s XMuxer, has MPEG editing abilities. I’ve never acutually tried it, because I only use XMuxer for its primary purpose, remultiplexing whole files. The Pro version claims to have frame-accurate editing, while the Lite version only does GOP-accurate editing.
There are many ways to avoid the need to edit MPEG entirely.
First, you might have an alternate way to capture the video. There are many video capture products for computers that take analog video in and save files out in some edit-friendly video format, like DV or Apple ProRes. PCI capture cards often present fully uncompressed video to the operating system’s video layer, letting you choose any real-time software encoder you like. USB and Firewire products usually use some sort of compression, often MPEG, but sometimes not. DV is popular in the Firewire case, and is a decent choice for video editing, as long as you’re not doing heavy color work. The most exciting development in this line is that most video equipment these days has HDMI outputs, which is fully uncompressed. All you need is an HDMI capture card, like the Blackmagic Intensity. There’s a Pro version that also has analog inputs.
If you have no choice but to start from an MPEG file, you might prefer to do a fast re-encode on it, transforming it to DV, ProRes, or some other edit-friendly file format. There’s a lot of software that will do this. Basically, you want something that will read in MPEG but can write out AVI, if you’re running Windows, or will write out QuickTime, if you’re running OS X. You can then feed the result into any NLE you like. There are too many such programs to even list here. I’ve used a bunch, but they keep changing for various reasons, so every year or two I end up seeking out yet another one. Just Google both your input file format and either “AVI” or “QuickTime” together.
The high road of MPEG editing is to not do it at all. When making MPEG files from AVI or QuickTime files, always keep the source file until you’re sure you are happy with the final MPEG. Editing the MPEG is hard, and restricts your choice of tools. If you absolutely must edit an MPEG file, don’t expect miracles. Always remember that by editing MPEGs, you’re swimming upstream.
This article is copyright © 2001-2010 by Warren Young, all rights reserved.
| Updated Wed May 06 2009 15:05 MDT | Go back to Videographica | Go to my home page |