I got into the IPTV space over 6 years ago when I helped co-found Arkstream. We had a grand vision to enable live video ingest and playback as well as commercial insertion - but with IPTV streams. It's a long story about why we couldn't bring our product to market. Our CEO was Mike Hafferty and I learned so much about sales from him. Everything is a sale - even this blog is a sales tool!
I joined Premier Retail Networks over 5 years ago because it was clear that IPTV was the right solution for their in-store video delivery system. I've spent these last years building and honing that system into the award winning advertising platform it is today. If you want to see it, don't worry - it will be in a Wal-Mart store near you soon!
Along the way I've gotten a lot of practical experience with IPTV systems. There's other folks that are more academically oriented and can certainly argue their points, and there's obviously a bunch of folks who are doing big telco-type deployments who have a different set of perceptions. But my perceptions are around using IPTV for a media playback solution on a local network (not a WAN) where you play many clips back to back to create a seamless show. Some of those clips are ads, some are content. The system does not have a user holding an IR remote control. It's an autonomous system that has to self-heal and stay operational. After all, if it's not playing the ads, it's not making money.
One of the things I've come to strongly believe is that *TODAY* if you want to deliver IPTV stream you *must* use the Real Time Protocol (RTP) to do it. But then I diverge from the standard approach. The extremely smart folks at the IETF that volunteer their time to work out these protocol issues take the position that the best way to make it work is to split the audio and video into separate streams and then sync the streams at the playback device (Set Top Box). Of course, this is not how the broadcast world does it - they mix the audio and video into a single stream (called a MPEG2 Transport Stream) and then broadcast that over the radio waves or cable to get to the Set Top Box (STB). For some reason the IETF folks want to make it a lot more complicated by splitting up the streams.
I think that the MPEG2 Transport Stream (TS) is fantastic. It's already packetized into 188 byte chunks. It has a standard way to convey information about the stream content using Program System Information Protocol (PSIP). You can carry alternate language audio or even different video streams. It's proven, it's standard, it's well documented. If you are not using it, you should. In the streaming world many of the streaming server vendors interpreted the stream splitting and the associated other complexities (like the Real Time Control Protocol - RTCP) as just too complicated and unecessary for applications that are just going to put an onto the network. After all, it works on cable, why do something different on an IP network, right? Heck, just toss the already packetized TS frames into UDP and send it! Right?
Wrong. IP Networks are different because the transport stream goes into UDP packets that are delivered best-effort. Sometimes the network will drop a few packets accidentally or on purpose if the instantaneous packet rate is higher than it can handle. Sometimes a switch or a router will hold a packet in a buffer and the order of the packets is changed a bit for only a few packets. I could go on with some real-world stories but I probably should not - besides, that's boring. The fact is, packets get lost sometimes. None of that happens across the data link layer of RF broadcast over the air or onto a coax cable. But it does on IP networks and if you don't have a way to know how much it happens and when, you are doomed to letting your customers tell you your video is bad - at least until they find someone else to stream to them and they are not your customers anymore.
RTP has a field in it's packet header for a sequence number. That sequence number increments each network packet. A receiver can use that to put the packets back into the right order and to detect packets that are lost in transit. The RTCP extension to RTP provides a standard way for receivers to report back packet loss rates (and other data) to the source. This capability is crucial to be able to remotely detect, diagnose and hopefully repair an issue that is affecting the video playback quality. If the network has a mis-configured switch that is dropping 10% of your packets the playback will be terrible. Without a means to know that the packet loss is happening you may not know you have a problem until a customer calls to complain - and then all you know is that you have a problem.
The streaming vendors that just put transport stream frames into UDP packets without using RTP cannot possibly get a packet loss count. It's somewhat possible to use the continuity counter in the MPEG stream itself, but that is just a coutner that goes from zero to 16 and then repeats (4 bits). It's not sifficient to get a real packet loss count that's useful for diagnostics.
Now, back to RTCP. If you do split your video and audio into seperate streams you must use RTCP to communicate the data to sync the streams. But if you use transport streams, you don't need that. RTCP does also share back the packet loss data... but that's a lot of protocol/software complexity for such a simple thing. I said before that *today* I think RTP is the right answer - purely since it can measure lost packets. Yes, yes, I know, there's also a timestamp feild in RTP. I think it's bogus. The whole notion of seperation of layers is to let the network layer carry a payload without having to know about the payload. That's actually a big problem with RTP and MPEG. To properly set the timestamp field using the MPEG payload you have to decode the stream enough to derive the MPEG clock. Now you really are into complexity, not to mention paying for both the MPEG system patent *and* the video compression patent licensing. I know that some systems use the timestamp field, but most that I know of don't use it at all. After all, the clock data is already in the MPEG stream. If you do want to measure jitter you can still can - you don't actually need to duplicate data in the stream. It's almost as if the smart IETF folks were being TOO smart!
I think that for broadcast-like streaming of tomorrow that we may see transport streams carried in non-RTP streams. I specifically think that the Datagram Congestion Control Protocol (DCCP) is a great place to start.
Why couldn't we just put MPEG Transport Stream frames into DCCP packets? We'd get the lost packet detection, stream control, and the added benfit that the network core can deal with the streams and avoid congestion. It's simple, it avoids some of the complexity of RTP and all the complexity of RTCP. We'd still need a standard way to collect packet loss statistics, but heck, we could use a REST API for that. It's just simple enough to change the game.
Of course that's not where it's headed in the IETF. They want to carry RTP in DCCP packets. It's going to be interesting to see where the industry goes with this. Personally, most of it is too complicated as it is.
What do you think?
[ 3 comments ] ( 20 views ) | [ 0 trackbacks ] | permalink |




( 3 / 163 )
Archives



