2.7.1 Audio/Video Transport (avt)

NOTE: This charter is a snapshot of the 40th IETF Meeting in Washington, DC. It may now be out-of-date. Last Modified: 24-Oct-97

Chair(s):

Stephen Casner <casner@precept.com>

Transport Area Director(s):

Scott Bradner <sob@harvard.edu>
Allyn Romanow <allyn.romanow@eng.sun.com>

Transport Area Advisor:

Allyn Romanow <allyn.romanow@eng.sun.com>

Mailing Lists:

General Discussion:rem-conf@es.net
To Subscribe: rem-conf-request@es.net
Archive: ftp://nic.es.net/pub/mailing-lists/mail-archive/rem-conf

Description of Working Group:

The Audio/Video Transport Working Group was formed to specify experimental protocols for real-time transmission of audio and video over UDP and IP multicast. The focus of this group is near-term and its purpose is to integrate and coordinate the current AVT efforts of existing research activities. No standards-track protocols are expected to be produced because UDP transmission of audio and video is only sufficient for small-scale experiments over fast portions of the Internet. However, the transport protocols produced by this working group should be useful on a larger scale in the future in conjunction with additional protocols to access network-level resource management mechanisms. Those mechanisms, research efforts now, will provide low-delay service and guard against unfair consumption of bandwidth by audio/video traffic.

Similarly, initial experiments can work without any connection establishment procedure so long as a priori agreements on port numbers and coding types have been made. To go beyond that, we will need to address simple control protocols as well. Since IP multicast traffic may be received by anyone, the control protocols must handle authentication and key exchange so that the audio/video data can be encrypted. More sophisticated connection management is also the subject of current research. It is expected that standards-track protocols integrating transport, resource management, and connection management will be the result of later working group efforts.

The AVT Working Group may design independent protocols specific to each medium, or a common, lightweight, real-time transport protocol may be extracted. Sequencing of packets and synchronization among streams are important functions, so one issue is the form of timestamps and/or sequence numbers to be used. The working group will not focus on compression or coding algorithms which are domain of higher layers.

Goals and Milestones:

Done

  

Conduct a teleconference working group meeting using a combination of packet audio and telephone. The topic will be a discussion of issues to be resolved in the process of synthesizing a new protocol.

Done

  

Define the scope of the working group, and who might contribute. The first step will be to solicit contributions of potential protocols from projects that have already developed packet audio and video. From these contributions the group will distill the appropriate protocol features.

Done

  

Review contributions of existing protocols, and discuss which features should be included and tradeoffs of different methods. Make writing assignments for first-draft documents.

Done

  

Post an Internet-Draft of the lightweight audio/video transport protocol.

Done

  

Post a revision of the AVT protocol addressing new work and security options as an Internet-Draft.

Jun 93

  

Submit the AVT protocol to the IESG for consideration as an Experimental Protocol.

Internet-Drafts:

Request For Comments:

RFC

Status

Title

 

RFC1889

PS

RTP: A Transport Protocol for Real-Time Applications

RFC1890

PS

RTP Profile for Audio and Video Conferences with Minimal Control

RFC2035

PS

RTP Payload Format for JPEG-compressed Video

RFC2032

PS

RTP payload format for H.261 video streams

RFC2038

PS

RTP Payload Format for MPEG1/MPEG2 Video

RFC2029

PS

RTP Payload Format of Sun's CellB Video Encoding

RFC2190

PS

RTP Payload Format for H.263 Video Streams

RFC2198

PS

RTP Payload for Redundant Audio Data

Current Meeting Report

Minutes of the Audio/Video Transport (avt) Working Group

Reported by Steve Casner

1. Introduction and Status

The AVT working group produced the Real-time Transport Protocol, which was published in January 1996 as Proposed Standard RFC1889 along with the companion RTP profile for audio/video conferencing RFC1890. AVT met for two busy sessions at the 40th IETF meeting in Washington, DC to discuss continuing work on RTP and auxiliary documents. At this meeting, a presentation was given on changes to the RTP specifications as detailed in new Internet-Drafts revising the specs towards advancement to Draft Standard status. There was significant discussion of a topic raised earlier on the mailing list that impacts the profile spec: how to get payload formats defined and named for hundreds of existing codecs, perhaps using a generic format. Presentations were also given on the revisions to the RTP MIB and on several new payload format specifications. It is expected that the RTP specifications and a number of the payload specifications will be ready for publication before the next IETF meeting.

Since the last IETF meeting, two payload format specifications were published as Proposed Standards: RFC2190 for H.263 video, and RFC2198 for redundant audio. Note that a newer payload format for the 1998 revision of H.263 is nearing completion as described below.

Just after the meeting, the IESG approved publication of the Internet-Draft revising the payload format for MPEG-2 video in RFC2038. Also since the meeting, a request was sent to the IESG to issue Last Call on a package of three drafts including the IP/UDP/RTP header compression specification following the PPPEXT working group's approval at this meeting of a set of proposed number assignments needed for these specifications.

2. Changes to the RTP Specification and Audio/Video Profile

The next goal of the AVT working group is to advance the RTP specification to Draft Standard status. Toward that end, Steve Casner reviewed revisions of the RTP spec and profile that have been prepared to incorporate clarifications to the text and changes to expand the applicability of RTP based on experience during the Proposed Standard stage. These revisions have been posted as Internet-Drafts:

draft-ietf-avt-rtp-new-00.ps, .txt
draft-ietf-avt-profile-new-02.ps, .txt

Readers are encouraged to see the PostScript versions of these drafts because they mark the changes with change bars.

A question raised at this meeting suggests there may be some confusion about the standardization process. An IETF standards-track RFC may be advanced to the next level without republication if no changes are required. However, when changes are needed, the mechanism is to post an Internet-Draft containing the revisions. That draft will then be published as a new RFC with a new number. Part of the IESG approval process is to judge whether the changes are acceptable while advancing to the next level or require another pass at the same level.

The changes to the RTP spec were briefly reviewed in this meeting and are listed in the introduction section of the draft. Although the changes are several, they do not affect the protocol format nor introduce any incompatibilities, so approval of these changes for advancement to Draft Standard is expected. The changes are primarily new rules and enhancements to the algorithms governing how RTP is used. In particular, several enhancements were added in the management of RTCP bandwidth to avoid scaling problems in very large groups with many simultaneous joins and leaves. Also added was a sampling method to reduce SSRC storage requirements when the number of participants is very large. These enhancements have been discussed in detail in the past three AVT meetings.

At this meeting, Jonathan Rosenberg described the latest of these enhancements: a "BYE reconsideration" algorithm to avoid a flood of RTCP BYE packets when many participants leave a session at the same time. This algorithm is included in the revised RTP spec. In addition, a more detailed description of the algorithm including simulation results is given in draft-ietf-avt-byerecon-00.ps and .txt.

This algorithm completes the planned set of scaling enhancements, except for some small refinements being considered for three of the algorithms. The algorithms have been validated through simulation, but it is important to make sure they work well in real implementations on a large scale. Anyone who can perform such a test is highly encouraged to do so and to report the results.

In addition to the changes already included in the RTP spec, there are two changes tentatively accepted in previous meetings but not yet entered in the spec: allowing the RTCP sender and receiver bandwidths to be specified explicitly rather than being fixed at 5%, and scaling the minimum interval for RTCP messages inversely proportional to bandwidth. These changes will be added in the next draft. Lastly, there are several other proposed small changes listed in the draft that received some discussion in this meeting and will be discussed further on the mailing list to decide if they should be made.

The primary change for the RTP audio/video profile draft is a more complete list of encodings (primarily audio) and better descriptions for both old and new ones. Of course, the profile can't include all encodings. Therefore, the main open issue for the profile is to better define namespace for encodings, the procedures for registering additional names, and how dynamic payload type mappings should be expressed using those names. That is the topic of the next section.

Slides: ftp://hydra.precept.com/pub/rtp/ietf40-avt.ppt
http://www.cs.columbia.edu/~jdrosen/ietf_byerecon95.ppt
http://www.cs.columbia.edu/~jdrosen/ietf_byerecon.ps
http://www.cs.columbia.edu/~jdrosen/ietf_byerecon.pdf

3. Generic Payload Type Mapping and Fragmentation

Before this meeting, a discussion began on the mailing list regarding how to enable RTP transport of the hundreds of existing codecs for which no payload formats have been defined. Steve Casner described the perceived problems as:

One answer is to define a generic payload format that introduces another level of indirection for the namespace (perhaps using another existing namespace) rather than using the RTP payload type field. The data would be packetized without optimization for a particular encoding, in some cases treating the data as opaque. Two drafts proposing such schemes have been submitted, the first in June for QuickTime media streams (draft-ietf-avt-qt-rtp-00.txt), and more recently a second for ASF streams (draft-klemets-asf-rtp-00.txt). Since in many cases the same encoding might be carried in either scheme but the two schemes would not interoperate, multiple people noted on the mailing list that if a generic scheme were to be defined, there should be only one.

At this meeting, two presentations were made to set the stage for discussion of whether a generic payload format is needed, and if so, what it should be. The first was by Anders Klemets who described a minimalist generic payload format and then compared it to the ASF payload format. He defined the problem as one of extensibility for large video-on-demand servers. Files containing new encodings might be added, but it would not be practical to upgrade the server to add a new packetization function for each. Generic payload formats are already been used in some commercial products as a solution to this problem. Should a generic payload format be standardized by AVT, or should market forces decide?

Klemets described a simple generic payload format in which the specification of what encoding is actually carried is specified out-of-band, e.g., using dynamic payload type mapping in SDP. The payload format itself would include a minimal payload header to preserve frame boundaries; it would implement fragmentation of large frames into multiple packets, and grouping of multiple small frames into one packet. He described the objectives for the ASF payload format as largely the same, but optimized for content stored in ASF files. In addition to fragmentation and grouping, the ASF payload format includes a frame sequence number, a key-frame flag, and a send-timestamp. If AVT does define a generic payload format, should some or all of the ASF payload format features be folded into it?

In the second presentation, Henning Schulzrinne aimed to disentangle the multiple, distinct issues that have sometimes been confused in the discussion; some are easy to solve, and others need new work:

There was substantial group discussion of these issues. Several people identified the name space issue as the most important. There was at least rough consensus that the right way to handle the multiple existing namespaces was to map them into the audio and video types of the MIME namespace registered with IANA. This includes the encoding names in the RTP profile RFC1890, the Microsoft WAV and AVI registries, and Apple QuickTime. Harald Alvestrand emphasized that even though some of the registrations in the existing namespaces may be obsolete, they should all be mapped into the MIME namespace where they could be marked as obsolete if appropriate. David Singer pointed out that mapping in the names would be straightforward, but the real work is in determining which names from separate spaces identify the same encoding and should therefore map to the same MIME type. Otherwise, nothing is gained in the re-registration.

The second part of the mapping problem is to specify how the MIME types would be associated with dynamic RTP payload type numbers, for example in the SDP "rtpmap" attribute. Previously, the payload type was mapped to an encoding name that implied a particular packetization format as well. If that is not the case for the transferred registrations, this mapping may need to be extended to identify separately the encoding and packetization format. Eric Fleischman also expressed concern that many of the encodings may have various parameters that need to be specified in the payload type mapping, whereas currently only sample rate and number of channels are defined in RFC1890 and rtpmap. SDP does have another attribute "fmtp" which may be appropriate for carrying these parameters. Singer noted that sometimes the codec parameters or other meta information might be too large to fit comfortably in SDP and might not be human readable as is the norm for SDP. Casner observed that the length constraint is primarily a concern when SDP is delivered via SAP, but is less a problem with RTSP. If some of the meta info is constant and reusable, it might be defined elsewhere and referenced by a short name in SDP.

Given the dynamic payload type mapping, there is no need for a packetization format to carry subtype information. However, Singer pointed out that a generic packetization is still needed to do blind fragmentation for historic content when no information is available on how to packetize. That's not ideal, but better than a blank screen. Don Hoffman asserted that if a generic format is to be defined, there should be only one and not many. We have the QuickTime and ASF proposals now, and MPEG4 may run into the same problem. Bob Webber expressed concern about the increased overhead of a generic payload format. Casner suggested a small family of generic formats that implement just the features that an encoding needs, rather than use a bitmask to indicate which features are present in each packet. Regarding the "send" timestamp in the ASF proposal, Singer asserted that if it is useful, it belongs at RTP level rather than at the packetization level. This function was considered and omitted when RTP was designed; we should either decide that was a mistake and change RTP, or leave it out. We should not put it in for some formats and not others.

The issues for generic packetization are less clear than for name mapping. Mark Handley explained that for many encodings, there is a big gain in robustness to loss if one designs a payload format optimized for each encoding to include a small amount of predictor state in the payload format header. Yet Fleischman says the existing file formats have methods to express fragmentation and reassembly of frames, and he wants to use the a generic packetization mechanism that can enable all this data that was originally targeted for other means of communication to be used with RTP. Anup Rao asserted that packetization belongs where it is in RTP, not in the file format. A codec provider should tell the media server how to put the data into RTP, either through a document, or code, or set of rules defining what to do with objects in the file. The tradeoff of optimized performance versus the reduced effort of a generic mechanism may be the most contentious issue. The working group clearly has work to do both on name mapping and on packetization which should be continued in discussion on the mailing list.

Slides: ftp://hydra.precept.com/pub/rtp/ietf40-avt.ppt
http://www.cs.columbia.edu/~hgs/papers/Schu9712:RTP.ps.gz

4. Revision of RTP MIB Specification

A new draft of the RTP MIB was posted prior to this meeting as draft-ietf-avt-rtp-mib-01.txt. Mark Baugher gave a presentation on the changes in this version and the status of the MIB implementation. The main difference from the -00 draft is the removal of RTP translator functions because there were no immediate plans for those functions to be implemented and thereby validated. A few other changes were made as a result of implementation experience, plus there were several clarifications in response to a review of the MIB by Fred Baker. Some dozen variables have been added across the various tables in the MIB.

The MIB consists of two modules: the RTP-SYSTEM module for end-systems reports current values of variables associated with streams being sent or received, whereas the RTP-IS module for RTP monitor systems records information from RTCP sender and receiver reports for access via network management applications. The MIB allows monitoring of RTP sessions that are initiated by some means, but cannot initiate or control sessions itself.

One variable of note that was added to the RTP-IS module is the round-trip time (RTT). This variable will only have a meaningful value if the RTP monitor's clock is synchronized with that of the sender of the particular stream; however, since it is expected that usual practice for the common "broadcast" scenario will be to locate an RTP monitor on the sending host, the clock will be the same.

The expected application for the RTP MIB is a help-desk scenario. Intel has implemented the MIB for use in conjunction with IP Multicast and RSVP MIBs to monitor operation of RTP applications across their multicast backbone. Integration into standard network management applications eases the monitoring task for the IT division. The remote monitoring capability simplifies "dry run" testing before broadcast events to assure quality. The Intel implementation will also be evaluated by HP. The next steps include an independent implementation by 3Com and fitting the RTP MIB into the ITU-T H-Media MIB framework. It is expected that there will be one more revision of the draft, then at the conclusion of the testing underway at Intel, it is planned that the RTP MIB will be submitted for publication as a Proposed Standard.

Slides: http://www.rdrop.com/users/mbaugher/rtpmibv01/

5. New RTP Payload Format Proposals

Several new payload types have been introduced at the past one or two AVT meetings. Some of these are now completed and ready for publication, though others need further work.

5.1 RTP use in MPEG4 and the Role of DMIF

Vahe Balabanian chairs the Delivery Multimedia Integration Framework (DMIF) subgroup within MPEG and gave a presentation to AVT to explore how MPEG-4 can make optimal use of RTP through DMIF. The DMIF specification ISO/IEC 14496-6 is concerned primarily with control plane (signaling) functions including QoS, but works in conjunction with the MPEG-4 Systems specification ISO/IEC 14496-1 which defines the data plane.

The generic MPEG-4 architecture is composed of three layers: a compression layer, a systems layer that manages synchronization and hierarchy among elementary streams, and a delivery layer. The DMIF Application Interface (DAI) between the systems and delivery layers is intended to allow playback of streams transparent to the source location and delivery technology. Steve Casner observed that one challenge in attempting to map MPEG-4 to RTP is that RTP provides some of the function of the system layer, so the DAI may not fit RTP well. [This is one aspect of RTP's Application Level Framing design philosophy in which processing is integrated across layers; attempting to keep upper layers unaware of networking issues can make optimization of the overall system more difficult. --Ed.]

In MPEG-4, images are rendered from multiple elementary streams of primitive audio, video and graphical objects according to scene description information that tells how the primitive objects are to be composed. The scene description is itself another elementary stream.

Across the (informative) interface between the compression layer and the systems layer, each elementary stream is delivered as a sequence of access units (AUs). The systems layer is responsible for fragmenting the AUs into packets (called AL-PDUs) which are passed across the (normative) DAI to the delivery layer. The header of the first AL-PDU of an AU contains the full timing information passed with the AU, but subsequent AL-PDUs of that AU have a smaller header. MPEG-4 Systems also defines a sublayer of the delivery layer to multiplex multiple elementary streams (in AL-PDUs) with different QoS and priority requirements into a "FlexMux stream" before transmission over a network transport channel, but the FlexMux sublayer may be bypassed. The format of the AL-PDUs would be the starting point for generating RTP packets at the transmitter. Conversely, at the receiver AL-PDUs would have to be regenerated from the incoming RTP packets.

Balabanian presented a strawman proposal for how RTP might be used for MPEG-4 to get the AVT group's feedback before taking the proposal to the MPEG committee. One goal of the design is to transform parameters from the MPEG-4 headers into RTP fields (such as sequence number and timestamp) as much as possible in order to minimize duplicate information. The RTP data packets could contain multiple AL-PDUs using a multiplexing header borrowed from the FlexMux concept. There were several comments on the proposal to use the Elementary Stream ID as the SSRC ID. These are unique within one MPEG-4 session, but perhaps not in a multi-sender multicast scenario. There were also questions about how the information other than audio and video might be carried; some may be static for the session and suitable for out-of-band transmission in SDP or RTSP, while for the dynamic information such as scene descriptions RTP/UDP might not be enough.

The strawman also proposed that only the RTCP SR and RR packets be used since existing MPEG-4 descriptors and signaling carry the other RTCP information. However, Jonathan Rosenberg commented that the SDES CNAME is used in SSRC collision detection. Don Hoffman suggested that the use of BYE should not be precluded since MPEG-4 might also be used in lightweight sessions with no additional signaling.

Casner suggested that there were many details in the presentation that the group probably didn't understand sufficiently to make comments in real time, and that the right way to proceed would be to write an Internet-Draft describing the proposal. Comments will be solicited on the mailing list when that draft is prepared.

5.2 H.263+ Video Payload Format

As noted in the introduction, the payload format specification for H.263 video has been published as a Proposed Standard RFC2198. However, enhancement of the encoding itself continues under the label H.263+ to improve loss resilience and to utilize increased processing power to achieve higher quality. As agreed in previous AVT meetings, a separate RTP payload format is being designed to support the enhanced encoding. In Munich, Stephan Wenger presented two different approaches proposed in separate drafts. As promised then, the two approaches have since been merged into a new Internet-Draft named draft-ietf-avt-rtp-h263-video-00.txt that was submitted in November. The presentation at this meeting was based upon that draft with some modifications proposed as a result of the ITU H.263+ meeting the week before IETF. Those modifications and feedback from this meeting were incorporated into draft-ietf-avt-rtp-h263-video-01.txt posted since the meeting.

The H.263+ enhancements include several error resilience mechanisms designed to scale over a packet loss probability from 0 to 20%. The new RTP payload format supports all of those mechanisms. The basic payload header is 16 bits with two optional extensions: an 8-bit Video Redundancy Coding header, and an 80-bit back-channel message discussed below. The basic header specifies the bit length of the redundant Picture Header that may be included in packets after the first of a picture. This is one of the primary error resilience mechanisms. Details are in the draft.

Two usage examples were given: H.263+ can achieve "very good" quality with 10 frames per second in QCIF size at 112kb/s. That's an average of 1400 bytes per frame, which means most frames fit in a single packet. The second example was a layered encoding scheme with a 20kb/s base layer producing 7.5 frames/sec QCIF, a 90kb/s first enhancement layer improving the display to 15 frames/sec CIF, and a 60kb/s second enhancement layer increasing the frame rate to 30 frames/sec CIF. At the end of the meeting, Christian Maciocco provided an on-site demonstration of this second example implemented at Intel. In addition, AVT participants were invited to download from http://kbs.cs.tu-berlin.de/payload a test implementation of the vic tool with a public domain H.263+ and the new packetization scheme added.

There are two open issues. The first issue is whether fragmentation may be allowed at byte boundaries within a macroblock so that it is not necessary to add payload header fields to indicate unused bits at the start and end of the payload. The concern is that decoder performance will be reduced by having to check for end-of-buffer in the innermost loop. This question will be answered by simulations.

The second issue was the subject of numerous comments in the meeting: should the back-channel message be included, and if so, how should it be used? The purpose of the back-channel message is to tell the encoder after an error occurs what was the most recent picture successfully received. For point-to-point communication, this is the most efficient error-resilience mechanism. However, it does not scale to large multicast groups. Furthermore, several participants said that control information such as this should be carried in RTCP rather than as an optional header in RTP data packets. Jonathan Rosenberg noted that the RTCP packet rate can be scaled to be more frequent for unicast applications so the interval need not be a problem. Carsten Bormann said the concern with using RTCP is the additional overhead of separate RTCP packets in scenarios where they are sent frequently, in particular when the back-channel message is used as an ACK rather than a NACK. He also pointed out that back-channel signaling in RTP packets would only be used when video was already being sent in the backward direction (continuous presence). Scott Petrack said he'd been considering a proposal for a generic mechanism to compress RTCP packets and piggy-back them on RTP packets to reduce overhead in very low bandwidth scenarios, but that the idea needed further work.

There was also a discussion of whether there would be too much latency for the back-channel mechanism to be useful. However, Gary Sullivan pointed out that in fixed-camera scenarios such as a teleconference, even an extremely old reference frame can be useful because much of the background does not change.

Bormann said the proposal should define a means to carry the information in RTCP as well as RTP, and specify when each of these mechanism is to be used. Wenger noted that when H.263+ is used in the H.323 framework, the backchannel information would be carried in the H.245 control connection, not in RTP anyway, so he would not be too concerned to leave it out of the payload format proposal entirely. Steve Casner expressed concern about having multiple ways to send the back-channel information leading to incompatibilities. Joerg Ott suggested that the RTP and RTCP options should be simulated to determine the optimum operation point for both mechanisms.

Bob Webber asked whether this new H.263+ payload format would obsolete the H.263 payload format in RFC2190, such that the latter would not be advanced beyond Proposed Standard. Wenger expects that the industry will move quickly to H.263+. Casner replied that the decision not to advance RFC2190 would be based on a response from the constituency that it is not needed. The problem is to determine how to contact that constituency. This question was deferred to discussion with the authors of RFC2190.

Bormann said that the authors intended to update the draft soon after the meeting, and requested prompt feedback from the working group so that the payload format could be put up for working group last call in January. This is important in order to meet some ITU deadlines; otherwise, a draft version of the spec might get incorporated into an ITU document such that AVT would no longer be able to change it.

Slides: http://kbs.cs.tu-berlin.de/payload/slices/

5.3 BT-656 Video Payload Format

Dermot Tynan presented a revised proposal for carrying ITU-R BT.656-3 uncompressed video over RTP in draft-tynan-rtp-bt656-01.txt. BT.656 is studio-quality digital video sampled according to BT.601-5 (formerly CCIR601) at 13.5 or 18 MHz. At the normal, lower rate, each scan line contains 720 samples occupying 1440 bytes in the 4:2:2 chrominance encoding. At the "high definition" rate, each line contains 1144 samples for NTSC or 1152 samples for PAL. The RTP payload format consists of a fairly simple 32-bit header followed by one scan line of samples (or a fragment thereof).

The changes since the previous version presented in Munich are that the flags to indicate NTSC/PAL and 13.5/18 MHz sampling rate have been combined into a single type field of 4 bits; an extra flag has been added to indicate 10-bit quantization vs. the default 8-bit; and per the recommendation of the working group in Munich, an 11-bit scan offset field has been added to support fragmentation of scan lines that are too long for the network MTU. Fragmentation must occur only on a sample-pair boundary (that is, between Cb,Y,Cr,Y units) and the offset is expressed in units of sample-pairs. For the 10-bit quantization, sample-pairs occupy 40 bits (5 octets) so the fragmentation does occur on an octet boundary but not necessarily on a 32-bit boundary.

The scan line and scan offset fields are large enough to accommodate growth beyond the picture sizes currently defined in BT.601-5, but type field values are only defined for the currently defined sizes. Michael Speer asked about support for progressive scan, but that's not covered because it's not defined in BT.656.

This draft is now ready for working-group last call and then Last Call for publication as a Proposed Standard.

Slides: ftp://xk10.wfa.digital.ie/pub/ietf/bt656-2.ppt

5.4 Revised JPEG Video Payload Format

The payload format for JPEG video was published in RFC2035 in October 1996 as a Proposed Standard. A proposal to revise the payload format has recently been posted as draft-ietf-avt-jpeg-new-00.txt and .ps based on implementation experience since then. Bill Fenner reviewed the changes which fall into three areas: interlace support, restart markers, and in-band quantization table support.

Early drafts of the payload format before RFC publication included support for interlaced video, but it was removed because of problems with interlaced formats on progressive-scan displays meant there was not consensus on its inclusion. This revision resurrects support for interlaced scan since it is now seen to be needed for systems using interlaced displays. Instead of dedicating bits in the "type" field to indicate interlace, codes are defined in the "type-specific field" for types for which interlace is useful. An added feature is the ability to indicate a single field which should be line-doubled for display.

Support for restart markers was a late addition before publication of RFC2035, and the method chosen has some disadvantages. It is believed that no implementations are using type codes for restart markers as currently specified. The new draft dedicates one bit of the type field to indicate the presence of restart markers in the data since their use is orthogonal to the type selection. That would simplify the addition of new types, e.g., for non-square pixels. Also, the additional information required to support restarts is carried in a 4-byte optional extension to the payload format header rather than in a 6-byte JPEG DRI segment in order to keep the data 32-bit aligned.

To allow dynamic quantization tables to be included in-band, previously unspecified values of the Q-factor field indicate that an optional quantization table header follows the payload format header (or restart marker header, if present).

These changes are backward compatible with the subset of RFC2035 believed to be in use. Anyone who has information to the contrary is requested to comment on the mailing list. The new functions have been implemented to the extent that available codecs will support them. This draft should be ready for working group last call.

5.5 Forward Error Correction Payload Format

Jonathan Rosenberg presented the draft proposal for an FEC payload format given in draft-ietf-avt-fec-01.txt that was revised from the first version based on feedback from the presentation in Munich. This is a "meta" payload format to apply forward error correction using parity-like mechanisms independent of the base media type and format.

In this draft, rather than use an RTP header extension, FEC packets are identified by an FEC payload type which indicates that an FEC payload header is inserted between the RTP header and any base payload format header. The media packets are transmitted unmodified with their normal payload type; consequently, receivers that do not implement the FEC payload format can ignore the FEC packets and just process the media packets. The RTP timestamp on an FEC packet is now the minimum timestamp of the media packets covered by the FEC packet rather than the XOR of those timestamps. While this means the timestamp for a lost packet may not be recovered exactly, it avoids what would be essentially random timestamps that impair RTP header compression and preclude the use of RFC2198 to combine FEC packets in with subsequent data packets for reduced packet overhead. Similarly, the sequence number of FEC packets now has the standard meaning (one more than the previous packet) rather than being an XOR. The sequence number of a lost packet can be determined easily from its precursor or successor.

In most cases, the FEC payload header is only 32 bits, but it may be extended to 64 bits if the pattern of packets covered by the FEC code is longer than 8 (allowing patterns up to length 40). The header also includes fields to recover the length and payload type of the missing packet.

Implementation of this payload format is underway, and should be done well before the next IETF meeting. The next steps are to try some interoperability testing and to specify a generic recovery algorithm.

Slides: http://www.cs.columbia.edu/~jdrosen/ietf_fec95.ppt
http://www.cs.columbia.edu/~jdrosen/ietf_fec.ps
http://www.cs.columbia.edu/~jdrosen/ietf_fec.pdf

5.6 Options for Repair of Streaming Media

In Munich, Colin Perkins gave a presentation reviewing the several error recovery mechanisms that have been used or proposed for RTP media streams. The recently updated draft-ietf-avt-info-repair-01.txt includes additional references to papers on these mechanisms and expanded recommendations for scenarios in which the algorithms might be used. At this meeting, he briefly reviewed the open issues in the draft and asked for comments.

The primary issue is congestion control and adaptivity to variations in loss rates and available bandwidth. What is a sensible operating point for loss mitigation algorithms? It is probably not reasonable to try to apply these mechanisms to recover from 60% packet loss. Is it reasonable for this draft to make specific recommendations for acceptable target and maximum overhead levels? It may be possible to define the sensible (fair) operating point based on the TCP equivalence function which gives an approximation of TCP's throughput for a given loss rate.

The goal for this draft is publication as an Informational RFC.

5.7 New GSM Formats

Scott Petrack has requested the inclusion of two new GSM audio encodings in the revised draft of the RTP A/V Profile. One encoding, GSM 6.20, consumes half the data rate of the original GSM 6.10 format, and the other, GSM 6.60, provides improved quality at the same data rate. This addition to the profile is straightforward.

6. Recording RTP in ASF

Eric Fleischman submitted draft-fleischman-asf-rtp-record-00.txt in November to describe how MBone sessions conforming to RTP/AVP (RFC1890) may be transparently recorded into Microsoft's Advanced Streaming Format (ASF) files. Such a recording could optionally be replayed into a session to simulate the exact sequence of what happened, including identification of the speakers. This is part of a larger effort to define how ASF recordings may be done for a variety of contexts.

So far, only minimal private comments on the draft have been received. The working group is requested to read the draft and provide feedback on any improvements that might be needed.

7. Wrapup

Even though there was a lot of good discussion at this meeting, more would have been valuable but we ran out of time. Steve Casner will collect the action items from this meeting and bring them up on the working group mailing list where it is critical for this important discussion to continue. The goal is to have most of these issues resolved by the next meeting in March.

Slides

None Received

Attendees List

go to list

Previous PageNext Page