Audio/Video Transport (avt) Charter

2.8.1 Audio/Video Transport (avt)

In addition to this official charter maintained by the IETF Secretariat, there is additional information about this working group on the Web at:

http://www.cs.columbia.edu/~hgs/rtp/faq.html -- RTP FAQ Page

NOTE: This charter is a snapshot of the 56th IETF Meeting in San Francisco, California USA. It may now be out-of-date.

Last Modified: 2003-01-21

Chair(s):

Stephen Casner <casner@acm.org>
Colin Perkins <csp@isi.edu>

Transport Area Director(s):

Scott Bradner <sob@harvard.edu>
Allison Mankin <mankin@psg.com>

Transport Area Advisor:

Allison Mankin <mankin@psg.com>

Mailing Lists:

General Discussion: avt@ietf.org
To Subscribe: avt-request@ietf.org
Archive: ftp://ftp.ietf.org/ietf-mail-archive/avt

Description of Working Group:

The Audio/Video Transport Working Group was formed to specify a protocol for real-time transmission of audio and video over UDP and IP multicast. This is the Real-time Transport Protocol, RTP, together with its associated profile for audio/video conferences and payload format documents.

The current goals of the working group are to revise the main RTP specification and the RTP profile ready for advancement to draft standard stage (including the sampling algorithms for use with very large groups, which have been broken out into a separate document), to complete the RTP MIB, to produce a guidelines document for future developers of payload formats and to continue development of new payload formats.

The payload formats currently under discussion include a number of media specific formats (MPEG-4, DTMF, PureVoice) and FEC techniques applicable to multiple formats (parity FEC, Reed-Solomon coding).

Archive before July 2001: ftp://ftp.es.net/pub/mail-archive/rem-conf/

Goals and Milestones:

Done		Working group last call on parity FEC draft (standards track)
Done		Post revised RTP MIB and issue working group last call (stds track)
Done		Working group last call on guidelines for payload format writers (BCP)
Done		Post revised RTP spec and audio/video profile
Done		Post revised DTMF payload format draft, ready for WG last call
Done		Post RTP implementation checklist draft
Done		Post payload format for MPEG-4 based on MPEG/IETF joint meetings
Done		Post revised RTP membership (SSRC) sampling draft
Done		Post revised draft on PureVoice (qcelp) payload format to address WG last call comments
Done		Submit RTP MIB to IESG for publication as Proposed Standard RFC
Done		Submit guidelines for payload format writers for publication as a BCP
Done		New working group last call on PureVoice payload format
Done		Working group last call on revised SSRC sampling draft (experimental)
Done		Analysis/simulation of multiplexing payload format proposals
Done		Post final revision of RTP spec and A/V profile drafts
Done		Revise MPEG-4 payload format document after implementation experience
Done		Decide how to proceed with multiplexing protocol: one generic payload format or a number of application specific formats
Done		Working group last call on RTP and A/V profile (for Draft Standard)
Done		Prepare MPEG4 implementation results ready for WG last call
Done		Post final revisions of selected multiplexing protocol draft(s)
Done		Working group last call on multiplexing payload format (stds track)

Internet-Drafts:

- draft-ietf-avt-profile-new-13.txt

- draft-ietf-avt-rtp-new-12.txt

- draft-ietf-avt-rtp-mime-06.txt

- draft-ietf-avt-rtcp-bw-05.txt

- draft-ietf-avt-tcrtp-07.txt

- draft-ietf-avt-smpte292-video-08.txt

- draft-ietf-avt-crtp-enhance-07.txt

- draft-ietf-avt-ulp-07.txt

- draft-ietf-avt-rtp-selret-05.txt

- draft-ietf-avt-uxp-05.txt

- draft-ietf-avt-srtp-05.txt

- draft-ietf-avt-rtcp-feedback-05.txt

- draft-ietf-avt-mpeg4-simple-07.txt

- draft-ietf-avt-dsr-05.txt

- draft-ietf-avt-evrc-smv-03.txt

- draft-ietf-avt-mwpp-midi-rtp-06.txt

- draft-ietf-avt-rtcpssm-03.txt

- draft-ietf-avt-rtp-retransmission-06.txt

- draft-ietf-avt-rtp-interleave-00.txt

- draft-ietf-avt-rtp-jpeg2000-02.txt

- draft-ietf-avt-rfc2833bis-02.txt

- draft-ietf-avt-rtp-ac3-00.txt

- draft-ietf-avt-ilbc-codec-01.txt

- draft-ietf-avt-rtcp-report-extns-03.txt

- draft-ietf-avt-uncomp-video-02.txt

- draft-ietf-avt-rfc3119bis-01.txt

- draft-ietf-avt-rtp-h264-01.txt

- draft-ietf-avt-rtp-ilbc-01.txt

Request For Comments:

RFC	Status	Title
RFC1889	PS	RTP: A Transport Protocol for Real-Time Applications
RFC1890	PS	RTP Profile for Audio and Video Conferences with Minimal Control
RFC2035	PS	RTP Payload Format for JPEG-compressed Video
RFC2032	PS	RTP payload format for H.261 video streams
RFC2038	PS	RTP Payload Format for MPEG1/MPEG2 Video
RFC2029	PS	RTP Payload Format of Sun's CellB Video Encoding
RFC2190	PS	RTP Payload Format for H.263 Video Streams
RFC2198	PS	RTP Payload for Redundant Audio Data
RFC2250	PS	RTP Payload Format for MPEG1/MPEG2 Video
RFC2343	E	RTP Payload Format for Bundled MPEG
RFC2354	I	Options for Repair of Streaming Media
RFC2429	PS	RTP Payload Format for the 1998 Version of ITU-T Rec. H.263 Video (H.263+)
RFC2431	PS	RTP Payload Format for BT.656 Video Encoding
RFC2435	PS	RTP Payload Format for JPEG-compressed Video
RFC2508	PS	Compressing IP/UDP/RTP Headers for Low-Speed Serial Links
RFC2733	PS	An RTP Payload Format for Generic Forward Error Correction
RFC2736	BCP	Guidelines for Writers of RTP Payload Format Specifications
RFC2762	E	Sampling of the Group Membership in RTP
RFC2793	PS	RTP Payload for Text Conversation
RFC2833	PS	RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals
RFC2862	PS	RTP Payload Format for Real-Time Pointers
RFC2959	PS	Real-Time Transport Protocol Management Information Base
RFC3009	PS	Registration of parityfec MIME types
RFC3016	PS	RTP payload format for MPEG-4 Audio/Visual streams
RFC3047	PS	RTP Payload Format for ITU-T Recommendation G.722.1
RFC3119	PS	A More Loss-Tolerant RTP Payload Format for MP3 Audio
RFC3158	I	RTP Testing Strategies
RFC3189	PS	RTP Payload Format for DV Format Video
RFC3190	PS	RTP Payload Format for 12-bit DAT, 20- and 24-bit Linear Sampled Audio
RFC3267	PS	RTP payload format and file storage format for the Adoptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) audio codecs
RFC3389	PS	RTP Payload for Comfort Noise

Current Meeting Report

Audio/Video Transport Working Group Minutes

Reported by Stephen Casner and Colin Perkins

   The avt working group met in two sessions at the 56th IETF meeting in San 
Francisco. In the first session, the group discussed the status of 
several documents in progress including the profile for Secure RTP, plus RTP 
framing over TCP, multiple RTCP extensions, and two video payload 
formats.  In the second session, the discussion covered seven payload 
formats related to audio (including MIDI and distributed speech 
recognition).

Introduction, Document Status, and Open Issues

   This meeting began with an update by Steve Casner on document 
publication status.  Two momentous steps were achieved in the days 
preceding the meeting:  The revised RTP specification and A/V profile 
(revisions of RFC 1889 and 1890) were approved by the IESG for 
publication as Draft Standards, and the payload format for MPEG-4 was 
submitted by the working group to the IESG with a request for 
publication as a Proposed Standard.  The RTP spec and A/V profile had been 
tentatively approved before the previous meeting, but there were several 
"RFC Editor note" clarifications requested by the IESG.  In addition, a 
number of questions regarding the text have been posted to the WG 
mailing list during the IESG review.  Consequently, Steve Casner wrote 
clarifications to address these questions and WG participants 
discussed them on the mailing list after the previous meeting.  
Ultimately, there were enough changes that the IESG agreed that the 
drafts could be revised before submission to the RFC editor: 
draft-ietf-avt-rtp-new-12 and 
draft-ietf-avt-profile-new-13.  The IESG gave final approval of these 
revised drafts and sent them to the RFC Editor.  Steve briefly reviewed the 
list of changes at this meeting and asked those who made comments in the 
mailing list discussion to verify that the results are 
satisfactory.

   We continued the pattern of publishing at least one RFC since the last 
meeting (RFC 3497 on the payload format for SMPTE-292M video), and four 
other drafts are in the RFC editor queue (the MIME registration for the 
payload formats in the RTP profile and the SDP bandwidth modifiers for RTCP 
bandwidth, both waiting for the RTP specification, and the payload 
formats for EVRC/SMV and ETSI ES 201 108 DSR).  Four drafts are with the 
IESG (ECRTP which is in Last Call, TCRTP, SRTP, and MPEG-4).
   
   Several drafts are either in WG last call or may be ready.  The draft on 
RTCP Feedback profile AVPF 
(draft-ietf-avt-rtcp-feedback-05) has been in last call for some time and is 
awaiting a final revision.  The companion draft which reports on 
simulations to validate the protocol design has been revised 
(draft-burmeister-avt-rtcp-feedback-sim-01) and will be published as 
Informational at the same time.  The RTP Retransmission draft, which uses 
the AVPF, has been revised based on WG comments and is ready for WG last 
call when the AVPF draft is updated.  Two drafts on Uneven Level 
Protection and Unequal Erasure Protection are being considered together as 
alternative methods; a new approach is proposed for ULP to revise RFC 2733 to 
fix its non-standard use of RTP header bits, but this was not 
completed before the meeting.  Lastly, 
draft-ietf-avt-rtcp-report-extns-03 on RTCP Extended Reports (XR) was 
discussed at this meeting to consider whether it is ready for last call.


Secure Real-time Transport Protocol

   Mark Baugher reported that SRTP 
(draft-ietf-avt-srtp-05) was submitted to the IESG in mid-2002 and IETF 
Last Call was issued, but the Security and Transport Area Directors 
expressed concerns regarding the interaction between the encryption 
ciphers and authentication.  Those concerns were discussed at the 
previous IETF AVT meeting and subsequently with the ADs.  Mark 
summarized the agreed modifications: authentication using an 80-bit MAC is 
mandated for SRTCP and the default for SRTP, but user may choose null 
authentication for SRTP.  Precise language will be crafted for the draft 
about when this is acceptable and to make clear the risks.  
Discussion of error tolerance will be removed from the draft, even though it 
was one of the original design requirements, because it has become a point of 
controversy due to its dependence on changes at several other layers of the 
protocol stack.  Other considerations are sufficient to justify the 
design decisions.  Automatic key management is mandated to ensure that the 
key streams are not repeated in order to avoid a serious failure case with 
the default counter mode cipher.  The authors also requested 
permission to make some small changes/clarifications reflecting recent 
feedback from implementers; these will be sent in a separate note to the 
list.  Mark stated the authors' intention to produce a new draft in a 
couple of weeks.  Steve Casner reported that our AD Allison Mankin had 
mentioned in a side conversation at this meeting that she wants this draft to 
go ahead quickly.

    
Framing RTP over Connection-Oriented Transport

   John Lazzaro presented 
draft-lazzaro-avt-rtp-framing-contrans-00, a new draft that restates the 
description of RTP framing over TCP that was removed when the RTP/AVP 
Profile (RFC 1890) was revised since no report of 
interoperability of the feature was obtained.  It is now claimed that 
there are implementations, so it would be appropriate to have it 
documented.  As this draft was prepared, two related problems became 
apparent:  1) the MMUSIC comedia draft specifies SDP session 
descriptions using TCP transport, but does not specify a format 
parameter to indicate RTP/AVP inside the TCP; and 2) RTSP (RFC 2326) 
specifies a single TCP connection carrying interleaved control and data, but 
not separate data streams carried in TCP.  This draft includes sections to 
address these last two topics, but it is an open issue for the WG 
whether these should be kept or abandoned.

   As an alternative to this "classical" framing of RTP in TCP using a 
16-bit frame length field between frames, some people have suggested on the 
mailing list that the multiplexed framing specified in RTSP be used 
instead.  John asked for input from the group whether one or the other (or 
both or neither) approach should be followed.  Steve Casner noted that the 
TCP framing was removed in the profile revision due to lack of interop, and 
asked whether there really is sufficient interest now to warrant 
publishing a spec.  John replied that this is needed in MIDI over RTP (see 
MWPP, later) because much of the user community is uncomfortable with UDP.  
Ross Finlayson spoke in favor of using the RTSP framing which is 
implemented in several RTSP servers including Apple's and his own 
(Live.com).  Henning Schulzrinne countered that it is only really 
applicable if there's a control protocol in the same stream, and 
suggested that we just specify the "classical" framing now and leave for the 
future a specification of how RTSP framing can be used separately from RTSP 
(requiring a different format parameter in SDP to indicate that 
framing).  One reason is that RTSP is undergoing update now.  Ross 
suggested that specifying a default "channel ID" would suffice for using 
RTSP framing without a control protocol, but others were concerned that 
there would be a desire to use multiplexing with other control 
protocols where the syntax of the RTSP framing would conflict, so it is not 
general.  Colin Perkins summarized the discussion by saying that this 
draft should restrict itself to the classical framing and leave the RTSP 
framing to be specified somewhere else if there is interest or dropped.  The 
SDP format parameter "TCP RTP/AVP" would indicate this simple case; other 
parameters could be defined for multiplexed framing as needed.

   Regarding the other problem of specifying how data can be carried in 
separate TCP streams under RTSP, John concluded and Magnus Westerlund 
agreed that this should be left to the revised RTP specification.

RTCP Extended Reports (XR)

   Timur Friedman reviewed the changes to the RTCP reporting 
extensions in 
draft-ietf-avt-rtcp-report-extns-02 with minor additional updates in -03.  
The RTCP packet type number is changed to 207 to avoid the conflict with the 
RTCP Feedback draft, and the numbering and structure of a few of the 
report blocks have been simplified.  There are several open issues 
regarding the -03 revision, the first of which is that the title was 
changed to avoid the need to spell out the RTCP acronym.  Steve Casner 
feels that the title should remain RTCP because that is the accurate 
topic.  The title would be "RTP Control Protocol (RTCP) Extended 
Reports", which is not too long.  Section 4.3 specifies how packet 
arrival timestamps in RTP timestamp format may be reported back; Magnus 
Westerlund has suggested that these timestamps should be converted to some 
fixed units to avoid problems if the RTP timestamp clock rate changes.  
Steve Casner said that is not usually done because it causes a number of 
other complications, and that you should be able to use the sequence 
numbers that are returned with the timestamps to select the correct rate.  
However, Steve asked for clarification in the text that these are 
arrival timestamps, not the sender's timestamps.

   In Section 4.4, the definition of standard deviation and TTL will be 
clarified.  The variable geometry of this packet type, indicated by a bit 
field, was simplified to a single format.  Some people would prefer the 
space savings of the variable format, while others prefer the 
simplicity.  Steve suggested that if particular subsets are most useful, 
then those subsets should be defined as distinct block types.

   Alan Clark discussed the changes to the VoIP metrics in Section 4.7.  In 
-02, echo level measurements were added since a number of 
implementations produce that data.  In -03, to support for 
sample-based codecs as well as frame-based codecs, the jitter buffer 
metric was changed to units of time (5ms) rather than frames.  A couple of 
people raised the question whether the 5ms unit should be 1ms instead.  
Alan responded that 5ms seemed a better tradeoff between resolution and 
range for the performance effects to be measured.  It could be done, but 
would require increasing three fields from 8 to 16 bits.  Anwar ??? asked 
about the requirement that all fields of the VoIP report block be 
supported.  Alan responded that although the block structure is fixed, some 
of the fields allow an "undefined" value for use when the metric does not 
apply.  Steve Casner asked for the use of the undefined value to be 
clarified.  Anwar also asked if there would be a XR block type for 
one-way delay.  Timur responded that this could be added in the future if 
someone develops a method that proves useful.  Philippe Gentric wants to get 
reports of instantaneous fill level rather than average; Alan said the 
collection of measures should allow this to be determined.

   Steve Casner raised a few issues regarding the completeness of the 
draft and the merging of the VoIP metrics with the other sections.  The 
VoIP metrics include a measure of round trip time, but as specified, this is 
only possible if data is being transmitted in both directions.  The draft 
needs to explain that, or to explain what to do if that's not the case.  The 
VoIP section could refer to the the non-sender RTT measurement method in 
Section 4.5, but it doesn't.  The draft needs an applicability 
statement at the top to explain which of the tools in this toolbox should be 
used in which situations, and how they would be used.  The draft should be 
careful to avoid unstated assumptions about the use cases.  Another 
problem is that Section 4.7 gives rules for how often the VoIP metric 
blocks must be sent.  This draft is not allowed to say that.  The RTP spec 
defines the basic RTCP packet timing, and the RTCP Feedback profile 
defines alternate timing under a more restrictive use case.  This draft 
could refer to use under the RTCP Feedback profile, but it can't specify 
timing of its own.  Alan responded that this would be done, and 
commented that in some cases the more useful information in the VoIP 
report block would allow less frequent reporting, saving bandwidth.  Steve 
agreed that it would be good to add a sentence pointing this out because the 
increased in RTCP packet size caused by the inclusion of the VoIP 
metrics will mean that the interval between RTCP packets is 
increased.

   A more complicated issue raised by Magnus Westerlund is that the draft 
should specify SDP signaling for its use.  This is a valid request, but 
might introduce too much delay in completing the draft.  Colin Perkins 
pointed out the need to update the document quickly in order to meet a May 
deadline for consideration by ITU H.460.9.  The specification of SDP 
signaling could be a separate draft, but it appears feasible to include a 
specification of SDP for point-to-point applications and leave more 
complicated scenarios for a separate draft.


RTCP Extensions for SSM with unicast feedback

   Julian Chesterfield discussed updates in 
draft-ietf-avt-rtcpssm-03 which introduces some additional summary 
statistics, some clarifications and examples of security 
requirements and how the security methods should be used for RTSP and SIP 
sessions, and updates to the IANA considerations.  The new summary 
sub-blocks are IPv6 feedback address, BYE list, and RTCP receiver 
bandwidth.  This revision is now aligned with the RTCP XR format that 
Timur just presented, as was suggested in a previous AVT meeting.  Steve 
Casner clarified that the reason for the suggestion was not to save an RTCP 
packet type code point, rather to ask if the same reporting 
mechanisms could be shared for both purposes rather than defining 
different ones.  If there is no feasible merging of 
functionality, then there is not a requirement to fit the SSM reports into 
the XR block structure with extra overhead.  The authors will 
reconsider this issue.

   Some open issues remain for the draft: clarification that the sender 
must forward a group size report whenever the size changes to allow 
correct operation of RTP timer reconsideration rules; reporting in the 
summary what the data corresponds to in terms of sample group size and 
receiver report age; and the alignment with XR.  The authors would like to 
know whether there are any other implementations of this draft, and 
whether it will be considered ready for WG last call when these open 
issues have been addressed.

   Timur Friedman suggests using XR block type 8 rather than 10.  John 
Lazzaro asked about summarization of the "extended highest sequence 
number" field from RTCP reports.  MWPP uses this value; he would want the 
summary to be the minimum value reported by receivers.  He will think 
about the details and make a suggestion.  Steve Casner asked how the 
summarization methods in this draft compare with the summarization 
methods in the XR draft.  Eve Schooler responded that the nature of the 
summary methods in the two drafts is different.  The XR draft reports 
sampling of data, while SSM reports mathematical distributions.  The 
authors of both drafts agreed to meet after the session to discuss this in 
more detail.  Steve commented that the reason for raising the point was 
that if there is no common applicability of the two methods, then 
binding them together does not make sense.

    
RTP Payload Format for JVT Video

   Stephan Wenger discussed 
draft-ietf-avt-rtp-h264-01.  The spec for the JVT codec itself was 
finalized in a meeting the previous week with acceptance for ISO FDIS 
status and with "Consent" in the ITU-T due on March 28, so it is now time to 
finalize the packetization here and Stephan hopes to accelerate the 
process.  A -02 revision is to be produced shortly, and perhaps one more 
revision if there are comments.  The hope is to go to WG last call before 
the next meeting.  There were many editorial changes from -00, plus a few 
technical additions.  Fragmentation was added, but needs more 
description and an example.  A "decoding order number" was added to 
facilitate some special handling of the packets without needing to decode 
the bitstream, but the description in this revision is inadequate.  A 
section was added on MIME registration and SDP usage, but this may also 
need refinement.

   The main issue for this payload format is interoperability with the 
MPEG-4 Simple payload format.  A new informational draft will be written to 
describe how this interoperation may be done.  Colin Perkins asked 
whether interoperation with RFC 3016 is possible and there is any 
pressure for this.  Stephan replied that he thought not, but will 
investigate for the interop draft.  As the h264-01 draft now stands, it is 
possible to transmit a subset of JVT in MPEG-4 Simple access unit 
fragments, although the utility of this is not clear.  It was also agreed at 
the JVT meeting to add a new form of STAP to allow more useful 
interoperation.  But it has also been suggested that this payload format 
should re-model the interleaving of MPEG-4 Simple and that the 
STAP/MTAP syntax should be aligned with MPEG-4 Simple.  This would 
achieve syntactic alignment allowing some re-use of code, but there would be a 
huge semantic difference.  Dave Singer expressed the position that this 
draft should be made the best it can be for H.264 first, and then 
consider interop.  The only solid requirement for interop is that it must be 
possible for a video stream using this payload format to be part of an 
MPEG-4 presentation.  That is already possible.  It was agreed not to 
introduce mythical similarities.

   At the previous AVT meeting, Stephan was resisting the addition of 
media-unaware fragmentation, but has added it in this revision because it 
may be necessary to transport a NALU of size greater than 64KB which is the 
most that IPv4 fragmentation can handle.  In addition, in there may be 
content pre-recorded with a NALU size that does not fit the MTU size of a 
delivery network.  Doing the fragmentation at the application layer 
allows application of tools such as RFC 2733 FEC for better 
protection efficiency.

   The security section of the document needs to be expanded beyond the 
minimal text in -01.  The main issue is the vulnerability of Parameter Sets 
when transmitted in-band.  This is addressed in the codec spec, and Colin 
Perkins said a normative reference could be made to that.  Steve Casner 
asked why the in-band transmission of Parameter Sets should be allowed at 
all, given that the draft says it is a bad idea.  Stephan replied that it is 
necessary in some scenarios involving gateways.  Philippe Gentric 
expressed serious concern that if this practice is allowed then people will 
use it when it shouldn't be used, and it will cause endless headaches 
similar to those already seen with RFC 3016.  Stephan said the draft will 
say SHOULD NOT to discourage its use.  The codec spec also includes the 
means to transmit data of unknown type that may or may not be active 
information, for example a set-top box software update that could 
contain a virus.  Stephan asks for help on security language to address 
this problem.

   Another open issue is video conferencing support.  Stephan would like to 
specify MIME codepoints for levels of operation beyond those included in the 
codec spec, e.g. 704x576 images at 7.5 fps.  Colin asked whether these 
modes will be added in future revisions of the codec spec, which would 
result in two ways to specify the same mode.  Stephan said no.  There is 
also a question whether this payload format is the right place to 
specify whether the response to a request for a full intraframe is 
mandatory.  It was agreed that this payload format cannot specify that 
because it is subject to congestion control.  Colin said this topic will be 
discussed separately in come combination of AVT and MMUSIC.

   The last open issue is regarding timestamps for Field mode, where there 
may be separate sampling instants for the two fields of a picture.  
Discussion requires detailed knowledge of H.264, so was deferred.
    
RTP Payload Format for uncompressed video

   Ladan Gharai reviewed the modifications, additions, and remaining open 
issues for draft-ietf-avt-uncomp-video-02.  The payload header now 
extends the sequence number in the RTP header to 32 bits to 
accommodate high data rates.  Additional color codings for 4:1:1 and 4:2:0 
interlaced and progressive were added, and separate timestamps are 
specified on interlaced fields to accommodate reversing 3:2 pulldown (an 
issue that was raised at the previous meeting).  The draft now 
specifies required and optional SDP parameters (rate, pgroup, 
color-mode, etc.) and covers congestion control in Security 
Considerations (a serious issue for very high data rates).

   There are some open issues.  At the previous IETF it was suggested that a 
planar video mode be added to the draft, i.e., sending Y, Cb and Cr 
planes separately.  This requires a 2-bit field, which could be stolen from 
the length field, to indicate which plane.  The main concern is that a 
substantial portion of the draft related to pgroups and color 
subsampling would be irrelevant to this mode.  Does it make sense to 
combine these disparate modes in one draft?  Stephan Wenger 
recommended keeping them separate.  Two bits may not be enough since there 
are some color models with more than 4 planes, and the plane numbering 
would need to be specified for all these models.  He also felt this would 
not be used because display systems are built to receive data in that 
form.  Philippe Gentric countered that systems doing processing, rather 
than display, do operate on the planes separately, and since they 
operate at gigabit rates, saving memory shuffling is important.  
However, after some study he concluded that the implementations differ so 
much that a single format won't help.  It was agreed to drop planar mode.

   On the other hand, the group felt it was worthwhile to keep the 
detailed listing of pgroup values rather than just the text 
explaining the rules for determining the values.  Ladan asked if anyone has 
additional color subsampling or colorimetry values that should be 
included in the draft.  Stephan Wenger said he had several that he would 
discuss after the meeting.  Steve Casner said the draft needs to specify the 
means for defining and registering new values after the draft is 
published.

   The last question was whether there should be a means to 
independently represent SMPTE timecodes, or are the RTP and RTCP SP 
timestamps sufficient.  Dave Singer said the SMPTE timecodes are 
required for some video applications because they can have gaps in edited 
programs, but it is orthogonal to the data format of the video.  Colin 
Perkins asked whether the NPT mapping in RTSP solves this problem or 
whether some means is needed in the payload format.  Philippe Gentric said 
this is really needed for studio work, and recommended that there be a 
separate payload format defined for carrying just the SMPTE timecodes that 
could then be synchronized with RTP mechanisms to any payload format.  
Colin recalled a proposal for that a couple of years ago, and 
suggested that we resurrect it.  Stephan advised we need input on real 
usage cases in order to produce anything useful.  John Lazzaro said that 
MIDI timecodes are SMPTE timecodes encoded in MIDI, and MWPP provides a 
means to transport these with resiliency.  This would be a solution, 
perhaps a gross one.


RTP Payload Format for a 64 kbit/s transparent call

   The second session started with a discussion of the RTP Payload format 
for 64 kbps transparent calls 
(draft-kreuter-avt-rtp-clearmode-02.txt) by Ruediger Kreuter. This draft 
describes a single 64kbps clearmode connection, and MIME type 
registration. It is distinct from Nx64kbps transport 
(draft-vainshtein-cesopsn-XX.txt) work which is done in the PWE3 working 
group. 

   The MIME type "audio/clearmode" is registered. Ruediger asked if this is 
appropriate? Flemming Andreason said yes, using "audio" as the MIME type 
makes things easier in SDP since it can be included in the same media 
stream as other codecs. Ruediger also noted that the ITU use "audio" for 
this, as do existing products, so that helps compatibility. Ruediger noted 
that Magnus Westerlund sent a number of comments on the draft. Due to 
these he will include the "maxptime" parameter, and an explanation of the 
mapping from MIME parameters to SDP parameters. 

   The chairs asked if there were objections to making this a work item? 
None were raised, so the next version will be submitted as an AVT work 
item, subject to area director approval.  Colin Perkins noted that it is 
necessary to avoid overlap with work in PWE3 (i.e. limit this to a single 
64kbps channel) if we are to accept it as a work item.


The MIDI Wire Protocol Packetization (MWPP)

John Lazzaro discussed MWPP 
(draft-ietf-avt-mwpp-midi-rtp-06.txt) and coding guidelines 
(draft-laz
zaro-avt-mwpp-coding-guidelines-02.txt). He noted that he has been in 
discussion with the MMA and AMEI (the MIDI manufacturers 
associations for the US and Japan), and is incorporating their feedback as 
appropriate. He also noted that IEEE P1639 has related work, solving the 
problem at layer 2, and outlined differences between the two 
approaches. 

   John noted that equipment today uses MIDI cables, asynchronous serial 
cables, which do not use timestamps. Instead, the timing of the signal on 
the wire denoted the media timing. People want pseudo-wire emulation in 
RTP, so they can use the precise packet times when operating on a LAN or 
other low-jitter environment.  Wants a parameter to indicate that the 
sender is "precision", so the last command in the packet indicates the 
sending time. 

   The other issue is synchronization between direct digital audio output 
and synth MIDI output. Options are to use an external clock sync, and 
slave everything to it, or to specify packet buffer latency and use 
manual calibration.

   There was much discussion around this subject between John, Magnus 
Westerlund, Steve Casner and Colin Perkins. This leant towards the first 
option: somehow convey the mapping between RTP timestamp and an 
external sync clock, and use the external clock to determine when to 
playout the media. Steve Casner gave a reference to the BBN 
synchronization protocol (J. Escobar, C. Partridge and D. Deutsch, Flow 
Synchronization Protocol, IEEE/ACM Transactions on Networking, Volume 2, 
Number 2, April 1994), which is an example of this approach. 
   
   Colin Perkins noted that RTSP has a similar signalling mechanism, to 
convey a mapping between RTP timestamps and a SMPTE timecode. John noted 
that MWPP may also be used with SIP, so a more general parameter may be 
useful, but is nervous about splitting this out into a separate 
document due to feature creep. It may be appropriate for the MWPP 
document to outline the form of the solution, but to leave the actual 
solution to a different document (since the solution to 
synchronizing playout from multiple sources is more general than MWPP).


RTP Payload Format for ATRAC-X

   Matthew Romaine discussed 
draft-hatanaka-avt-rtp-atracx-01.txt, the RTP payload format for 
ATRAC-X. A number of questions were raised at the last meeting: 
ambiguity in the timestamp definition, concern that the reasoning behind the 
multiplexing was not convincing, the reason why RFC 2198 was not used for 
redundancy, and concern about possible decoding ambiguity when 
fragmentation is used. Since then, the timestamp and sample rate have been 
clarified, the redundant data framework has been modified, a new method of 
multi-channel data decomposition has been introduced, and the reasons for 
multiplexing have been solidified.  

   The timestamp now corresponds to the presentation time in 
milliseconds and the sample rate must now be identical for all streams. 
Steve Casner asked why milliseconds are used for the presentation time 
rather than the sample clock rate? Will revisit.

   The RFC 2198 redundancy format has a lot of overhead, because of the 
payload headers, and its block length field (10 bits) is too short, so 
redundancy is included in the payload format.

   Multi-channel decomposition is now included, allowing rate 
adaptation at the sender, which can drop the less important channels based on 
feedback from the receiver, and allowing for multi-channel 
presentations to be split across multiple ATRAC-X streams if they use more 
channels than are supported by the codec. 

   There was some confusion over the terminology, with Steve Casner and 
Colin Perkins asking questions for clarification whether the multiple 
streams are from one source, or multiple sources. The draft will be 
updated to clarify the distinction between ATRAC-X streams and RTP 
streams.

   Steve Casner expressed concern that the payload format is 
inflexible in the way streams are multiplexed: for example layered coding is 
typically implemented by layered data across several RTP sessions, but the 
payload format sends the multiple layers within a single RTP stream. It 
would be appropriate to consider scenarios other than unicast 
streaming to ensure the design is does not unduely constrain future 
applications.

RTP Payload Format for iLBC Speech

   Alan Duric discussed iLBC speech 
(draft-ietf-avt-rtp-ilbc-01.txt and 
draft-ietf-avt-ilbc-codec-01.txt). Since the last meeting, the codec and 
payload format have been tested at the SIPit 12 interop event, with 
several interoperable implementations; a 20ms frame size mode has been 
introduced into the codec (with 4 blocks of 160 samples); and the PLC has 
been enhanced.

   The SDP now includes a "mode=" attribute to indicate if 20ms or 30ms 
frame sizes is preferred (the default, if the attribute is not present, is 
30ms for backwards compatibility). Steve Casner asked if the frame size can 
be distinguished in-band? Not without breaking the backwards 
compatibility. Steve asked how the zero value - support both - is used? Not 
clear, maybe just list it as "reserved". Colin Perkins asked if the 
"a=ptime" attribute can be used instead?  No due to ambiguity with 60ms 
packets (2x30ms or 3x20ms).

   The current open issue is VAD and comfort noise generation. No 
interest at present, so they plan to drop this. Steve Casner noted that 
comfort noise frames would need to be indicated in-band? Yes, this needs to 
be considered how they can be distinguished. May be possible to use the
   existing comfort noise payload format. 

   See also http://www.iLBCfreeware.org/ for source code and more info.

RTP Payload Format for the Speex Codec

   Greg Herlein was scheduled to discuss the Speex codec and payload 
format 
(draft-herlein-speex-rtp-profile-00.txt), but couldn't reach the meeting due 
to traffic disruption caused by anti-war protests in the city. Steve 
Casner briefly outlined the codec and payload format, and asked for any 
comments to be sent to the mailing list.

RTP Payload Format for RGL Codec

   Michael Ramalho discussed the RGL codec 
(draft-ramalho-rgl-desc-01.txt) and payload format 
(draft-ramalho-rgl-rtpformat-01.txt). He started with a brief 
introduction to the codec, which provides lossless compression for G.711. 
The output of the RGL codec is variable rate. Arbitrary input can be 
compressed, and the output will be at most one octet larger than the input 
(and usually will be much smaller).

   The first octet of the compressed output will never match one of a set
   of reserved values (0x3e, 0x5e, 0x7e, 0x9e, 0xbe, 0xde and 0xfe). These 
reserved values are used in the payload format to indicate a 
particular bundled format: bundling of one, two or three RGL frames per 
packet is signalled by one of these reserved values, and a less 
efficient, but generic, format is signalled by another value, and allows an 
arbitrary number of frames per packet. 

   Steve Casner asked if the codec is likely to switch between single- and 
multi-frame modes? It may be possible to distinguish the bundling at 
session setup time, to avoid the need for reserved byte codes (as is done by 
the EVRC payload format). Magnus Westerlund also noted that one can 
always distinguish the format using different RTP payload types, so it may 
not be necessary to use the reserved codes.

   Magnus Westerlund asked if the generic format, or something similar with a 
count of the number of frames in the packet, can be used in all cases? This 
is possible, but the overhead may be too large. Magnus noted that, if the 
type octet was removed, the size would be the same in almost all cases, and 
the format would be simpler. 

   Steve Casner noted that this is rather complex, with several 
different formats.  That is a detraction, that could be fixed using two 
payload types (one frame per packet; multiple frames per packet).

   Michael noted that it is desirable that the number of samples in the 
payload headers matches the "a=ptime" attribure in the SDP. Also, he noted 
that there is a missing parameter for a-law or u-law.

   More information is available at 
http://www.vovida.org/
    
RTP Payload Format for ETSI ES 202 050 DSR

   Qiaobing Xie discussed 
draft-xie-avt-dsr-es202050-00.txt, which is a new encoding scheme for 
distributed speech recognition. The draft uses exactly the same format as 
that used in the previous DSR payload format RFC-to-be, although the 
contents of the frame pairs, and the MIME type, are different.  Accept as a 
work item? Yes, subject to area director approval.

   Steve Casner asked if this draft cannot be simplified more, being just a 
reference to the previous DSR payload format and registration of the new 
MIME type? Qiaobing wished to clarify a couple of points that were 
unclear in the previous draft, and so copied the text and added a 
diagram for explanation. Steve said it would be appropriate to insert the 
change into the other draft, as an RFC editor note, to ensure that both are 
clear.

File format for EVRC/SMV vs QCP

   Steve Casner noted that there is a proposal to change the file format for 
the EVRC/SMV payload format (currently with the RFC Editor) to be that 
defined in draft-garudadri-qcp-00.txt. Steve asked for an "ego-free" 
evaluation of which format makes most sense, or if we should accept both 
formats?

   It was asked if the QCP file format has IPR associated with it, and if 
the implementations are independent. Randall Gellens replied, noting that he 
is still researching these issues and cannot give a definitive answer, but 
that he thinks the QCP implementations are independent and is not aware of 
any IPR on QCP. 

   Randall also noted that he was unaware of the QCP format when working on 
the EVRC/SMV payload format.  The QCP format has been in use for several 
years, by other applications, but the MIME type has not been 
registered.  It may be necessary to register this, for use with QCELP for 
example, even if it is not used for the EVRC/SMV payload format. There may be 
implementations using this with EVRC/SMV though.

   It is unclear if the EVRC/SMV format is used as currently 
specified, but 3GPP-2 may be considering this. Steve Casner asked if we can 
get clarification from 3GPP-2 on their plans for the file format? We may be 
able to get a liaison statement from 3GPP-2, stating their needs?

   No decision was made, but it is hoped that input from 3GPP can help to 
resolve the issues.

Other discussion

   Since there was time left at the end of the session - due to missing the 
Speex discussion - Steve Casner asked if there were any other topics that 
the group wished to discuss?

   It was asked if it is possible to add new items to the RTCP 
reporting extensions draft (e.g. for video quality assessment)? Not at this 
time, because that draft is complete and we don't wish to delay it. 
Extensions may be written as new drafts though.

   John Lazzaro asked when the AVT group might close, since the charter is 
open ended due to the addition of new payload formats? Steve noted that we 
have fairly broad license to work on payload formats, but need area 
director approval for other new work items. Colin Perkins noted that the 
current charter lists getting RTP to Draft Standard as the main goal of the 
group. This is now complete, so we should be considering future work 
items.
   
   Stephan Wenger asked if it is appropriate to consider moving 
existing payload formats to draft standard? Yes, or to historic if they are 
no longer useful. There was some discussion of how the H.263 format can 
move to draft standard. 

   Magnus Westerlund noted that NAT boxes present a challenge. Should the 
AVT group work to define "symmetric RTP"? Steve replied that it should 
probably be defined in the control protcols that use RTP, since they may 
have differing interpretations of the term. Magnus noted that there is a lot 
of commonality between the protocols, so a separate draft may be 
warrented, and there is the issue of address binding? Something to be 
discussed offline.

   Colin Perkins noted that the group may also wish to consider mapping RTP 
onto DCCP as a future work item. Transport over PR-SCTP was suggested as 
another possible work item.  These are long term goals, not something we 
should rush into, but something to consider as those other protocols are 
developed.