2.8.16 Speech Services Control (speechsc)

NOTE: This charter is a snapshot of the 64th IETF Meeting in Vancouver, British Columbia Canada. It may now be out-of-date.

Last Modified: 2005-09-20

Chair(s):

David Oran <oran@cisco.com>
Eric Burger <eburger@brooktrout.com>

Transport Area Director(s):

Allison Mankin <mankin@psg.com>
Jon Peterson <jon.peterson@neustar.biz>

Transport Area Advisor:

Jon Peterson <jon.peterson@neustar.biz>

Mailing Lists:

General Discussion: speechsc@ietf.org
To Subscribe: speechsc-request@ietf.org
In Body: subscribe
Archive: http://www.ietf.org/mail-archive/web/speechsc/index.html

Description of Working Group:

Many multimedia applications can benefit from having Automated Speech
Recognition (ASR), Text to Speech (TTS), and Speaker Verification (SV)
processing available as a distributed, network resource. To date,
there
are a number of proprietary ASR, TTS, and SV API's, as well as two
IETF
drafts, that address this problem. However, there are serious
deficiencies to the existing drafts relating to this problem. In
particular, they mix the semantics of existing protocols yet are close
enough to other protocols as to be confusing to the implementer.

The speechsc Work Group will develop protocols to support distributed
media processing of audio streams. The focus of this working group is
to develop protocols to support ASR, TTS, and SV. The working group
will only focus on the secure distributed control of these servers.

The working group will develop an informational RFC detailing the
architecture and requirements for distributed speechsc control. In
addition, the requirements document will describe the use cases
driving
these requirements. The working group will then examine existing
media-related protocols, especially RTSP, for suitability as a
protocol
for carriage of speechsc server control. The working group will then
propose extensions to existing protocols or the development of new
protocols, as appropriate, to meet the requirements specified in the
informational RFC.

The protocol will assume RTP carriage of media. Assuming
session-oriented media transport, the protocol will use SDP to
describe
the session.

The working group will not be investigating distributed speech
recognition (DSR), as exemplified by the ETSI Aurora project. The
working group will not be recreating functionality available in other
protocols, such as SIP or SDP. The working group will offer changes to
existing protocols, with the possible exception of RTSP, to the
appropriate IETF work group for consideration. This working group will
explore modifications to RTSP, if required.

It is expected that we will coordinate our work in the IETF with the
W3C Mutlimodal Interaction Work Group; the ITU-T Study Group 16
Working
Party 3/16 on SG 16 Question 15/16; the 3GPP TSG SA WG1; and the ETSI
Aurora STQ.

Once the current set of milestones is completed, the speechsc charter
may be expanded, with IESG approval, to cover additional uses of the
technology, such as the orchestration of multiple ASR/TTS/SV servers,
the accommodation of additional types of servers such as simultaneous
translation servers, etc.

Goals and Milestones:

Done  Requirements ID submitted to IESG for publication (informational)
Done  Submit Internet Draft(s) Analyzing Existing Protocols (informational)
Done  Submit Internet Draft Describing New Protocol (if required) (standards track)
Done  Submit Drafts to IESG for publication
Jun 2005  Submit MRCPv2 specification to IESG

Internet-Drafts:

  • draft-ietf-speechsc-reqts-07.txt
  • draft-ietf-speechsc-mrcpv2-08.txt

    No Request For Comments

    Current Meeting Report

    DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT 
    
    
    
    SpeechSC Meeting Minutes
    IETF 64 Vancouver
    November 10, 2005
     
    Started 3:19pm
    Jeff Haynie, scribe
     
     
    Eric Burger open with note well and agenda bashing
     
    Sarvi discussed status of MRCPv2 based on presentation
     
    discussed Sarvi's proposal to Andrew Wahbe's email to add 
    two new timers: 
     
    014(partial-match-maxtime) and 015(no-match-maxtimeout) and 
    rename Recognition-timeout to "hotword-timeout" and clarify text
     
    discussion by Dan Burnett and Dave Burke about maxspeech timer
     
    *ACTION: we (dan, ken, myself or dave can) need to submit a 
    CR to VBWG for completetimeout clarification in VoiceXML
     
    Dave asked about added a resource type to MRCP to allow sip 
    registration to have mrcp client to discover capabilities
    of mrcp engine
     
    Sarvi noted that we should address this in a separate document
    not in MRCPv2 via RFC 3840
     
    request to add a milestone to the charter (hint hint john) to 
    deal with MRCP server resource discovery, etc.
     
    chairs agreed to move last draft to last call status
     
    Sarvi/Dan commit to having last call draft in 2 weeks
     
    Eric and Sarvi discussed what possible next steps 
     
    Dave agreed that infrastructure control is higher priority 
    that multimodal control
     
    Ken encouraged everyone to join mrcp conformance effort, 
    talked about voiceprint/si features based on working group
     
    Magnus questioned media server control, eric clarified use 
    of media server control (not to be confused with RTSP as 
    in content)
     
    Magnus asked about playback resources for mrcp
     
    David discussed possible work around playback resources, but
    looking at needs in a broader detail to determine what needs 
    to be done
     
    Sarvi clarified the capability of playback resource using 
    simple synth resource, so we have a good amount of detail 
    already built
     
    David didn't want to design this now since not everything 
    is covered like time control you have in RTSP
     
    Magnus wants to sit down and clarify what would need to be 
    done, possibly work outside of group
     
    Eric asked about DMCP submission
     
    David asked if anyone would champion DMCP, in the absence 
    of input will not do anything. David/Eric are willing to help
     
    agenda done, open mic
     
    Dan asked about procedural clarification
     
    "snotty answer" from Eric - just because we're done doesn't 
    mean we're done, real answer from Jon Peterson
     
    Jon: i have to read it (give feedback), then IESG (give 
    feedback), then RFC editor (editorial review comments)
     
    Jon: process could take a couple of months to a couple of years
     
    David thanked everyone for hard work
     
    Adjourned at 4:03pm (second shortest meeting)
    
    
    
    
    DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT 

    Slides

    Chairs Slides (PDF)
    Chairs Slides (PPT)
    MRCPv2 Status