2.1.16 MIME Encapsulation of Aggregate HTML Documents (mhtml)

NOTE: This charter is a snapshot of the 40th IETF Meeting in Washington, DC. It may now be out-of-date. Last Modified: 27-Oct-97

Chair(s):

Einar Stefferud <stef@nma.com>

Applications Area Director(s):

Keith Moore <moore@cs.utk.edu>
Harald Alvestrand <Harald.T.Alvestrand@uninett.no>

Applications Area Advisor:

Keith Moore <moore@cs.utk.edu>

Mailing Lists:

General Discussion:mhtml@segate.sunet.se
To Subscribe: listserv@segate.sunet.se
In Body: subscribe mhtml <full name>
Archive: ftp://segate.sunet.se/lists/mhtml/

Description of Working Group:

World Wide Web documents are most often written using HyperText Markup Language (HTML). HTML is notable in that it contains "embedded content"; that is, HTML documents often contain pointers or links to other objects (images, external references) which are to be presented to the recipient. Currently, these compound structured Web documents are transported almost exclusively via the interactive HTTP protocol. The MHTML working group has developed three Proposed Standards (RFCs 2110, 2111 and 2112) which permit the transport of such compound structured Web documents via Internet mail in MIME multipart/related body parts.

The Proposed Standards are intended to support interoperability between separate HTTP-based systems and Internet mail systems, as well as being suitable for combined mail/HTTP browser systems.

It is beyond the scope of this working group to come up with standards for document formats other than HTML Web documents. However, the Proposed Standards so far produced by the working group have been designed to allow other such formats to use similar strategies.

The MHTML WG is currently INACTIVE while first implementations are under way. To support implementation efforts, the WG Editor maintains an Informational Internet-Draft ftp://ftp.dsv.su.se/users/jpalme/draft-ietf-mhtml-info-06.txt that provides additional useful information for implementors. This Informational Draft also discusses Web page formatting choices that affect their efficient use through disconnected channels such as mail. It will become an Informational RFC after implementation experience has been collected. Until then, this informational draft will be kept current and available in the IETF Internet-Drafts library.

The MHTML Mailing List remains open for discussion of any issues that may arise during implementation, and to collect information about successful interoperable and interworkable implementations in anticipation of progression to Draft-Standard Status.

From May to October, 1997, the working group will Monitor Implementation progress and discuss issues, periodically Update Draft of Informational Document.

The editors of this group are:

Main editor: Jacob Palme <jpalme@dsv.su.se Associate editor: Alex Hopmann <alex.hop@resnova.co

Goals and Milestones:

Mar 96

  

Clarify issues and submit first Internet-Draft.

Jun 96

  

Submit first Internet-Draft for MIME encapsulation of HTML.

Sep 96

  

Submit MHTML specification to IESG for consideration as a Proposed Standard.

Oct 96

  

Submit Internet-Draft for guidelines in created documents for disconnected access.

Dec 96

  

Submit guidelines Internet-Draft to IESG for consideration as an Informational RFC.

Aug 97

  

Meet at Munich to review Implementation progress.

Oct 97

  

Submit Implementation Progress Internet-Draft to IESG for publication as an Informational RFC.

Internet-Drafts:

Request For Comments:

RFC

Status

Title

 

RFC2112

PS

The MIME Multipart/Related Content-type

RFC2111

PS

Content-ID and Message-ID Uniform Resource Locators

RFC2110

PS

MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML)

Current Meeting Report

Minutes of the Mime Encapsulation of Aggregate HTML Documents (MHTML) WG

These minutes are submitted by Einar Stefferud <stef@nma.com>, using meeting notes taken by Eric Berman <ericbe@microsoft.com.

Eric Berman volunteered to take meeting notes for the minutes. There were no revisions to the meeting agenda as presented.

I (a). Status of Implementations

Are there any features in the standard which no one implements, and which thus might be cut off when publishing as an IETF Draft Standard?

Some questions arose about granularity. For example, if we have specified a boundary condition, does it have to actually be generated by anyone to get our standard to Draft?

Resolution: Ask this question on the list and check out Jacob's questionnaire, and get an opinion from the area directors.

A suggestion was made for an IMC interoperation event, which drew no dissent. It would seem that IMC testing of MHTML fits normally into the framework of any of the IMC EMail and MIME testing events. And, this might also be done off-line just using a repository of messages and ad-hoc testing. No special WG action should be required.

I (b). Questionnaire on implementation status, which can be found at:

<http://www.dsv.su.se/~jpalme/ietf/mhtml-impl-status-v1.html>
<http://www.dsv.su.se/~jpalme/ietf/mhtml-impl-status-v1.rtf>

Should this ask about future implementation plans? Is there any problem asking for this kind of info from vendors? Should the questionnaire have a repository of sample messages?

Nobody expressed problems with asking about future plans with regard to MHTML. But, when is the right time to ask people to fill it out? Doing it now is not a problem, but does it have any value now? The questions really need to be answered as part of the process of going to Draft Status. If we ask now, we will have to ask again later.

Jacob will post a message asking for WG inspection of his questionnaire. Comments about it should go to the mailing list.

II. What should an implementor do with features they do not support?

Is it permitted to just ignore things they do not support? Example: An implementation which cannot show graphics just ignores them, or an implementation which cannot handle a particular kind of link just ignores the objects referred to by this link. Any conformance requirements on this?

These are both MHTML as well as HTML issues (e.g., showing ALT text for images). Or if a UA cannot resolve links. But it is a principle of HTML to just ignore stuff that is not understand; does this apply here? It seems to, and we do have our own MUST/SHOULD areas for the MHTML part of it. Consensus appeared to be that we don't need to say anything about this.

III. Allow multiple Content-Location statements in the same content heading?

Argument for: There are times when something has multiple valid names. For example, with file names treated as case insensitive in an HTML document, which are different according to URL comparison rules but usually work against forgiving HTTP servers or filename. Thus, binding to the URL could give different results from looking it up in the MHTML message. Nobody at the meeting expected to implement any time soon, but still, there seems no reason to prohibit it.

The problem, though, is that Content-Location can be used as a base, and thus this would lead to ambiguity.

Two proposals:

(a) if you do this, you must use a content-base at the same level, or
(b) if you do this, all of the content-locations must be valid as a base.

One concern from an implementation point of view concerns implementations that might hash a URL for easier management. Tradeoff is that there is a bit of extra flexibility at the expense of a bit of extra work. If we allow this, we MUST require that you do no harm. Rough consensus was that of the two options, (a) is preferred and everyone seemed agreeable to allowing multiple Content-Locations.

This then raised the question of whether we should prohibit relative content-locations: Should we say that they have to be absolute?

Tradeoffs: Prohibition would certainly simplify things, but would add a few extra bytes on the wire (probably irrelevant). Worded that way, the tradeoff seems clear, but are there any scenarios which this would make worse? (Need to get the list's consensus on this; no strong opinions here). Alex, Jacob, and Nick will be word-smithing this in an editors meeting at IETF, and Jacob will submit the revised Internet-Draft to the list for review and approval.

IV (a). Web browsers today typically have two save formats, save as text and save as source.

Save as source saves only the HTML text, not the linked objects. The user can thus not use the saved HTML to display the message with inline objects again, unless the user has some means to retrieve these inline parts. Perhaps there should be a third option "save as message" which saves in message/rfc822 format, and thus really saves both the HTML text and the inline objects.

Save as text, save as source, save as message?

Not a controversial idea (some early implementations already do this), but the more interesting question is whether or not the document should say anything here or if it is already perfectly well specified.

Consensus was that this does not seem to require any document change!

IV (b). MHTML is not only a format for sending HTML in e-mail.

It is also an archiving format, where a whole document with its inline linked objects can be saved in one file. Does this fact require some special action from us?

MHTML use as an archiving format will become a useful document archive format, in addition to its use for sending compound documents via EMail.

V. Caching Issues

The consensus on the list seems to be that an MHTML message shows one instance of a web document at one time, and should not show a later version if a user retrieves and views the received message a few days later. (Except when Content-Type: message/external-body is used.)

Observation: archiving is not caching! Both are useful and this fact is non-controversial. But, it is important to note that they have very different purposes and behaviors.

Is interaction with message/external anything we need to worry about? Probably not; the value of message/external for archiving is pretty low. Since this isn't caching, then the cache issues do not apply.

Plus, with 4b, if we don't treat this as an archive, then we would be inconsistent. Saving as an archive to disk would yield different results than saving to an inbox. This is quite bizarre. Note that you are not forbidden to use disaggregation, but then you need to be aware of the security/caching issues. This is already covered in our specs.

Question: Can someone do:

multipart/related
text/html (a)
multipart/related
image (b)
text/html (c)
multipart/related (d)
image (e)

So, 1: can (a) point to (b).
2: can (c) point to (a)?, and
3: can (c) point to (e)?

Answer to 1 is no and would be ambiguous if we did allow it. Answer to 2 seems more interesting, probably allowed today. Answer to 3 would not work.

Current language seems to reflect this.

Consensus: current language is correct but not clear enough, nor does it have examples (both positive and negative).

VI. Is there any need for discussion of interaction between MHTML and Content-Type: message/external-body?

We do not know if this is a problem, since we have never discussed it. First question is why you'd want to have such interaction? What does it do for you? What do you do with an unresolved reference?

Consensus: Already covered because it's undefined if it doesn't resolve.

VII. Additional Proposals

I may have time to submit two small additional proposals to the informational document, one about readable HTML and one about combinations of multipart/alternative where one of the alternatives is an HTML form. If I have time to do this, we might wish to discuss it at the meeting.

Nothing about this has yet been sent to the mailing list. The idea is to have a simple form of text-based mail forms, employing options, text boundaries, etc. This appears to be an interesting idea, but not really in the scope of WG MHTML. Perhaps it is best for Jacob to prepare it as a separate Informational RFC

VIII. Any other issues?

Does XML have any or cause problems for MHTML? Probably not. But should we add it to the list of stuff like PDF that should work (in non-normative fashion)? Yes, we should allow this and mention it.

Procedurally, we (or someone) can develop a Proposed Standard for how to do XML with the MHTML tools (Multipart-related, etc.), and perhaps this might be included when MHTML goes to draft if the timing is right. But, for now, we should just work on getting our current drafts onto the Standards Track!

IX. Time schedule for publication of new proposed standard and progressing it to draft standard.

Revise the current draft text in an editors meeting at IETF. Publish the new Internet-Draft to the list.

Then we need to discuss the revised draft and relative URL simplification on the mailing list for about a week starting 5 January 1998.

Then a 2-week working group last-call, followed by IETF last call.

Slides

None Received

Attendees List

Roster not received

Previous PageNext Page