IESG Narrative Minutes
Narrative Minutes of the IESG Teleconference on 2013-05-16. These are not an official record of the meeting.
Narrative scribe: John Leslie and Susan Hares (The scribe was sometimes uncertain who was speaking.)
Corrections from: Pete, Barry, Benoit
2. Protocol Actions
2.1 WG Submissions
2.1.1 New Items
2.1.2 Returning Items
Client Server | | | CON [MsgID1] | +----------------->| | | | ACK [MsgID1] | |<-----------------+ | |
| 0 | (Reserved) | |
CON NON ACK RST Request X X ? ! Response X X X ! Empty ! ! X X
2.2 Individual Submissions
2.2.1 New Items
2.2.2 Returning Items
2.3 Status Changes
2.3.1 New Items
2.3.2 Returning Items
3. Document Actions
3.1 WG Submissions
3.1.1 New Items
4. Managing Prefixes ............................................ 7 5. Address Configuration ........................................ 8 6. Updating Address-relevant Entries ........................... 10 7. Renumbering Event Management ................................ 13
3.1.2 Returning Items
3.2 Individual Submissions Via AD
3.2.1 New Items
3.2.2 Returning Items
3.3 Status Changes
3.3.1 New Items
3.3.2 Returning Items
3.4 IRTF and Independent Submission Stream Documents
3.4.1 New Items
3.4.2 Returning Items
3.4.3 For Action
1231 EDT break
1236 EDT back
4 Working Group Actions
4.1 WG Creation
4.1.1 Proposed for IETF Review
4.1.2 Proposed for Approval
4.2 WG Rechartering
4.2.1 Under evaluation for IETF Review
4.2.2 Proposed for Approval
5. IAB News We can use
6. Management Issues
7. Agenda Working Group News
Amy: we need from Martin how to split the Transport AD load
Martin: working on that tomorrow afternoon
1309 EDT Adjourned
(at 2013-05-16 07:45:39 PDT)
The terms "Intermediate Flow Selection Process" and "Intermediate Selection Process" are so similar that I had to read the glossary entry for the former several times in order to catch the difference. If possible, it would be better to use a different name to refer to this process. I realize this is a central bit of terminology in this draft, so the request may seem a bit extreme, but it looks like it's been newly introduced in this particular draft, so it's not too late to do something about it. I'm not convinced that fixing it is worth the trouble, but I raise the issue because it tripped me up; it will probably trip up other new readers of the document. In section 6.2.1, I assume that the flow key is substantially smaller than the flow cache entry, but this is a bit surprising. I'm assuming the flow cache entry is somehow a heavier-weight thing, but it's not obvious what that extra weight is. I went looking for a definition of "flow cache" and didn't find one in any of the referenced RFCs. It might be helpful to have a glossary entry that briefly describes the flow cache. Presumably it's just the set of all flow records; if so, the definition of flow record in 5101 doesn't give me a basis for thinking that it's much larger than a flow key. None of this is intended to imply that the text is wrong; just that it might help to have a bit more exposition on the topic. 220.127.116.11: what's a flow position? Aside from these observations, which may or may not actually be helpful, the document looks good—thanks for doing the work!
Shouldn't Section 10 discuss the security implications of revealing to an external party more detailed and finer grained information about what is happening within a network? This might be what Stephen is asking for in the second part of his Discuss.
The shepherd writeup says: This draft raised IPR concerns, in the same manner as the PSAMP selection draft had done. Nick Duffield (AT&T) commented that the AT&T IPR claim relates only to statistical sampling, and PSAMP handled this by saying "at least on of the sampling techniques must be implemented." In this draft, we have tightened that up a little by saying "a conforming implementation MUST implement at least the Property Match Filtering." So you're saying that somehow the compliance requirement in section 6.1 helps with the IPR issue? I've got to say that even setting aside my general dislike of compliance requirements, it makes me very uneasy to have requirements for the purpose of addressing IPR issues. Can you explain a bit further?
Section 2, "Hash-based Flow Filtering": I'm always a bit concerned to see protocol MUST requirements in a definitions section (as I am with IANA Considerations or Appendices). It's not clear to me that these requirements will be noticed. Is there nowhere else in the document these make sense to go? (6.1.2? Somewhere in section 7?)
(1) This is probably a discuss-discuss. Section 9.1.1 allows addition of new flow selection algorithms via expert review. Would/should a flow selection algorithm that was counter to RFC 2804 and didn't have any e.g. network management functionality be approved by the experts? I think the answer ought be "no" so maybe that guidance to that effect could be added to 9.1.1. (2) section 10, I think you should add text (or a pointer to text elsewhere) that recognizes that exported flows can be sensitive for security or privacy reasons and need to be protected. There's probably text in some other document but if not, happy to help generate that.
- I had similar questions to Pete about the write up and the IPR situation. When I went to look in the wg list archive I didn't see where this IPR declaration was annouced to the list or discussed - can someone send a pointer? (I probably missed it.) - section 2: last 2 sentences of "hash-based flow filtering" definition seem mangled - 7.1 seems to say that propertly match filtering for all entries in the [iana-ipfix-assignments] registry (which is expert review) MUST be supported. Is that right? That seems odd since the registry changes (with expert review) but the code that'd be written for this spec won't necessarily change. The IANA registry is also currently an informative reference, so the "MUST be supported" interpretation above is also a bit weird, but I've no idea which subset of the IANA registered things need to be supported to get interop. Or maybe I'm confused? (As usual:-) That would have been a discuss but section 8 seems to fix the problem by listing the IANA registered things that MUST be supported. Maybe add a forward pointer from 7.1 to 8 to clarify? - section 10, 3rd para: I suggest replacing "a user" with "an adversary" since we're only concerned here about flows involving an adversary and not with monitoring innocent users, right? - section 10, I'm not sure that a CBC IV is the right way to say what you want - do you just want random numbers? If so then maybe NIST 800-90A is a better reference or even RFC 4086?
0) Full-on support Stephen's discusses. 1) s2: Missing a word (maybe "can be"): Hash-based Flow Filtering can already applied at packet level, in which case the Hash Domain MUST contain the Flow Key of the packet. 2) s7: 1st para 'bout info model should point to s8?
I have no objection to the publication of this document, but I do see a few issues that I would strongly suggest you resolve before handing the text to the RFC Editor. --- Section 4 could probably be renamed since, I assume, this is no longer a proposal, but a real protocol solution. Changes to the text as well. --- In 6.4 I think the term "reject" is mis-applied because there is no rejection involved. --- Section 6.5 does not clearly explain how to build a rejection Reconfigure-Reply message despite that section 9 says it does. You should add a statement about how a rejection is conveyed and which status code to use. --- Section 6.6 Depending on the status code enclosed in a received RECONFIGURE-REPLY message, the relay may decide to terminate the request or try a different corrected Reconfigure-Request. Undoubtedly true, but which codes get which treatment? --- Please decide whether the names of the new messages are in upper case or not.
6.4/6.5: "MUST be configurable"? Aside from this being completely unverifiable, I don't see any justification for such a requirement in this document. 6.5: "the server MAY use its content", "The server MAY use the content". Lowercase the MAYs. Better yet, use "might" or "can". These aren't protocol options.
I agree with Sean's discuss.
Thanks for quickly addressing my comment in the new draft version. Regards, Benoit
Version -07 has resolved my concerns, and thanks for addressing them.
I have no problems with the publication of this document, but I do have some non-blocking comments... 1. Section 6 : The first sentence in 6.2.1 ("Clients MUST silently discard any received RECONFIGURE-REQUEST message") is useless. A client implementation won't know about this protocol that works between relays and servers. 2. Section 6.3 : I see "The relay agent MAY supply the updated configuration in the RSOO" and wonder why this is a MAY. If the updated configuration is not included, what is the DHCP server to do? The discussion in Section 6.5 does not seem to cover the situation from the server's perspective.
s6.9: Because the Reconfigure option requires that the client authenticate it I'm happy (s15.11 of RFC 3315), but why wouldn't you also require that the rely agent & server support the authentication option as opposed to "A control policy based on the content of received Relay Identifier Option MAY be enforced by the DHCPv6 server"?
<gripe> s9: Points to s21.1 of RFC 3315. That section points to RFC 2401. That RFCs been obsoleted. Is there any chance that DHCPv6 is going to be looking at updating that? <gripe> s6.5: Worth adding RFC 5007 after the RFC 3315 for status codes because that's where Not Configured is defined. s9: Worth adding a reference to RFC 6925 for the relay identifier option? I had to go hunt that one down.
minor nit: section 4 If the relay has no client to reconfigured, it stops sending Reconfigure-Request messages. - If the relay has no client to be reconfigured, it stops sending Reconfigure-Request messages. or If the relay has no client to reconfigure, it stops sending Reconfigure-Request messages.
The regular expression for domain names does not cover all possible domain names. Domain names can contain any ASCII character, not just the ones that RFC1034 recommends—RFC1034 doesn't exclude those other characters, but merely recommends which ones should be used in names that will be used by certain protocols. This is alluded to in the comments for the declaration, so I assume it's been thought about, but I'm a bit concerned that if the domain name data format is used to fetch a domain name that's been configured, for example in a DNS server, but that domain name contains characters not in the set defined in this document, an implementation might exhibit unexpected behavior; in any case, this wouldn't actually _work_. Is there some strategy documented elsewhere for dealing with this problem? Or is this only ever expected to be used in situations where non-conforming domain names can't occur (i.e., not for fetching configuration state from a DNS server)? I realize that this question really pertains to 6201, so it may be difficult to answer now, but it became apparent when I reviewed the document, so I thought it worth asking even if there's nothing to be done about it at the moment.
Internationalized domain names MUST be encoded in punycode as described in RFC 3492 You could simplify a bit if you wanted and simply say, "Internationalized domain names MUST be A-labels as per RFC 5890". Then you can skip the reference to 3492 and 5891, since 5890 makes the forward references.
I'm curious - what's a yang-identifier used for? Wouldn't a hint here be nice?
Regarding the derived-type issue raised for IP address types in the Gen-ART review: I would probably have reacted similarly to what Joel said. Please consider adding a comment that helps clarify the issue to new readers.
In the shepherd writeup, the answer to Q7 is non-responsive to the question. That said, I know that the author is well aware of BCPs 78 and 79, and this is a bis of his earlier document, so I'm not concerned about that point.
My question is for the INT and OPS ADs, not for the shepherd or authors. This isn't REMOTELY a blocking comment for draft-ietf-netmod-rfc6021-bis, but while looking at this draft, I noticed text that was carried over from RFC 6021, that said this (and it appears a couple of times): The canonical format of IPv6 addresses uses the compressed format described in RFC 4291, Section 2.2, item 2 with the following additional rules: the :: substitution must be applied to the longest sequence of all-zero 16-bit chunks in an IPv6 address. If there is a tie, the first sequence of all-zero 16-bit chunks is replaced by ::. Single all-zero 16-bit chunks are not compressed. The canonical format uses lowercase characters and leading zeros are not allowed. The canonical format for the zone index is the numerical format as described in RFC 4007, Section 11.2."; I'm not asking about it in a draft-ietf-netmod-rfc6021-bis context (not remotely worth changing from the previous RFC). I am wondering if these additional rules for the compressed format are used outside NETMOD? If so, I wonder if it's worth describing them in a doc that might get more attention.
I have to ask why the canonical representation doesn't end in Z (i.e., why is it different)?
I am entering these notes as Comments because I am actually not too bothered whether you address them or not. I do not believe that failure to address my points will result in a less stable or useful internet. I think that you might make your document more easily used and clearer for people trying to understand the technology if you find ways to answer the points in your text. --- I found that the Abstract and Introduction are not clear on the motivation for this work. The number of IDs supported is not a direct statement of the granularity of labeleing. Thus, the motivation needs to be clarified. Is the requirement to increase the number of labels available? Or is the requirement to enable subdivision of existing labels to increase granularity? --- I am trying to understand why it is necessary to include the Ethertype twice in order to stack the label. Surely the first 893b provides sufficient information that what follows will be two label values. I shouldn't be surprised to find that the answer is in bullet 2 of section 2.1, or maybe it is to do with 32 bit alignment, but this sort of design choice probably needs to be called out if you don't want future generations to become confused or even break Trill. But the way things are constructed means that you should also handle the case where the second Ethertype has a different value. --- Like Stewart, I am surprised that you have chosen to limit the label stack to exactly two components: a high part and a low part. This seems remarkably un-forward-looking given the experience we have of the value of deeper label stacks in other technologies. --- The description of the fields in the two parts of the label in Section 2.3 seems to be incomplete. I find: The two bytes following each 0x893B have, in their low order 12 bits, fine-grained label information. The upper 4 bits of those two bytes are used for a 3-bit priority field and one drop eligibility indicator (DEI) bit. The priority field of the Inner.Label High Part is the priority used for frame transport across the TRILL campus from ingress to egress. The label bits in the Inner.Label High Part are the high order part of the FGL and those bits in the Inner.Label Low Part are the low order part of the FGL. This omits: - In what order are the priority field and DEI arranged? - What is the meaning of the priority field in the low part? - What is the meaning of the DEI in the low part? - What is the link between the label format here and that in 6325 where the 3-bit priority field is accompanied by a "C" bit? I know that some of the answers appear later in the document, but it seems odd that you describe some fields, but not all of them, and it looks like not all questions do actually get answered. --- Section 3 has: It is the responsibility of the network manager to properly configure the TRILL switches in the campus to obtain the desired mappings. This seems to leave quite an opening for fat fingers. What mechanisms are provided to assist the operate in detecting her mistakes? Do you have defaults you can recommend for "out of the box" behavior?
Throughout: I assume that "campus" is a well-understood term of art in this area? This was not a familiar use to me. 4.1: It MUST be possible to configure the ports of an FGL-edge TRILL switch to ingress native frames as FGL. I don't understand this requirement. Are you simply saying, "An FGL-edge TRILL switch MUST ingress native frames. It MAY ingress them or FGL or MAY ingress them as VL, depending on local configuration."? If so, say that. Otherwise, it sounds like you're saying, "The purchase order for buying an FGL switch MUST say that it is configurable to...", which is silly. FGL-edge TRILL switches MUST support configurable per port mapping from the C-VLAN of a native frame, as reported by the ingress port, to an FGL. See above. What is the protocol requirement you are trying to express?
I balloted Yes because I like it. The document seems well-written and complete, gives thought to operational aspects and security aspects, and provides good background for readers who might not have been the implementers for TRILL VL, but are now implementing FGL on the next release. I do have some comments, but they're not blocking. Is this draft about coexistence or migration? I'm seeing both words used in the text. We spent some time talking about this recently ("if you're migrating for decades, you're coexisting"). In this text: 2.2 Base Protocol TRILL Data Labeling This section provides a brief review of the [RFC6325] TRILL Data packet VL Labeling and changes the description of the TRILL Header by moving its end point. This descriptive change does not involve any change in the bits on the wire or in the behavior of VL TRILL switches. I found "this descriptive change" confusing on first read. Perhaps "this change in description" might be clearer. Thanks to the WG for describing the things that could happen when you send FGL frames to a switch that's not FGL-safe. In Appendix A, I found a "unicat" - guessing that's a typo.
My discuss originally said: "I see that Ethertype 893B is owned by IETF TRILL Working Group, 155 Beaver Street, Milford MA 01757, UNITED STATES. It is unusual for an IETF WG to own codepoints in other SDOs, and in any case I do not recognize that address as a normal IETF address. Who owns change control and IPR for Ethertype 893B? Why is this not clearly an IETF assignment at IETF main office." From the ensuing discussion I think that the policy of the IETF should be that the Etherypes of its protocols point to a formal IETF contact address. What we do about historic Ethertypes is a different matter to what we do going forward. I will clear this when the IESG has agreed the best way to move forward on this and that policy is enacted.
I think a more precise definition of Ethertype 893B is required that is provided by this draft. Specifically it looks like the structure 893B/lower/893B/upper is the only structure allowed and the draft should be more emphatic on this.
Seems like another bullet to be added to s5.3 is that the swithc needs to check for the Ehtertype?
I am sympathetic to stuart's discussion point. as I noted there is historical precedent for setting the registration of an ieee registerd resource to IANA (rather than to a working group or individual. --
P. 14: informative reference to HHGTG almost, but not entirely, helpful. :) P. 15: why start at version 1 when you can only have a total of 4 versions? Wouldn't version 0 be a better choice? P. 19: I'm a little uncomfortable with this text: There is no expectation and no need to perform normalization within a CoAP implementation (except where Unicode strings that are not known to be normalized are imported from sources outside the CoAP protocol). I think what's intended here is right, but you've mentioned what amounts to a strong suggestion, if not a requirement, as a parenthetical note. It seems like what you intended was something like this: There is no expectation and no need to perform normalization within a CoAP implementation, since senders are expected to be implemented with pre-normalized strings. However, strings received from non-CoAP sources SHOULD be normalized where possible. Of course, there's actually no value to normalization in this case if it can't be depended on, and I suspect that you don't want to make that a MUST. So this might be a better way to do it: There is no expectation and no need to perform normalization within a CoAP implementation, since senders are expected to be implemented with pre-normalized strings. Strings received from non-CoAP sources and then forwarded by CoAP senders cannot be assumed to have been normalized, and may not compare correctly with normalized strings representing the same text. I don't have a strong opinion about how this should be done, but it seems like the text as written doesn't give clear guidance. It seems that cross-proxies ought to be able to do normalization, and maybe other proxies could as well, but that's a much bigger change. Section 4.1: Even though I think it doesn't make any sense to do this, it might be worth stating how a receiver should behave if it receives an ACK with a request. Section 4.2: Wouldn't this: More generally, Acknowledgement and Reset messages MUST NOT elicit any Acknowledgement or Reset message from their recipient. be better stated this way? More generally, recipients of Acknowledgement and Reset messages MUST NOT respond with either Acknowledgement or Reset messages. Might be worth grouping for operator precedence here: OLD: ACK_TIMEOUT * (2 ** MAX_RETRANSMIT - 1) * ACK_RANDOM_FACTOR NEW: ACK_TIMEOUT * (2 ** (MAX_RETRANSMIT - 1)) * ACK_RANDOM_FACTOR OLD: 2 * MAX_LATENCY + PROCESSING_DELAY NEW: (2 * MAX_LATENCY) + PROCESSING_DELAY Section 5.8: The use of the word "safe" to describe methods is really confusing because of save/unsafe options. It would help ease of comprehension if you used a different word--e.g., read-only. I realize that this goes somewhat against the idea of sharing nomenclature with HTTP, but I think the clash between safe/unsafe options and safe/unsafe methods is confusing enough that you aren't really benefiting from that anyway. In 5.9: as a user and implementor of RESTful systems who has learned by doing more than by reading, the term "action result" is somewhat opaque to me. I think I know what it means, but it might be nice to explain what it means before using it like this: Like HTTP 201 "Created", but only used in response to POST and PUT requests. The payload returned with the response, if any, is a representation of the action result. 18.104.22.168: I think you mean "first" rather than "previously": The client is not authorized to perform the requested action. The client SHOULD NOT repeat the request without previously improving its authentication status to the server. Which specific mechanism can be used for this is outside this document's scope; see also Section 9. 5.10.1: how does the client know that an endpoint hosts multiple virtual servers in time to leave out the Uri-Host option? Is this literally just in the case where the hostname appears in the URI to be decomposed as an IP address literal? are sufficient for requests to most servers. Explicit Uri-Host and Uri-Port Options are typically used when an endpoint hosts multiple virtual servers. In 22.214.171.124: I can't quite understand what is meant here. I could see it meaning either or both of the following: 1. If If-None-Match appears in a query with one or more If-Match options, and none of the If-Match options match, the condition is fulfilled. 2. If If-None-Match appears in a query, no If-Match options may appear in that query; the condition is met if the target doesn't exist. I think that the text means just (2), but because of the name of the option, I want it to mean (1), even though the text doesn't say that, and since the text doesn't say that If-None-Match and If-Match are mutually exclusive, I can easily imagine someone reading the text and carelessly assuming that it means (1) or (1 or 2). In 6.4, why /url/? This is really confusing--I was halfway through this section, and kind of confused, before I realized that the slashes were for emphasis, and weren't path separators. Also in 6.4, the text does not account for the case where there's a user part to the authority section of the URI.
I want to thank the authors and WG for producing this specification. I think it may be one of the more important pieces of work in recent years. As might be expected with a document of over 100 pages reviewed by a pedant, I have a list of nits and worries. These are all enetered as Comments: I hope you will find time to address them and make the resulting RFC more polished, but none of them comes even close to a requirement that you change the text, and I am ballotting Yes. --- How quickly we forget! I looked up the spec of a typical home computer around the time of HTTP v1.0 and discovered that it was not so very different from what you are scoping your environment to today. That is not to say that your work is not valid. Far from it! But I do believe it might be helpful to explore in more detail why you don't simply run an early, low-function version of HTTP. FWIW, I believe the answer is that you want some of the advanced functions that have been added to HTTP over the years, but do not want the full family of rich features that may chew memory and bandwidth. --- What about the use of CoAP on the wider Internet? --- A very minor terminology point from Section 1.2 Confirmable Message Some messages require an acknowledgement. These messages are called "Confirmable". When no packets are lost, each Confirmable message elicits exactly one return message of type Acknowledgement or type Reset. Non-confirmable Message Some other messages do not require an acknowledgement. This is particularly true for messages that are repeated regularly for application requirements, such as repeated readings from a sensor. You have picked a form "confirmable" and "non-confirmable" tha expresses the ability to do something: this message can be confirmed. But you have mapped the form to a description that requires action: an acknowledgement must be sent. "Can be confirmed" != "A confirmation must be sent" "Cannot be confirmed" != "A conformation does not need to be sent" I don't believe this is too important because you are defining the tems, but I do think that the casual reader will not embed your redefinitions of normal English words, and so will be confused by these terms in the text. --- Section 1.2. Also a very minor point. Acknowledgement Message An Acknowledgement message acknowledges that a specific Confirmable Message arrived. It does not indicate success or failure of any encapsulated request. The fact that and Acknowledgement Message can carry a response is lost in this definition. Maybe you need (including fixes to your typos)... Acknowledgement Message An Acknowledgement Message acknowledges that a specific Confirmable Message arrived. Of itself, an Acknoweledgement Message does not indicate success or failure of any request encapsulated in the Confirmable Message, but the Acknoweledgement Message may also carry a Piggy-backed response (q.v.). --- My feeling was that the Message IDs shown in figures 2 through 6 were confusing in their randomness. For example, you could spend a lot of time staring at figures 2 and 3 trying to work out how CON is encodeded as 0x7d34 while NON is encoded as 0x01a0. Since you want to show Message IDs to show how they correspond on different parts of the flow, you could have written, e.g., Client Server | | | CON [MsgID1] | +----------------->| | | | ACK [MsgID1] | |<-----------------+ | | --- Section 2.1 As CoAP is based on UDP, it also supports the use of multicast IP I don't think it is based on UDP. I think it runs over UDP. --- Section 2.3 As CoAP was designed according to the REST architecture and thus Maybe insert another pointer to [REST]. --- Section 4.2 A Confirmable message always carries either a request or response and MUST NOT be empty, unless it is used only to elicit a Reset message. That is a bit of an over-use of "MUST NOT". I think A Confirmable message always carries either a request or response, unless it is used only to elicit a Reset message in which case it is empty. --- Section 4.2 A CoAP endpoint that sent a Confirmable message MAY give up in attempting to obtain an ACK even before the MAX_RETRANSMIT counter value is reached: E.g., the application has canceled the request as it no longer needs a response, or there is some other indication that the CON message did arrive. In particular, a CoAP request message may have elicited a separate response, in which case it is clear to the requester that only the ACK was lost and a retransmission of the request would serve no purpose. However, a responder MUST NOT in turn rely on this cross-layer behavior from a requester, i.e. it SHOULD retain the state to create the ACK for the request, if needed, even if a Confirmable response was already acknowledged by the requester. This paragraph is giving me some worries. I think the initial MAY confuses the CoAP implementation with the endpoint containing the CoAP implementation. The endpoint may give up, and may instruct the CoAP implementation to stop retransmitting. But I think a CoAP implementaiton must not give up unless it is told to. The drop from MUST NOT to SHOULD in the final sentence seems odd. My understanding was that a CoAP implementation MUST always retain the state to create the ACK for the request. Is this use of SHOULD a relaxation, and how does it square with the MUST NOT? --- Section 4.6 Messages larger than an IP fragment result in undesired packet fragmentation. s/undesired/undesirable/ ? --- Section 4.6 A CoAP message, appropriately encapsulated, SHOULD fit within a single IP packet (i.e., avoid IP fragmentation) and (by fitting into one UDP payload) obviously MUST fit within a single IP datagram. s/MUST/will/ --- Section 12 I am surprised that IANA was relaxed about the use of "reserved" in Section 12. For example, in 12.1 you have two ranges marked "Reserved" without any clue to what this means. For example, does it mean that allocations can be made, that an RFC can dedicate them to a new use, or that they must never be allocated? In 12.1.2 you have The Response Codes 96-127 are Reserved for future use. All other Response Codes are Unassigned. I take "Unassigned" to mean available for assignment according to the policy for the registry. But "Reserved or future use" means what? In 12.2 you have | 0 | (Reserved) | | This is the meaning of "reserved" I think we are used to, and means will not be made available for allocation. (Although I am puzzled that you don't include the pointer to this RFC.)
The items here are very borderline in my book for DISCUSS items; I'm happy to be talked out of them. But I would like to hear from the authors and/or chairs before I give my "YES" (which is my plan once these are resolved). 4.1: An empty message has the Code field set to 0. The Token Length field MUST be set to 0 and no bytes MUST be present after the Message ID field. If there are any bytes, they MUST be processed as a message format error. If you insist on the MUSTs, make the second one "bytes of data MUST NOT be present". The current construction is ambiguous. That said, I find the combination of MUSTs to be a bit problematic. MUST NOT send data, but MUST receive as a format error will lead to some sender saying, "A conformant receiver MUST reject with an error, so no need for me to validate on the way out" and for a receiver to say, "A conformant sender MUST NOT send data, so no need for me to validate on the way in." That's a recipe for non-interoperability. If it were me, I'd drop the last sentence. 4.3: rejecting a Non-confirmable message MAY involve sending a matching Reset message, and apart from the Reset message the rejected message MUST be silently ignored. See comment on 2.1. But if you're going to allow this, I don't understand the MAY: Doesn't rejecting the message require sending a Reset? Otherwise, the message has not been rejected; it's simply been ignored. The second part is either redundant or confusing: What else might I do with a rejected message other than send the Reset and ignore it? I think this needs rewriting. 5.2.2: It is probably worth saying somewhere in here: "Once the server sends back an empty Acknowledgement, it MUST NOT send back the response in another Acknowledgement, even if the client retransmits another identical request. If a retransmitted request is received (perhaps because the original Acknowledgement was delayed), another empty Acknowledgement is sent and any response MUST be sent as a separate response." 5.4.2: Cache-Key is undefined, here or in any other document I can find. It probably needs an explanation somewhere in this document. 5.5: Again, I don't like the combination of MUST NOT include/MUST ignore. I would drop the MUST ignore part. 5.10.4: The server SHOULD return the preferred Content-Format if available. If the preferred Content-Format cannot be returned, then a 4.06 "Not Acceptable" SHOULD be sent as a response. What are the exceptions to the above two SHOULDs? If the preferred format is available, when would a server not return it. If it's not available, when would the server return other than "Not Acceptable"? Also, since Accept is not marked as "Critical", why isn't it *always* treated as elective and therefore ignored if the server can't satisfy the request? (In other words, shouldn't you also have a "Critical Accept" option defined?)
By section number: 1. The reference to [I-D.ietf-lwig-terminology] worries me, given that it is not even in LC yet. 2.1: When a recipient is not able to process a Non-confirmable message, it may reply with a Reset message (RST). Why is this? If a NON message can't be ACKed, why can it be RST? This seems like additional machinery for the client. 2.2: the response to a request carried in a Confirmable message is carried in the resulting Acknowledgement (ACK) message. This is called a piggy-backed response, detailed in Section 5.2.1. [...] If the server is not able to respond immediately to a request carried in a Confirmable message, it simply responds with an empty Acknowledgement message so that the client can stop retransmitting the request. When the response is ready, the server sends it in a new Confirmable message (which then in turn needs to be acknowledged by the client). It took me a bit to figure out why things worked this way, and I think a sentence or two of explanation would be useful. A piggy-backed response to a Confirmable request doesn't itself need to be confirmable because if the ACK gets lost, the client will re-transmit the request until it gets the answer. When the response to a Confirmable request is not piggy-backed, the response should itself be Confirmable, since a Confirmable request will normally want a guaranteed response. Likewise, if a request is sent in a Non-confirmable message, then the response is usually sent using a new Non-confirmable message, although the server may send a Confirmable message. "Likewise" seems wrong here. A Non-confirmable request can *not* get an Acknowledgement message, and therefore can *not* get a piggy-backed response. Additionally, the response MAY be Confirmable or MAY be Non-confirmable, though certainly Non-Confirmable is the more likely case. This should probably be reworded. 3: Token Length (TKL): 4-bit unsigned integer. Indicates the length of the variable-length Token field (0-8 bytes). Lengths 9-15 are reserved, MUST NOT be sent, and MUST be processed as a message format error. Why not make TKL a 3-bit unsigned integer and have a reserved padding bit before it? Is this protocol likely to be extended to 9-15 byte Tokens? Code: 8-bit unsigned integer. Indicates if the message carries a request (1-31) or a response (64-191), or is empty (0). As above, why not make a 3-bit field for Code Type (000=request, 001=reserved, 010=success response, 100=client error response, 101 =server error response, 110/111=reserved), and then a 6-bit Code? It would also make the registry much easier to read. 4.2: A Confirmable message always carries either a request or response and MUST NOT be empty, unless it is used only to elicit a Reset message. I don't understand the requirement not to be empty, especially when there is an exception at the end of the sentence. Shouldn't this be "and contains data bytes unless it is used only to elicit a Reset message."? The Acknowledgement message MUST echo the Message ID of the Confirmable message, and MUST carry a response or be empty (see Section 5.2.1 and Section 5.2.2). I don't understand the second MUST: What other choice is there besides carrying a response or being empty? Aren't those the only two options? The Reset message MUST echo the Message ID of the Confirmable message, and MUST be empty. Why MUST a Reset be empty? What harm is there if there is data in there? Rejecting an Acknowledgement or Reset message is effected by silently ignoring it. I don't understand the above. As far as I can tell, an Acknowledgement nor a Reset can be rejected; the side that sent them will never know it is rejected. 4.5: All of the MUSTs and MAYs in the section are used rather terribly. I'm glad to suggest text if you need. 4.6: The phrase "and (by fitting into one UDP payload) obviously MUST fit within a single IP datagram" is unnecessary, but even if you do use it, s/MUST/needs to. The MUST is not a requirement on an implementation of CoAP. That said, I fear this section is really nothing more than an implementation note: Because it is a layer violation, it's not clear to me that any implementer has the ability to figure out much of this. (For example, the idea that an implementer would be willing to -- or even know how to -- set the Do Not Fragment bit or figure out the Path MTU is a bit hopeful.) I have no objection to this section, but it might be better as an implementation note rather than a set of requirements. 5.2: I found "indicating a value of 4*32+3, hexadecimal 0x83 or decimal 131" just adds confusion rather than clarify. Perhaps instead: "even though if taken as an 8-bit unsigned, the entire Response Code field would have a value of hexadecimal 0x83 or decimal 131 (4*32+3)". But personally, I would simply drop it; it doesn't add anything. See also comments above on section 3 and below on 12.1. Also: The response codes are designed to be extensible: Response Codes in the Client Error and Server Error class that are unrecognized by an endpoint MUST be treated as being equivalent to the generic Response Code of that class (4.00 and 5.00, respectively). However, there is no generic Response Code indicating success, so a Response Code in the Success class that is unrecognized by an endpoint can only be used to determine that the request was successful without any further details. First, I suggest changing the first sentence after the colon: "Response Code details that are unrecognized by an endpoint when the class is Client Error or Server Error are treated as equivalent to the...". Much clearer, and the MUST is wrong. Second, I don't see why you don't simply define a 2.00 so that this can be a bunch simpler. 5.2.3: Potential confusion: s/the client may send a Resent message/the client will send a Resent message 5.4.1: I don't understand the need for the third bullet. Isn't this already said in the second bullet? The fourth bullet has the same issue as 2.1 and 4.3. 5.4.4 If the option is not present, the default value MUST be assumed. Don't you really mean, "If an option is present with no value, the value is the default."? The way you have it written sounds like a completely missing option should be assumed to be present and have a default value. I don't think that's what you mean. (The MUST is just superfluous anyway.) 5.4.5: If a message includes an option with more occurrences than the option is defined for, the additional option occurrences MUST be treated like an unrecognized option (see Section 5.4.1). I think you want to be specific about the order: If a message includes an option with more occurrences than the option is defined for, any additional option occurrences that appear subsequently in the message MUST be treated like an unrecognized option (see Section 5.4.1). That is to say, you can't choose to keep later ones and discard earlier ones. 5.4.6: If it were me, I would have put the NoCacheKey bits in the high four bits so that you could simply do a <224 test for cache-matching options. I suppose this ship has sailed. 5.5.1: The implementation note notwithstanding, I don't understand why Content-Format is not a SHOULD. 5.7.1: If a response is generated out of a cache, it MUST be generated with a Max-Age Option that does not extend the max-age originally set by the server The reverse would be clearer: If a response is generated out of a cache, the generated Max-Age Option MUST NOT extend the max-age originally set by the server Also: "a proxy SHOULD be conservative" s/SHOULD/should. There's nothing to implement here. 5.10.5 s/MUST/is. The MUST is not meaningful. 8.1: A server SHOULD be aware that a request arrived via multicast, e.g. by making use of modern APIs such as IPV6_RECVPKTINFO [RFC3542], if available. That SHOULD is not meaningful. It is useful, not required with exceptions (as SHOULD indicates), for the server to know that it is using multicast. This also gives a reason not to allow RST on *any* NON messages. 8.2: "the server MAY always pretend"? We are getting sillier with our 2119 usage later in the document. Did you really mean, "the server MAY ignore the request"? Isn't that true of any NON request, not just multicast ones? (I did not do an significant review of sections 9 or 11.) 12.1: As mentioned regarding section 3 and 5.2 above, I think it would be much easier to figure these out if you separated out the bit fields of code type and code, and then had sub-registries for request codes, success codes, client error codes, and server error codes.
I will (still:-) end up balloting yes for his. Just one DISCUSS point remaining. (1) Cleared, but see comment below (2) Cleared. (3) Cleader, but see comment below. (4) Cleared, but see comment below. (5) Cleared. (6) Cleared. (7) Cleared. (8) Cleared. (9) Cleared. (10) 10.1 - what does https mean here? If it means that the request/response are in clear between the source and proxy and then encrypted then a) you really really need to say that clearly and b) why is that even acceptable and c) what if the destination resource requires client auth? It just seems broken to pretend to use https this way. Going via a cross-proxy breaks security. Similarly, what does coaps mean in 10.2? We had some mail exchanges about that, but I'm not sure I'm ok with the outcome so I'd like to DISCUSS this some more. (And did any of that get into -16? Not sure.)
--- former discuss points (1) I think you made a change to 5.6 for this but I still think (now at COMMENT level) that it'd be really good to say that CoAP is currently well defined only for URI schemes like coap(s) and http(s) but maybe not others. Basically, you need a scheme://host:port syntax or else you have to do some more specification work to get CoAP interop. (3) You now say "SHOULD use 32 bits of randomness" which is ok. I think it might be worth adding that CoAP nodes that might be targetted with many guesses SHOULD also detect and react to that. Text of discuss (3) was: 4.2, last para: this creates an attack that can work from off-path - just send loads of faked ACKs with guessed Tokens and some of 'em will be accepted with probability depending on Token-length and perhaps the quality of the RNG used by the sender of the CON. That could cause all sorts of "interesting" application layer behaviour. Why is that ok? (Or put another way - was this considered and with what conclusion?) I suspect you need to have text trading off the Token length versus use of DTLS or else CoAP may end up too insecure for too many uses. (Note: the attack here is made worse because the message ID comparison can be skipped. Removing that "feature" would help a lot here.) 5.3.1's client "may want to use a non-trivial, randomized token" doesn't seem to cut it to me. How does this kind of interaction map to DTLS sessions/epoch? Basically, I'd like to see some RECOMMENDED handling of token length that would result in it not being unsafe to connect a CoAP node to the Internet. (And please note recent instances where 10's of thousands of insecure devices have been found via probing the IPv4 address space. These are real attacks.) (4) 4.4, implementation note - this seems unwise since it means that once Alice has interacted with Bob, then Bob can easily guess the message IDs that Alice will use to talk to Charlie. This is no longer a DISCUSS because you said that the WG figure its ok and given you say to randomise at the start (of what?) then its marginal. --- old comments below, sorta checked against -16 intro, 2nd para: better to not talk about the WG name and its work really, but about the resulting protocol intro, last para: more sales pitch language 3: Message ID - with 16 bits that imposes a rate limit on how often you can send. I don't think that's described and I'm curious as to whether it'd set to max goodput for CoAP that'd be way less than otherwise possible with e.g. HTTP. - I think in a mail Carsten said its 250/second max or something, I still think this'd be great information to say explicitly early on in the spec since it might prevent someone spending a lot of effort before they find out that CoAP doesn't work for their use-case. 7.1 - what if I want to only do discovery via DTLS? What does "support" mean for port 5683 then? Carsten said that you do need to still listen on 5683 even if you only want to do work on <TBD>. I'm not so happy about that but its not DISCUSS level. 12.7 - as it turns out I also don't see why this needs two ports - the cost is two more bytes for security which is significantly-enough less than the current cost (in terms of message size) for security. Am I wrong? Carsten responded: "yep, that's what we want" and I'm ok with that, if not convinced.
A well-written document and I have a few points to discuss. The congestion avoidance mechanisms look ok, but I assume we will get feedback from implementers and deployments on the parameters and mechanisms. It would be good to get this feedback documented at some point. Here are the issues (based on my own review and input from Joe Touch and Michael Scharf): 1) IPv6 UDP checksum calculation It is not clear if zero UDP checksums are permitted or not permitted with COAP.? (UDP zero checksums: https://datatracker.ietf.org/doc/draft-ietf-6man-udpzero/) That should be specified. 2) Handling of UDP-lite Can UDP-lite (RFC 3828) be used or cannot be used in conjunction with CoAP? 3) Fragmentation of messages The recommendation in Section 4.6 about the path MTU is generally valid only for IPv6. For IPv4, 567 bytes is the safe area to work without fragmentation, though in today WANs 1280 work perfectly, but I am not so sure about the networks envisioned for CoAP. This 576 bytes for IPv4 are mentioned in the implementation note, but deserves text on the same level as for IPv6. 4) Ensuring no fragmentation with IPv4 The implementation note in Section 4.6 states that for IPv4 it is 'harder to ensure that there is no IP fragmentation'. This neglects the possibility of using the Don't Fragment (DF) flag in the IPv4 header and also that there is possibly feedback from a node enroute that the MTU is too big if the DF flag is set, i.e., by means of an ICMP error message. Should there be any recommendation or protocol machinery to deal with path probing? E.g., referencing RFC 4821 (PMTUD). 5) Reaction to network errors that are signalled I wonder why the draft is not discussing any reaction to network failures signalled through ICMP messages. This relates also to my DISCUSS issue no 4. 6) Idempotency done. 7) Protocol reactions to reserved or prohibited values Regarding reserved or prohibited values in the IANA section, it would be useful to be clear about what happens when those values are seen. I.e., should they be ignored, generate an error, etc. 8) Flow Control/Receiver Buffer The protocol does not have any real means for the receiver to control the amount of data that is being sent to it. I can understand the attempt to provide a simple protocol, but adding a very basic flow control mechanism will not prohibitively increase the complexity of the protocol, while improving robustness. According to Section 2.1, a node can always return a RST if the message cannot be process for whatever reason. I propose to add an option to the RST message that allows the message receiver to state how much data it is willing to accept from a particular sender or in general (up to the implementation). 9) Handling a wrapping message IDs According to Section 4.4.: "The same Message ID MUST NOT be re-used (in communicating with the same endpoint) within the EXCHANGE_LIFETIME (Section 4.8.2)" with EXCHANGE_LIFETIME of 247s. By now it is unrealistic that the message ID of 16 bits will wrap around in that time frame, but protocols live long and at some later time it can be possible. However, the protocol doesn't have any means to detect wrapped message IDs.
1) Endpoint vs. host This document uses the term "endpoint" to refer to the combination of address and port, and possibly also security association, that is local to one end of an association. I would have expected the more common term "socket", as originated in TCP parlance, to be used instead (even though here the term is used in a connectionless context). 2) Reaction to network errors due to local link errors Link layers can give some hints if the link is up, down, etc. Traditionally, this has not been taken into account too much when design transport protocols, but wouldn't it make sense to take it into account for CoAP, as it is much more working in constrained environments? 3) Short messages Section 3., paragraph 1: > CoAP is based on the exchange of short messages which, by default, > are transported over UDP (i.e. each CoAP message occupies the data > section of one UDP datagram). CoAP may also be used over Datagram What are short messages in terms of bytes? Is this a hidden protocol requirement? 4) randomization of message IDs Section 4.4., paragraph 3: > Implementation Note: Several implementation strategies can be > employed for generating Message IDs. In the simplest case a CoAP > endpoint generates Message IDs by keeping a single Message ID > variable, which is changed each time a new Confirmable or Non- > confirmable message is sent regardless of the destination address > or port. Endpoints dealing with large numbers of transactions > could keep multiple Message ID variables, for example per prefix > or destination address (note that some receiving endpoints may not > be able to distinguish unicast and multicast packets adressed to > it, so endpoints generating Message IDs need to make sure these do > not overlap). The initial variable value should be randomized. the initial variable SHOULD be randomized, just to avoid blind off path attacks, right? 5) In Section 4.6.: > larger than an IP fragment result in undesired packet fragmentation. should read larger than an 'IP packet' instead of 'IP fragment'. 6) Section 5.4.1., paragraph 7: > Critical/Elective rules apply to non-proxying endpoints. A proxy > processes options based on Unsafe/Safe classes as defined in > Section 5.7. I suggest moving this statement to the beginning of this subsection, as it provides important information that shouldn’t be missed. 7) Dependency between application layer and CoAP Section 5.2.2., paragraph 2: > The server maybe initiates the attempt to obtain the resource > representation and times out an acknowledgement timer, or it > immediately sends an acknowledgement knowing in advance that there > will be no piggy-backed response. The acknowledgement effectively is > a promise that the request will be acted upon. This may or may not be an issue: Assuming that the server did sent an ACK for a request but is never ever fulfilling its promise to send any real 'response'. The request/response initiated from the client is done on the CoAP level, but not for the application on top. Is there any recommendation for the application on top of CoAP how to handle such cases?
Thank you for this important work and well written specification. While there are aspects that I would personally have done differently and some fine-tuning of the spec could continue, I believe the document is ready to move to an RFC. I also believe it that is a much-awaited spec and very useful to the Internet community in its current state. I do agree with some of the points raised in other reviews, and those need to be addressed. I did have one specific additional suggestion worth bringing up here. Dan Romascanu did a Gen-ART review and raised the issue that the parameter changes discussed in S4.8.1 are security sensitive, i.e., changes in the parameters may cause security/denial-of-service issues. This should be noted somewhere in the S11. I'd make a brief observation that it is security sensitive and should be addressed in any system that allows configuration of these parameters. Here's what Dan wrote: 3. Section 4.8 defines a number of CoAP protocol parameters and derived parameters that according to 4.8.1 may be changed. Some of these parameters have limitations and their changes may affect the basic functionality of the nodes, the interaction between nodes or between nodes and servers, as well as the functioning in constrained environments. However there is no risk analysis in Section 11 (Security Considerations) about the threats related to mis-configuration of the modes and un-appropriate or malevolent changes in these parameters, and recommendations of security counter-measures on this respect.
Overall, this is a very nicely written specification. Thanks! In Section 2.2., are requests and responses in 1-1 correspondence? Or can a single request receive more than one response? In Section 3, why is version number 1 and not 0? What's the plan here, do we get 3 or 4 versions out of this? In Section 4.3, would it make sense to have something stronger than MAY for cases where future messages are likely to be screwed up, e.g., where CoAP syntax is malformed? (A "STFU RST"?) From Section 4.2 and 4.3, I generated a table mapping message types to request/response/empty: CON NON ACK RST Request X X ? ! Response X X X ! Empty ! ! X X Might be helpful to include something like that as a summary. This might be a bad idea, but: Did the WG consider allowing an ACK to contain a request? In the case where a CON contains a response and the client wants to send another request, it would save a message to put the request in the ACK to the response.
I lacked some time to review the draft details. However, after a discussion with Joel Jaeggli, and with the OPS DIR review from Mehmet, I trust that the OPS aspects have been taken care of.
Note that I plan to ballot Yes, after we resolve these questions. I have three points - the first one is the one I'm most curious about. In 4.8.1. Changing The Parameters It is recommended that an application environment use consistent values for these parameters. I'm thinking about this in an IOT/M2M context where it's somewhere between inconvenient and impossible to change parameters on all the deployed devices at once. I understand that configuring these parameters is out of scope for the doc, so assume changing the parameters is out of scope as well. If you start deploying new devices into that environment with significantly different parameters, is it more likely that performance would suffer, or that something would break? (I don't care what the answer is, I'd just like for the reader to have one - do you HAVE to get the parameters right the first time, or do you WANT to get them right, but you can deploy new devices with different parameters and let the old devices be removed/replaced over time?) This one is on the edge of being a Comment: In 5.10.5. Max-Age The value is intended to be current at the time of transmission. Servers that provide resources with strict tolerances on the value of Max-Age SHOULD update the value before each retransmission. Will servers know that resources they serve have strict tolerances? The answer may be "yes", I'm just asking. If not, I'm wondering if this should be a MUST. This one is on the edge of being a comment: In 7.2. Resource Discovery The discovery of resources offered by a CoAP endpoint is extremely important in machine-to-machine applications where there are no humans in the loop and static interfaces result in fragility. A CoAP endpoint SHOULD support the CoRE Link Format of discoverable resources as described in [RFC6690]. Is it obvious that this is a SHOULD? Is CoRE Link Format necessary for resource discovery, or can you also accomplish this with humans if they're in the loop? I'm just trying to wrap my head around "it's extremely important but implementations might not do it".
I think this specification is well-written, it's important, and a lot of people will need to read it - that's why I'm being picky on comments. Martin already has a DISCUSS about some of the more transport-ish topics (support for UDP-lite, etc.). I'm sympathetic, but didn't restate them. If Martin is happy, I'll be happy. In this text: Constrained networks like 6LoWPAN support the expensive fragmentation of IPv6 packets into small link-layer frames. Is "support" the right word here? I'm not understanding "support the expensive fragmentation". In this text: Although CoAP could be used for compressing simple HTTP interfaces Is "compressing ... interfaces" the right way to say this? I've seen other reviewers mention "short messages" in "CoAP is based on the exchange of short messages", but it may also be worth clearly distinguishing "short message" from "SMS" ("short message service") - as I understand it, the two phrases have nothing in common, but they are both used in the document (at the beginning of Section 3, and even in the same paragraph) without qualification. Response codes correspond to a small subset of HTTP response codes with a few CoAP specific codes added, as defined in Section 5.9. I get this, but I'm wondering if it's worth thinking about whether these similar but unrelated namespaces can semi-collide (if HTTP is extended to include a 328 response code, is it OK for CoAP to define a 3.28 response code that means nothing like what HTTP 328 means?) Given that 404 and 4.04 are similar, for example, I'd expect some implementers to guess what less common CoAP response codes are, based on HTTP response codes, rather than check carefully. That's an obscure comment, but I thought I should ask. In 6.4. Decomposing URIs into Options, is "fail this algorithm" clear? It might be a term of art for HTTP folk, but I'm not familiar with it. 4. If |url| has a <fragment> component, then fail this algorithm. In 8.1. Messaging Layer When a server is aware that a request arrived via multicast, it MUST NOT return a RST in reply to NON. If it is not aware, it MAY return a RST in reply to NON as usual. Doesn't this tell me that the MUST NOT is not required for interoperability? I'm only quibbling about the use of 2119 language. On a related point, if there was a sentence that started out "to keep Bad Thing X from happening, ..." that would be helpful. There's similar language in 8.2. Request/Response Layer When a server is aware that a request arrived via multicast, the server MAY always ignore the request, in particular if it doesn't have anything useful to respond (e.g., if it only has an empty payload or an error response). but MAY is pretty weak anyway (maybe "can always ignore the request", to avoid the 2119 question?). In 11.3. Risk of amplification This is particularly a problem in nodes that enable NoSec access, that are accessible from an attacker and can access potential victims (e.g. on the general Internet), as the UDP protocol provides no way to verify the source address given in the request packet. An attacker need only place the IP address of the victim in the source address of a suitable request packet to generate a larger packet directed at the victim. Such large amplification factors SHOULD NOT be done in the response if the request is not authenticated. I don't understand what the SHOULD NOT means in practice. Is this saying the server shouldn't return large resources for NoSec requests (whatever "large" means), or ? If this is saying the same thing as the text on using "slicing/blocking mode" two paragraphs, down, it would be clearer to combine these points in a single paragraph.
Thank you for a well-written document. I do support Pete's DISCUSS points.
I am entering no-objection on the basis that this has no impact on the routing layer and I am confident that the applications layer ADs will ensure the quality of the design.
Thanks for a clear draft. Makes things so much easier to review. Just want to discuss a couple of things before moving to yes (hopefully). 0) s126.96.36.199: Before I get in to the nitty-gritty of the specific concerns I have with DTLS use of Raw Public Keys, I have to ask whether use of the DTLS client_certificate_url extension was considered? If raw public keys is just about size then that'd work wouldn't it. If you're about dumping processing of paths, then that's another thing. 1) s188.8.131.52.1: What's really hung me up on this draft is the raw public key option. Primarily, the TLS draft is not quite done yet; I'm still in discussions with the authors. Two issues that affect your draft: a) <soap box> By draft-ietf-tls-oob-pubkey taking a pass on specifying all of the ways an identity can be bound to a public key, it leaves it up to the application to specify that mechanism. This binding is important because you can't get peer authentication without it; I'm really worried that if this mode gets widely deployed people will say they have "DTLS security" but few (if any) are actually do the work necessary to bind the identity to the key. </soap box> So, you specify that binding in the provisioning section (good ;) but I want to make sure that it's clear who's doing what to whom: i) s184.108.40.206.1: For machines it's perfect appropriate to generate the key and install it because we doubt it'll be able to do that well enough right (i.e., crummy entropy sources)? I want to make it clear that that's been done by the manufacturer. In this mode the device has an asymmetric key pair but without an X.509 certificate (called a raw public key). to: In this mode the device has an asymmetric key pair but without an X.509 certificate (called a raw public key); the asymmetric key paid is generated by the manufacturer and installed on the device. ii) s220.127.116.11.1: This draft does that binding using Stephen's naming thing with hashes, but I want to make sure that it's clear who is identifier, it now says: An identifier is calculated from the public key as described in Section 2 of [RFC6920]. Is it the An identifier is calculated by the endpoint from from the public key as described in Section 2 of [RFC6920]. b) draft-ietf-tls-oob-pubkey is likely to take a pass on specifying a mechanism for revoking the public key and identity binding. Note that ocsp-/multi-ocsp staplingwon't work because there's no way to request information about a certificate that you don't have information about. I'm not trying to gold plate the security mechanism here but I think we need some words on how revocation for this mode will be handled. However, I suspect you'll want to use the ACLs....there's a mechanism for including ACLs during provisioning but is there a way to update them later? What happens if a new node gets installed or removed? Is there a requirement for ACLs to be supported; the text has a SHOULD but that seems to be about ACL provisioning support not ACL support. 2) s18.104.22.168: Another TLS-related issue: When referring to TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8 please include the MTI curve(s). Ever so glad that conservative curves are being used. In the following, I assumed you'd want the one must and the two mays but I can understand if you don't. I'd argue you do have algorithm agility with DTLS so you could get away with just the MUST ad not the MAYs. Unrelated to you, but I thought I'd let you know about: the curve requirements will almost certainly be removed from the mcgrew draft at my direction because no other D/TLS EC cipher suite specifies an MTI curve. There's support for conservative curves, but not enough interest in starting to add MTI curves instead of the application picking them. Note the Zigbee folks also point to the mcgrew draft but it seems their drafts already include the curves so we ought to be good to go on both fronts. I think we need to be clear that choosing this particular cipher suite that it means an implementation needs to support the extensions defined in RFC 4492 - or if it doesn't. I'm assuming you want it to so I'm going to propose some tweaking: OLD: Implementations in RawPublicKey mode MUST support the mandatory to implement cipher suite TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8 as specified in [I-D.mcgrew-tls-aes-ccm-ecc], [RFC5246], [RFC4492]. NEW: Implementations in RawPublicKey mode MUST support the mandatory to implement cipher suite TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8 as specified in [I-D.mcgrew-tls-aes-ccm-ecc]. The key used MUST be ECDSA-capable; the curve secp256r1 MUST be supported, and the curves secp384r1 and secp521r1 MAY be supported [RFC4492]; these curves are equivalent to the NIST P-256, P-384, and P-521 curves. Implementations MUST use/support (?) the Supported Elliptic Curves Extension and Supported Point Format extensions [RFC4492]; the uncompressed point format MUST be supported; [RFC6090] can be used as an implementation method. The mcgrew draft had the following instead of the last sentence so I'm open to whichever but I think something about the folllowing needs to be added: o The uncompressed point format MUST be supported. Other point formats MAY be used. o The client SHOULD offer the elliptic_curves extension and the server SHOULD expect to receive it. o The client MAY offer the ec_point_formats extension, but the server need not expect to receive it. o [RFC6090] MAY be used as an implementation method. And then, I think we need to specify how the MTI would look: namely by adding the following on to the end of the paragraph. When the mandatory to implement DTLS cipher suite and curve and used the SubjectPublicKeyInfo indicates an algorithm of id-ecPublicKey with the namedCurves object identifier set to secp256r1 [RFC5480]. If secp384r1 or secp521r1 are used the those object identifiers [RFC5480] are included instead. That way everybody knows what values go in the SPKI of the oob-pubkey draft. Note they tried to change that field recently and I had to remind them not to. 3) s9: I know we're all about be liberal in what you accept, but in this context that might be challengeing; this bit: ... all modes of DTLS may not be applicable. Some DTLS cipher suites can add significant implementation complexity as well as some initial handshake overhead needed when setting up the security association. Made me wonder whether you considered which other DTLS extensions might be useful in addition to the EC ones and SNI as well as what extensions should be profiled out? For example, max_fragment_length looks pretty useful in this context as well as certificate_url. But does heartbeat make any sense? 4) s8: I think you need to make it pretty darn clear in s8 that multi-cast is an unsecured "mode" as specified in this document. It's kind of buried in s9. 5) s9.1.2: (worried about a DoS attack) Do you mean that responses to secured-CoAP messages returned unsecured are silently discarded/ignored or rejected and then that kicks off an error code response? 6) s22.214.171.124: Did you consider whether there should be an application profile for the psk_identity_hint (see Section 5 [RFC4279]) - i.e., is there a standard format for CoAP that should be defined? 7) s126.96.36.199: When you say "the Certificate must be validated" I'm just checking that you're think there's going to be a certificate chain? If there's no chain the validation rules in 5280 don't apply. 8) s9: If you're going to allow more than two entities to share the preshared keys I think it's worth pointing out you really can't get peer authentication with either CoAP or DTLS. The description in s9 and elsewhere seems to imply that more than one peer might share the same key. 9) Either in s9 or s11 we need to say something about devices with bad entropy sources not bothering to make keys because they won't be of any use. If they've got bad entropy sources the manufacturer or whoever should be making the keys.
0) s1.2: Is the "CoAP-to-CoAP proxy" definition missing? s5.7 refers to s1.2 for the definition but that term is not in s1.2. 1) s7.1: I think this is munged: A server is discovered by a client by the client knowing or 2) s9: In the NoSec option is worth also pointing to the link layer security (IEEE 802.15.4)? 3) s9.1: Because there's no s9.2 (I assume that might have been an IPsec-secured CoAP section at one point) you can drop the header. 4) s188.8.131.52: The 1st paragraph is about certificates but it only indicates the TLS cipher suite needed to be supported. It should say what values go in the certificate to support the cipher suite. Basically, need to point to RFC 5480 and say include values X. I'd suggest: OLD: Implementations in Certificate Mode MUST support the mandatory to implement cipher suite TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8 as specified in [RFC5246]. NEW (adding some more stuff + references): Implementations in Certificate Mode MUST support the mandatory to implement cipher suite TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8 as specified in [RFC5246]. Namely, the certificate includes a SubjectPublicKeyInfo that indicates an algorithm of id-ecPublicKey with namedCurves secp256r1, secp384r1, or secp521r1 [RFC5480]; the public key format is uncompressed [RFC5480]; if included the key usage extension indicates digitalSignature. 5) s184.108.40.206: Going with the thought that there's a chain, we also need to say what algorithms can be used to sign the certificate. I assume we want the use same algorithm for both the TLS_ECDHE_PSK and TLS_ECDHE_ECDSA certificate so text is needed at the end of the paragraph to indicate that: Certificates MUST be signed with ECDSA [RFC6090], and it MUST use SHA-256, SHA-384, or SHA-512 [SHS]. It is RECOMMENDED that the curve match the hash function's output matches the length of the elliptic curve order. On the fist bit: You could lock it down to just one curve if that's what you decide to do based on comment #5. On the second bit: if the curve and the hash size line up then you can use RFC 6090 and we - really - want that. Not sure if the above is too cryptic ;) 6) s220.127.116.11: Maybe a typo MUST instead of must: If there is no SubjectAltName in the certificate, then the Authoritative Name must match the CN found in the certificate using the matching rules defined in [RFC2818] with the exception that certificates with wildcards are not allowed. 7) s18.104.22.168: r/further work./further study.
I support some of the points raised by mehmet against draft 13 (addressed by carsten) which ultimately are not likely to be resolved by the this draft alone. > ** Concerning the migration path, and future versions of CoAP in the same network: > > - It would be useful to state clearly, in which cases it is dangerous to change any of the recommended default values or parameters in future versions of CoAP and would potentially impact the co-existence of two CoAP versions. Thus such a statement would support an incremental deployment to be successful. Again, I believe this is future work, which also applies for the configuration and management data model. Protocol debugging is aided by the self-describing nature of the protocol messages. This has worked pretty well in the (informal and formal) interops so far. Future work will have to address management interfaces for CoAP nodes, including management of security related events. I think some of this will have to integrate with resource discovery, an active subject in the working group. Grüße, Carsten --- this is not a dicuss because frankly I think at some point you have to draw a line under work that has been done so far and progress work from there.
I don't think it adds value to have the ExID be of variable length. It's highly implausible that there could be 64k TCP experiments in the reasonable lifetime of this policy, particularly now that the code space is being managed by the IANA. Two bytes in the TCP header is a lot, and alignment of 4-byte integers is easy to fix. I think that using a 4-byte ExID runs a real risk of overflowing the available space in the TCP header in real-world circumstances. I realize that I may be stepping in it here by making this observation, because I know that the compromise of doing an IANA registry was not popular in TCPM. Nevertheless I feel compelled to request that the authors consider that, now that this compromise has been reached, it's pointless to maintain support for 4-byte ExIDs. That is, I ask the authors to change the specification to make the ExID explicitly two bytes. Please feel free to point me at the post-IESG-review discussion where this was decided to be unnecessary, rather than reiterating the arguments, if you have pointers to it. I wasn't able to find this in reviewing the history, but the discussion got rather long, so I may have missed it.
Note that you have 3692 transposed as 3962! I find this symptomatic :-( --- I am Abstaining on this document. I understand it as a pragmatic extension of the experimental codepoint space, but I believe that RFC 3692 (sic) is deliberately precise about the value of limiting the experimental space and gives good reasons. I cannot support this document going against the philosophy of 3692, and I think it would have been far better to encourage the use of non-experimental options and possibly make it easier to register non-experimental options. However, I do not believe I should block the pragmatic solution in this document that clearly has consensus support.
I am convinced by the IESG discussion that this is not appropriate for a standards track document.
I have not decided whether to recommend the publication of this document yet. I want to hear the discussion on the IESG telechat first.
Hi Joe, Thanks for your effort on this draft. I'll clear my DISCUSS now. The draft improved a lot during our discussion. Discussing with Martin and Wes during the IETF last week, we concluded that it would be appropriate to restart the ballot on this draft. Indeed, IMHO, the draft improved so much that I believe that some IESG members might clear their ABSTAIN ... which is always a preferred outcome if possible. Regards, Benoit
I think the changes in -04 are a fine compromise, providing an FCFS registry for those willing to use it, and giving advice for dealing with situation when the registry isn't used. Thanks very much to the authors and the working group for sorting this out!
Given the discussions during the IESG call, I am moving to Abstain.
This is a significant improvement over the version that I previously reviewed. Although significant steps have been taken to improve the document, the risk of collision due to unregistered use or registration of a collision is still acknowledged. I would find this less objectionable in an Informational or Experimental document, but in a Standards Track document I find it difficult to accept anything other than required registration (which may be confidential to IANA) to ensure the absence of a collision.
In light of the more recent drafts, I'm fine with logging my position as no objection. I'm not sure that I interpret 3692 quite as strongly as Adrian's discuss does. while there is certainly a temptation to squat on experimental code points due to the size of the space, the potential for less pollution of the registered code point range seems like it has fewer consequences then what is otherwise expected to continue to happen.
- So colour me puzzled - the write up says "nothing new here, just a way to use 6378" but there are two RAND+fee IPR declarations. Sigh. (But no more than a sigh since the WG are ok with it.) - Is the syntax on p6 for describing label stacks not more generic than this? I assume its too late (or not worth the bother) to take this out into its own informational RFC as it might be more broadly useful. If that text is replicated from elsewhere then I'd suggest you reference the elsewehere and not include it here again. - The tables/figures/whatever between figures 7 and 9 have no captions.
Probably the RFC editors would correct this. Anyway, here it is "Contributing Authors" section title -> "Contributors" https://www.rfc-editor.org/rfc-editor/instructions2authors.txt : 1. First-page header [Required] 2. Status of this Memo [Required*] 3. Copyright Notice [Required*] 4. IESG Note [As requested by IESG*] 5. Abstract [Required] 6. Table of Contents [Required for large documents] 7. Body of the Memo [Required] 7a. Contributors 7b. Acknowledgments 7c. Security Considerations [Required] 7d. IANA Considerations 7e. Appendixes 7f. References 8. Author's Address [Required] 9. IPR Boilerplate [Required*]
I don't object to the Informational status of this document, but I have to ask, in a non-blocking and non-confrontational way: From the shepherd writeup: This document does not specify a protocol but describes how to use the MPLS-TP linear protection as specified in RFC 6378 for ring topologies, the document is thus intended to be published as an informational RFC. This really *is* what applicability statements are for, the title even calls itself an applicability statement, and it does make recommendations (not just give information). I wonder, then, why it's Informational, rather than Standards Track (see RFC 2026, Section 3.2).
This document seems unclear as to the exact scenario it is addressing. If it is essentially addressing the scenario described in RFC 4192, the document needs to be reviewed to make sure that the gaps identified are real gaps that could actually happen in an RFC 4192 scenario. If that is not the scenario that this document is intended to address, the document needs to clarify what scenario it does intend to address. I've mentioned in the comments a number of cases where the document doesn't appear to be referring to use cases that could actually happen in an RFC 4192 scenario; I'm concerned that as it is written, it will lead readers to misunderstand how renumbering ought to work, and will actually lead them to do things that make their lives worse during a renumbering event. At the very least, I think the document needs to strongly state that it presumes the reader already has a clear understanding of RFC 4192, because if they read this document without reading 4192, I really think they will get the wrong impression. I think the document is going in a good direction, and I would support publication if this problem can be addressed.
Suggestion in section 2: OLD: o Hosts that are configured through DHCPv6 [RFC3315] can reconfigure addresses by starting RENEW actions when the current addresses' lease time are expired or they receive the reconfiguration messages initiated by the DHCPv6 servers. NEW: o Hosts that are configured through DHCPv6 [RFC3315] obtain new addresses through the renewal process or when they receive the reconfiguration messages initiated by the DHCPv6 servers. The reason for the proposed change is that renewal doesn't happen at expiry time, and addresses aren't changed through the Renew process. The current text is less specific, but probably specific enough for the context, and I suspect more helpful to the reader at this point in the document. In section 4.1, I'm really skeptical that this paragraph bears any relationship to reality for any enterprise setting larger than a SOHO configuration: Usually, the new short prefix(es) comes down from the operator(s) and is received by DHCPv6 servers or routers inside the enterprise networks (or through off-line human communication). The short prefix(es) could be automatically delegated through DHCPv6-PD. Then the downlink DHCPv6 servers or routers can begin advertising the longer prefixes to the subnets. Are you actually seeing this in practice? I would expect an enterprise-level site to have a negotiated prefix which is statically configured and controlled via contract, not via protocol. This scenario certainly could happen in a home environment or a home office environment, but this document doesn't seem to be addressing that use case. You do mention off-line communication in parentheses, but I really think that should be what you lead in with—it doesn't make sense to me otherwise. I am of course willing to be persuaded on this question—I think it's a bit speculative at the moment since I don't know of a lot of enterprises that do production IPv6 internally yet. Did you ask the Google guys what they do? In 4.2: When subnet routers receive the longer prefixes, they can directly assign them to the hosts. Host address configuration, rather than routers, is the primary concern for prefix assignment which is described in the following section 5.1. What does it mean for a router to assign a prefix to a host? Do you mean "advertise a prefix on a link to which hosts are connected?" In 5.1: Another limitation of DHCPv6 reconfiguration is that it only allows the messages to be delivered to unicast addresses. So if we want to use it for bulk renumbering, stateless DHCPv6 reconfiguration with multicast may be needed. However, this may involve protocol modification. This is not accurate. The DHCPv6 server has a complete list of all clients on any given link. It can issue unicast reconfigure messages to each client, if desired. Multicast DHCPv6 reconfigure is not an option, because it's too easy to use it for a DoS attack. I notice also that you don't mention the 'A' bit in router advertisements. I think this bit also affects the behavior of various stacks; I'm not sure, but I think it probably should be discussed. Section 6.1 talks about DDNS as a way to update IP addresses on names during a renumbering event. I'm not at all clear on what the use model for this would be. If we're renumbering servers, then certainly you could configure each server with its own key to use for doing updates, so that it could poke its new SLAAC- or DHCPv6-derived address into the DNS. I'm not sure this is a good _idea_, but it's eminently doable. The document seems to mention RFC 4704 in passing, without citing it. Possibly the authors aren't actually familiar with RFC 4074, but just thought that some commercial servers might have custom solutions to this problem? I think RFC 4074 entirely addresses this problem, at least for hosts that can be numbered using DHCPv6. The combination of RFC4074 and server-oriented DDNS could probably address most of the problems one might care about with respect to the problem 6.1 is trying to address. Of course, there is still a gap here, since servers have to somehow notice that their address has changed and trigger the DDNS update. The document also mentions A6 records here, which are deprecated (RFC 6563), and therefore ought not to be mentioned. In 6.2: (Addresses of DHCPv6 servers do not need to be updated. They are dynamically discovered using DHCPv6 relevant multicast addresses.) While this could be true in principle, it isn't true in practice, because most relay agents have the option of being configured with DHCPv6 server addresses rather than sending to a multicast address, and I think it's more common to do unicast than multicast for this step. So the document shouldn't assume that this is a solved problem. Even in the case of multicast, it's necessary for multicast routing to be configured and working in order for DHCP messages to find their way to servers. In theory a renumbering event shouldn't break multicast routing, but in practice it might. The DNS server addresses for hosts could be configured by DHCPv6. In stateless DHCPv6 mode [RFC3736], [RFC4242] allows the server to specify valid time for the DNS configuration. But in stateful DHCPv6, current protocols could not indicate hosts the valid time of DNS configuration. If the DNS server has been renumbered, and the DHCP lease time has not expired yet, then the hosts would still use the old DNS server address(es). It might be better that the hosts could renew the DHCP DNS configuration before the lease time, especially there might be some extreme situations that the lease time need to be long. In this case how the DHCP server could learn the proper DNS configuration valid time is also an issue. There are a bunch of problems with this. First, the stateful DHCP T1 and T2 times are effectively equivalent to the stateless DHCP Information Refresh Timer (DHCPv6 doesn't talk about leases or lease times, so this term should not be used; rather, addresses have lifetimes, and IAs have T1 and T2 timers that indicate to the client when to renew or rebind, respectively). So there is no need for an additional timer in stateful DHCPv6; indeed, it doesn't make sense to add one just for renumbering. How would you know what value to set it to? Why wouldn't you just set T1 to that value? Next problem: in a typical renumbering scenario, the old and new prefixes are both valid. The old prefix is deprecated, but still works. So the DNS server is still reachable at the old address: there is no interruption of service. Deprecated addresses and prefixes decay gracefully, so that by the time they are gone, everybody has renumbered. Hence, the scenario being described here should never happen unless someone unconfigures the deprecated address on the DNS server before all the hosts have stopped using it. That would be an administrative error. Section 6.3 appears to be a pair of solutions, not a gap. It is described as a gap, in the sense that the solutions are lacking, but it's not describing the actual gap. This seems to be putting the cart before the horse. The right thing to do is to identify the gap that this solution would fill, rather than describing the absence of these solutions as a gap. I think the gap you are talking about is that there isn't a way to trigger configuration updates for all the subsystems on a server, and all the external databases that refer to that server, on the basis of a change to the server's IP address. I am not positive that you need to change the way you are approaching this, because I think the solutions you are talking about are interesting, but this section really doesn't feel like it's a gap analysis as written, so starting from what the gap is and talking about how it could be addressed might help. More examples in this section might help: I think the tunnel endpoint example is a really good one to use. There are probably some other examples that would be good here—e.g., I have static IP addresses configured in my nginx configuration. This isn't really the gap—I ought to just put domain names in. But unless the tunnel endpoint or nginx notices that the addresses returned for that domain name have changed, it will keep blindly using the old address until something triggers it to do a refresh or a restart. You mention this in 5.2, but it should be mentioned here as well, since as far as I know most services do not actually refresh DNS information until they are triggered somehow to do so. In section 7.2, what do you mean by "no mechanisms?" RFC 4192 talks about how to address this issue, if I understand the issue correctly. Or is the gap you are talking about the lack of a way to set all the TTLs for a zone to no more than a particular value? Section 10.2 mentions the A6 record again. I don't think this is helpful, because it died for a reason. I think you should leave this out. Also, I think you need to lead in with the authority problem—my first read of this section really puzzled me, because it seemed to be talking about a problem that is easy to solve, and it was only when I saw the acknowledgement in section 13 that I went and read the chown draft and then reread section 10.2 and understood what the gap was that was being documented. It looks like 10.3, second bullet, is talking about the problem I mentioned in 6.3 with services not refreshing DNS information. I think the techniques you talk about in 6.3 can address this problem, so I don't think it's an unaddressable gap. If this isn't the gap you are talking about, a bit more exposition might be required...
- The write-up notes that this is similar to RFC 6879 which is recent, on the same topic and has overlapping authors. I was surprised the intro didn't say why this is different. Are we sure there is no conflicting advice in the two documents? - Surely there's a gap in re-numbering when both v4 and v6 numbers have to change at once - why isn't that mentioned here? (I assume because of the wg charter or something.) - p5: "performed integrally" isn't very clear but I think I get that you mean as an atomic operation or something - section 2 last para, not clear to me what you're saying about SLAAC - sentence is odd - p6: "like [cfengine]" is not so nice as a way to reference whatever that might be, same for others - p8, what is "M/O"? you should say - p11, what is "a gao"? - p12, presumably automated approaches like LEROY increase the risk since a bad actor with the right permission could cause havoc - is that noted? - p13, presumably changing DNS RRs means signing things if DNSSEC is in use - is that noted? - p14, ingress filtering - if I could tell an ISP to no longer drop packets with sources from prefix X (because I claim to be renumbering) then I would defeat anti-spoofing measures - is that noted? - 7.2 - do DNSSEC RRSIG validity periods play into this too? - general: if addresses for mail servers are exposed via SPF, then presumably those will need renumbering too; renumbering might also trigger false positives or negatives for anti-spam features (not sure what v6 stuff is being done for DNS RBLs) - I think section 11 should note that if you do attempt to fill soem of these gaps, then you may create new threats and those will also need to be addressed; and some of those will be *very* hard problems to solve
In section 7, you miss a discussion regarding the host/router unique id in the NMS applications Let's start with syslog, SNMP notification, and IPFIX. For all of these protocols, the host or router id is the UDP message source IP address. Sometimes this matches the device loopback IP address: indeed, we can force the loopback IP address as the source IP address for IPFIX/syslog message/SNMP notifications. If the source IP address is changed (whether it matches the loopback is irrelevant), the syslogd, trapd, or IPFIX collector will consider that a new device appeared in the network. This is a problem. I don't want to say that the IP address mapping must be sent in band (i.e. in SNMP, syslog, or IPFIX), maybe updating DNS is sufficient, maybe querying a UUID is the solution, or maybe the solution depends on the protocol (sending the MAC or a UUID in IPFIX would help, as Options Template Record). So maybe there are different solutions for the different protocols. A discussion/some text is required concerning the host and router unique IDs in NMS applications: these applications should be aware of the IP address mapping. This might lead to some extra bullet point in the section 9.4
I agree with Stephen's comment: - The write-up notes that this is similar to RFC 6879 which is recent, on the same topic and has overlapping authors. I was surprised the intro didn't say why this is different. Are we sure there is no conflicting advice in the two documents? The relationship with RFC 6879 should be clearly mentioned. The relationship between the various documents was already my feedback for RFC 6879. See https://datatracker.ietf.org/doc/draft-ietf-6renum-enterprise/ballot/#benoit-claise - When I arrive at section 4, I was not too sure if you intended to match section 4 t o7 with The automation can be divided into four aspects as follows. o Prefix delegation and delivery should be automatic and accurate in aggregation and coordination. o Address reconfiguration should be automatically achieved through standard protocols with minimum human intervention. o Address-relevant entry updates should be performed integrally and without error. o Renumbering event management is needed to provide the functions of renumbering notification, synchronization, and monitoring. I had to carefully match those four points with the section titles. It was not too clear if "managing prefixes" is treating "Prefix delegation and delivery" 4. Managing Prefixes ............................................ 7 5. Address Configuration ........................................ 8 6. Updating Address-relevant Entries ........................... 10 7. Renumbering Event Management ................................ 13 You should make it easier for the reader. Either match the section titles and/or use references: The automation can be divided into four aspects as follows. o Prefix delegation and delivery should be automatic and accurate in aggregation and coordination. See section 4. o Address reconfiguration should be automatically achieved through standard protocols with minimum human intervention. See section 5 o Address-relevant entry updates should be performed integrally and without error. See section 6 o Renumbering event management is needed to provide the functions of renumbering notification, synchronization, and monitoring. See section 7
I support the publication of this document and only have a few, non-blocking, comments/questions... 1. This document is clearly focused on the planned renumbering events within a network and does not address the issues surrounding emergency renumbering events. Did the WG consider the two distinct cases? Would it make sense to add some text to the Abstract/Intro highlighting the focus so people don't think this document covers emergency renumbering events? 2. It may be useful to point out in either 3.1 or 3.3 that administrators can leverage the address selection policy distribution mechanism in draft-ietf-6man-addr-select-opt to update the address selection policies on hosts during renumbering. 3. The 2nd paragraph in 5.1 is rather obtuse. It says that combining SLAAC and DHCP-based address configuration "would add more or less additional complexity". I would think that it would add complexity, period. It might be useful to reword this to make the meaning clear. 4. In 7.1, it mentions a possible notification mechanism to signal a change in the DNS system related to a renumbering event. It may be worth mentioning that such a notification mechanism will need a robust security model.
Just checking on whether these are things people think about when renumbering: 0) s11, prefix validation: Is there any reason it's not: "Prefixes from the ISP need authentication to prevent prefix fraud." In other words what's up with the "may"; when wouldn't you need authentication? 1) s11: Do you also need to discuss issues with long-lived sessions and how to keep them alive or not (e.g., ssh connections)? 2) s11, influence on security controls: a) If there's DHCPv6 authentication keys associated with an IP address they'll need to be changed for it to continue working - no? Addresses in SEND certificates are going to need to get updated. Are these further examples of what you were thinking or is this more about keeping the security up and running during the transition? b) More generally, you can include IP addresses in certificates and if you go and renumber those protocols might, well really will, stop working until you reissue a new certificate with a new address. Is this covered someplace else, does everybody know to do this reissue dance, or should there be a new section "Influence on Security Protocols"?
1. This document recommends filtering all IPv6 link-local traffic. Link-local IPv6 traffic is widely used on notionally IPv4-only networks. For example Apple's Bonjour protocol relies on IPv6 link-local addressing. So filtering this traffic would create significant operational problems even in supposedly IPv4-only networks. Bonjour is widely used even by non-Apple devices, including most IP printers and Microsoft Windows printer drivers, so even a non-Apple shop would have a problem with this recommendation. Proposed action: remove the first paragraph of section 2.1. Remove this text from the next paragraph: However, neither RA-Guard nor DHCPv6-Shield can mitigate attack vectors that employ IPv6 link-local addresses, since configuration of such addresses does not rely on Router Advertisement messages or DCHPv6- server messages. Remove this paragraph: If native IPv6 traffic is filtered at layer-2, local IPv6 nodes would only get to configure IPv6 link-local addresses. As an alternative, the authors could consider a much more restricted recommendation similar to the first paragraph of 2.1, but referring only to RA traffic and DHCPv6 traffic. This requires a fancier switch, but doesn't break Bonjour. A much less satisfactory alternative would be to simply say that breaking IPv6 at layer two breaks Bonjour. In combination with a very strong applicability statement (see point 2), I would feel compelled to clear this item in the DISCUSS, even though I would not actually be happy with the outcome. If you go this route, you should include all but the first sentence of the first paragraph of this DISCUSS point so that network operators who may not be aware of this realize what's at stake. 2. This document makes broad recommendations about filtering tunnel traffic. These recommendations are appropriate in an enterprise environment where IPv6 is not yet deployed, but are inappropriate in most other contexts (e.g., it would be very damaging if ISPs started filtering all tunnel traffic). Proposed action: This document needs a very clear applicability statement at the top, before it even talks about any sort of recommendation of this type. There is a single sentence in the introduction about applicability, but this isn't sufficient. I would suggest something like the following: Most general-purpose operating systems implement and enable native IPv6 [RFC2460] support and a number of transition/co-existence technologies by default. Support of IPv6 by all nodes is intended to become best current practice [RFC6540]. Some enterprise networks might, however, choose to delay active use of IPv6. This document describes operational practices for enterprise networks to prevent security exposure resulting from unplanned use of IPv6 on such networks. This document is only applicable to enterprise networks: networks where the network operator is not providing a general-purpose internet, but rather a business-specific network. The solutions proposed here are not practical for home networks, nor are they appropriate for provider networks such as ISPs, mobile providers, Wifi hotspot providers or any other public internet service. In scenarios in which the IPv6-capable devices are deployed on enterprise networks that are intended to be IPv4-only, native IPv6 support and/or IPv6 transition/ co-existence technologies could be leveraged by local or remote attackers for a number of (illegitimate) purposes. For example, Proposed alternative action: Some kinds of tunnel traffic are okay to filter in all environments, because their use is deprecated. If this document only referred to those types of traffic, this would also satisfy my concern. E.g., I don't think it's a problem to have a blanket recommendation to filter Teredo. 3. This recommendation breaks DNS: For this reason, networks attempting to prevent IPv6 traffic from traversing their devices should consider configuring their local recursive DNS servers to respond to queries for AAAA DNS records with a DNS RCODE of 3 (NXDOMAIN) [RFC1035] or to silently ignore such queries, and should even consider filtering AAAA records at the network ingress point to prevent the internal hosts from attempting their own DNS resolution. This will ensure that hosts which are on an IPv4-only network will only receive DNS A records, and they will be unlikely to attempt to use (likely broken) IPv6 connectivity to reach their desired destinations. Suppose I do a AAAA query and an A query for a name that has non-AAAA records. Returning NXDOMAIN tells the resolver that there is no such domain name, not that there is no such record. The correct answer in this case is NOERROR, and an answer that contains no records. Silently ignoring these queries is bad advice as well, since it could potentially result in 90 second delays in accessing web pages on dual-stack servers. Happy Eyeballs is at this point a widely-deployed solution, present in all modern web browsers, so filtering AAAA records to route around brokenness is probably not needed. However, if the working group really has consensus to make a recommendation like this, it should at least do it in a way that doesn't break the DNS. Proposed action: Delete this paragraph. Alternative proposed action: specify that when filtering AAAA queries, the filtering entity needs to actually do the query, and then return no records if it receives an answer. This is a fairly easy thing to do on a DNS proxy. Explain why just returning NXDOMAIN or dropping the query will break DNS. If you go with the alternative proposed action, whatever text you put in should be vetted by the DNS directorate, because I'm just an amateur DNS geek and can't promise that I got the advice exactly right.
The buffer overflow referred to in [Core2007] is six years old at the time of review, applies to an operating system that no-one uses, and is therefore a poor basis for recommending link-local filtering. I think protecting against RA-based and DHCPv6-based attacks is important, but can't be cleanly achieved by filtering the IPv6 ethertype. I support Stephen Farrel's DISCUSS relating to DNSSEC, as well as Barry Lieba's comment. I think this document is trying to do something worthwhile, so please don't take these DISCUSS points as general disapproval.
(1) Only just about a discuss: The title is overly broad - are you claiming to have identified *all* security implications? I doubt that. Truth in advertising would suggest pre-pending "Some" to the title or perhaps appending "considered in a bath." Same goes for the abstract. (2) section 4, 3rd para seems to recommend something that'd make DNSSEC deployment harder. Even in an informational document that seems like a bad plan. Why is it ok to recommend that? Or am I wrong that this practice would make it harder to start deploying DNSSEC?
- end of section 2: is it really true that the exact same policies should be enforced for v6 and v4? I guess that depends on what you call policy - some might consider the use of NATs and private address space as a policy but that doesn't seem to be something worth recommending does it? I'm happy to let the INT or OPSEC folks say this however they like but as stated this seems overly broad. - 2.1 - is it really wise to recommend blocking v6 at layer 2? Would that not possibly get hard to turn off when you do want v6 packets? Seems fragile to me fwiw. I know you don't strongly recommend that but for things like this where you point at something it'd be better if you also called out any significant downsides.
I agree with the issues Ted has raised.
This document is far enough outside of my expertise that I would normally ballot NO OBJECTION, especially because it's Informational. But between Brian's ABSTAIN, Stephen's DISCUSS, and the fact that I can't determine how this fits into OPSEC's charter, I will also ABSTAIN. Since this is Informational, this makes no practical difference; if Joel wants to go ahead with the document, my abstention won't affect approval. But I thought I ought to register my concern.
I found the draft interesting to read. However, I wish the draft would be clearer regarding: 1. the security mechanisms that should be in place while waiting for the network to be IPv6-enabled, and then disabled (example: "layer-2 device filtering") 2. versus the measures that must remain in place once the network is IPv6-enabled (RA-Guard [RFC6105]) Within 2., there are actually two cases 2.1 when the network is both IPv6 and IPv4 enabled 2.2 when the network is only IPv6. It's interesting to note that - this is just one example. For example, same question for the next section "tunneling mechanisms" (some tunneling mechanisms are not valid any longer in the 2.2 case ) - the two previous examples are from the same section "filtering native IPv6 traffic". Having different sections for the points 1 and 2 may help, or having an introduction with generic security mechanism (point 2), and then, the content of this draft (point 1) If there are more reviews in that direction, it might be worth seriously considering.
From the shepherd writeup: The Document Shepherd has had a number of discussions with the authors on this topic, and has followed the progression of the draft through revisions and the WG. The Document Shepherd also sat in the bath with a highlighter and carefully reviewed the document. One can always rely on Warren for a good chuckle....
I have issues with this document, but do not see any benefit in arguing for changes. This document: - Only discusses approaches that have been well-known for years and have been discussed on numerous IPv6-related mailing lists - Encourages filtering of nascent/experimental testing of IPv6 in legacy networks rather than encouraging controlled testing/deployment/monitoring of IPv6 traffic - Engendered only a smattering of positive support in the WG and a large amount of disinterest from the silent majority.
My shakey notes from the roads around the Cliffs of Moher read like this: ---- 0) Should we rename this "Filtering IPv6 from IPv4 to avoid some IPv6 Security Issues". 1) I guess there's the recommendation that the same security policy be applied to both IPv4 and IPv6 traffic, but that's really what like 2 paragraphs in the whole draft? 2) Do we REALLY want to recommend filtering IPv6? ---- I think the first and second bits are kind of covered by Stephen's discuss, but maybe we should go a step further in truth in advertising and just title this draft what it's mostly about. Funny that the 3rd one Jari and Joel discussed while at the IESG retreat last week (not security related but more like unexpected traffic loads) so I know these surprises happen on mixed networks. But, 'd hate for this kind of recommendation to stick around for a while. That is when we don't need to be filtering for immature products we ought to get this document gone.
I wondered if it'd be worth adding some security consideration text about possible attacks that might be enabled by this if e.g. there's a load balancer on the NAT'd end with different devices behind the NAT having the same master key - presumably a bad actor might be able to re-direct or replay some traffic even if there's a low probability that that'd not be detected unless a large amount of traffic is re-directed or replayed. Not sure if the attack is practical though, I guess it'd need a sequence number collision.
This is the lamest comment in the history of conflict reviews, but I found the title, "A TCP Authentication Option NAT Extension", confusing (we're extending TCP-AO, not NATs). The abstract is much clearer: This document describes an extension to the TCP Authentication Option (TCP-AO) to support its use over connections that pass through network address and/or port translators (NATs/NAPTs). Not a big deal, but perhaps this could be considered as the document moves through the process.
I checked with the IDR and SIDR WGs and no concerns were raised. I have hence cleared my Discuss. There was one technical comment which I pass to the author for his consideration: >> TCP-AO-NAT SHOULD NOT be used with both flags set in IPv4, however, as the result would rely entirely on the ISNs alone. The preceding paragraph says that the ISNs alone provide most of the randomness ("KDF input randomness is thus expected to be dominated by that of the ISNs") so the justification for the sentence quoted above isn't obvious. - RFC5389 is all very well, and broadly related to the topic. But the citation is provided without context, or more accurately, it's cited out-of-context. I was expecting to go look at RFC5389 to find out something useful about localNAT and remoteNAT, but no.
No objection to the draft, but I am curious whether the clients will know they're behind a NAT or whether all clients will end up setting this all the time.