IESG Narrative Minutes
Narrative Minutes of the IESG Teleconference on 2017-05-25. These are not an official record of the meeting.
Narrative scribe: John Leslie and Ignas Bagdonas (The scribe was sometimes uncertain who was speaking.)
Corrections from: (none)
1 Administrivia
2. Protocol actions
2.1 WG submissions
2.1.1 New items
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
2.1.2 Returning items
2.2 Individual submissions
2.2.1 New items
2.2.2 Returning items
2.3 Status changes
2.3.1 New items
2.3.2 Returning items
3. Document actions
3.1 WG submissions
3.1.1 New items
Telechat:
3.1.2 Returning items
3.2 Individual submissions via AD
3.2.1 New items
3.2.2 Returning items
3.3 Status changes
3.3.1 New items
3.3.2 Returning items
3.4 IRTF and Independent Submission stream documents
3.4.1 New items
3.4.2 Returning items
4 Working Group Actions
4.1 WG Creation
4.1.1 Proposed for IETF Review
4.1.2 Proposed for Approval
4.2 WG Rechartering
4.2.1 Under evaluation for IETF Review
4.2.2 Proposed for Approval
5. IAB News We can use
6. Management Issues
Telechat::
7. Working Group News
7a. Other Business
1031 EDT Adjourned
(at 2017-05-25 06:00:07 PDT)
draft-ietf-mpls-tp-aps-updates
The acronym table lists two one-letter acronyms, which appear to actually be keys to codes in a table instead of actual acronyms. I would propose moving the definition of "i" and "N" from section 3 into section 4.2. Also, section 3 contains expansions for "PF:DW:R," "PF:W:L," and "PF:W:R," but not "SA:MP:R," and "SA:MW:R." This seems oddly inconsistent; I would suggest adding entries for the two "SA:..." acronyms.
draft-ietf-anima-grasp
In this text, T6. The protocol must be capable of supporting multiple simultaneous operations with one or more peers, especially when wait states occur. I understand every word, but I'm not sure what this requires the protocol to do. Are you asking that the protocol be non-blocking? But that's a guess. In this text, A GRASP implementation will be part of the Autonomic Networking Infrastructure in an autonomic node, which must also provide an appropriate security environment. In accordance with [I-D.ietf-anima-reference-model], this SHOULD be the Autonomic Control Plane (ACP) [I-D.ietf-anima-autonomic-control-plane]. I wonder what happens if the security environment isn't the ACP. Is that obvious? In this text, An implementation MUST support use of TCP. It MAY support use of another transport protocol. However, GRASP itself does not provide for error detection or retransmission. Use of an unreliable transport protocol is therefore NOT RECOMMENDED. just to educate me, is the strategy here, that (for instance) if synchronization fails over an unreliable transport protocol, that eventually it will be attempted again, just because the two ACAs know they aren't synchronized? I'm really confused by this text. Nevertheless, when running within a secure ACP on reliable infrastructure, UDP MAY be used for unicast messages not exceeding the minimum IPv6 path MTU; however, TCP MUST be used for longer messages. In other words, IPv6 fragmentation is avoided. If a node receives a UDP message but the reply is too long, it MUST open a TCP connection to the peer for the reply. Note that when the network is under heavy load or in a fault condition, UDP might become unreliable. Since this is when autonomic functions are most necessary, automatic fallback to TCP MUST be implemented. The simplest implementation is therefore to use only TCP. We've been having quite the discussion about how well Path MTU Discovery works, even in IPv6. Because GRASP could be running over virtual interfaces, I suspect there's a chance that you'll be running in a tunnel that will give you a Path MTU that's smaller than the IPv6 minimum. But ignoring that for now ... IIRC, we've had poor experiences with protocols that are expected to switch from UDP transport to TCP transport in the middle of a request/response pair. But, setting THAT aside for now ... This text correctly points out that UDP transport is most likely to fail under heavy network load or in a fault condition, when autonomic functions are most necessary. If TCP is mandatory to implement, and implementations will need to switch from UDP to TCP at the most awkward times, and that's been a problem area for other protocols in the past, why not just require TCP in the first place? I see that the UDP/TCP question was listed as an open issue before it was closed, so I'm not balloting Discuss, because I assume I'm missing something that people will help me understand, but I thought about it for a while ... Thanks for this text, If no discovery response is received within a reasonable timeout (default GRASP_DEF_TIMEOUT milliseconds, Section 3.6), the Discovery message MAY be repeated, with a newly generated Session ID (Section 3.7). An exponential backoff SHOULD be used for subsequent repetitions, to limit the load during busy periods. Frequent repetition might be symptomatic of a denial of service attack. and especially for the warning about DoS attacks. I found Appendix D and E useful. Thanks for including both of them.
The comparison text to routing protocols is outdated as ignores TE which can support any link/node attribute desired (bandwidth, availability, latency, etc.), discovery, bidirectional negotiation for use, and autoconfiguration (e.g. RFC 5340). When first discussing automatic networks, it may have been useful to compare with routing, as at a very high level view, it may look similar, but I think it is no longer relevant, and very confusing for a routing person. Suggest instead of a "I'm more complex than you" approach, remove these paragraphs. A few minor edits will fix. Suggest to remove the first paragraph of Section 2.2. Or edit: 1. links are no longer simple: "consider simple link"/s/"consider link" 2. Delete from "nodes need a consistent, although partial, view of the network topology in order for the routing algorithm to converge. Also, routing is mainly based on simple information synchronization between peers, rather than on bi-directional negotiation." I think what you want to infer by "partial" is for a protocol instance/region. But there is support today for multi-layer and multi-region networks. And convergence scale is implementation. But none of this is relevant to anima so best is to delete vs. trying to fix. Appendix E Remove the paragraph on routing or preface with "Early routing protocols.." And the paragraph on RSVP is really not relevant for this comparison. Unless want to edit, as RSVP-TE does do "discovery".
Firstly, thank you for addressing Joel's OpsDir review. As others have noted, this is a long document :-) I think that, in spite of this, it is very well written.... These comments were written against v-11, but I think are still applicable to -12. Section 2.1, D1: "... the protocol can represent and discover any kind of technical objective ..." While the document *does* say that readers should be familiar with RFC7575, RFC7576, and I-D.ietf-anima-reference-model, I think it would still be helpful to (briefly) describe an objective here, or simply mention that "technical objective" is a term of art and point to the Terminology section (or Sec. 3.10). When I initially read this it sounded incredibly broad, once I found the Terminology section it all made more sense... S2.2. Requirements for Synchronization and Negotiation Capability "SN5. ... It follows that the protocol’s resource requirements must be appropriate for any device that would otherwise need human intervention." I found this sentence confusing / hard to parse. I *think* that you are saying that the protocol should not require so many resources that it cannot be deployed on devices (and so humans would still need to manually manage them)? If so, I think that this could be clearer, but, unfortunately I cannot provide better text... 3.2. High Level Deployment Model "A more common model is expected to be a multi-purpose device capable of containing several ASAs." I'm sure you are right... but for a reader new to the topic this is not obvious (nor clear) - would it be possible to provide some sort of examples of such devices (or brief description of why a more common model would have several ASAs?) E.g: "multi-purpose device capable of containing several ASAs (such as a router or large switch)" (or whatever...) "..it is essential that every implementation is as robust as possible." -- this sounds suspiciously like "Don't write bad code...". What is the purpose if this statement? Do you think that it will somehow make people write better / more robust code? If so, shouldn't this be in our standard boilerplate? This whole paragraph feels like it is not actionable / is something that all code for all implementations of everything should follow... (I have a horrible feeling that I'm heading off on a soapbox rant / that this is a pet-peeve...)
1) Use of transport protocols is not sufficiently defined Especially the following text in section 3.5.3 seems not to reflect later assumptions correctly; it seems to be assumed that TCP is used for all messages other than the discovery and therefore reliable transport is provided for these message (see sections 3.5.5, 3.8.4 and 3.8.5): "All other GRASP messages are unicast and could in principle run over any transport protocol. An implementation MUST support use of TCP. It MAY support use of another transport protocol but the details are out of scope for this specification. However, GRASP itself does not provide for error detection or retransmission. Use of an unreliable transport protocol is therefore NOT RECOMMENDED." In general the usage of the transport protocols is not well enough specified, see also Spencer's comments and this part of Martin's tsv-art review (Thanks!): "* Usage of UDP: This document is not discussing any of the aspects in RFC 8085. Every usage of UDP is required by IETF consensus to review RFC 8085 and to address at least the applicable subset of issues listed in RFC 8085 (or the predecessor RFC 5405). * Starting with UDP and switching to TCP for the data transfer looks like the right do. However, UDP should be really only used to discover other devices, but not piggy back further protocol mechanics. However, this document is not really specific on how to make use of TCP, for instance, how long are TCP connections kept open or closed down after a protocol exchange (persistent vs temporary connections). What happens if a TCP connection is shutdown by one end or is forcefully closed, e.g., by a reset?" I would recommend, as assumed in the rest of the document, to update section 3.5.3 to only use UDP for the initial recovery message and open a TCP connection for the discovery response and require that all other messages to be sent over TCP (also removing any option to use any other reliable transport because TCP seems to be the right choice here.) Further, additional guidance is needed when to open and close a TCP connection (or keep it alive for later use) and what to do if the connection is interrupted. 2) Time-out handling section 3.5.4.4: "Since the relay device is unaware of the timeout set by the original initiator it SHOULD set a timeout at least equal to GRASP_DEF_TIMEOUT milliseconds." Should a relay really maintain an own time-out? Wouldn't it be sufficient to just relay again if another discovery message is received. Otherwise this can lead to an amplification, when the own time-out expires and another relay message is sent when another discovery message is received due to the time-out of the originating peer. Further in relation to the point about, this should be more specific: section 3.5.4.4: "Also, it MUST limit the total rate at which it relays discovery messages to a reasonable value, in order to mitigate possible denial of service attacks. " 3) Version and extensibility: section 3.5.4.5: "A possible future extension is to allow multiple objectives in rapid mode for greater efficiency." How can this extension be defined if there is no version mechanism?
Other mostly editorial comments: - ASA needs to be spelled out in the intro. - I would recommend to move section 2 and 3.3 into the appendix - section 3.5.4.2: "A neighbor with multiple interfaces will respond with a cached discovery response if any." "cached response" is explained in the next section and not clear in this paragraph. - section 3.5.4.3: "After a GRASP device successfully discovers a locator for a Discovery Responder supporting a specific objective, it MUST cache this information, including the interface index via which it was discovered. This cache record MAY be used for future negotiation or synchronization, and the locator SHOULD be passed on when appropriate as a Divert option to another Discovery Initiator." Not sure why the first is a MUST and the later is a SHOULD. I guess a SHOULD for caching would be sufficient. - section 3.8.6 "If a node receives a Request message for an objective for which no ASA is currently listening, it MUST immediately close the relevant socket to indicate this to the initiator." How is that indicated? Should really be further clarified - Also section 3.8.6: "In case of a clash, it MUST discard the Request message, in which case the initiator will detect a timeout." Why don't you send an error message instead? How does the initiator know that is should retry (assuming there is a TCP connection underneath that provides reliable transport)? - Also section 3.8.9: "If not, the initiator MUST abandon or restart the negotiation procedure, to avoid an indefinite wait." How does the initiator decide for abandoning or restarting instead? Needs clarification! - Could be useful to include an optional reasoning field in the Invalid Message and make copying the received message up to the maximum message size of this message a SHOULD (section 3.8.12.). - Not sure I fully understand the purpose of the No Operation Message (section 3.8.13.). If you just want to open a socket for probing, you perform a TCP handshake and send a RST right after. No need for further application layer interactions. And should there also be an optional reasoning phrase? - Not sure why the objectives flag is needed. I assume that unknown objectives are ignored anyway and if a objective is known the receiver should know if that objective is valid for the respective message type (section 3.10.2). - section 3.10.4: "An issue requiring particular attention is that GRASP itself is a stateless protocol." It's not. It caches information and needs to remember previous messages sent to reply correctly. - section 5: "Generally speaking, no personal information is expected to be involved in the signaling protocol, so there should be no direct impact on personal privacy." I don't think this is true because the protocol is so generic that you cannot say anything about the services it is used for. Please see also further comments from Martin's tsv-art review (Thanks again!)!
Substantive: -3.5.2.1: "Messages MUST be authenticated and encryption MUST be implemented." Should the latter be "... MUST be used"? It seems odd for authentication to be MUST use, but crypto to only be MTI. -3.5.4.3: "An exponential backoff SHOULD be used for subsequent repetitions, to limit the load during busy periods." Why not MUST? Also, is there a retry limit? (Comment applies to the other sections that mention retries with exponential backoff) -3.5.6.2: "To ensure that flooding does not result in a loop, the originator of the Flood Synchronization message MUST set the loop count in the objectives to a suitable value " I assume this is true for discovery and negotiation as well? I don't think it was mentioned in those sections (although I think I saw a related mention in the message format sections.) - 3.10.5: "SHOULD NOT be used in unmanaged networks such as home networks." Why not MUST? -5, Privacy and Confidentiality: Did people consider IP Addresses and other potentially persistent identifiers as impacting privacy? -7, Grasp Message and Options table: Why "Standards Action"? Would you expect some harm to be done if this were only Spec Required? Editorial: - Is section 2 expected to be useful to implementers once this is published as an RFC? Unless there's a reason otherwise, I would suggest moving this to an appendix, or even removing it entirely. As it is, you have to wade through an unusual amount of front material before you get to the meat of the protocol. - Along the lines of the previous comment, I found the organization a bit hard to follow. I didn't find actual protocol details until around page 21. Procedures are split (and sometimes repeated) between the procedure sections and the message format sections. I think that will make this more difficult and error prone than necessary for implementors to read and reference. I fear readers will read one section and think they understand the procedures, and miss a requirement in the other. - 3.5.2.2: First bullet: Please consider a "MUST NOT construction. "MUST only" can be ambiguous. It would be helpful to explain why the loop count must not be more than one. I can infer that from the later sections on relays, but it was not obvious when reading this section. And unless I missed something, there's no text that puts the two ideas together. - 3.5.4.5: This section seems redundant to the similar sections under negotiation . Since those sections have more information, would it make sense to consolidate them there?
I have a small list of issues that I would like to discuss before recommending approval of this document: 1) The first reference to UTF-8 needs a Normative reference to RFC 3629. 2) In Section 3.10.1, you say: The names of generic objectives MUST NOT include a colon (":") and MUST be registered with IANA (Section 7). In Section 7 you only say: GRASP Objective Names Table. The values in this table are UTF-8 strings. Future values MUST be assigned using the Specification Required policy defined by [RFC5226]. IANA is not going to review section 3.10.1 and there is no back reference in Section 7. IANA needs to know that values with ":" are not to be registered.
Martin's ART Review comments seem to be addressed (other than some possible cleanup of text about TLS use). As a general comment, the document has several SHOULD/MUST level requirements which are sometimes addressed at people deploying the protocol, sometimes at UI designers and sometimes at designers of new objectives. I generally don't mind, but the document doesn't always make it clear what is the intended audience for different requirements. Other smaller things: "Fully Qualified Domain Name" probably needs a Normative Reference. 3.5.4.3. Discovery Procedures In 6th para: The cache mechanism MUST include a lifetime for each entry. The lifetime is derived from a time-to-live (ttl) parameter in each Discovery Response message. Cached entries MUST be ignored or deleted after their lifetime expires. In some environments, unplanned address renumbering might occur. In such cases, the lifetime SHOULD be short compared to the typical address lifetime and a mechanism to flush the discovery cache MUST be implemented. How can the discovery cache be flushed? 3.9.5.4. Locator URI option In fragmentary CDDL, the URI option follows the pattern: uri-locator = [O_URI_LOCATOR, text] I suggest inclusion of optional transport protocol here to match other locators and to follow best practices for not encoding transport information in URIs.
Thanks for addressing the SecDir review, as well as Ben's questions on the WG decisions for authentication & encryption and Spencer's on running in a secure ACP. Clarifying the text for the latter would be helpful.
ISSUE 1 The security situation here is pretty unspecified here, in at least two respects: 1. In terms of communication security, you seem to have two modes: (a) Punt it to ACP (b) Use TLS as specified in S 3.5.2.1 I'm not reviewing ACP here (though I have some comments on that too) but S 3.5.2.1 doesn't (for) instance explain how to do certificate validation, which it clearly needs to do. Finally, I don't understand the security story for the multicast packets. This is especially relevant for Rapid mode, where you are attaching real work to these multicast packets. 2. I didn't find the security model very clear. As I understand things, basically anyone on the network who has ACP credentials is trusted to engage in negotiation with you, so, for instance, if you want to get parameter X, then you basically just trust whoever on the network offers you X. is that correct? That seems like it needs to be very explicitly called out. And if that's not true, then I don't understand the spec. ISSUE 2 This document seems like it provides incomplete guidance on how to actually implement it. For instance: discovery messages to a reasonable value, in order to mitigate possible denial of service attacks. It MUST cache the Session ID value and initiator address of each relayed Discovery message until What's "reasonable"? ISSUE 3. I don't think I understand how the transition from UDP multicast to TCP/TLS unicast works. Maybe I'm just misreading the spec, so could you point me to the section that describes this. Finally, I don't see a spec for how you map CBOR onto the wire. Do you just shove them on? Something else? I see that Martin Thomson raised a number of these issues in his review in more detail.
S 3.5.4.3. After a GRASP device successfully discovers a locator for a Discovery Responder supporting a specific objective, it MUST cache this information, including the interface index via which it was discovered. This cache record MAY be used for future negotiation or synchronization, and the locator SHOULD be passed on when appropriate as a Divert option to another Discovery Initiator. What's an "interface index" S 3.5.4.4. Since the relay device is unaware of the timeout set by the original initiator it SHOULD set a timeout at least equal to GRASP_DEF_TIMEOUT milliseconds. I'm not sure I'm following here. Does the relay instance retransmit with its own timeout? It MUST cache the Session ID value and initiator address of each relayed Discovery message until any Discovery Responses have arrived or the discovery process has timed out. How does this behave if the original initiator's timeout is longer than GRASP_DEF_TIMEOUT? S 3.5.5. A negotiation procedure concerns one objective and one counterpart. Both the initiator and the counterpart may take part in simultaneous negotiations with various other ASAs, or in simultaneous negotiations about different objectives. Thus, GRASP is expected to be used in a multi-threaded mode. Certain negotiation objectives may have restrictions on multi-threading, for example to avoid over-allocating resources. "multi-threaded" is an odd word here. I assume you mean that you are doing multiple stuff at once, but you might actually write the system using non-multi-threaded techniques. S 3.7. You seem to be going to a lot of trouble to deal wit session ID collisions. Why don't you just make session IDs 128-bit random values and then you won't have to worry about collisions. The Session ID SHOULD have a very low collision rate locally. It MUST be generated by a pseudo-random algorithm using a locally generated seed which is unlikely to be used by any other device in the same network [RFC4086]. Why don't you just require a cryptographically secure PRNG? That will be required to implement the rest of this protocol S 3.8.2. You seem to introduce a normative dependency on CDDL here. I see that it's in your changelog here, but what are your intentions about this document, given that CDDL seems to not even be a WG document S 3.8.5. It MUST contain a time-to-live (ttl) for the validity of the response, given as a positive integer value in milliseconds. Zero is treated as the default value GRASP_DEF_TIMEOUT (Section 3.6). Why do this, rather than just forbidding 0. S 3.8.6. If a node receives a Request message for an objective for which no ASA is currently listening, it MUST immediately close the relevant socket to indicate this to the initiator. This is to avoid unnecessary timeouts if, for example, an ASA exits prematurely but the GRASP core is listening on its behalf. This is not secure. You need a secure indication of non-knowledge, not a transport-level close. S 3.9.5.4. What are the semantics of a Divert URI? What do I dow ith the path part? S 3.10.4. The semantics of "dry run" seem pretty unclear. Is it just "tell me if you would be sad about doing this"?
The document includes a couple of instances of "reasonable" in normative statements (e.g., "reasonable timeout"). I would strongly recommend having specific recommendations in the document where this happens. The CBOR definition has constants for IP_PROTO_TCP and IP_PROTO_UDP, but no way to register additional values with IANA. This does not seem future-proof. Section 3.8.4 talks about behavior when a node has a "globally unique address," but provides no guidance for detecting this. Are nodes expected to check for link-local, zeroconf, RFC 1918, and RFC 6598 addresses? Any others?
draft-ietf-mpls-tp-shared-ring-protection
I want to thank the authors for a very readable draft. It was a pleasure to review, and that's a high bar for the subject. I have loads of questions, but my first set of questions is an expansion of Alvaro's comment that I think rises to the level of a Discuss. Please note that I'm asking questions, not proposing text changes, so I really do want to discuss it. ---------- my first set of questions In this text, Three typical ring protection mechanisms are described in this section: wrapping, short wrapping and steering. All nodes on the same ring MUST use the same protection mechanism. I would like to understand what happens if they aren't - and I'm asking, mostly as a way of encouraging guidance for operators in debugging cases where they're not all using the same mechanism. I'm not asking for a full mesh of possible misconfigurations, only for a sentence or two ("If they aren't all using the same protection mechanism, the following things may happen"). More broadly, I'd like to understand why wrapping and short wrapping are both defined. It seems like the only functional difference is that short wrapping doesn't give you as much latency. Is that right? 24 pages in, I see this: o In rings utilizing the wrapping protection, each node detects the failure or receives the RPS request as the destination node MUST perform the switch from/to the working ring tunnels to/from the protection ring tunnels if it has no higher priority active RPS request. o In rings utilizing the short wrapping protection, each node detects the failure or receives the RPS request as the destination node MUST perform the switch only from the working ring tunnels to the protection ring tunnels. so I'm pretty sure there are differences beyond what I was seeing, earlier in the document. And, of course, I'm not sure what the effect of choosing steering over wrapping/short wrapping would be, for my users, but that can wait until we talk about wrapping and short wrapping ... At a minimum, I'd like to see guidance for operators in choosing among the three protection mechanisms. Why would they choose any one of the three? I also note that this MUST seems to be repeated using different words in section 5.1, as All nodes in the same ring MUST use the same protection mechanism, Wrapping, steering or short-wrapping. If that's saying the same thing, one MUST is all you need.
---------- all the other questions In this text, When the service LSP passes through the interconnected rings, the direction of the working ring tunnels used on both rings SHOULD be the same. For example, if the service LSP uses the clockwise working ring tunnel on Ring1, when the service LSP leaves Ring1 and enters Ring2, the working ring tunnel used on Ring2 SHOULD also follow the clockwise direction. I'm not understanding why this is a SHOULD, and not a MUST. If the direction of the working ring tunnels used on both rings is not the same, does this still work? If it still works, why does this matter? But, either way, you might usefully say something about why this isn't always the right thing to do, even if you just give one example. The point of SHOULD is that implementers make their own informed decisions, so providing information that will inform those decisions seems important. I wanted to call out Ring switches MUST be preempted by higher priority RPS requests. For example, consider a protection switch that is active due to a manual switch request on the given link, and another protection switch is required due to a failure on another link. Then an RPS request MUST be generated, the former protection switch MUST be dropped, and the latter protection switch established. MSRP mechanism SHOULD support multiple protection switches in the ring, resulting in the ring being segmented into two or more separate segments. This may happen when several RPS requests of the same priority exist in the ring due to multiple failures or external switch commands. as really good examples of the kind of text I think would help the places in this document ("For example", "This may happen when") where no examples are given. Thanks for providing those examples! Ouch. Do I understand from o Protection Switching Mode (M): This 2-bit field indicates the protection switching mode used by the sending node of the RPS message. This can be used to check that the ring nodes on the same ring use the same protection switching mechanism. The defined values of the M field are listed as below: +------------------+-----------------------------+ | Bits (MSB-LSB) | Protecton Switching Mode | +------------------+-----------------------------+ | 0 0 | Reserved | | 0 1 | Wrapping | | 1 0 | Short Wrapping | | 1 1 | Steering | +------------------+-----------------------------+ that you already have three protection mechanisms, and have only one possible codepoint to allocate for any future optimizations? Assuming that "0 0" can be unReserved ... Could you clarify what "anyway" means in this text? When multiple MS RPS requests exist at the same time addressing different links and there is no higher priority request on the ring, no switch SHOULD be executed and existing switches MUST be dropped. The nodes MUST signal, anyway, the MS RPS request code. I'm seeing that the commands like LP described in section 5.2.1.1 are used in the document before these (I'm serious) helpful and clear explanations appear. If it's possible to move section 5.2.1.1 up in the document, that would be great, but if it isn't possible, a forward pointer would be helpful to readers who don't already know what the command abbreviations mean. I'm really confused by this SHOULD: The PSC protocol [RFC6378] is designed for point-to-point LSPs, on which the protection switching can only be performed on one or both of the end points of the LSP. The RPS protocol is designed for ring tunnels, which consist of multiple ring nodes, and the failure could happen on any segment of the ring, thus RPS SHOULD be capable of identifying and handling the different failures on the ring, and coordinating the protection switching behavior of all the nodes on the ring. I suspect that's because it's not a 2119 SHOULD, but if people think it is, I wouldn't mind understanding why. Section 5.3, "RPS and PSC Comparison on Ring Topology" is really helpful, but it appears 43 pages in. Given that I'd expect people to be asking why they should implement a new protection switching protocol when they've already implemented PSC, I'd think this would be much more useful, early in the document. I'm somewhat confused about the code point allocation strategy in this text: The RPS Request Field is 8 bits, the allocated values are as follows: Value Description Reference ------- --------------------------- --------------- 0 No Request (NR) this document 1 Reverse Request (RR) this document 2 unassigned 3 Exercise (EXER) this document 4 unassigned 5 Wait-To-Restore (WTR) this document 6 Manual Switch (MS) this document 7-10 unassigned 11 Signal Fail (SF) this document 12 unassigned 13 Forced Switch (FS) this document 14 unassigned 15 Lockout of Protection (LP) this document 16-254 unassigned 255 Reserved My first question is, why the highest priority RPS value is 15, given that the field is 8 bits wide. If anyone ever needs to add a code point higher than the highest priority code point, will that work well? I can imagine code that says "if operation_priority is greater than highest_priority, it's an error", for example. I may have other questions depending on your answer, but let's start there.
Some nits and a question: 3. MPLS-TP Ring Protection Criteria and Requirements a. The number of OAM entities... "Each ring-node requires only one instance of the RPS protocol. " --- not super important, but is this "Each ring-node requires only one instance of the RPS protocol (regardless of the number of rings)" or "Each ring-node requires only one instance of the RPS protocol per ring"? -- if a node participates in multiple rings, does it need an instance for each ring? (I suspect that this is somewhat of an implementation choice, but am not sure). 4. Shared Ring Protection Architecture 4.1. Ring Tunnel "... ring tunnels which provides a server layer for the LSPs traverse the ring." I think "for the LSP's traversing the ring." (or perhaps "which traverse the ring.")
Substantive: - The abbreviation "MSRP" is already used by RFC 4975. Please avoid overloading it if at all possible. (And you probably want to collide with "Manufacturer's Suggested Retail Price" even less.) -4.4.2: "When the service LSP passes through the interconnected rings, the direction of the working ring tunnels used on both rings SHOULD be the same. " Would it ever make sense for the directions to be different? (That is, why not MUST?) If so, a few words about that would be helpful. -5.1, 3rd bullet: "Determination of the affected traffic SHOULD be performed by examining the RPS requests (indicating the nodes adjacent to the failure or failures) and the stored ring map (indicating the relative position of the failure and the added traffic destined towards that failure)." Would it ever make sense to violate that SHOULD? (That is, why not MUST?) -6.2: Why "standards action"? That's a high bar. Are there reasons why a lower bar like "specification required" would not be appropriate? For example, are we in danger of running out of code points? Is this registry at unusual risk for poor quality registrations? Editorial: -3: Is this section expected to be useful to implementors? It reads more like evidence to the WG that this meets the requirements. I suspect people won't much care about that once this is published as an RFC. Please consider moving it to an appendix, or even removing it entirely. -4.4.2: "For example, if the service LSP uses the clockwise working ring tunnel on Ring1, when the service LSP leaves Ring1 and enters Ring2, the working ring tunnel used on Ring2 SHOULD also follow the clockwise direction." Please avoid repeating the 2119 "SHOULD" in the example. - 5.1: "The MSRP protection operation MUST be controlled with the help of the Ring Protection Switch protocol (RPS)." That seems like a statement of fact, rather than an implementation requirement. Starting around 5.1, I notice several uses of the word "source" as a verb, where from context it seems like you mean "to send" or "to originate". Is that a term of art? I usually think of "source" as a verb to mind "acquire","find" or "find a source for" -5.3: "... thus RPS SHOULD be capable of identifying and handling the different failures on the ring ..." That seems like a statement of fact.
Two technical comments that I think are important to address but do not warrant a discuss: 1) section 5.2: "As shown in Figure 14, when no protection switching is active on the ring, each node MUST send RPS requests with No Request (NR) to its two adjacent nodes periodically." What does periodically mean here? Can you maybe give a number or even a normative statement like "and MUST NOT send more often than every X seconds" to avoid unnecessary congestion...? 2) section 5.1.1: "A ring node which is not the destination of the received RPS message MUST forward it to the next node along the ring immediately." Why would you forward these? I thought you only send messages to your neighbors? Maybe I missed this but is there a use case for this scenario? Otherwise it might be safer to not forward to avoid that messages with a wrong destination node ID circle around forever. If you forward maybe you also need a hop-count to decrease or at least say that messages that are received and have the own node ID as source node ID MUST be dropped...? Further, as mentioned by Ben for a couple of case, some of the uses of normative language in section 5 seems not to be appropriate as they don't specify a concrete implementation action. Please check carefully and change some to lower case instead, e.g. "The MSRP protection operation MUST be controlled with the help of the Ring Protection Switch protocol (RPS). " "The RPS protocol MUST carry the ring status information and RPS requests,.." (this sounds like a requirement on the protocol design but when you implement the protocol as specified there is no way to not do it, so this MUST is unnecessary) "Each node on the ring MUST be uniquely identified by assigning it a node ID." (also requirement-like; the MUST in the next sentence is the important one) "When a node detects a failure and determines that protection switching is required, it MUST send the appropriate RPS request in both directions to the destination node." "MSRP mechanism SHOULD support multiple protection switches in the ring, resulting in the ring being segmented into two or more separate segments. " "The first three RPS protocol messages carrying new RPS request SHOULD be transmitted as fast as possible." (Again the later SHOULD is the more important one) There may be more…
I'd like to see the discussion with gen-art reviewer conclude and the associated changes folded into the next version of the document.
The security considerations of this document seem unacceptably incomplete, as they basically just point to other documents. The RPS protocol defined in this document is carried in the G-ACh [RFC5586], which is a generalization of the Associated Channel defined in [RFC4385]. The security considerations specified in these documents apply to the proposed RPS mechanism. The security considerations of those documents don't seem that great either. However, I believe that they miss a new security issue raised by the mechanism in this draft, which is that a member of the ring appears to be able to forge reports of errors at other parts of the ring. Specifically, S 5.1.3.3 says: When a node is in a pass-through state, it MUST transfer the received RPS Request in the same direction. When a node is in a pass-through state, it MUST enable the traffic flow on protection ring tunnels in both directions. This seems not to involve any filtering, which suggests that node B can send a forged SF from C->D and from D->C, which at least potentially temporarily breaks the link there, causing traffic diversion. More generally, this system assumes that every node trusts every other node completely. That must at least be stated. Incidentally, the text above appears to contain a bug in that it doesn't talk about processing incoming RPS requests intended for the receiving node, but I may just have missed the section where it says that.
S 4.1.1. protect these LSPs that traverse the ring, a clockwise working ring tunnel (RcW_D) via E->F->A->B->C->D, and its anticlockwise protection ring tunnel (RaP_D) via D->C->B->A->F->E->D are established, Also, an anti-clockwise working ring tunnel (RaW_D) via C->B->A->F->E->D, and its clockwise protection ring tunnel (RcP_D) via D->E->F->A->B->C->D Why does the protection tunnel include D on both ends whereas the working tunnel does not? S 4.2. packets are periodically exchanged between each pair of MEPs to monitor the link health. Three consecutive lost CC packets will be interpreted as a link failure. Is this a normative statement (i.e., does it need a MUST). S 4.3.2.1. Why do you ever not use short wrapping? S 5.1.4.1 A node MUST revert from pass-through state to the idle state when it detects NR codes incoming from both directions. Both directions revert simultaneously from the pass-through state to the idle state. incoming within what time frame?
This document describes 3 different protection mechanisms and it specifies that all nodes "MUST use the same protection mechanism". When should these mechanisms be used? What are the conditions that an operator should take into account when selecting between them? I would like to see operational considerations explained.
draft-ietf-idr-shutdown
(So obviously the right thing to do that even TSV ADs ballot Yes - thanks!)
Having had to deal with many instances of "<ring ring>Hey, my BGP session with you just went down, whatsup?!", "Yes, it's a maintenance. I sent you mail about it last month, then last week, then this morning, then 5 minutes before pulling the session. You even generated a ticket for me, it's # [1432323] 'kthnxbye...<click>" I think that this is the best thing since sliced bread (of course, I also thought jabber over BGP was cool). Some nits: 2. Shutdown Communication Shutdown Communication: to support international characters, the Shutdown Communication field MUST be encoded using UTF-8. perhaps: "MUST be encoded using UTF-8 "Shortest Form" encoding"? (from Security Considerations) - or Alexey Melnikov's suggestion... Also, *perhaps* it is worth noting that it might be possible for someone to send: 'BGP going down\nMay 22 11:19:12 rtr1 mib2d[42]: SNMP_TRAP_LINK_TYPE: ifIndex 501, ifOperStatus "Interface is a small turnip", ifName ge-1/2/3' and that logging of these should strip control characters. This may already be covered in syslog...
I wondering why there is this addition cited below to the copyright notice needed given there is no IPR declared. Can you please explain?! „This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.“
Since this notification message is only defined for two of the subcodes, shouldn't the error handling in Section 4 also check for invalid subcodes? Why is the length limited to 128 (instead of the possible 255)? Could be useful to clarify.
The portion of Section 6 (Security Considerations) that discusses confusable characters is describing a problem that isn't obvious on first reading. As these strings are human-produced and human-consumed, it's not clear what harm would arise through the use of spoofing. If there is a real risk here that the authors are aware of, it should be described in more detail to allow implemetors to more adeptly steer around it. If not, the statement around spoofing should probably be removed so as to avoid implementors scratching their heads regarding what mitigating actions they might take.
In Section 2: Shutdown Communication: to support international characters, the Shutdown Communication field MUST be encoded using UTF-8. A receiving BGP speaker MUST NOT interpret invalid UTF-8 sequences. Note that when the Shutdown Communication contains multibyte characters, the number of characters will be less than the length value. This field is not NUL terminated. I think you should stick a reference to RFC 5198, which talks about subset of UTF-8 intended for human consumption. I was also thinking about language tagging (RFC 5646) for human readable text, but I suspect that nobody will implement it in your extension.
draft-ietf-dnsop-nsec-aggressiveuse
I should ballot Discuss, so we can all tell Warren how awesome this draft is on the telechat itself. More seriously, I'm pretty sure I was Gen-ART reviewer for the RFC being updated, and this update seems very much like the right thing to do.
One smallish, unimportant editorial comment: In section 5, e.g.: "If the negative cache of the validating resolver has sufficient information to validate the query, the resolver SHOULD use NSEC, NSEC3 and wildcard records aggressively." it seems like the word "aggressive" has some meaning which was at least not clear to me. Is there a difference in negative caching and aggressive negative caching? If this word should provide any additional information on what to do could you maybe further explain?
I agree with Adam's comment about the parenthetical phrasing in the abstract. I see the intent for text in square brackets to be removed. Did I miss instructions to the RFC Editor to that effect? Most likely they will figure it out, but explicit instructions would be better.
It would have been nice to use a AAAA record in the examples.
I'm an author, recusing myself. But if I weren't, I'd ballot "Awesome" :-)
This seems like a good change; the description is well written and easy to understand; and the logic seems sounds and well-explained. The abstract should remove the parentheses from the second paragraph, as they form an important (as opposed to incidental) part of the description of the update.
Specially for Warren: "Awesome" :-)
draft-ietf-oauth-native-apps
Thanks to Zitao Wang (Michael) for the OpsDir review, and William for addressing the comments...
I agree with Adam's general sentiment about detection of bad behavior vs asking people not to be bad. -8 and it's children: There seems to be a lot of duplication (including duplication of normative language) between the security considerations and the rest of the document. - 8.7: This section seems to argue against using in-app browser tabs in the first place. If there is no good way for the user to tell the difference between that and an imbedded UA, then maybe we should train users to be suspicious of any in-app presentation of the authorization request? The last paragraph seems to be founded on a mismatch between user needs and typical user sophistication.
A couple of nits: 8.2. OAuth Implicit Grant Authorization Flow The OAuth 2.0 implicit grant authorization flow as defined in Section 4.2 of OAuth 2.0 [RFC6749] generally works with the practice of performing the authorization request in the browser, and receiving the authorization response via URI-based inter-app communication. However, as the Implicit Flow cannot be protected by PKCE (which is a required in Section 8.1), the use of the Implicit Flow with native apps is NOT RECOMMENDED. NOT RECOMMENDED is not actually a construct allowed by RFC 2119, I think you should reword it using "SHOULD NOT". It would be good to add RFC reference for HTTPS URIs.
Document: draft-ietf-oauth-native-apps-11.txt S 7. To fully support this best practice, authorization servers MUST support the following three redirect URI options. Native apps MAY use whichever redirect option suits their needs best, taking into account platform specific implementation details. It's not entirely clear from this text what "support" means. Would just echoing whatever redirect URI the client provided count as support? S 7.2. App-claimed HTTPS redirect URIs have some advantages in that the identity of the destination app is guaranteed by the operating system. For this reason, they SHOULD be used in preference to the other redirect options for native apps where possible. You should probably be clearer on who this guarantee is provided to. And I assume this SHOULD is directed to app authors? Claimed HTTPS redirect URIs function as normal HTTPS redirects from the perspective of the authorization server, though as stated in Section 8.4, it is REQUIRED that the authorization server is able to distinguish between public native app clients that use app-claimed HTTPS redirect URIs and confidential web clients. S 8.4 doesn't seem clear on how one makes this distinction. Is it just a matter of remembering what the app author told you? S 8.1. As most forms of inter-app URI-based communication send data over insecure local channels, eavesdropping and interception of the authorization response is a risk for native apps. App-claimed HTTPS redirects are hardened against this type of attack due to the presence of the URI authority, but they are still public clients and the URI is still transmitted over local channels with unknown security properties. I'm probably missing something, but I'm not sure what this last sentence means. Is the channel here the one that kicks off the native app with the HTTPS URI as the target?
General ======= The thesis of this document seems to be that bad actors can access authentication information that gives them broader or more durable authorization than is intended; and appears to want to mitigate this predominantly with a single normative statement in a BCP telling potential bad actors to stop doing the one thing that enables their shenanigans. For those familiar with the animated series "The Tick," it recalls the titular character yelling "Hey! You in the pumps! I say to you: stop being bad!" -- which, of course, is insufficient to achieve the desired effect. I see that there is nevertheless "strong consensus" to publish the document; in which case, I would encourage somewhat more detail around what the rest of the ecosystem -- and the authentication server in particular -- can do to mitigate the ability of such bad actors. Specifically, section 8.1 has a rather hand-wavy suggestion that authorization endpoints "MAY take steps to detect and block authorization requests in embedded user agents," without offering up how this might be done. The problem is that that the naïve ways of doing this (UA strings?) are going to be easy to circumvent, and the more advanced ones (say, instructing users to log in using a non-OAuth flow if the auth endpoint detects absolutely no cookies associated with its origin) will have interactions that probably warrant discussion in this document. (For example, such an approach -- while potentially effective -- would interact very poorly with the "SSO mode" described in section B.3; although I think that recommending the use of "SSO mode" should be removed for other reasons, described below). ________ Specific comments follow The terminology section makes distinctions about cookie handling and content access in generic definitions (embedded versus external UAs, for example) but doesn't do the same for specific technologies. It is probably worthwhile noting that the "in-app browser tab" prevents apps from accessing cookies and content, while the "web-view" does not (I had to infer these facts from statements much later in the document). Section 7.3 gives examples of IPv4 and IPv6 addresses for loopback. While I'm sympathetic to the deployment challenges inherent in getting entire network paths to upgrade to IPv6, this text discusses loopback exclusively, which means that only the local operating system needs to support IPv6. Since all modern operating systems have supported IPv6 for well over a decade, I suggest that the use of IPv4 addresses for this purpose should be explicitly deprecated, so as to avoid unnecessary transition pain in the future. Minimally, the example needs to be replaced or supplemented with an IPv6 example, as per <https://www.iab.org/2016/11/07/iab-statement-on-ipv6/>: "We recommend that existing standards be reviewed to ensure they... use IPv6 examples." Section 8.1 makes the statement that "Loopback IP based redirect URIs may be susceptible to interception by other apps listening on the same loopback interface." That's not how TCP listener sockets work: for any given IP address, they guarantee single-process access to a port at any one time. (Exceptions would include processes with root access, but an attacking process with that level of access is going to be impossible to defend against). While mostly harmless, the statement appears to be false on its face, and should be removed or clarified. Section 8.4 indicates that loopback redirect URIs are allowed to vary from their registered value in port number only. If you decide not to deprecate the use of IPv4 loopback, I imagine that servers should also treat [::1] identical to 127.0.01 for this purpose as well. Section 8.7 claims that users are likely to be suspicious of a sign-in request when they should have already been signed in, and goes on to claim that they will distinguish between completely-logged-out states and logged-in-but-needing-reauth states, and may even take evasive action based on associated suspicion. Based on what I know of user research for security indicators, the chances of these statements being true for any non-trivial portion of any user population is basically zero. I propose that this section simply highlight that this is effectively an intractable problem from the client end, without any illusions that users have the ability to distinguish between the two circumstances, and that authentication servers must be extra vigilant in detecting and avoiding these kinds of attacks. Section 8.11, third paragraph talks about keystroke logging; in practice, the attack here is far easier than that, as I believe that applications that embed a web view can simply extract authentication-related material directly from the DOM. Section B.2 uses the phrase "Android Implicit Intends" where I believe it means "Android Implicit Intents." Section B.3 describes the use of a "Web Authentication Broker" in SSO mode, which provides an isolated authentication context. If the section 8.7 text regarding user detection of nefarious application behavior in the form of web-view embedding is not removed, this needs a very clear treatment of how users might be expected to distinguish between that behavior and the SSO mode behavior. On casual examination, it seems that there would be no way to do so. I'll note that this BCP also promotes the "already logged in" behavior as being a key benefit to OAuth (cf. the third paragraph of Section 4), which the described behavior seems to mostly defeat. I would strongly suggest either removing discussion of using this mode, or deprecating it in favor of the user's preferred web browser, so as to obtain the advantages described in section 4.
Quick question just to double-check: should this document update RFC6749?
draft-ietf-isis-l2bundles
The shepherd raises multiple concerns about this document including the need for additional discussions in rtgwg, more operator feedback, technical concerns, and no discussion about the filed IPR. Further, the security considerations section only says 'None' which also seems not appropriate. I didn't follow the working group, nor am I an expert on this work, therefore I abstain. However, for me this document does not seems to fulfill the needed processing requirements to be published but I'll leave the judgement to the responsible AD.
* I would like to suggest rewriting the examples with IPv4 addresses from the documentation block(s) defined in RFC5737 instead of using random IP addresses. * It would have been nice to include an example with IPv6 addresses. * Appendix A: The length value for "L2 Bundle Attribute Descriptors" under "TLV for Adjacency #2" is wrong. It says 29 but it needs to be 32 * I would like to see Adam's DISCUSS points addressed as well.
The main motivation (from the Introduction) for this extension seems to be to provide information to "entities external to IS-IS such as Path Computation Elements". I would like to see some discussion related to the "interface" with these external entities. [Note: I am not asking for a comparison with other potential solutions like BGP-LS.] I would also like to see the conversation with the OPS DIR reviewer reflected in the document. I also support the DISCUSS points about the Security Considerations. About other potential or protocol agnostic solutions... I read the related discussion [1] that took place before the WG adoption of the document, and trust that the subsequent adoption and WGLC reflect the consensus of the WG. As has been put in evidence by the development of multiple routing protocols and other solutions, one size doesn't always fit all. [1] https://mailarchive.ietf.org/arch/msg/isis-wg/cK6dBtjvFZgNnZsQGZBgrpGG8jc/?qid=0a77495659ebc27956fe54d200bf3f33
I support the various DISCUSS points concerning the security considerations. I note that the remaining authors have made their IPR statements, so that discussion is moot. I share some of the discomfort concerning the shepherd report. I'm willing to accept that the shepherd is in the rough, but it would be nice to have stronger evidence of that, perhaps in the form of an opinion from the other chair. To quote a wise area director: I leave it to the responsible AD to do the right thing, whatever that might be.
1. "IPR declarations from Clarence Filsfils and Ebben Aries are missing.". Adam is right. That's a showstopper. 2. Reading the write-up: (5) Do portions of the document need review from a particular or from broader perspective, e.g., security, operational complexity, AAA, DNS, DHCP, XML, or internationalization? If so, describe the review that took place. An operational review on the question how should management of L3 bundles be handled would be a good recommendation. This is exactly what Mahesh's OPS DIR feedback is about: https://datatracker.ietf.org/doc/review-ietf-isis-l2bundles-04-opsdir-telechat-jethanandani-2017-04-20/ I've seen Les' answers. I believe those should be documented in the draft.
I agree that there needs to be security considerations and have the following suggestions to help fill in that section. I think I caught the added considerations, but please expand on it if I've missed something. The draft seems to enable methods to gather information on connected links and the available bandwidth. That should be mentioned as a vulnerability, exposing path information (connections/links and bandwidth). This is a consideration in other IS-IS RFCs and is specific to the TLVS and subTLVs of this draft as well as far as I can tell, but please correct me if I am missing something. The use of the Sub-TLV identifiers provide path information that should be a security consideration in the write up: o IPv4 Interface Address (sub-TLV 6 defined in [RFC5305]) o IPv6 Interface Address (sub-TLV 12 defined in [RFC6119]) o Link Local/Remote Identifiers (sub-TLV 4 defined in [RFC5307]) Within a single operator environment, the concerns are mitigated, but not eliminated since it does not appear that encryption is used. The following text from RFC7917 seems like a useful addition to these security considerations along with an explanation of what is possibly exposed with this draft (above): Security concerns for IS-IS are already addressed in [ISO10589], [RFC5304], and [RFC5310] and are applicable to the mechanisms described in this document. Extended authentication mechanisms described in [RFC5304] or [RFC5310] SHOULD be used in deployments where attackers have access to the physical networks, because nodes included in the IS-IS domain are vulnerable.
I'm in agreement with the current DISCUSSes raised. (watching carefully)
I agree with Adam's discuss and in particular the point about Security Considerations. I am holding this discuss independently so I can review those when they exist.
[removed issue regarding author IPR declarations based on <https://mailarchive.ietf.org/arch/msg/isis-wg/Du-FujLleUPhkSbo_Ud00rvVEi4> and <https://mailarchive.ietf.org/arch/msg/isis-wg/hQGriZR12khwiX7NR53lJULqwWk>] Blocking Issue: This document is at odds with BCP 72, and is inappropriate for publication with its current security considerations section.
If the Discuss objections I lay out can be addressed, I plan to abstain for the many of the same reasons Mirja cites in her abstention. I find the shepherd's write-up to contain an alarming number of red flags indicating a lack of WG consensus and, lack of proper review by parties who should be involved, claims that operator input has been ignored (for a routing protocol no less), and indication that IPR disclosures have not apparently been brought to the WG's attention. These overarching process problems seem large enough that any comments I may have on actual content -- such as an apparent lack of IPv6 support (or, at least, a complete omission of IPv6 from the examples) -- would seem like rearranging deck chairs on the Titanic.
(1) I support Adam and Ekr's DISCUSSes about the security considerations. (2) I also agree that this document should not go forward until Clarence confirms that all appropriate IPR disclosures have been filed. (3) Given the point raised by the shepherd about protocol-agnosticism, it seems like the existence of single mail (all I could find, but maybe there is more?) from another WG participant who works for the same vendor as several of the authors saying that he believes the encodings are adaptable to OSPF is not quite sufficient justification for putting the shepherd's concerns aside. OTOH, perhaps there is more context that is not present in the ballot text, list archives, and minutes I reviewed, so just flagging this in case there is further explanation that could be provided to the IESG (don't think this would imply changes to the draft).
draft-ietf-nfsv4-xattrs
I share EKRs concerns.
For the ADs - this draft is still in Last Call (ends 2017-05-16). I'm creating the ballot now, because it's easier to juggle the three NFSv4 drafts on the 2017-05-25 telechat agenda if you read -versioning first, and then read -xattrs and -umask together.
Jürgen Schönwälder, in his OPS DIR review flagged the same comment as Alexey: I was a bit surprised to see that numeric values for NFSv4 protocol extensions etc appear to be somewhat magically managed (section 8.6) instead of using IANA but this likely has its own history and reasons.
I agree with the SecDir review about removing the nested references in the security considerations section. https://www.ietf.org/mail-archive/web/secdir/current/msg07386.html The security considerations section does exist and states that file attribute extensions adds no new concerns than that of file data and named attributes. It defers to the security considerations of application data in NFSv4.2 (RFC 7862), which refers to NFSv4.1 (RFC 5661). 5661 discusses possible MITM and down-grade attacks and how to mitigate them with RPCSEC_GSS (integrity or privacy services). I agree with this assertion, though I'd rather have the draft reference 5661 directly or RFC 7530. And support EKR's discuss.
Since xattrs are application data, security issues are exactly the same as those relating to the storing of file data and named attributes. These are all various sorts of application data and the fact that the means of reference is slightly different in each case should not be considered security-relevant. As such, the additions to the NFS protocol for supporting extended attributes do not alter the security considerations of the NFSv4.2 protocol [RFC7862]. This seems inadequate. The issue is that if machine A writes some extended attribute which is security relevant (i.e., this file is only readable under certain conditions) and then machine B doesn't know about the attribute, then you have a security problem on B because it will not enforce it. It seems like FreeBSD uses extended attributes for this purpose, so this isn't just theoretical.
My understanding of draft-ietf-nfsv4-versioning is that extensions are tightly bound to precisely one minor version of NFS, and become part of the mandatory-to-understand (but not necessarily implement) XDR for the next minor version. Based on this, I would expect any extension based on the new scheme defined by draft-ietf-nfsv4-versioning to be exceeding clear about which minor version they apply to (e.g., in the abstract, introduction, and/or title -- ideally all three). I can infer that this extension applies to 4.2 based on the timing of its publication request and a few sidelong mentions of 4.2 in the text, but I think this really needs to be more prominent. Editorial: - The code which is extracted from this document contains a copyright date of 2012. Is that intentional? - The "RFCTBD10" string in the code block in section 7.1 is highly likely to be overlooked by the RFC production center in its final production steps. I would recommend a note (either inline or as an RFC editor note to accompany the document) that explicitly calls out a need to replace this string. - Please expand the acronym "ACE" on first use.
Comment is Section 8.6 made me wonder: is there an IANA registry of all NFS extensions? Having a single place would avoid problems of conflicting allocations. Nit: "xattrs" is sometimes mistyped as "xatrrs"
draft-ietf-nfsv4-versioning
Russ Housley has provided a Gen-ART review for -09 version of this document, and the author is responding to those comments. I did have one question that came up during AD Evaluation that I wanted to mention. The first two drafts that used this mechanism (umask and xattrs) used two different idioms for discovering support. The xaddrs draft defines an xaddr_support attribute, while umask does not. In conversations with the working group, the reasoning was that xattrs defines a number of operations, so discovering that the complete mechanism is supported before you start trying to use the attributes makes sense, while umask defines only one attribute, and for any attribute, you can find out if it is supported within a given file system by interrogating the appropriate bit position in the REQUIRED attribute supported_attrs, so there is no advantage to adding a umask_support attribute. That all made perfect sense to me, but the explanation was helpful enough to me that I wonder if it's worth a sentence or two, pointing out that some protocol designers may choose one idiom, while other protocol designers choose the other, and saying that's not a problem. If the answer is "that explanation isn't needed", that won't change my ballot position from Yes, of course.
Out of curiosity, why isn't this material more appropriate as a BCP?
(Updated): Please have a look at comments raised in ARTART Directorate review: <https://datatracker.ietf.org/doc/review-ietf-nfsv4-versioning-09-artart-lc-miller-2017-05-12/> This is one of the most comprehensive documents on versionning/extensibility that I've seen. It is probably overkill for many other protocols, but I am glad that you wrote it for NFSv4+. I only have a few minor comments: In Section 6: Extensions to the most recently published NFSv4 minor version may be made by publishing the extension as a Proposed Standard, unless the minor version in question has been defined as non-extensible. A document need not update the document defining the minor version, Do you mean "need not update" in the sense of not using "Updates" header in the resulting RFC? If yes, I think you should make it clearer. which remains a valid description of the base variant of the minor version in question. In Section 8: This section addresses issues related to rules #11 and #13 in the minor versioning rules in [RFC5661]. It would be good to add section number reference here (Section 2.7), to save readers troubles trying to figure out what #11 and #13 mean.
I suspect this was discussed as part of the document's development, but it's clear that this new approach to versioning presents new challenges for preventing collisions of numeric constants and bitmap bit meanings. Previously (by my understanding; this isn't my area), minor versions included a full XDR, and therefore effectively carried their own complete and hermetically-sealed registry with them. With the new approach, additional documents may extend the XDR independently. I understand that IANA registration of the various codepoints in NFS is probably too daunting a task to consider reasonable, and that there is effectively an understanding in the working group that future extensions to a minor version are responsible for checking that they don't conflict with any published or pending extensions prior to publication. I don't have an issue with this approach per se, but I think it should be more clearly spelled out in this document. Editorial: - The document uses both "interversion" and "inter-version" -- please choose one and stick with it. - As section 8 is targeted at an audience that may not be concerned with the remainder of the document, I would suggest that the introduction specifically point implementors to it. - The first bullet under "Based on the type of server" in section 9.2 says older servers can only interoperate with older clients; when, in fact, they can clearly operate with newer clients described by the third bullet of "Based on the type of client:". Recommend: "...interoperate with clients implementing the older version. However, clients that do not implement the older version of the feature..."
It looks like the author has proposed a number of edits in response to the gen-art review, and I'd like to see those incorporated into the next version. I do not think this document needs to formally update RFC 7530.
draft-ietf-nfsv4-umask
I agree with Alexey - an example would be helpful. As someone who has run into this issue (and differences in behavior!), this is a useful document. Nits: Abstract: "In many important environments, inheritable NFSv4 ACLs can be rendered ineffective by the application of the per-process umask." s/important// 1: (personal peeve) - every environment is important to someone... 2: this makes it sound like inheritable ACLs would NOT be ineffective if the environment is not important :-) Sec 2. Problem Statement "As a result, inherited ACEs describing".... First use of ACE, please expand / reference.
Without being "fluent" in NFSv4, it would be nice to have an example how this fit into larger picture. E.g. by showing a file create request.
Editorial: Page 2: As a result, inherited ACEs describing Suggest expanding the “ACE” or adding a reference since the term first appeared.
Please expand "ACL" and "ACE" on first use and in the title. Section 5 uses an all-caps "RECOMMENDATION," which is confusable with (but not) an RFC2119 term. If this is intended to be invoke RFC2119 terminology, please rephrase with "RECOMMENDED" or "SHOULD." If not, please remove the capitalization or change to a synonym that is less confusable with "RECOMMENDED."
Not sure I understand the reason in section 3 why this document does not update RFC7862. But I guess both (updating or not updating) is fine.
draft-ietf-tls-ecdhe-psk-aead
Ciphersuite drafts for TLS are usually above my pay grade, but I understand most of EKR's Discuss, and agree with Adam's suggestion to change the document title to "ECDHE_PSK with AES-GCM and AES-CCM Cipher Suites for Transport Layer Security Version 1.2 (TLS 1.2)" at an absolute minimum.
I support Ekr's DISCUSS position.
The following text appears to have been added in -04 A server receiving a ClientHello and a client_version indicating (3,1) "TLS 1.0" or (3,2) "TLS 1.1" and any of the cipher suites from this document in ClientHello.cipher_suites can safely assume that the client supports TLS 1.2 and is willing to use it. The server MUST NOT negotiate these cipher suites with TLS protocol versions earlier than TLS 1.2. Not requiring clients to indicate their support for TLS 1.2 cipher suites exclusively through ClientHello.client_hello improves the interoperability in the installed base and use of TLS 1.2 AEAD cipher suites without upsetting the installed base of version-intolerant TLS servers, results in more TLS handshakes succeeding and obviates fallback mechanisms. This is a major technical change from -03, which, AFAIK, prohibited the server from negotiating these algorithms with TLS 1.1 and below and maintained the usual TLS version 1.2 negotiation rules. This is a very material technical change. I don't consider it wise, but in any case it would absolutely need WG consensus, which I don't believe that it has given the recent introduction. The discussion of dictionary attacks here seems inferior to that in 4279. In particular, you only need to actively attack one connection to capture the data you need for a brute force attack despite the text there referring to trying "different keys". Please correct that.
The citations to TLS 1.3 still seem pretty muddled. I think you should just stop referencing and discussing 1.3. S 2. I'm not sure that the discussion of the PRF is helpful here in mandating the non-use of these cipher suites with TLS 1.1 and below.
I agree with EKR's discuss -- specifying semantics for these ciphersuites with TLS 1.0 and 1.1 is a material change, and the proposed mechanism (in which servers are encouraged to infer 1.2 support even in the absence of explicit indication) is a bit baffling. Given the scope this document covers, I recommend adding "1.2" to the title of the document. (e.g.: "ECDHE_PSK with AES-GCM and AES-CCM Cipher Suites for Transport Layer Security Version 1.2 (TLS 1.2)")
draft-ietf-dnssd-mdns-dns-interop
I think that this is a useful document -- I think that it would be more useful if it A: made DNS-SD be LDH only, or somehow made all DNS deployments be UTF-8 (without any sort of homograph issues), but seeing as both of these would require magic, I'm balloting Yes. :-) 2 nits: 1: ... "the so-called LDH rule" -- I think that it would be useful to expand this - the tone is introductory, and so I think helpful to new readers. 2: "cannot be used in the DNS unless they cleave to the LDH rule." - I would suggest "adhere to" or "follow" - 'cleave to', while cooler, is likely confusing to a: those who don't have English as a first language, or b: were born after 1886. :-)
I do have one comment, and it's only for consideration by the responsible AD. This document is great, and the shepherd thinks it's received sufficient review for publication as Informational, but I wonder if it might - make sense to publish as a BCP, which would generate additional review from other communities, OR - make sense to publish as Experimental, which might signal that this document is probably the right thing to, but the jury is still out, OR - include "You are not expected to understand this" in the Introduction, crediting Dennis Ritchie for prior art (*) I'm MOSTLY kidding ... (*) https://en.wikipedia.org/wiki/Lions%27_Commentary_on_UNIX_6th_Edition,_with_Source_Code#.22You_are_not_expected_to_understand_this.22