***DRAFT*** IETF 77 ICCRG Minutes – still in-progress Session 1 - Tuesday RG Status was presented by RG co-chair Wes Eddy - There are strong presentations and attendance at meetings, but mailing list traffic is light; use the mailing list more - ICCRG will be meeting at IETF 78. - RFC 5783 on Congestion Control in the F\RFC Series published in February - Would like to have capacity sharing architecture work written up as an I-D - MultTFRC work is coming to this RG - Lots of TCP CCs have been reviewed Wes presented “Around the IETF” regarding other IETF activities that are related to congestion control: http://www.ietf.org/proceedings/10mar/slides/iccrg-9.ppt Disagreement about ALTO; clarified that it improves P2P performance through localization, lowering the RTT, which will raise congestion (a 2nd order effect). Aaron Falk – in the apps area, is overload protection the only issue? Lars Eggert – Clarification – admission control is the issue A Falk – lots of real-time stuff going on in the Internet now, which needs bandwidth management. Lars: it would be helpful if someone with a congestion control mind could apply it to admission control for applications and generalize, with provably stable behavior LISP may cause delays, or reordering (look at data probe concept in LISP+ALT) Bob Briscoe – Looking at LISP and others regarding Congestion Control Lars: In BFD’s latest draft, the congestion control mechanism was taken out because it hadn’t been vetted. There is a potential research topic. A Falk – Is this tracking of activity wrt CC a good topic for an IESG talk? Revise talk, and go over with Ads ============================= Briscoe and Mathis Presented on Internet Capacity Sharing Architecture There is a BoF on congestion exposure. If it goes through, it would raise a number of questions ICCRG could have mutually incompatible views on the world: - Existing guidelines on CC, and check/verify new protocols - Develop new guidelines; if everyone agrees that we need to move on from TCP friendliness, we can have new CCs as long as they state capacity sharing assumptions such as relentless TCP Matt & Bob are still trying to work out the core dilemma of flow isolation or not, and can’t say anything until they resolve the issue between them They may agree not to agree. CONEX in brief – network helps limit the congestion different users/networks cause; we can relax assumptions about what flows have to do We can count the volume of congestion a user causes, bytes marked w/ ECN or dropped Should incentivize not causing congestion collapse and sharing capacity rather than worrying about instantaneous flow sharing There could be more diversity in individual responses to congestion. We can think about flow arrivals, traffic patterns, length of flows rather than how two flows interact. About traffic rather than just flows. Research questions that would be raised: 1. Scaling transport performance – more dynamic range – what ICCRG is already doing 2. Diversity within ranges of congestion responses 3. Whole internet work – what patterns and structure might emerge Scaling transport performance The problem in TCP from the HS-TCP paper, is that it relys on the square root of loss rate Greedy flows create certain loss rates to fill the capacity. Loss rate proportional to the square of the window Loss fraction goes down by inverse square Recovery time between losses goes up 1 Gbps = 9 minutes between losses To show “god is on our side”, look at what Van Jacobson says (in the TR version he would have written if it could’ve been done in time for SIGCOMM) This is like the millennium bug in TCP What’s the real performance scaling problem? Not flow rates, but 3 aspects: flow rate, # of flows increasing, and flow size If 2 are held the same, see how other varies Holding the size constant … New flows bursting in slow start hit other continuing flows in congestion avoidance We can grow capacity A flow needs longer to recover, but new flows keep beating it down So we’re bounded by the arrival rate of other flows Holding the number of flows constant, a model from 2000 by Neal Cardwell shows smaller flows can get stuck in slow start Phillip Eardley – How does the graph tell you that? Bob clarifies Tim Shepherd – Further investment in capacity does not let people go faster? David McDysan – Other things going on such as CDNs that would lower RTT, that would counter. Not just transport, but router ports Bob: but those are just step functions. We’re still on the same scaling curve Imagine keeping the control frequency constant – feedback every 2 seconds, implies congestion control is 1/p The amount of data between signals is proportional to the data rate, so loss rate has to be inversely proportional to data rate If we want to scale geometrically, we need to have congestion control with 1/p not 1/sqrt(p) Tim Shepherd – We can also scale the segment size to go faster. Matt – if segments are constant time rather than constant size, then we go back to constant frequency Delay sensing works better than loss Lachlan Andrew – exactly what Tom Kelly was pushing with scalable TCP We’re not saying this is new, but can we get consensus on this; trying to unify some things. The footnote is important; later work shows that synchronized losses from drop tail queues burst losses when sawtooth hits capacity. Short RTTs pull down long RTTs. Things can be fine with AQMs but with drop tail, there is a problem. Delay sensing is not a panacea As we scale up, signal from delay is smaller relative to the control Can’t be done without network support as you go faster The challenge is not flows in equilibrium, but getting flows started (changing the equilibrium point) Do we need flow isolation too? Bob and Matt have agreed not to agree on this. Matt conjectures that if you partition flows in the network (maybe with WFQ), then different signals go to different flows, and performance of flows is more predictable/dependable But this fundamentally clashes with some things ECN does with weighted congestion control Alternative view to flow isolation … Not just that flow isolation is really expensive to do on machines Now you have to determine what partitions each flow should have How do you know what’s fair between two ends? Avoiding overshoot Rather than isolate flows, give end systems incentives not to cause congestion, and information to avoid congestion. Even in ECN congestion signals are not enough as you get it once it’s too late. No drop happens, but still an overshoot. Paper by Ion Stoica “One bit is enough” makes a useful proposal A utilization signal can be obtained rather than a congestion signal Being in a high/low statmux queue affects signals Assuming conex is deployed, weighted congestion controls need work Dave McDysan - Would scope include overall cost/economics of the system? Matt Mathis – YES, but IETF doesn’t go there. Huge amount of work to be done. Usually reported out in research journals, which ends up being distilled into RFCs In conex this could be a problem … finding economic models could be used without implying the direction the internet will go Question – ECN marking is difficult in some cases. In wireless, it’s hard to get this kind of information. Guidelines are rough. Answer Briscoe – congestion could be on lots of things such as power; vendors can implement it differently Mathis – flow isolation allows network to control congestion, but changes TCP operation. Phil Eardley – Cascaded ISP slide – Can you do shaping? Mathis – hard to get information and doesn’t scale. Can’t shape on ingress. Question - Every link is connected to devices that have no knowledge of capacity. How can one do early marking if capacity in unknown? Bit rate may not be stable over time ================== Jerry Chu and Nandita Dukkipati Presented on increasing the initial window. http://www.ietf.org/proceedings/10mar/slides/iccrg-4.pdf Jamshid Mahdavi – an initial burst of 10 will impact devices with small queues; we still see them Briscoe – We actually want to encourage short queues! Jana – Avg bandwidth reached 1.7 Mbps world wide from slide. What does this mean? Need more detail regarding this. Need to know bandwidth per connection not IP address. Clients could be behind NATs with many users. Lars – Slide 11 question. What is bandwidth? BW is running throughput average. Gorry – It seems a little bit strange to talk about 10 segment window on ten flows for 5% of users. What happens to VOIP, etc? The Burst is of concern. Jana - If Google switches to initial window to 10, that might be OK, but extrapolating to entire Internet is another issue. Should TCP switch to initial segment to 10 is not clear. Mathis – results are counter intuitive. RFC 2018 is not actually implemented fully. Thus a lot of thought experiments are not correct. Randall Stewart – if IW goes to 10 segments will browsers open less connections? Lars – It’s nice to see actual data; thanks for bringing it here. (big applause) We must be careful regarding results as must understand the data. (agrees with Jana) Jana – Make sure we read the data correctly. A. Falk – Cut off buy Wes. Take to mail list. Wes – All need to read the paper and take to mail list to get to other presentations. Lachlan Andrew relinquished his slot to open discussion, so this talk can continue. A. Falk – need others to take similar experiments. - Looking at aggregated data. Need to maybe took at outliner cases like those with small buffers, etc... - Who is getting poor performance? Are they always the same type of users? Jamshid – Does implementation contain a host cache? Answer, no caching as not considered useful. Chris Morrow – Nobody offered to help test better. Google has infrastructure and is interested in help to test this better. Tim Shepherd – We still don’t know if this is OK for Google to do. Critical question is how does this effect non-TCP traffic and other users? How is this tested? Bob Briscoe – Perhaps see when damage occurs by increasing the window? Then go backward. (Answer – Google is looking at some of this.) Christian – Looks like good optimization for Google search. (Answer – not just “search” but is good for “maps” etc...) Gorry – You need to setup on client side. View what happens in client side stack. Murali Sridharan – Trying similar experiments at Microsoft. See similar results. Perhaps Microsoft and Google can share data. Pekka – Only have one vantage point. Need observation from the network. Lars – The community needs to get together to help out. Latency definition question. Three-way handshake data should be included in measurements. It may be already in the data. Mathis – Also measurements of next connections. And things like SACK. Uganda IETF Fellow – Proposal is probably bad for Uganda. 30,000 students and 40 Mbps bandwidth shared … pipe is always full. Mathis – If there is a single bottle-neck, it will not change the goodput. Utilization is high, but variance explodes. Jana – I have 400 users behind 1 IP point. It happens all over the world in developing countries. Lars – maybe ICCRG should look at developing county issues. All developing regions have purchasing power that lags content. So need for content is greater than ability to purchase bandwidth. Mathis – This may be bad for a lot of people. Human impact may be in a part of the network “WE” do not see often. Bob Briscoe – Better to be handled by end system than network. Router can give information and incentives. Session 2 – Wednesday Mayutan Arumaithurai presented on: draft-mayutan-ledbat-congestionarchitecture-00 The idea is to modularize congestion control: - Congestion detection - Flow control - Bandwidth estimation Gorry Fairhurst – this is an interesting thought process … how far do you wish to really go with specifications? Mayutan – this is a start, main focuses in the past have been on specific aspects, so modularizing may improve this Gorry – in TFRC we worked on loss events, and by splitting this apart and thinking about it, you can show what things are the same; as an academic, I’m interested in this work Matt Mathis – you don’t mention the Congestion Manager; it should be cited Mayutan – The CM compliments our approach; we modularize at a flow basis, and it shares information between/across flows Matt – should be cited; some lessons apply here; RFC 3124 Kevin Mills presented on “Study of Proposed Internet Congestion Control Protocols” Applying techniques of physical sciences / physical systems to large distributed systems Reduced scale discrete event simulation; statistical experiment design; multidimensional data analysis Looking at switching Internet congestion control algorithms Parameter space is huge Reduced scale model (down to 20 parameters) 2 values per parameter Down to running a million simulations 7 times (once per algorithm) 6 important parameters driving system behavior were identified Techniques for reducing the space to cover make this large study possible Looked at how algorithms recover from spatiotemporal congestion in large fast and small slower networks. Initial slow start threshold made a big difference. Euclidean distance clustering shows which algorithms stand out; tells us where there are big differences rather than what they are. Use this to determine what to look at in more detail. E.g. shows Algorithm 3 induces a large retransmission rate as congestion increases. Can see outliers and confidence levels. Some algorithms do considerably better than TCP in goodput when competing as congestion increases (BIC, HS-TCP, and Scalable). C-TCP and H-TCP rank well (friendly to TCP and also perform well in general). Also FAST w/ alpha-tuning enabled. Findings were not driven by comparing by scores; let data identify what questions made sense to distinguish things. Scalable, BIC, and HS-TCP tend to reduce less than TCP, aside from BIC under some circumstances; unfair to TCP and unfair to new flows. Matt Mathis – one of the problems w/ standard TCP is that it’s unfair to itself; can’t fill a fast network. Real question is how much standard TCP loses compared to not having these other algorithms around. C-TCP, H-TCP are fair to TCP; Scalable, BIC, and FAST unfair to TCP and themselves. Consider studying FAST and FAST-AT more before deploying; can have issues w/ loss, including loss of SYNs. BIC is a lot of code; others are not, but may have periodic functions that have to run. FAST and FAST-AT get to their maximum rate fast. Recovery latency was best for C-TCP, next FAST, but FAST-AT not so good. BIC was worst. Recommendations: Users could benefit from these under certain circumstances May not do better for any particular flow C-TCP seems to provide best overall properties FAST and FAST-AT have appealing properties, but intense spatio-temporal congestion send them into oscillation Matt – pulled down final report; awesome Kevin – 500 pages (mostly pictures) Matt – slightly different experiment would be nice; set up network with control category and experimental, swap TCPs from standard in experimental and compare results Kevin – didn’t run this but code is freely available; report and models are freely available John Leslie presented on “Congestion Definitions” This started as a rant. I don’t believe we mean the same thing when talking about congestion. Only concerned with what the word means, not with whether what we’re talking about is good or bad Congestion originally meant everything was fine until we hit some capacity limit and all hell broke loose Van Jacobson is famous for figuring out how to fix that and giving us the algorithm we loved ever since (slow start, cong avoidance, fast retransmit, fast recovery) Introduced Random Early Detection later, changed what we’re talking about, now packets dropped can still be counted, but it’s not the same ECN changed this even more; now talking about marks and not drops Datagram Congestion Control Protocol Look at 2 fundamentally different things that look as if they’re the same: more packets come in than are forwarded at any given instant versus queue arrival rate exceeds service rate … not the same thing Queueing theory definition, networking textbook definition, operators definition, and economic definition Would like to go through exercise of different names for different kinds of definitions on the list? Hand-raise taken to see if this should happen on the list? - Nobody would be bothered by this John will take up on the list Lachlan Andrew gave a quick talk to ask the question whether we believe Reno for a given network will always converge to a given average rate What reasons are there for getting non-unique rates? Matt – things like turbulence have not been answered Lachlan – averages out Matt – chaotic Lachlan – averages out, though not predictable, long-term average Matthew Kaufman - Differences in RTT Lachlan – if fixed, then there’s an average rate Lachlan found a particular network (shown on 2nd to last slide) where flows can have different long-term rates depending on which start first Matthew – does addition of noise fix this? Lachlan – to slight degree People would be very interested in hearing results of this on list