Dropbox excessive oubound packets, Q: how to bandwidth limit

Discussion in 'Tomato Firmware' started by friedgreen, Sep 4, 2013.

  1. friedgreen

    friedgreen Networkin' Nut Member

    Getting alarms from syslog 2500+ outbound packets when dropbox uploads are underway. Anyone have any idea of how to slow those transfers down to a reasonable level that does not overwhelm the upstream path ? some kind of bandwidth limit that could be used ?
    Looking forward to your ideas. thanks.
  2. Porter

    Porter LI Guru Member

    I've never heard of kernel messages that appear when your upstream gets saturated...

    Could you post a few lines of your log, please?
  3. friedgreen

    friedgreen Networkin' Nut Member

    It's syslog output, just says it went to dropbox.com, nothing to see. But the syslog server has alarms for over 2500 outbounds that can show when a lan device is hacked and trying to send massive amounts of data. it has helped many times to catch those unaware that it is even happening. I don't want to shut off or increase the alarm to say 3500 because if their being port scanned internally that runs about 2600 packets.
    I just want to limit the amount of packets going out to dropbox.com, keep it under 2500 a minute.

    Thanks for your reply.
  4. Porter

    Porter LI Guru Member

  5. friedgreen

    friedgreen Networkin' Nut Member

    Thanks for the link Porter. Connlimit does limit connections, but not packet streams as far as I can see. Maybe I'm not seeing something so please straighten me out promptly if I'm wrong, just like my wife ;-)
  6. koitsu

    koitsu Network Guru Member

    From your description of what you want, you're trying to find what's called rate-limiting (this is not the same as QoS, and anyone who tells you otherwise wrong) -- you effectively want to be able to limit the speed/rate of traffic to a specific value (ex. 200kbit/sec). Rate-limiting is absolutely doable on Linux. Let me be clear here: I am talking about actual throughput rate-limiting, not connection count rate-limiting. Sometimes the term rate-limiting in this context is called "traffic shaping" (but this isn't entirely correct either, because that term is murky and starts to relate to QoS).

    The most commonly-used thing I've seen on Linux for rate-limiting of this sort is called tc (Traffic Control): http://linux-ip.net/articles/Traffic-Control-HOWTO/

    However, to my knowledge Tomato does not have support for tc. The QoS layer in Tomato, however, may have this capability -- but I do not use it so I do not know. I also know there are some known complexities (i.e. router reboots) with the QoS code but only when used with other features.

    connlimit does not rate-limit throughput (and this is where @Porter misunderstands) -- connlimit limits the total number of concurrent connections from an IP address or network range (e.g. on the Internet is only allowed to have a total of 10 sockets connected to you; any more and you reject or drop new connection requests).

    I have no familiarity with Dropbox, so the question becomes "how" the kernel would be able to detect packets going to Dropbox. This is all I could find on Dropbox: https://www.dropbox.com/help/23/en

    If it uses a specific TCP or UDP port number, then it's doable without much issue. If it uses port 80 (HTTP) or port 443 (HTTP+SSL) then that's a problem, because rate-limiting rules applied to those ports would also affect your normal web browsing traffic. The only thing I see that's static there is the "LAN Sync" option on "port 17500" (it's nice that they don't say TCP or UDP, sigh... people...), but given previous context I'll assume TCP.

    The only other solution -- and it is half-ass because if Dropbox changes their network (expands or shrinks), the rules become outdated immediately -- is to apply rate-limiting rules to specific network ranges. Meaning, you would need to find out all the network blocks (ranges) that Dropbox uses for their service and make rules that apply to those ranges only (for TCP port 80 and 443).

    There is no way in rate-limiting to say "rate limit *.dropbox.com" (as in a domain name or wildcard address) -- IP networking does not work like that.

    If the core of the problem is that you have a single person on your network who is causing massive network problems for other people using the router due to their excessive usage, then "limiting Dropbox" is not the solution -- limiting that person is the solution, or solving it via social means (approach them and explain that what they are doing is destroying the network for others).

    I cannot help past this point.
    Last edited: Sep 5, 2013
  7. friedgreen

    friedgreen Networkin' Nut Member

    Thanks Koitsu for your help, yes TC is what I'm trying to do. The users were warned, but the software for apple devices automatically sets up auto uploads to the dropbox, so every time a apple device is plugged in it removes any manual settings for the new apple device. In the early morning when users show up and plug in it creates a unusable upstream as they are all updating at the same time. The only way I can see to solve this is using packet limits to dropbox.com. I'll check out QoS a little closer.

    The most commonly-used thing I've seen on Linux for rate-limiting of this sort is called tc (Traffic Control): http://linux-ip.net/articles/Traffic-Control-HOWTO/

    However, to my knowledge Tomato does not have support for tc. The QoS layer in Tomato, however, may have this capability -- but I do not use it so I do not know. I also know there are some known complexities (i.e. router reboots) with the QoS code but only when used with other features.

    There is no way in rate-limiting to say "rate limit *.dropbox.com" (as in a domain name or wildcard address) -- IP networking does not work like that.
  8. koitsu

    koitsu Network Guru Member

    You're going to need the following:

    1. tc support, which I do not think Tomato offers at this time,

    2. Combined with tc support, full knowledge of all of the dropbox.com IP address ranges that the Apple devices talk to. Trust me: Dropbox is so gigantic/huge that it won't just be a single IP address, they probably have lots of network ranges all over the place, and probably geographically balanced. And if they use something like AWS or a "cloud" service, then it's nearly impossible to tc this "for just the Apple devices" because you would be rate-limiting any traffic going to that "cloud" service. Maybe someone online has a list of the network ranges (CIDRs) that Dropbox.com uses, I don't know.

    tc + layer 7 filtering (for heuristics) would work, however L7 filtering is extremely slow, and causes a lot of headaches. I cannot stress this point enough -- a little residential router can easily lose 98% of its throughput due to all the CPU overhead when it comes to L7 filtering. So please try to stay away from using L7 filtering if possible.

    Or if the Apple devices can be configured to always send their traffic through a proxy server, you could set up a proxy server on the router and then rate-limit communication in/out of the proxy server's connections (meaning anything going through the proxy would be rate-limited, but anything not going through the proxy wouldn't be limited).

    Welcome to how tricky all of this is. It's a lot easier to solve this at a social level, if you ask me. Trying to solve social problems like this with technology never works, and if it does work, it never scales. That's the sad truth, I'm sorry to say.
  9. Marcel Tunks

    Marcel Tunks Networkin' Nut Member

    Somewhat related question:
    Does the Tomato QoS support wildcards in MAC address-based rules? If so, and you know that the Apple devices are the culprit, could you establish MAC - based rules using data from an OUI/MAC address database?

    Second, more related question:
    Does anyone know what tool/application ISP's use to throttle traffic in this way?
  10. gutsman7

    gutsman7 Networkin' Nut Member

    tc support is included in most linux kernels and definetely in tomato. First if you whant to address this problem find out what port dropbox is utilizing and put restrictions on that port. If you whant to limit the number of connections it uses then use this rule.
    iptables -t nat -I PREROUTING -p tcp —dport (your port here) -m connlimit —connlimit-above 100 -j DROP.
    This will limit the connections. Then if you whant to rate limmit the port lets use the tc function.
    #limit inbound
    tc qdisc del dev br0 root
    tc qdisc add dev br0 root handle 1: htb
    tc class add dev br0 parent 1: classid 1:1 htb rate 10000kbit
    tc class add dev br0 parent 1:1 classid 1:10 htb rate 1000kbit ceil 2000kbit prio 2
    tc filter add dev br0 parent 1:0 prio 2 protocol ip handle 10 fw flowid 1:10
    iptables -t mangle -A POSTROUTING -p tcp --sport (port here) -j MARK --set-mark 10
    #limit outbound
    insmod imq
    insmod xt_IMQ
    ip link set imq0 up
    tc qdisc del dev imq0 root
    tc qdisc add dev imq0 root handle 1: htb
    tc class add dev imq0 parent 1: classid 1:1 htb rate 2000kbit
    tc class add dev imq0 parent 1:1 classid 1:10 htb rate 100kbit ceil 255kbit prio 2
    tc filter add dev imq0 parent 1:0 prio 2 protocol ip handle 10 fw flowid 1:10
    iptables -t mangle -A PREROUTING -p tcp --dport (port here) -j MARK --set-mark 10
    iptables -t mangle -A PREROUTING -j IMQ --todev 0
  11. Porter

    Porter LI Guru Member

    I'm going to bed now and therefore this is very short:

    Tomato's QoS system in fact does use tc.

    Please use the QoS-system and see how it goes. Use a recent Toastman for this. Put in your _measured_ bandwidth minus 15-30%. Don't fine tune anything on the Classification page.

    When hearing about your problem, synflood protection came into my mind. You can easily do this with iptables. Just brainstorming here, but could you substitute the SYN-packets with ACK-packets?
  12. koitsu

    koitsu Network Guru Member

    I wasn't aware tc was available on Tomato, nor was I aware the QoS implementation on Tomato used it. (I was under the impression the QoS stuff was some custom/proprietary thing written by some third party)

    Thank you, @Porter and @gutsman7, for correcting me. I appreciate it. I hope you guys can assist friedgreen going forward.

    But remember what I said above: from the look of it, Dropbox doesn't use a special port number, it just relies on TCP port 80 and 443 (as a destination port), which means the OP would need some way to determine "Dropbox traffic" vs. "non-Dropbox traffic".

    Layer 7 rules are not the way to go because of the processing overhead and some other reasons. So the only thing I can think of is for the Dropbox-identifying classification to be accomplished by a series of network ranges (e.g. -- I got that by doing a DNS lookup on dropbox.com and asked whois.arin.net who owned that netblock. But I can almost guarantee you there are way, way more network ranges than just that for their stuff, especially given how major they are as a provider; so I hope someone somewhere has a list of them all...) and then use -s and -d respectively to match against those ranges (not ports).
  13. Porter

    Porter LI Guru Member

    tc will only bandwidth-limit your connection and not directly packet-limit your connection. It counts bytes and not packets.

    I agree with you that it is very likely that you won't be able to distinguish dropbox-traffic from other port 80/443 traffic. Especially because I think dropbox uses https and you won't be able to use layer 7 filtering anyway.

    What I don't agree with is your very negative image of layer 7 filtering. It is possible to write fast patterns but it is a trial and error process because the libraries being used by the kernel and iptables have some "performance quirks" and you can't predict what's fast and what's slow. You will have to benchmark them. The tools for this are provided with the layer 7 source. But it's true, layer 7 filters are a last resort.

    Is the dropbox-traffic crippling your router or your connection? Please take a look at the Status site and look for "load". If it's just crippling your connection, I'd say QoS is the way to go and you probably don't need to use iptables.

    I'm still intrigued by the synflooding analogy.
    Take a look at those links, you should be able to cook something up:

    But please keep in mind that you could run into bigger problems: Since you probably won't be able to distinguish the dropbox-traffic from other port 80/443 traffic, packet-limiting it would most likely interfere with people's usual browsing habits, too. Maybe there's a way to instate those limits on a source IP basis, so the ones with dropbox are the only ones being slowed down.
  14. friedgreen

    friedgreen Networkin' Nut Member

    Thanks a lot Porter and all you guys.
    Been messing around with QoS and have a 1% 1% QoS class now called ExcessOut, created a QoS classification with dropbox.com as the IP entry and assigned it to the QoS class. I have no idea if it will work but it took the url entry. I'll be keeping my eye on the charts to see if it work but I'm guessing it wont.....
    Since this is my first dive into QoS I find it somewhat stupid that all port 80,8080 to the config page is unclassified and cannot be classified after multiple attempts, so the graphs show unclassified data without a way fix as far as I can see. What good is a graph that you can't view without skewing the data?
    Also my SSH into the site is shown as p2p/bulk default class and also cannot be classified, ie, all connections used to administer the site cannot be classified.
    FYI: Shibby v110 AIO on a RT-N16 if your wondering.

    I'll take a look at the synflooding and post what I come up with, thanks.
  15. Porter

    Porter LI Guru Member

    Tried the URL entry myself and to my surprise this works:
    CONNMARK udp -- anywhere v-www-2b.sjc.dropbox.com multiport ports www,webcache,433 CONNMARK set-return 0x3100009/0xff
    CONNMARK tcp -- anywhere v-www-2b.sjc.dropbox.com multiport ports www,webcache,433 CONNMARK set-return 0x3100009/0xff

    BUT: DNS-lookups tend to change so this is nothing reliable. Letting your router reconnect once a day might help a bit.

    You don't need an extra class for this to work. It would have been sufficient, if you just classified it as p2p or crawl (which should be classes that have been readily configured).
    Keep in mind that you essentially kill dropbox for your users now. I certainly don't think this is a good idea.

    Please don't try to classify traffic from your pc to your router. This doesn't work since QoS only works with traffic that goes _through_ the router. It's just cosmetics that this traffic appears as unclassified.
  16. koitsu

    koitsu Network Guru Member

    My point about L7 filtering, re: that it's slow, is because L7 filtering requires that the entire payload of every packet be examined (which also means fragmented packets have to be reassembled first before they can be analysed properly -- unless the L7 filter algorithm actually doesn't defragment first, in which case that's fine but it still has to look at the full payload). There is no hardware-level support for this; it's done entirely CPU-side.

    To be clear: the entire payload of the packet has to be examined and compared to a regular expression, which is very expensive CPU-wise. These consumer-grade routers are not CPU powerhouses; even the expensive "high-end" Asus routers can easily be swamped by L7 filtering. It's always expensive (CPU-wise) to do this kind of filtering -- it doesn't matter how "good" or "accurate" the filter regex is, because it still has to examine the entire payload of every packet.

    This is why ISPs doing L7 filtering end up buying dedicated hardware devices (ex. Sandvine) to put on their network. Many of those have a combination of dedicated hardware to offload payload examination as well as hardware-level regex support.

    P.S. -- I don't want this thread to get OT (e.g. into a debate about the efficiency or lack thereof of L7 filtering). Just want to make that clear. If we differ on opinion, cool, no problem with that. But L7 filtering is a very expensive operation CPU-time-wise.
  17. friedgreen

    friedgreen Networkin' Nut Member

    Porter, I found the use of urls or ips both failed to catch traffic and classify it unless the rules were placed above layer 7 rules. Also found that using a url would not list the url in my iptables via "iptables -t mangle -vnL" as you show below, instead it converted to an ip. I thought it was not working at all when I attempted to grep the url, it was a surprise to find the converted ip instead. The dns name varies also among smarthosts, so all known urls would need to be entered. The only way I see to do this is by netrange, and that will change constantly just like dns so it's a no go for me, too much work to keep up. I'm going to try some simple Netfilter rules to drop the timeouts so packets are cleared quicker, and less connections sit idle. Will post results.

    SYN sent 20
    SYN received 20
    FIN wait 20
    Time Wait 20
    Close 20
    Close Wait 20
    Last Ack 20
    UDP Timeout - Unreplied 10
    Generic 20

  18. friedgreen

    friedgreen Networkin' Nut Member

    Kiotsu, If my understanding of layer 7 is correct it will limit packet types, not by url. Since the packets are all HTTP/HTTPS a layer 7 would limit all browsing. If layer 7 works differently please explain how it could be used in this application, I am working with RT-N16's so there should be enough horsepower for just one little extra layer 7 rule I would think.

  19. jerrm

    jerrm Network Guru Member

    Public DNS names are almost never a good idea in iptables rules for dest/source addresses. The firewall deals with IPs, not DNS names. Adding a rule for mydomain.com will only match the A records returned for the exact query "mydomain.com" when the rule is added, it does not necessarily block www.mydomain.com or mail.mydomain.com, etc. There is no guarantee the client PC will get the same answers for mydomain.com as your router did when the rule was added. Too many load balancing schemes use dns at some level and IPs can jump around. If the IPs involved are part of some CDN, then there is no telling what you will end up impacting.

    There really isn't a way for the firewall to use DNS names reliably. Even if iptables did a dns/rdns lookup for every connection it would still be flaky, and think of the performance issues.
  20. Porter

    Porter LI Guru Member

    What I don't agree with is that you make it sound like there is no scenario when you should use layer 7 filters. There is no denying the fact that layer 7 filtering will always make your connection slower than a port filter. But that's it. How slow your connection will get depends on other circumstances, too. It essentially boils down to the decision, if layer 7 filtering is worth it in your specific scenario. For most people being able to match media streams in http is a real advantage because the streams won't have to compete with the downloads. Loading webpages takes a bit longer, but your streams actually don't stutter.

    Be carful with the timeouts! This can have unintended consequences.
    I'm still recomending default QoS settings and just monitoring how it improves you connection speed. We still don't know, if your router can actually handle that many packets (and the syslog entry is basically irrelevant to you) or if it's just your connection that's getting saturated.
    Don't think about the layer 7 filters for now. Only think about them, if your load gets too high while people are using dropbox or generally are using your internet connection heavily.
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice