Using QOS - Tutorial and discussion

Discussion in 'Tomato Firmware' started by Toastman, Dec 24, 2008.

  1. Gagan

    Gagan New Member Member

    Good day,
    I am new in networking,
    It would be great help, if you could advice me below matter.

    I have bought Netgear WNR3500LV2 and flash with Tomato open source firmware,
    It is working perfectly, but we have 30 users,
    Is it possible that i can set data limit (limit quota) 2GB per user in router.
    appreciate ur help in advance .
     
  2. kthaddock

    kthaddock Network Guru Member

    One post is enough ! Don't triple post!
     
  3. xUnDONE

    xUnDONE Network Newbie Member

    Toastman,

    Thank you! Very informative for a newbie like me! Will surely give it a go at this.

    A couple of questions:
    1. Will there be any issues with connections that have the same upload and download speed? Let's say 10Mbps for up and down.
    2. I want to divide the 10Mbps link between 2 school buildings, would that mean I will just have to specify on each tomato router that the MAX INBOUND & MAX OUTBOUND is 5Mbps?
    Thank you very much and looking forward to your response!
     
  4. pegasus123

    pegasus123 Addicted to LI Member

    i haven't really tried it myself, currently im on DSL running 5mb up /1mb down, the way i understood it is simply to control upload, you essentially control download as well.

    Since DSL has fairly low upload speed, this isnt usually a problem with QOS, however what you raised is a valid question and i would like find the answers as well.

    In the situation of setting same amount of download and upload in QOS, if i understood it correctly, its actually much easier to control now since latest versions of tomato has now "true limit" meaning, whatever you set in QOS, tomato will try to adhere to.

    This isnt usually a problem with just normal use, but i can think of someone torrenting and will cause havoc in the network with that kind of upload, so you can still manage it by setting the default class to 1% upload or maybe limiting the UDP/TCP it can open.

    I am no way an expert, this is just based on my understanding. Someone correct me and also add what i might have not thought of.

    You also ask about BANDWIDTH limiter, you can use it to divide equally or just use QOS limiting classes. It's up to you.
     
  5. lboucher

    lboucher Reformed Router Member

    Hello,
    first post here, sorry if my question is a rookie one, but I didn't find something in the forum.

    I'm running Toastman last ARM version (great !) with default rules (just add minor ones) on an asus rtn-18u

    It's about DNS rules (for example). The first one on the top (dest port 53, small size, ...)

    First, I was confused by seeing some packets on details view, qualified as p2p (with flag 255 wich means default class), and other qualified as service. So I have modified the DNS rule: removed size limit, replaced "dstport" with "port" to be more wide...

    Then, after two hours, a small light in my brain: Classification is only for outbound rules ! :rolleyes:

    But, why on earth can I see some inbound packets on the details list ? and why they are not unclassified ?
    I can see some dns response, from opendns port 53 to my router random port, and some are marked p2p/bulk, and other are marked service, and I cannot see any differences between them... I'm confused... :confused:

    It's not an issue, just to understand ;)

    regards
     
  6. pegasus123

    pegasus123 Addicted to LI Member

    your router has to send acknowledgement on packet, you might think it's inbound only but your router is always doing both.
    that is correct, qos only works on outbound packets as this is the only thing you can control. you cannot control what other is sending you.

    changing port from dest port to port would not help since your router will not respond over from Port 53 (wan)

    also try to analyze why port 53 is being put on default class. you might have missed something.
     
    Last edited: Jan 8, 2016
  7. pegasus123

    pegasus123 Addicted to LI Member

    All. I have already upgraded my line to 50/50 fiber connection and qos now becomes irrelevant to me.

    also found out that tenda n60 cannot handle 50mbps sustained throughput with QOS on without causing bottleneck. load average sky rocketed hence it's better QOS is off.
     
  8. Porter

    Porter LI Guru Member

    That's incorrect. Most of the rules are simply designed to only catch traffic that's being initiated from inside the LAN.

    Could it be that your router works as a DNS cache for your local clients and intercepts their DNS requests? That might be one explanation. But just to put things into perspective: a few stray packets won't hurt your connection. If you are not seeing lots of connections classified incorrectly, I wouldn't worry.

    That's incorrect. I already explained above that iptables also marks inbound packets.
    When it comes to actual shaping of traffic, outbound traffic can be controlled much better. Inbound traffic is less well controlled which doesn't mean you can't control it.
     
    pegasus123 likes this.
  9. lboucher

    lboucher Reformed Router Member

    Hello,
    yes Porter, I'm using DNSmasq.
    OK, I found here:
    http://linksysinfo.org/index.php?threads/using-qos-tutorial-and-discussion.28349/page-9#post-204051
    You were saying "iptables matches whole connections, not just packets. A connection is a two way process, therefore you see both inbound and outbound packets."

    Yesterday, I restore the default rule, with "dst port 53". And today, I can see the same situation: some inbound packets are marked. Logical, if I understand what you are saying, if the query is marked, the response is marked. OK.

    I can see something interesting: my inbound packets are going by three (same port, same size):
    src dst class bo/bi
    208.67.222.220 53 192.168.1.2 13961 P2P 204/64
    208.67.220.220 53 192.168.1.2 13961 P2P 204/64
    208.67.222.222 53 192.168.1.2 13961 service 204/64
    208.67.222.220 53 192.168.1.2 19020 P2P 174/59
    208.67.220.220 53 192.168.1.2 19020 service 174/59
    208.67.222.222 53 192.168.1.2 19020 P2P 174/59
    208.67.222.220 53 192.168.1.2 48635 P2P 302/59
    208.67.220.220 53 192.168.1.2 48635 service 302/59
    208.67.222.222 53 192.168.1.2 48635 P2P 302/59

    On each triplet, only one is service and two are p2p...
    I don't see the outbound ones on the list.

    I'm using DNSmasq without options (so without strict-order I think), so there must be some times when the dnsmasq query is send to the three opendns resolver, and the fastest one become the default for a time: http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2009q3/003295.html

    I don't undestand why only one is marked as service, but no problem, it doesn't hurt :)

    Thank you.
     
  10. pegasus123

    pegasus123 Addicted to LI Member

    ^ tnx for correction. appreciate it Porter ^^
     
  11. cloneman

    cloneman Addicted to LI Member

    Can anyone comment on ICMP/Ping classification? When I go to view details page I can't locate ICMP traffic. Is it because ICMP is not a "connection"?

    How is it being treated if it doesn't appear on this page?
     
  12. pegasus123

    pegasus123 Addicted to LI Member

    noticed this too, however you can see the ICMP count in ip traffic details under ICMP column.
     
  13. cloneman

    cloneman Addicted to LI Member

    Alright, I see them in the IP traffic page, but that doesn't provide any insight as to how QoS handles it
     
  14. menot2016

    menot2016 New Member Registered

    Can anyone help me set up QoS properly? There are people gaming, streaming, and downloading on a 2mbps down connection so i really need to be efficient here cause theres not a lot of traffic to work with.

    Here are my settings:
    [​IMG]
    I put game traffic on high, tcp 80,443 from 0-512kb on medium for websites and whatnot, tcp 80,443 from 512kb+ on low, and crawl as default.

    The problem is that the games still lag even though it's put on top. I tested this with a download on my PC and a game on the other. It definitely gets the ~40kbps it needs according to the pie chart and everything is getting classified correctly but the torrent still gets around 120+kbps which is probably too much. If I put an upper bound limit of around 50% on the low class I can game properly.

    I'm on
    Tomato Firmware v1.28.0508 MIPSR2Toastman-RT-N K26 Std
     
  15. cloneman

    cloneman Addicted to LI Member

    If you're on DSL, have you enabled the DSL overhead value? Just choose any of the values from the dropdown.

    You shouldn't need to have a high minimum for your #1 class, although it's not related to the issue you are having.
     
  16. WaLLy3K

    WaLLy3K Networkin' Nut Member

    The torrenting issue appears to be a concern, hinting that something might not be configured correctly.

    I'm presuming you're not using anything to classify torrent traffic (Recommended, unless you specifically set a static port from the torrent client), which means it should fall under the crawl class, that should be capped at 750kbit (93.75 kbps) but instead is getting 120+ kbps.

    What does your rules page look like?
     
  17. menot2016

    menot2016 New Member Registered

    I want to correct myself,
    "but the torrent still gets around 120+kbps which is probably too much."
    Not the torrent, but the download I started which falls under the "Low" class and its classified correctly and gets the appropriate ~120 kbps limit I set it for. The problem is that when I start my game the ping is pretty whack at 300-500ms.
    Also yes, I am on DSL, but I have no idea what these DSL overhead options mean! I've read through the guide and didn't find any info on it, unless i glanced over it. My model is a zyxel p600 if that matters.
    [​IMG]
     
  18. WaLLy3K

    WaLLy3K Networkin' Nut Member

    TCP/UDP and Port Src/Dst for Steam Games would be your biggest concern that I can see (To be fair though, I probably can't see much as it's 12am here ;)). Also compared to my Steam entry, you could probably add a few more ports.

    My QoS entry FWIW:
    Code:
    1200,3478,4379-4380,27000-27100
    As for DSL settings, I find it's a bit hit and miss. With DSL overhead disabled, entering a QoS ceiling based off a web speedtest result will be accurate (That is, Mbps converted to Kbit). To each their own, however!
     
  19. menot2016

    menot2016 New Member Registered

    Thanks, but I only play Source games and these have the clientport command to specify which port I want them to use which works, I think lol. I'm getting around 40kbps on the high class whenever I game so I think it's captured in the rule.

    Is anyone familiar with the options for DSL overhead?
     
  20. WaLLy3K

    WaLLy3K Networkin' Nut Member

    I believe understand where you're coming from, but the source engine's client-server network architecture utilises TCP and UDP. Your rules aren't classified correctly, because they specify only UDP, and destination port.

    It could definitely account for why you're seeing 300ms+ ingame. To be fair, I feel I should have better explained this though. :)
     
  21. menot2016

    menot2016 New Member Registered

    Hmm, I see. I never expected it to use TCP. Good idea.
     
  22. lboucher

    lboucher Reformed Router Member

    Hi,
    I think you can diagnose if you click on QoS, View details, during the game. You will see exactly what port and protocol and qos class your game is using.
     
  23. menot2016

    menot2016 New Member Registered

    Thanks for the tip. The ports do checkout correctly
     
  24. cloneman

    cloneman Addicted to LI Member

    Try any option for DSL overhead value. In my Experience, it doesn't matter which one you choose, as long as you turn it on. For example, choose the first one, 32-PPPoE.

    This will reduce your bandwith somewhat but might fix the problem you are having.
     
  25. 1activegeek

    1activegeek Connected Client Member

    Evening from US-CT. I'm hoping the collective geniuses here can help me iron out my issue, or more importantly, point me in the directions to be able to solve my problem or identify the root cause. I'm assuming I may get a handful of stupid responses, so I take those in jest. I may have missed some clearly important things, but I've read A LOT of threads on QOS and specifically focus on the answers from folks like Toastman, Shibby, and the folks who make these firmwares. That said, here is my debacle:
    • Setup QOS to run to smooth out operations with all the things I'm running
    • Configured it following most of the rulesets and general guiding principals as Shibby outlined in the start of this thread - I've read the updates which seem to be alterations mostly for his specific uses and focused on handling of P2P traffic
    • Some priority traffic types since I work from home: GoToMeeting, Google Voice, Skype, and HTTPS traffic
    • Bandwidth (measured over time different timeframes and without QOS active) minimum I even see is about 125mb Down and 10-11mb up (skipping the normal ramp up during tests) - basically it hits the spike and comes down sometimes, but usually no lower than 125/10
      • Based on this, I set QOS Bandwidth to: 8500kbit up, 100000kbit down which should be well within the 15% below setting suggested since I ordinarily see even higher rates like 150/15
      • Screenshots below with % settings, which I believe I've set correctly
      • Also included Screenshots of the Classification section highlighting all the affected pieces of my setup and the most important items including HTTPS which is above all the other heavy consumables
    Most of the time this works fine and there are no issues. My Skype, GTM, and Voice work flawlessly even while large files are being uploaded/downloaded at times. I also serve up Plex content to my family, which can now happen during the day without disruption either. It's great ... until ...

    The problem: If I have something large consuming my Outbound pipe connection (think backups, or multiple Plex streams, etc) and I additionally have a large consumption start of my Inbound pipe with large file downloads for example NNTP/NZB downloads, the whole thing goes to hell in a hand basket. The network becomes very slow and "unresponsive", the router doesn't respond until the whole situation is over (even though download/upload is happening fine, aka it's not crashing I don't believe), and it even affects wireless clients trying to connect or getting disconnected as the "storm" rolls through the network.

    Any direction that someone can provide to help understand a way to fix this if they've had it happen, or a way I can start digging into specific logs to find errors of importance to try and nail this down would be greatly appreciated.

    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
     
    Last edited: Mar 24, 2016
  26. WaLLy3K

    WaLLy3K Networkin' Nut Member

    If your Internet connection sees 100Mbit down and you're seeing issues when you're initiating large file downloads, it's very possible that you're pushing the limits of your router hardware under Tomato. The easiest way to test this is to jump into an SSH session (Or use Tools > System Commands), and use the following command to check the CPU's sirq percentage:

    Code:
    top -bn 1 | head -2
    If you're finding that sirq is hitting 100% during large downloads, then you definitely are hitting the limits of your router. Unfortunately, there's not much you can do to resolve this - except for upgrading router hardware (and keep using Tomato), or use QoS on stock firmware that may have better optimisations for your router.
     
  27. 1activegeek

    1activegeek Connected Client Member

    Is there a way to record this or view it retroactively? When it hits, I can't do much of anything, including open SSH sessions let alone the browser interface. I've tried multiple times to login as I notice it happening, but I'm always too late. Any type of way to record this output over a period, and then I can try to force the condition to happen?

    I apologize as I'm aware of what the goal is, but I'm not familiar with using CLI tools like top for monitoring/viewing resources.
     
  28. WaLLy3K

    WaLLy3K Networkin' Nut Member

    You could set a custom task via Administration > Scheduler for say 5pm, and run the following command to log the result to your syslog:

    Code:
    logger `top -bn 1 | head -2`
    You would start up the download maybe one minute beforehand, and stop it 30s after 5pm (or whatever time you choose). However, I'm concerned that if you can't use OpenSSH or browse the web GUI when you're running a large download, then that basically confirms what the issue is.

    Still, irrefutable confirmation is good. You'd then test again with QoS disabled and compare the two.
     
  29. koitsu

    koitsu Network Guru Member

    The easiest way I can think of to do this is to log the data to a file on the filesystem. Put this into /tmp/sirq.sh and make it executable:

    Code:
    #!/bin/sh
    while true; do
      echo `date; top -b -n 1 | head -2 | tail -1`
      sleep 1
    done
    
    Easiest way to do that:

    Code:
    root@gw:/tmp/home/root# cat > /tmp/sirq.sh
    {copy paste above here}
    {press Enter, then Ctrl-D}
    root@gw:/tmp/home/root# chmod 755 /tmp/sirq.sh
    
    Then run it as a nohup'd process that's been backgrounded (&) so that it'll stay running once you log out of the router, as well as writing its output to /tmp/sirq.log:

    Code:
    root@gw:/tmp/home/root# nohup /tmp/sirq.sh > /tmp/sirq.log &
    
    Be aware that top sometimes takes multiple seconds to work/run, so you may find that the output doesn't happen once every second -- nothing anyone can do about that, sorry.

    The whitespace/alignment within top will be lost using this method (because of the echo `commands` methodology to get date + interrupt usage percentage on a single line). But that shouldn't matter -- all you care about is the last "XXX% sirq" portion.

    To kill this background process off, log back into the router and do:

    Code:
    root@gw:/tmp/home/root# killall sirq.sh
    
    IMPORTANT: Keep in mind that /tmp is RAM on Tomato, so you should not let this run indefinitely or for long periods of time (it's going to output roughly 8KBytes every 2 minutes). Be sure to remove the log file when you're done (rm /tmp/sirq.log). You can remove the script (rm /tmp/sirq.sh) too.

    IMPORTANT: DO NOT forget to kill off the process (i.e. do not just remove the log file and think that's enough -- it isn't!). Failure to do so, even if removing the log file, will not free up /tmp (free up RAM) because the file descriptor is held open until the process closes it or exits. This is how *IX filesystems and kernels work. So just make sure you kill the process off when done and remove the log file too, don't just remove the log file. :)
     
  30. 1activegeek

    1activegeek Connected Client Member

    Thanks wally3k and koitsu, I'll work on testing this out this weekend when I have a moment. I'm guessing it is likely a limitation on hardware as well. I'm running this on an Asus RT-N66U currently which has served me well through it's life. It may just be time to look at upgrading.

    Is there something in particular I should be looking into if this is a hardware issue? More RAM, More internal storage, stronger chipset (more Ghz)? Or will this SIRQ output help provide me the insight into what hardware specifically is failing me or maxing out? Alternative thought as well, is there any way if it's simply CPU, to push some overclocking?
     
  31. koitsu

    koitsu Network Guru Member

    Your question is kind of strange -- it's like you're treating your purely embedded router as a PC. It's not like you're going to be able to replace (upgrade) some particular component on it. I think what you might be asking is "what is sirq", which I've answered in the past.

    The issue is not "a hardware problem", it's that these devices really aren't designed to be doing all of the "stuff" that Tomato does. QoS happens to be the worst of the bunch (IMO). So if you're looking for something that works better, more reliable, etc., then I'd suggest looking at devices that advertise this from the get go and stick with the firmware/OS that the vendor provides natively. You may want to look into products like Ubiquity's stuff, or just use Asus's native firmware, or possibly consider the AsusWRT-Merlin firmware and see if it performs better for you. We cannot upgrade the Linux kernel in Tomato, which could potentially relieve some of these issues, because the switching/Ethernet and wireless drivers provided by Broadcom are binary blobs (i.e. proprietary) and tied directly to that (ABI breakage otherwise would be pretty much guaranteed).

    In other words: everyone's needs/wants are different. My RT-N66U, for example, isn't experiencing any of these issues -- because I intentionally avoid/do not use features like QoS and many other things/features. That's just how I work, it's not necessarily other people work. For example, QoS seems to be your main focus/need, while for me it's not a focus/need at all.

    Do not overclock. I cannot stress this enough. There are several posts on this forum showing signs of actual router instability (kernel panics or spontaneous reboots/power loss -- impossible to determine which -- or issues with USB reliability, etc.) that were caused by overclocking. In fact, there are even a couple cases of people having stability problems unless they underclock, which is much more indicative of an actual hardware problem (possibly router overheating due to improper HSF mounting at the factory, or router is placed in a physical location that doesn't provide sufficient airflow). Again: do not overclock.
     
  32. WaLLy3K

    WaLLy3K Networkin' Nut Member

    Interestingly, I read his post a little differently. "If I were to upgrade router since this has seen me through some time now, what aspect of hardware (RAM, CPU, etc) should I be looking at in a new router, to get better QoS performance?"

    In short, a dual core ARM CPU above 800Mhz would fit the bill - a Netgear R7000 or better would give 1activegeek the headroom to use QoS on a (roughly) 200-300Mbit connection before maxing the hardware out.

    They could however (as mentioned) look into Merlin's AsusWRT firmware and see if they suit, since the build should be better optimised for that hardware.

    There's also the option of using a dedicated (low power?) PC as a router, but that can be overkill depending on ones needs.
     
  33. 1activegeek

    1activegeek Connected Client Member

    Thank you again wally3k and koitsu - this just confirms it. Tested it out finally and it seems it's definitely the sirq value maxing out. I appreciate the inputs koitsu on the hardware piece - but wally3k supplied what I was thinking of. I wasn't necessarily thinking I could just upgrade something internally to make it better, I was looking more into what hardware could potentially support it as an upgrade. Looking into the R7000, I started thinking about some of the other stuff I wanted to start doing as well (security guys love snort/splunk) - so I'm actually going to look at and think about potentially setting up a pfSense box and let that handle the QOS instead. Leave the basic networking and such to the router (wifi as well), and then that way I can take the load processing off the router completely and gain some other benefits I don't currently have yet.

    As a benefit to help anyone else identify if potentially this could be their problem, below is the output of a few lines from the logging suggested above, to see the ramp up into the non-responsive network I spoke of:

    Tue Mar 29 00:37:29 EDT 2016 CPU: 0% usr 0% sys 0% nic 90% idle 0% io 0% irq 9% sirq
    Tue Mar 29 00:37:31 EDT 2016 CPU: 9% usr 18% sys 0% nic 54% idle 0% io 0% irq 18% sirq
    Tue Mar 29 00:37:32 EDT 2016 CPU: 0% usr 0% sys 0% nic 100% idle 0% io 0% irq 0% sirq
    Tue Mar 29 00:37:33 EDT 2016 CPU: 9% usr 0% sys 0% nic 81% idle 0% io 0% irq 9% sirq
    Tue Mar 29 00:37:38 EDT 2016 CPU: 0% usr 0% sys 0% nic 0% idle 0% io 0% irq 99% sirq <<<<<<<<
    Tue Mar 29 00:37:55 EDT 2016 CPU: 0% usr 0% sys 0% nic 0% idle 0% io 0% irq 100% sirq <<<<<<<
    Tue Mar 29 00:38:09 EDT 2016 CPU: 0% usr 1% sys 0% nic 0% idle 0% io 0% irq 98% sirq <<<<<<<<

    You can clearly see the time gap as well, and the process itself actually somehow got killed as it stopped logging after the last line, and when I was able to get back into the router, killall command didn't work and I couldn't find the process in top at all. Removed the files and restarted for good measure. Interesting to note though is the router uptime is still strong, so I know the router itself didn't actually reboot in the process.
     
  34. koitsu

    koitsu Network Guru Member

    Yeah, I can assure you that snort/splunk is a complete/entire massive overkill for any router, including ARM. You need an actual PC for this. I don't think you realise how absolutely insanely CPU-intensive snort and splunk are. In fact, just the mention of splunk makes me cringe, thinking back to my past two jobs. God what a nightmare.

    Yeah, you need a full on PC.
     
  35. 1activegeek

    1activegeek Connected Client Member

    koitsu - thanks for the confirmation, ya I do realize the overhead those will require. I may not go the Splunk route, but def the SNORT. I never went down that route, because I knew the router couldn't handle it. I'm thinking now though if I potentially need new hardware to handle QOS properly for my connection anyways, why not separate out from router. Did I mention I'm also running OpenVPN on it? ;) Moving all this to a single dedicated box I think makes more sense since my network requirements/desires are outgrowing the hardware capabilities of my RT-N66.

    Nice thing is I have an ESXi box with 4 NIC's and I can easily dedicate 2 to being the in/out traffic ports and configure the rest of the VM's (which this backup traffic and NZB traffic is running from) even to route right through vSwitch's instead of having to ring them around on the actual network. Easy DMZ, more security, all by switching 3 wires.
     
  36. Paul Shingledecker

    Paul Shingledecker New Member Member

    I'm not exactly a newbe to Tomato and QOS as we have been using it for 10 years or so. And I have read a lot of the QOS posts over the years and feel I have something of a handle on how to set it up for basic usage. However there is one issue we have that I can't find mentioned in any of the posts. I work for a radio station in Haiti where we are using limited Internet connections for both normal browsing, downloads, etc. as well as for streaming our audio from the station to a server to be distributed on the internet. Protecting this upload stream is my challenge.

    The details are that we have a limited connection 4 meg up and 1 meg down at best, some times a little less. I have the outbound limit set at 600 and the inbound at 3000.

    I have a Streaming class as the second class after Service and give it 40% min and 100% max. Everything else including VOIP, Remote, and browsing are lower and pretty much follow the norms recommended by Toastman.

    Everything works pretty well including fairly decent browsing latency and acceptable file transfer etc. And the Streaming normally gets the 130 kbits it needs but every few minutes when there is a large demand and downloads spike for an instant, suddenly the available upload bandwidth for the streaming drops to 80 or 90 kbits and the audio stream drops out for a second or two. Then everything rights itself and the audio is OK again.

    My question is, is there any way to protect that 130 kbps audio streaming upload from these temporary dropouts - which are very annoying to the thousands of listeners following the stream! (I also have a similar problem protecting my VOIP telephones but I'm most concerned about the streaming.)
     
    Last edited: Apr 13, 2016
  37. cloneman

    cloneman Addicted to LI Member

    You should stress-test your setup to determine where your QoS is being ineffective. A properly tuned QoS should survive an aggressive situation where, for instance, you start 3-4 downloads at once or 3-4 uploads at once, while running a ping or jitter test. An FTP server is useful here. My favorite test is the VoIP test @ visualware. http://myspeed.visualware.com/index.php. You'll want to create a rule that places the test in the same class as the bandwith you want to protect for the simulation to work.

    If you notice you can affect the jitter and latency a lot (or introduce packet loss) with aggressive file transfers, then something is wrong with your QoS. When I had a 5/1 wireline connection, I was able to maintain 0.2% packet loss and 10ms jitter during high stress/congestion situations. (normal unstressed, 1ms jitter)

    A few things:

    - Make sure your version of tomato is recent enough (~2013ish) to have the new ingress (download) speed management. If you have a Min and Max in your download section, you have the right version. If you have only one column, you have the old download QoS manager.

    - If downloading is somehow affecting your upload speed, try selecting a DSL overhead value. I think this reserves more bandwidth to offset unknowns. When I use it, my bulk traffic moves out of the way quite a bit

    - as long as you've set a minimum that gives it enough, you have nothing else to configure, the maximum isn't really important. If you set minimum 40%, that class can always make a claim to 40% of the bandwidth, and that traffic should not be dropped or queued.

    Finally, if you find out that despite your best efforts it still drops sometimes, it might be due to the bandwidth fluctuating too much. In that case, your only option is a 2-class system where you only have 2 classes - your radio and "bulk". You set everything else to "bulk" with a 50% cap to ensure everything else will never exceed 300kbps, which allows you to survive a situation where you connection suddenly becomes unable to deliver more than 500kbps. In that case though, you might be better off using the bandwith limiter and turning off Qos. Try the other options first to find the root cause.
     
  38. Paul Shingledecker

    Paul Shingledecker New Member Member

    Cloneman thank you for your answer. I haven't had time yet to do the tests you suggest. It is not quite as urgent as before because we now have another ISP back up and running that handles the streaming exclusively. However I still plan to work on the QOS on the other connection for the eventuality that we again have to use it. (ISP's in Haiti are notorious for being off for days or even weeks at a time!) So I will do the tests you suggest and will post back when I know more.
     
  39. Scubasteve2365

    Scubasteve2365 New Member Member

    Hello,

    I'm not sure if I should be posting it in this thread or a separate. I am running Advanced Tomato on a Netgear R7000. For at least a year, maybe longer, I have had QoS setup and working well. My ISP (cable) package was a 30/5 package. With Qos Off, I'd get reguarly 35/6 due to over-provisioning. With QoS enabled I would get about 28/4.5. I was very happy with this.

    They have recently installed some upgrades in our area, conveniently timed after Google announced that my city is on their short list to be one of the upcoming google fiber locations.

    I am now on a 200/20 service, and with QoS off I get 238/24. This is repeatable regardless of time of day.

    Herein lies my problem (if you can call it that), with the exact same QoS settings, changing only the rates for my new speeds, I get 140/19.5. The upload seems fine but the download is taking quite a large hit. Proportionally I thought I would be in the 180 to 190 range on the downlink.

    I've logged into the system commands on the sirq does not get past 45. The cpu load and everything as far as I can see looks fine.

    Any suggestions of what might limit this higher throughput under QoS?

    I was on build 130, but upgraded to build 136 today due to this problem and now the QoS performance is worse (I was using the fq-codel, perhaps this is why, and that is now not an option)
     
    Last edited: May 20, 2016
  40. cloneman

    cloneman Addicted to LI Member

    Shot in the dark here, maybe 45% Sirq on a dual-core router means 1 core is 90% busy, and therefore overwhelmed?

    If you set your QoS download max to something low (e.g. 70mbps) you could compare the sirq and try to figure it out if it's a bottleneck.
     
  41. yasavvy

    yasavvy Reformed Router Member

    I was doing all right on my Asus RT-N66U with Shibby Tomato and had QoS sort of set up decently with my 105Mbps/12Mbps connection, but I just upgraded to 150Mbps/20Mpbs and everything's gone to hell in a hand cart.

    I use VoIP (all cell phone calls go through my computer), girlfriend uses youtube and skype, and I run a torrent seedbox and also play a lot of games. My games need to have 50-90ms and no higher, but like others have said, as soon as I download a single torrent, I get only 7MB/sec no matter what % I set the classification to (instead of 15MB/sec which is the max I get with QoS disabled), and when I SSH into the router to look at the CPU usage, I see it's fine but the IRQ or SIRQ is trucked.

    So my solution is to use something like SeriousBit NetBalancer software for Windows to limit the bandwidth, but the software seems like a piece of crap. I haven't found good software yet that limits bandwidth reliably. Even qbitorrent which is what I use doesn't seem to have a very good working bandwidth limiter.

    I find it hilarious that there doesn't seem to a solution for anybody and it's hit or miss--with an emphasis on miss.

    If there is no single router out there that can handle a 150Mbps connection with QoS, perhaps someone can suggest a piece of Windows software that actually works to limit bandwidth on that machine at the very least.
     
  42. 1activegeek

    1activegeek Connected Client Member

    @yasavvy - in Tomato and all the folks behind it and the other open source options defense, these devices were never originally designed with efficient QOS of high speed networks originally. The specifics of your device will vary the hardware response to the onslaught as was indicated a few posts earlier to me. It's a tricky balancing act to get exactly what you're looking for in these devices, again simply because most were not designed for it.

    I would suggest using a different box if you really have that much QOS requirement. And if you're graduating to that type of device, you can get a lot of the other features involved here. For me, it was time to upgrade to a pfSense box. I was running VPN server, QOS, Guest Wifi isolated, dual band wifi, etc. I sort of peaked its potential with the QOS onslaught. Now that I've moved to pfSense, I can also do some better QOS, I've got multiple VLANs and better firewall rules configured, virtualized environment now has direct access to DMZ interfaces for my external facing functions, I'm setting up VPN client to connect a secondary location to mine for simplified access. The world is your oyster when you step up to a device intended to handle all the things your doing. Unfortunately I'm barely using Tomato functionality anymore, but until now (aka about 6yrs) I've been running on Tomato with ZERO issues and had nothing but great things to review about it.

    Don't know if this necessarily answers a question, but provides a different angle of approach as I did when I found I had hit the limits of what hardware can do.
     
  43. cloneman

    cloneman Addicted to LI Member

    If you lower your max QoS download speed back to what it was before (e.g. 105 Mbps or a little less) it will behave like it did before, taxing the sirq less, and you can keep the new upload speed. This could be a good temporary solution (or a permanent one if you don't care about leaving a bunch of bandwidth unused)

    You could also try an ARM router which apparently can handle more than double the speed of the MIPS RTN-66U.

    QoS is very important to me as well. I have an RTN-66U, but 'only' 30Mbpsm so I don't hit those limits. The other reason I haven't tried pfsense or something more powerful, is that I have no idea how to setup QoS on these platforms, there's no documentation for it that I can understand, I especially see no references with regard to proper inbound IMQ/IFB QoS, with bandwidth borrowing between the classes like tomato does.
     
  44. yasavvy

    yasavvy Reformed Router Member

    Okay I got QoS under control a bit now. The only drawback--and it's a big one--I can't seem to get the download speed beyond 88Mbps. I make a change and run speedtest. Make another change speedtest etc. I wanted to make sure the download speed didn't affect my Diablo III ping and I found the sweet spot doing that.

    Inbound Rates / Limits
    100000 @ 1% - 85% "bigxfer" = 83Mbit on speedtest + 95ms ping d3

    If I change that 85% to 100% it won't go beyond 88Mbps on speedtest but it will raise my D3 ping to 115ms. Other than that, good. Right now I am downloading a torrent at 9.1Mbps with 150KB/sec up and still getting a 90ms ping in game. This is what I wanted. Now only if I could figure out why my bandwidth is stuck at 88Mbps yet I get 130Mbps with QoS disabled and it barely affects my ping even at full speed.

    I swear once I figure this out I'm making a real video tutorial. Nobody should have to go through this over and over again.
     
  45. Mark Barabus

    Mark Barabus Serious Server Member

    I'm using Tomato with QoS for going on 2 years now and until recently i thought my settings were bomb proof. I recently started to get back into online gaming and i'm having issues with lag. My latency tends to jump up and down at a constant rate ingame causing my movements to lag. Everything none game related still runs perfectly fine including VOIP calls. I can even dowload at full throttle (or upload) and i'm still able to watch YouTube or browse the web with minimal impact which tells me my QoS is still working.

    To summarise my setup:
    I'm on an ADSL1 line with 7mbps download/ 768kbps upload.
    My QOS limits are 5950kbps download/653kbps upload.
    I'm using a modem in bridge mode connected to an Asus RT-N66U router in PPPOE mode.
    My Tomato version is 1.28.0000 MIPSR2-3.1-132 K26 USB VPN-64K
    Everything is wired using CAT6 cable.
    My QOS settings in Tomato are all default with less than a dozen of my own rules added to catch VOIP/Game traffic and a few more REMOTE rules for my work requirements.

    What i have tried so far:
    Reduced my QOS bandwidth limits to as low as 4500/350
    My theory was VOIP/Game wasnt getting enough of a constant bandwidth so i changed the min-max for VOIP/Game to 20-30% which assures it gets a more or less constant rate but this hasnt improved anything.
    Lowered other classes to 70% to stop them taking up unecessary bandwidth
    Disabled all L7 rules

    I even tried different firmwares including ASUS WRT Merlin and OpenWRT with the new SQM codec which supposedly combats bufferbloat but neither improved my issue. And to my knowledge i'm not experiencing bufferbloat because i'm never maxing out my connection during my sessions it always indicates me using 3000/200 which leaves plenty of room for traffic. The bufferbloat test at DSLreports does however report my bufferbloat as grade F though i'm not sure if i should be concerned about this as again i never max out my connection while i'm playing. Besides my understanding is that on ADSL1 line it would be very difficult to prevent bufferloat as there is not a lot of bandwidth to work with in the first place.

    After trying all this i'm at my wits end and dont really know what else i can try to effectively resolve my issue. Am i simply expecting too much from QoS and Gaming on an ADSL1 line?

    Any help or advice appreciated.

    Thanks
     
  46. cloneman

    cloneman Addicted to LI Member

    You could be having a line issue, if like you said, you lag even when there's no traffic. With a good line your ping should not go up too much during a speedtest. With a slow upload , your 'max' value for voip/game should be higher, close to 100%.

    Don't go crazy changing your rules if you're getting lagging even with no traffic on the line. Most games should do fine in that amount of upload speed. Try to rule out a line issue...

    on adsl1 you need to turn on the DSL overhead value, try 32. Also how are you using 3000 down during a gaming session? That doesn't make a whole lot of sense.
     
  47. pegasus123

    pegasus123 Addicted to LI Member

    @Mark Barabus I had been implementing qos for an Internet Cafe before when I was on a 5/1 mbps line. and it sucks to manage so many clients who games, who surf and who download stuff on a very slow line.

    In my experience implementing qos for gaming is to catch all every connections and ensure they are properly classified.
    I had an issue with one of the game before for several months which is sporadically laggy and sometimes not until I found out I wasn't actually marking all the right connections to gaming classification.

    Once I found it out, then boom solved all the lagging issue.Then the online game is stable at 30ms.
    Connection Details under qos is your friend here and will be a invaluable tool for analysis.

    I might also add that those gaming connections that I didnt catch were thrown to Bulk class since it's using UDP on my setup and I didn't properly classify it.

    I'm just throwing it out since you didn't mention. My experience and yours might vary and might not necessarily be applicable :)

    PS: Many games changes port ranges during updates so be sure connections are classified accordingly.
     
    Last edited: Jul 1, 2016
  48. Mark Barabus

    Mark Barabus Serious Server Member

    The line appears to be stable. Noise values (SNR 16.0/12.3) are in acceptable ranges in my modem stats and i'm getting stable pings in both speedtests and ping commands. My pings tend to average around 30ms with little to no traffic. What i cant figure out is when i'm lagging in games if i ALT+TAB out to my ping command that is running it will show my ping stable (around 30ms) yet ingame it will show 300-500ms. Could this be because the ping command (outside game) is prioritised as ICMP or is there some other reason for this? I had a theory that my ISP was classifying VOIP/Game traffic incorrectly as all my other traffic appears perfect but after checking Wireshark and calling them out i'm assured its all correct, by my ISP of course so i cant verify this.

    One thing i have yet to try is increasing the max value for VOIP/Game so i will give this a try later. I believe currently i have it set to around 200kbps which when i consider i almost always have a voice chat open it may not be enough to maintain a stable ping. I suppose 400kbps might be a more realistic figure.

    The 3000down/200up is whats reported on average for the entire network while i'm gaming. VOIP/Game tends to take up around 1000/150 when i check iptraffic and the remaining traffic is web browsing so nothing substantial and plenty of bandwidth left over.

    Also, i'm fairly sure all my traffic is classified correctly but i will double check this later, as you have said its quite possible something has changed in an update.

    Thanks for all the advice will give it a try over the weekend and report back.
     
  49. cloneman

    cloneman Addicted to LI Member

    Yeah on a line with slow uploads you definitely want to very liberal on your Maximums. (90% or 100%).
    The game is probably running out of capacity at 400kbps

    you could always separate voip from game... move all the game related rules to a lower class but with a higher maximum value.

    To see if a class is being taxed, you can try

    Code:
    tc -s qdisc show dev ppp0
    In my case, sfq 20 corresponds to class 2 (voip) you can see if your packets ate being dropped (running out of upload capacity in that class)
     
    Last edited: Jul 3, 2016
  50. Mark Barabus

    Mark Barabus Serious Server Member

    Okay so i did some tinkering over the last couple weekends with my QOS...

    Firstly i went through all my QOS rules checking they are still relevant and working as intented. The majority were fine but i managed to get rid of a few rules that i'm no longer using.
    Secondly i raised the VOIP/GAME limit to 100% aswell as raised the min limit to 20%- probably unecessary but i'll try anything.
    Thirdly and finally i changed my modem from a generic ISP modem to a Huawei HG612- although i never had any problems with the generic one this HG612 should perform better and syncs slightly faster.

    After going through all that it pains me to say i'm still having the same issues, constant rubber banding and otherwise eratic pings shown (ingame only). I average on 30ms in most game servers that are nearby but it will frequently jump to 150-300ms every other minute which i believe is the cause of the rubber banding.
    I should mention most if not all the older online games are fine its only those released in the last 2-3 years i seem to have issues with. So perhaps they just have greater bandwidth requirements that my line is incapable of providing.

    Saying that however running the command

    tc -s qdisc show dev ppp0

    If i am reading the results correctly the VOIP/GAME class is not taxed and has no drops at all. Infact the only classes that appear to be dropping packets are the WWW and P2P classes.

    Code:
    qdisc htb 1: root r2q 10 default 90 direct_packets_stat 110
     Sent 68660438 bytes 704467 pkt (dropped 2929, overlimits 675655 requeues 0)
     rate 0bit 0pps backlog 0b 272p requeues 0
    qdisc sfq 10: parent 1:10 limit 127p quantum 1492b perturb 10sec
     Sent 827373 bytes 13300 pkt (dropped 0, overlimits 0 requeues 0)
     rate 0bit 0pps backlog 0b 0p requeues 0
    qdisc sfq 20: parent 1:20 limit 127p quantum 1492b perturb 10sec
     Sent 37586 bytes 406 pkt (dropped 0, overlimits 0 requeues 0)
     rate 0bit 0pps backlog 0b 0p requeues 0
    qdisc sfq 30: parent 1:30 limit 127p quantum 1492b perturb 10sec
     Sent 12782 bytes 227 pkt (dropped 0, overlimits 0 requeues 0)
     rate 0bit 0pps backlog 0b 0p requeues 0
    qdisc sfq 40: parent 1:40 limit 127p quantum 1492b perturb 10sec
     Sent 11190090 bytes 45622 pkt (dropped 153, overlimits 0 requeues 0)
     rate 0bit 0pps backlog 30277b 44p requeues 0
    qdisc sfq 50: parent 1:50 limit 127p quantum 1492b perturb 10sec
     Sent 2403 bytes 22 pkt (dropped 0, overlimits 0 requeues 0)
     rate 0bit 0pps backlog 0b 0p requeues 0
    qdisc sfq 60: parent 1:60 limit 127p quantum 1492b perturb 10sec
     Sent 736141 bytes 10626 pkt (dropped 0, overlimits 0 requeues 0)
     rate 0bit 0pps backlog 0b 0p requeues 0
    qdisc sfq 70: parent 1:70 limit 127p quantum 1492b perturb 10sec
     Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
     rate 0bit 0pps backlog 0b 0p requeues 0
    qdisc sfq 80: parent 1:80 limit 127p quantum 1492b perturb 10sec
     Sent 32581332 bytes 586062 pkt (dropped 1967, overlimits 0 requeues 0)
     rate 0bit 0pps backlog 6604b 127p requeues 0
    qdisc sfq 90: parent 1:90 limit 127p quantum 1492b perturb 10sec
     Sent 23266676 bytes 48092 pkt (dropped 809, overlimits 0 requeues 0)
     rate 0bit 0pps backlog 73120b 101p requeues 0
    
    Any suggestions welcome as once again i have hit rock bottom and i am out of ideas. I have attached my QOS settings in hope it will bring some light or show up any issues.

    Thanks again

    qos-basic.gif
     
  51. AaronCompNetSys

    AaronCompNetSys LI Guru Member

    Maybe I missed it, but is your problem occurring during load of the other QoS categories or with zero load?

    I've always had problems with modem connection limit and huge ping drop off with any load. Currently I run a detect script to completely cut off any other bandwidth uses during gameplay.
     
    cloneman likes this.
  52. Mark Barabus

    Mark Barabus Serious Server Member

    Well i have the issue during moderate to high load aswell as very minimum load, so it would appear to be a problem with any amount of activity yes.

    Is it possible the problem isnt QOS related at all? I'm not familiar with the connection limits but i can see how it would be possible for the modem connections to be maxed out so maybe i need to look at that. How would i confirm it being maxed out?
     
  53. cloneman

    cloneman Addicted to LI Member

    Start a continuous ping in terminal window. Examine how the ping is impacted by a variety of activities. Use an ftp server to simulate different types of bandwidth load, eg. 60%. (filezilla lets you set caps)

    A properly functioning connection shouldn't alter ping too much until you reach maybe 90%
     
  54. Mark Barabus

    Mark Barabus Serious Server Member

    Ok so i did a constant ping -t while uploading and downloading through filezilla. Ping didnt alter much even at max load with QOS on i saw average pings of 37ms with lows of 24ms and highs of 45ms. Lowering the bandwidth in zilla didnt change the ping results much at all. As a matter of fact i get the same ping results at zero network load maybe just 5ms lower.
    I did happen to notice a few high pings where it jumped to 300ms for a brief milisecond although its probably nothing to worry about- possibly just google delaying my request.

    With QOS off things got chaotic rather fast as you would expect, at max 100% i'm getting average pings of 620ms with lows of 245ms and highs of 840ms. Its not until i turn the upload down to 40% that things "calm down" and even then its averaging at 75ms with lows of 24ms and highs of 300ms so extremely jittery.

    So i'm not sure what to take away from these tests other than my QOS seems pretty stable and that i cant use the connection stably without QOS. As far as my gaming issues i'm not sure if these results bring any light on the issue- could the random spikes of 300ms be anything or perhaps even the smaller jumps of 24ms to 45ms could be the problem as those are quite consistent at happening every few seconds.
     
  55. Toastman

    Toastman Super Moderator Staff Member Member

    Looks to me like your QOS is working. It may be a problem with the game, the gameserver, the path in between you etc.

    To eliminate QOS as a cause, turn off QOS, remove all other machines, leaving only the games machine running only the game. Does the game ping still show huge variations? If it is OK, turn on QOS and see what happens.
     
  56. Mark Barabus

    Mark Barabus Serious Server Member

    Ok some more test results...

    With QOS off the ping stays roughly the same with no traffic (only the ping and game running). I'm getting 20ms-21ms with no jumps at all. Same results with QOS on.

    If i introduce other devices while QOS is off pings stay much the same with little to no traffic with just the odd blip where ping jumps to 50ms.

    Moderate to normal web browsing and youtube traffic with QOS off ping really starts to take a hit averaging at 130ms jumping around erratically from 20ms to 400ms.

    Bringing QOS back on while still under moderate to normal load ping calms down averaging around 100ms but still quite erratic jumping up and down from 20ms to 150ms.

    I'm immediately noticing these results are quite different from my results last night (from my FTP testing) so just to confirm i fired up zilla to repeat those tests and sure enough pings are showing much more erratic today.

    Code:
    Reply from 216.58.198.99: bytes=32 time=26ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=20ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=39ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=37ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=126ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=45ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=95ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=20ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=35ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=58ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=177ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=27ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=44ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=39ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=37ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=143ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=525ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=26ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=31ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=25ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=35ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=26ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=22ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=27ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=42ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=22ms TTL=55
    Reply from 216.58.198.99: bytes=32 time=38ms TTL=55
    
    So with that maybe i have found the potential cause of my problems?
    If under moderate load its jumping around from 20ms to 150ms all the time that would sure amount to my lag problems in game.
     
  57. Mark Barabus

    Mark Barabus Serious Server Member

    I think i may have found my issue... I couldnt figure out why i was getting inconsistent and erratic ping results under moderate to high load and then it hit me- i have a HTPC system with transmission (torrents) and IP cameras running 24/7 that just maybe is maxing connections. The torrent speeds are limited to 10kbps throughout the day and switch to full throttle at midnight but even with that measly 10kbps i thought it has to be taking up connections right. Sure enough i close transmission and my pings steady out at 20-21ms under normal useage and at max throttle 24-50ms. I havent actually tested any games yet but i figure this has to be it.
     
  58. Mark Barabus

    Mark Barabus Serious Server Member

    Update- Unfortunately even with transmission closed the ping fluctuations are causing havoc with games so although i think transmission was a big part of the problem its not the solution to my problem. Closing it certainly reduced the pings but they are still erratic. Disconnecting every other device but the gaming machine and gaming is fine so it appears any amount of load however small from other devices is influencing ping too much causing the game to lag. At zero load with just the gaming machine pings are 21-24ms and gaming is fine but add more devices and the ping will fluctuate between 24-50ms and i believe this is what causes my problems with games as its 20ms of jitter.

    EDIT: Just to get a better picture of what is going here i am attaching a screenshot of my tomato graphs and ping commands. This is with utorrent and filezilla running downloads and uploads. I even reduced the limits in tomato qos to 300/3000kbits so this is my line at roughly 40% saturation and the pings are still all over the place. As i said before any amount of traffic even 100kbits and the pings go crazy. At idle (zero load) its 21-24ms. FYI this is the same with qos on or off. I honestly cannot make any sense of whats going on.

    ping-qos.gif

    Would anyone happen to have any thoughts as to what could be going on here- specifically why does my ping fluctuate so much with very little traffic ~more than 1 device less than 300kbps out/1mbit in yet the pings are stable at zero load.
     
    Last edited: Jul 23, 2016
  59. pegasus123

    pegasus123 Addicted to LI Member

    Perhaps post your config?
     
  60. koitsu

    koitsu Network Guru Member

    Is there a reason you hid the graphs/latency data in Pingplotter Pro for all hops leading up to the destination (i.e. hops #1 to hops #X-1)? Showing hops #1 through #3 would at least be helpful in combination with the ones you did show.

    If the graphs/latency data for the hop on the other side of your Internet connection (i.e. your ISP's router) shows a different pattern as the destination then I would chaulk this up to "problems on the Internet" (i.e. not your issue).

    Also don't forget it's perfectly possible for "ping traffic" (which is not necessarily the same as network hop traversal/discovery; please see how traceroute works) to be classified as non-prioritised, thus see latency there that isn't actually seen in other types of packets. (I'm not saying this is the case here, just saying it's perfectly possible and does happen on the Internet regularly (ICMP directed *at* a router is often deprioritised)). I've talked about this at length (including how to properly set up PingPlotter) (note the year of the post -- it's nothing new). Richard Steenbergen's NANOG presentation also goes over the complexities of network troubleshooting, covering this too.

    Finally: you're testing from a wired device, not wireless, yes?
     
    Toastman likes this.
  61. Toastman

    Toastman Super Moderator Staff Member Member

    Normally I would always choose to ping a site that is as close to me, network wise, as possible. i.e. the ISP's own server or gateway IP. This removes internet paths from the equation. If I ping my local ISP I get a constant 23 - 26mS with almost no variation when under load.

    I tried pinging the same IP as you from here, (S.E.Asia) which has 15 hops, and I get very little variation in the return. Running almost standard Toastman QOS rules, and with currently about 17 users online, bw utilization of approx. 60-70 %. Thus even with the variability of the path, my results are better than yours.

    Capture.JPG
     
  62. ghoffman

    ghoffman LI Guru Member

    @Toastman -
    is there a way to reset qos configuration to stock without erasing all settings?
     
  63. Mark Barabus

    Mark Barabus Serious Server Member

    @koitsu - I didnt have the hops showing in my previous attachment, it was only showing the summary screen of all traces i was running and results of the last hops. I have now attached a screen of all my hops to the destination address bbc.co.uk which is said to be one of the best to test against over here in the UK. Note my minimum pings are somewhat higher here 35ms because i have had interleaving enabled on the line.

    The first attachment is the trace at idle bandwidth and the second is at roughly 20-40% bandwidth. I have provided a third attachment showing the tomato bandwidth graphs at this time.

    idle-qos-on.png traffic-qos-on.png traffic-ping-qos-on-with-bandwidth.png

    Very interesting read on how traceroute works, i will need to read it a couple more times before i fully understand the workings but its certainly brought me up to speed. I do have TCP enabled in pingplotter but it appears the destination (or ISP) is blocking ICMP on all ports other than HTTP on port 80 so i am limited to using port 80 here. Doesnt matter too much but it would have been nice to set it to my VOIP or SERVICE class.

    And finally yes i am running wired. I have also tried different cables and running the ping from different machines for what its worth.

    @Toastman - Those ping results are phenomenal and to think you are at 60 - 70% bandwidth i am now really starting to wonder if this is a problem with my ISP provided line outside my property rather than a QOS issue. Even at idle i get huge variations and if i add 20 - 40% bandwidth i am looking at 20 - 1000ms.
     
  64. koitsu

    koitsu Network Guru Member

    I didn't need to see individual graphs for each and every hop, just hops #1, #2, #3, and the destination, as well as all the Avg/Min/Max/etc. information box at the top (for all hops). More graphs = less vertical room = harder to see "scale" of problem. Also, PingPlotter by default has a cap of 30% packet loss (I tend to increase that to 100%). There are several "defaults" in PingPlotter (including what it considers to be low/medium/high latency) that are (IMO) not very ideal (they all can be tweaked though). That said:

    Based on the 2nd graph above, I would say it looks (to me) like the connection between your router and ISP is being saturated somehow -- specifically for the time periods 14:40 to 14:43 or so. It looks like your latency actually gets ~2x worse (idle = latency ~35-46ms (see below for my concern about this), saturated = ~81-88ms). This is supported also be a "max" of 508ms at hop #2. I don't really care about the jitter column (this matters more for VoIP and some other things). Supposedly there is some packet loss (vertical red lines at hop #2), but what's interesting is the information box at the top doesn't show that for hop #2.

    Hop #1 looks good all around, which means your traffic between your device/PC and your router is at least good. So we can rule that out.

    Could router CPU utilisation or QoS (possibly misconfigured?) be responsible for this problem (specifically extremely high latency during that 14:40 to 14:43 window)? Absolutely.

    I might suggest a test endpoint somewhere other than the BBC, BTW. Does plus.net have a website? If so, is www.plus.net (or wherever) closer to you, hop-count-wise? If so, that'd make for a better destination test (i.e. less reliance on the Internet). Just remember that destination should be a PC, server, or whatever and not a router -- Internet backbone or ISP edge routers (de-)prioritise ICMP responses, so latency/packet loss/etc. shown at those is often misleading (the Richard Steenbergen PDF goes over this). What you have to look for is a pattern of behaviour that "trickles down" through subsequent hops -- and if it does, try to narrow it down to which hop it begins at. (In your case, it looks like during saturation is happens at hop #2, so that's why I said what I did in my above paragraph).

    I can't tell you how many times in my life I've seen people show a mtr/traceroute that looks like this, claiming "there's a problem!! Look at the packet loss!" (gamers state this kind of thing all the time) -- and sorry about the formatting (the forum here messes up alignment/content in code blocks when editing a post):

    Code:
    === Tue Jul 26 17:30:00 PDT 2016  (1469579400)
    Start: Tue Jul 26 17:30:00 2016
    HOST: icarus.home.lan                                                  Loss%   Snt   Rcv  Last   Avg  Best  Wrst
      1.|-- gw.home.lan (192.168.1.1)                                         0.0%    75    75   0.2   0.2   0.2   0.4
      2.|-- 96.120.89.145                                                     0.0%    75    75   8.2   8.4   7.8  10.8
      3.|-- be-20003-sur04.santaclara.ca.sfba.comcast.net (68.86.249.249)     0.0%    75    75   8.7   8.7   8.3   9.4
      4.|-- hu-0-3-0-6-ar01.hayward.ca.sfba.comcast.net (68.87.192.249)       0.0%    75    75   9.9   9.9   9.4  11.2
      5.|-- hu-0-2-0-0-ar01.santaclara.ca.sfba.comcast.net (68.85.154.249)    0.0%    75    75  10.1  10.0   9.3  12.7
      6.|-- be-33651-cr01.sunnyvale.ca.ibone.comcast.net (68.86.90.93)       66.7%    75    25   9.9  10.0   9.5  11.1
      7.|-- be-10925-cr01.9greatoaks.ca.ibone.comcast.net (68.86.87.158)      0.0%    75    75  13.7  12.8  11.5  17.7
      8.|-- hu-0-11-0-0-pe03.11greatoaks.ca.ibone.comcast.net (68.86.85.238)  0.0%    75    75  11.8  11.7  10.9  13.0
      9.|-- ae12.sjc12.ip4.gtt.net (173.205.58.169)                           0.0%    75    75  11.5  11.4  10.8  15.4
    10.|-- xe-1-0-1.sjc20.ip4.gtt.net (89.149.184.250)                       0.0%    75    75  11.6  11.7  10.9  13.9
    11.|-- as20473-gw.sjc20.ip4.gtt.net (69.22.130.90)                       0.0%    75    75  14.5  27.9  11.1 186.4
    12.|-- ???                                                              100.0    75     0   0.0   0.0   0.0   0.0
    13.|-- mambo.koitsu.org (104.238.183.73)                                 0.0%    75    75  11.9  11.9  11.1  18.9
    
    They'd be referring to the loss shown at hops #6 and #12. This is totally completely normal. It's just ICMP prioritisation (hop #6) and a router choosing not to respond with ICMP TTL exceeded (hop #12). You can see th destination (mambo.koitsu.org) has 0% loss/virtually no latency in comparison to other hops.

    But what about this?

    Code:
    === Sat Jul 23 12:12:00 PDT 2016  (1469301120)
    Start: Sat Jul 23 12:12:00 2016
    HOST: icarus.home.lan                                                  Loss%   Snt   Rcv  Last   Avg  Best  Wrst
      1.|-- gw.home.lan (192.168.1.1)                                         0.0%    75    75   1.5   3.6   0.5  16.3
      2.|-- 96.120.89.145                                                     1.3%    75    74  71.1  79.0  31.3 141.7
      3.|-- be-20003-sur04.santaclara.ca.sfba.comcast.net (68.86.249.249)     1.3%    75    74  74.6  80.4  21.9 144.8
      4.|-- hu-0-3-0-6-ar01.hayward.ca.sfba.comcast.net (68.87.192.249)       1.3%    75    74  61.2  78.5  16.1 140.5
      5.|-- hu-0-2-0-0-ar01.santaclara.ca.sfba.comcast.net (68.85.154.249)    0.0%    75    75  71.0  75.9  19.9 135.3
      6.|-- be-33651-cr01.sunnyvale.ca.ibone.comcast.net (68.86.90.93)        0.0%    75    75  70.7  78.3  17.8 154.9
      7.|-- be-10925-cr01.9greatoaks.ca.ibone.comcast.net (68.86.87.158)      0.0%    75    75  74.8  80.8  25.7 148.2
      8.|-- hu-0-11-0-0-pe03.11greatoaks.ca.ibone.comcast.net (68.86.85.238)  0.0%    75    75  86.2  80.8  17.1 141.1
      9.|-- ae12.sjc12.ip4.gtt.net (173.205.58.169)                           0.0%    75    75  83.6  81.3  21.8 146.1
    10.|-- xe-1-0-1.sjc20.ip4.gtt.net (89.149.184.250)                       1.3%    75    74  78.0  81.3  28.1 148.4
    11.|-- as20473-gw.sjc20.ip4.gtt.net (69.22.130.90)                       0.0%    75    75  68.2  83.8  27.1 156.2
    12.|-- ???                                                              100.0    75     0   0.0   0.0   0.0   0.0
    13.|-- mambo.koitsu.org (104.238.183.73)                                 0.0%    75    75  86.6  82.2  28.5 141.3
    === END
    
    This is where it gets complicated (but I know definitively what happened because I'm the one that induced that). There is scattered degrees of loss (1 packet to several routers), but more importantly, a correlating increase in average latency and worst (maximum) and average latencies. So what happened there? I was downloading a game off steam (~67GBytes), so my connection became fairly saturated for about 20 minutes. I do not use QoS (by choice).

    So why didn't the loss show up at every router? Because when you have a saturated connection, it's pretty much chance (specifically, timing) as to whether or not you'll receive the response packet within the window of time (internal timeout within the tool). What also matters is the longevity of the test -- I test for 75 seconds (75 probes) and run the test once every 2 minutes. If I had done this for, say, 10 seconds, the information may have been a lot less useful/definitive. If I had done it for, say, 300 seconds, the information would have been more definitive but less "accurate" (as far as what exact time of day the problem happened).

    The point here is: network troubleshooting is painful and complicated. Long gone are the days of just running a simple ping test against somewhere and having it be definitive (that type of troubleshooting ended in around ~2001 or so, when we started introducing more asymmetric routing and vendors began defaulting to (de-)prioritising ICMP).

    Anyway, my example aside, I definitely think you're going in the right direction with concerns over QoS and whether or not it's actually working/doing the right thing.

    ** -- The (idle) latency seen seems awfully high to begin with. What kind of ISP connection is this? Satellite? It's interesting that literally the latency between you and the next upstream hop is 35-46ms. I haven't seen something that high in a long time; I'm used to literal Ethernet links, ADSL, and cable (coax). Those tend to be ~1ms (Ethernet), and anywhere between 10-20ms (ADSL and cable). I see that PPP (not sure if PPPoE or PPPoA) is involved, but that just seems awfully high to me as a default. Is this normal for your ISP / area? If you use the equipment (router/etc.) your ISP gave you, do you see this kind of latency when the connection is idle?

    The reason I ask is that 35-46ms can correlate with a fairly "long" distance. For example, my default latency is about 15ms. I'm in Silicon Valley (northern California) using Comcast. Packets sent to a destination in Dallas Texas (specifically a destination within Comcast's network in Dallas) have an RTT (round-trip-time) of 50ms. I can't see the return path (so I don't know how the response packet makes it back across the network to me, network-wise or geographically), but the physical distance between here and Dallas is over 1700 miles (2735 km). So when I see an "idle latency of ~35-46ms", I start to wonder what the network connection actually is/how it's provisioned. But if it's normal (using ISP equipment, etc.), then okay, we know what the baseline is!
     
    Last edited: Jul 27, 2016
  65. cloneman

    cloneman Addicted to LI Member

    If QoS is turned off and you simulate an FTP-based line saturation of 40% or less, and this causes erratic pings to several destinations, the problem is that your ISP/line sucks. Large numbers of connections would not cause problems unless your Router CPU is struggling (e.g. 1000s of connections)

    With no load your connection should have 1ms typical jitter and with some light load it should increase to at most 5ms with occasional outliers (no spikes of consecutive pings of 200ms, etc).

    Above ~50% load (in both directions simultaneously) on ADSL, there's potential for ATM overhead congestion without QoS, but that's not what were seeing here so far.
     
  66. Mark Barabus

    Mark Barabus Serious Server Member

    Thank you really helpful and informative. I am still learning pingplotter but as i get familiar with it as you have said i have found i have needed to tweak a few more settings. FYI i now realise the packet loss shown in graph 2 wasnt shown in the top information box because it was set to focus "auto"-- like i said i'm still learning but atleast that explains it.

    To answer your questions:
    1) Believe it or not this is an ADSL1 type connection although i can understand how you might think it was something else. I connect using PPPoA using the ISP modem which is bridged to my asus n66u running tomato where PPPoE mode is selected.
    2) I cant say whether its normal latency for the area but i have used the same ISP in a previous area of the UK and always had good latency 20-21ms so i can atleast say this is not normal for the ISP. To be honest there is not a lot of choice for ISP as i dont have the choice of cable here, only ADSL is available and one company (BT Openreach) basically own the entire telecom market so even if i go through another ISP it will still be going through BT's network. For instance i am currently with the provider PlusNet but they run off the backbone of BT so its all the same.
    3) If i use the ISP provided modem/router i get the same results with latency varying at idle.

    Testing other endpoints or targets pretty much ends up with the same results. I typically have several open so i can compare the results and rule out any false spikes. plus.net is indeed closer but still 9 hops. One of the closest ones i can find is BT.co.uk which is only 7 hops so based on your response i will use this one for my primary testing. What i am noticing from my tests is that the latency starts to vary after hop #2 immediately after my router, so that would seem to suggest an issue outside the router and with the line or ISP.

    To back this up i would also just like to post some final pingplotter graphs to various destinations. These were all recorded at idle load. Tomato shows the max recorded traffic at this period as 525kbit/s (down) and 120kbit/s (up) so roughly 5-10% load. All the tests showed latency variations of 5-20ms and some spiked as high as 1000ms. However the test to bbc.co.uk stayed somewhat stable at 35-55ms so it would appear those other target spikes are irrelevant. Still 20ms is not acceptable at these loads, based on your responses i suspect an issue with the ISP and there is very little if anything QOS can do for me here.

    bbc.co.uk.png bt.co.uk.png plus.net.png

    I also had a ping command running alongside these tests again just to verify the results and indeed there is the same latency variation happening.

    Code:
    Reply from 212.159.8.2: bytes=32 time=45ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=46ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=55ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=49ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=49ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=55ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=62ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=46ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=50ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=45ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=49ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=48ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=45ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=44ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=50ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=47ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=45ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=45ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=46ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=45ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=47ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=57ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=44ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=44ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=50ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=53ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=50ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=49ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=45ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=45ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=50ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=49ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=50ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=47ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=47ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=52ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=50ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=46ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=49ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=46ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=49ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=48ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=50ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=55ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=46ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=45ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=47ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=51ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=45ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=44ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=48ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=44ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=44ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=47ms TTL=56
    Reply from 212.159.8.2: bytes=32 time=48ms TTL=56
    [code]
     
  67. Toastman

    Toastman Super Moderator Staff Member Member

    Did you try connecting with Windows PPPoE direct to the modem? It does not sound like a Tomato issue.
     
  68. Mark Barabus

    Mark Barabus Serious Server Member

    Yes i tried connecting directly using Windows PPPoE and the issue persists. I'm currently trying to raise it as a fault with my ISP but they are saying there is no fault as the line has no noise and the sync rate etc is all fine which i have to admit is true, but something has to be wrong somewhere. I think it will take a while to get them to admit a fault let alone fix it so i will post back once i make progress.
     
  69. Mr9v9

    Mr9v9 Serious Server Member

    https://www.netlimiter.com/ Worked for me along time ago, they have since changed their look a bit but it was pretty stable if you use a windows machine.
     
  70. koitsu

    koitsu Network Guru Member

    Speaking strictly about HTTP/HTTPS traffic: if you use Chrome, it has a built-in rate-limiter you can use. It's intended for developer usage though (you'll find it under the F12 developer tools somewhat deep in there), but it does in fact work reliably.

    Otherwise yeah, NetLimiter tends to work pretty well on Windows. It has some GUI quirks/issues, but they're things you can work around.
     
  71. cloneman

    cloneman Addicted to LI Member

    If you download all your torrents through a single computer you have the option of classifying that device's MAC address with a more punitive QoS rule.

    @Mark Barabus : If your line stats look good despite jitter, there could also be congestion with the uplink that feeds the nearby DSL hardware. This would affect mostly your neighbors and therefore would go unreported as an ISP-wide issue (unlike lack of congestion of transit link bandwidth, which would affect a larger number of customers)

    You could try testing a neighbor's DSL, etc, as your next step, but for anyone reading this in the future, it's a low percentage troubleshooting step for someone already deep in the rabbit hole:D
     
  72. Mark Barabus

    Mark Barabus Serious Server Member

    @cloneman - Pretty sure i'm wasting my time with the ISP, even if it was a wide ISP issue they just dont want to know. Even the guys over at thinkbroadband are basically telling me its waste of time due to the limited capacity inherent in the 20CN backhaul.

    So if i'm being told to forget it by a so called expert i may just have to accept it for what it is, make do until fibre makes it to my area in a few years.

    Its just infuriating to think my pings are perfect within 1ms at complete idle but as soon as i influence any traffic even a tiny 500kbits the pings go doolally. It makes no sense to me at all but i guess it is what it is and there is no fixing it or so i'm told.
     
  73. Mark Barabus

    Mark Barabus Serious Server Member

    Can anyone tell me how i can check ICMP is being classified by QoS correctly. I have prioritise ICMP ticked but i dont see it listed on the view details page. I see @cloneman mentioned it further up in the thread but there was no answer other than it being on the IP traffic page.
     
  74. cloneman

    cloneman Addicted to LI Member

    I classify ICMP manually on the classification page, because I dont know how that checkmark works. You just change protocol dropdown from the default TCP/UDP to ICMP.

    I believe I can see ICMP usage on the graphs if I do a continuous ping...

    Changing the DSL modem/ethernet cable could also not hurt.
     
  75. tvlz

    tvlz LI Guru Member

    With the ICMP checkbox ticked it is hard-coded to use the service class, check the source code (rc/qos.c) to be sure
     
  76. pegasus123

    pegasus123 Addicted to LI Member

    I classify my ICMP manually before under Games category which is at about 3rd from the highest, if I use the checkbox then it goes straight to the highest classification.

    My purpose is so that when I do ping game servers, it allows me view the current state of gaming class what the ping/jitter looks like.
     
  77. misuercarriere

    misuercarriere Reformed Router Member

    @Toastman
    If it is possible, could the images found in this post be re-linked as they are no longer accessible? :)
     
  78. Toastman

    Toastman Super Moderator Staff Member Member

    I don't know what those images were, it is all lost in history. That was quite an old post, now out of date anyway.
     
  79. Mark Barabus

    Mark Barabus Serious Server Member

    Thanks guys, i have created a rule for my ICMP traffic and i can see it in "view details" when pinging outside but i cant seem to get anything to show for when someone is pinging me.

    I have set up a 24/7 monitor using the service here http://www.thinkbroadband.com/ping which constantly pings my IP address every second but from what i can see in "view details" nothing is shown for incoming ICMP traffic. I have checked a few other ping services such as mxtools just to be certain and sure enough i cant find any traffic in tomato "view details" for incoming pings.

    My guess is its classed correctly using the rule i created but it would be nice if we could see exactly what is happening with "view details" as its possible incoming ICMP is still going to the SERVICE category where ideally i want it going to a lower class so i can see how my regular traffic class looks.
     
  80. misuercarriere

    misuercarriere Reformed Router Member

    Could anyone point me to an updated QOS example for Tomato? I could only find this one from 2013 posted here and here. I'm looking for a good recent QoS configuration to adapt for my needs.
    Any help is greatly appreciated. Thanks :)
     
  81. cloneman

    cloneman Addicted to LI Member

    There's 3 main schools of thought on the topic:

    1) Enterprise Network neckbeards who think all inbound QoS is ineffective/futile (let's ignore those)
    2) Those who suggest leaving most of the settings to default on Shibby/Toastman, and using that
    3) My ideas from this post , which involve using only a small set of rules.

    Before getting too deep, be aware that if your connection is very fast (80mbps) , some older routers will run out of CPU / Sirq% juice and be unable to provide effective QoS. There may also be some differences between ARM and MIPS QoS implementations. Be sure you also have a relatively recent Shibby/Toastman Build, not something random from 2011. With those caveats in mind, good luck.
     
  82. Toastman

    Toastman Super Moderator Staff Member Member

    Flash an up-to-date version of Tomato. Check the button to erase NVRAM while flashing. The default QOS rules will appear. Now set up your router again, don't replace config with old saved config. The example rules are still pretty much the same, so don't get too excited!
     
  83. Mr9v9

    Mr9v9 Serious Server Member

    Try this link and follow along, I think it will help you! :)
     
  84. Toastman

    Toastman Super Moderator Staff Member Member

    I wouldn't do that personally, as the information on that page is very misleading.
     
  85. cloneman

    cloneman Addicted to LI Member

    I flashed Advanced Tomato 1.38 on a spare router yesterday and noticed there was both fq_codel and codel available in the QoS Dropdown list. I wasn't aware this had been added to MIPS, I thought it was just ARM according to shibby's changelog.

    - If QoS is turned off, does it use codel by default now? Or does it need to be switched on for that scheduler to be enabled?

    EDIT: I figured it out. The dropdown option shouldn't be there. You can't select codel or fq_codel on MIPS, it gives an error. It's not implemented.
     
    Last edited: Nov 13, 2016
  86. cloneman

    cloneman Addicted to LI Member

    Note: This is not meant to be a drama or advocating pro QoS. I just thought I'd try to explain all the noise behind "QoS stigma" with a theory I cooked up.

    I've been thinking about the issue of QoS as it compares to other issues, and I've found a parallel to another confusing concept: Nutrition and Fitness , which leads to "Biohackers" and "Bro Science"

    Slightly Long explanation:
    There is a lot of misinformation surrounding nutrition and while there are some universal truths (e.g. vegetables are good for you), for the most part the information and diets available are such a mess and it's hard for most people to figure out the ideal diet for humans, let alone for specific humans with different needs, genes, etc.

    Even though real science is important, a large number of people turn to "health gurus" and anecdotal experimenters. Many people have great success by mostly following unproven or anecdotal "biohackers", trying experimental supplements and diets, and following the advice of people who are not educated or "qualified" to talk about nutrition. I would suggest that many people have been more successful in taking control of their diet and fitness by using unproven, (but plausible) techniques. Sometimes there's outright quackery, sometimes the plausible techniques are wrong, but by experimenting, people gain some feedback on what works and can build a nutrition and fitness plan that is far more practical and effective than one purely based on proven medical research. Another point: traditional sciences and doctors typically don't like these ideas and almost universally call them untrue and even dangerous.

    This is similar to QoS and traffic control. There isn't a universally accepted way on how to manage traffic and there is a lot of misinformation and debate, and products which advertise QoS do not have any instructions that are useful for solving real-world situations. Most people have given up in light of so much conflicting information, no turnkey solutions, and complicated packet flow theory that very few people understand.

    As such, I believe that people turn to "bro science" for their QoS implementation. They pull on levers to see what they do, use modified firmwares that are considered inappropriate for large networks, and invent things like like the CoDel and IMQ. This is people experimenting with things that the "establishment" of network professionals considers, at least at first, to be useless and taboo. Yet, these professionals do not agree on a solution - you might hear things like "QoS doesn't work on the open internet".

    There is so much misinformation about QoS that 99% of people do the same as they do with their diet: give up and assume full control is impossible. The "bro scientists" in this case are the people that design these consumer-oriented QoS features, and the people who write up guides for them (like this thread) which are based on "best overall result for a use case" instead of "how networks should work according to professionals".

    The reason its similar to nutrition and fitness is because:

    - slow internet and confusing food choices are problems that affect everyone, leading to many ideas and perspectives
    - No real consensus or practical implementation on the "best" diet or the "best QoS"
    - the Medical Community and the Network professionals community frown upon untested techniques and advice
    - There's money or faith to be lost if Cisco/Juniper/etc implement untested "bro" QoS techniques and loses customers or confuses them. There's money to be lost in the nutrition industry (or people get sick) if "bro nutrition" turns out to be inaccurate, or if people get confused and stop buying your food
    - Doctors rarely try "bro science" vitamins and diets, Network professionals rarely even try tomato QoS

    Closing Thoughts

    The reason many network professionals consider concepts like inbound QoS to be rubbish, is because, in my belief, it is a "bro science" whereas most other networking concepts are rigid and unquestionable. These concepts are implemented by many people, have authoritative standards to back them up and are safe to use in mass quantity by large businesses. (Like Headache Medicine).

    However, I believe that "bro science" has its place in both QoS and Health, and I continue to believe in my Tomato QoS ideas and in Dave Asprey's diets high in saturated fat. My Doctor will never like Dave Asprey, and people who work in enterprise may never like Tomato QoS, and that's fine with me. But, these people are wrong to dismiss ideas completely and rudely. In the absence of authoritative information for concepts with many variables (no one knows who is 100% right), give people choice and keep an open mind.
     
    Last edited: Nov 15, 2016
    Mr9v9 likes this.
  87. Monk E. Boy

    Monk E. Boy Network Guru Member

    This is the fundamental problem. Nobody expects to actually invest time required to learn anything, they just expect to buy something, put it on the network, and to have it work without any effort on their part. If it doesn't work they throw it out and buy something else. Rinse, wash, repeat. Most of the people trying out third party firmware are basically trying to find another option to throwing it out and buying something else, they're still not willing to learn.

    People posting questions online represent the tip of the iceberg, and they should be commended for being willing to learn, but there are a lot of people who just try firmware as a last ditch effort before throwing out their hardware.

    This mentality is why IoT has become a tragedy of the commons. I've even seen idiots threatening lawsuits if their internet connection gets shut off because of some misconfigured/misengineered IoT device that's on their network got infected and is spewing DDoS traffic.
     
  88. JustinChase

    JustinChase Networkin' Nut Member

    I'm wondering, is there any way to force my Verizon Network Extender to have a higher priority over everything else no matter what; with or without turning on QoS, or without trying to setup QoS for other traffic?

    I use satellite internet, and once I use all my allotted data it slows to a crawl. I wouldn't know how to set limits that are dependent on whether or not I'm in data restriction.

    I really just want to ensure that ALL traffic passing thru the network extender takes #1 priority over any/all other traffic. As it is now, if there is any traffic while in data restriction, I can't make or receive phone calls with usable quality.

    Thanks in advance!
     
  89. cloneman

    cloneman Addicted to LI Member

    You can prioritize traffic from that 1 device by setting it to highest priority based on it's MAC address. If you'd like to not touch QoS for your other traffic, you could optionally remove all the other classifiers on the classification page.

    However, knowing the usable speed of your internet connection is not optional. You would need to figure out the speed at which your internet connection behaves when its throttled and change the global max upload and download on the QoS page. Potentially, you could write a script that runs a speedtest periodically to detect weather you're in throttled mode, and adjust qos_ibw and qos_obw in nvram to the predetermined throttled speeds minus ~15% , followed by 'service qos-restart'
     
  90. JustinChase

    JustinChase Networkin' Nut Member


    Thanks for the advice. I've added a new classification for the extender, based on the MAC of the device, and I set it to be the highest classification. I set artificially high bandwidth limits, to effectively not throttle anything, since I really just want to force the extender traffic to have priority over everything, and don't want to try messing with changing the limit based on the speed I'm receiving at any time from the constantly fluctuating satellite connection.

    I also set the class for the extender to Service, thinking that would help force it to get highest priority. I don't want to mess with all the other classes, since someday I may get a real internet connection, and don't want to have to re-set them all up again. I also don't want to set bandwidth limit to the low speed required for when I'm throttled, since it will be next-to-useless when I have "full speed" (about 10MB/sec unrestricted). Also, I get unrestricted from midnight to 5am, so I'd have to automate the switching twice/day, which seems like it will add far too much complexity to what I'm hoping can be a fairly simple setup.

    I then made a call, which was fine, then started playing a video, and the call quality became useless (my efforts didn't work).

    I'm sure I'm missing something, since this is all still pretty much a black art to me. What more/else do I need to do?

    Will screenshots of my setup help?
     
  91. cloneman

    cloneman Addicted to LI Member

    You can't set artificially high bandwith limits. You have to take a chunk of bandwith off the table or it doesn't work. (Maximums have to be ~10-15% less than your connection's speed test). If your satellite has totally inpredicatable speed tests, you have to be even more conservative in your max limits.

    Try to not think of it as prioritizing important traffic. It's more like dropping unimportant traffic and leaving important traffic untouched.

    Once you know what the numbers are that work you can probably use the scheduler admin-sched.asp to modify qos_ibw and qos_obw and 'service qos-restart'.


    In an ideal world you could leave the bandwith artificially high and not 'lose' any, and then drop it suddenly when the router detects an inbound call. I don't know how to do that though, but it's possible, and I don't think anyone else has done it before.
     
  92. JustinChase

    JustinChase Networkin' Nut Member

    Darn; that's too bad.

    Is there any way (with or without QoS on the router) to ensure my extender always has a minimum amount of data available? I really only need to ensure about 8 KB/s to get quality phone calls.

    I wish I didn't have to play games to ensure 8 KB/s, but that's how terrible satellite internet is :(
     
  93. JustinChase

    JustinChase Networkin' Nut Member

    Darn; that's too bad.

    Is there any way (with or without QoS on the router) to ensure my extender always has a minimum amount of data available? I really only need to ensure about 8 KB/s to get quality phone calls.

    I have to physically turn off all computers in my house to ensure nothing uses my bandwidth if I want to make a simple phone call :(

    I wish I didn't have to play games to ensure 8 KB/s, but that's how terrible satellite internet is :(
     
  94. cloneman

    cloneman Addicted to LI Member

    In theory you could setup a QoS system that, upon a trigger method of your choice, heavily punishes other computers without disabling internet access outright.

    Step 1. detecting that a call is taking place (Hard)
    Step 2. Enabling/ Disabling the QoS that punishes all traffic to a very low amount during phone calls,by modifying qos_ibw or qos_irates (easy)

    Something like iptables-save -c | grep <mac address> could tell you how many bytes/packets are being transferred by the iptables rule that your voip traffic is running on. In theory, you could query continously and subtract and see if a call is taking place :confused:. This is entirely a rabbit hole suggestion because I'm not at all qualified to discuss using iptables or scripting, or if it even makes sense to have something checking continuously for traffic in this convoluded manner that I've cooked up that is probably too error prone or limited to be useful in real life.

    You could also maybe use the WPS button to turn the QoS on and off.

    If someone competant (aka, not me) can suggest a way to detect traffic flow / call-in-progress for that mac address in a simple way, then turning on a 'punitive temporary' QoS is the trivial step #2.

    Of course, before going any further, test with punitively low QoS value and make sure it actually helps.

    EDIT: I've located a perl script that may help in measuring to see if voip traffic is occuring.
    https://web-beta.archive.org/web/20070925231745/http://snaj.ath.cx:80/tc-viewer/tc-viewer.html

    This rabbit hole project might have some hope yet! You'd have to get entware setup as a prerequisite. PM me if you want to discuss this any further
     
    Last edited: Dec 16, 2016
  95. ruggerof

    ruggerof Network Guru Member

    Just a quick question to the experts.

    If I am on a DSL but in a double-nat situation, should I still have something in the DSL Overhead Value - ATM Encapsulation Type?
     
  96. cloneman

    cloneman Addicted to LI Member

    yes (if ADSL). Next gen VDSL is often fiber backed so compensation is not necessary. The ATM compensation prevents the bottlenecked DSL from getting full. the double nat is not a bottleneck since its 1gb/s -

    the atm compensator's job is to make sure the letter is light enough for a child at the end to carry. in the middle, its just big trucks and bodybuilders - the weight limit for the child at the end must still be respected if you want QoS to work.

    --
    background for anyone reading this in the future: The ATM compensator works by assuming data size will increase later in the chain. With ATM's inefficiencies, small packets have a lot of overhead, and when QoS is making a decision , it has to "count" how much space the data will end up needing when it gets sent over ATM. It counts 'wrong' because it assumes ethernet or cable or VDSL which do not have this inefficiency, so it doesn't put aside enough bandwith , leading to potential unpredicted bandwith exhaustion / congestion.

    In Tomato's implementation of TC-ATM, turning it on lowers the effective speed of the link, both up and down. It's possible that it's overcompensating (to be safe) instead of calculating the exact inefficiency. This is why I tell people "just pick any bytes number" if they need compensation. For me, all of them produce the same result.
     
    Last edited: Jan 25, 2017
  97. ruggerof

    ruggerof Network Guru Member

    Thanks for the explanation Cloneman! My speed is 100/10 so most likely I am in a VDSL2.
     
  98. sabishii

    sabishii New Member Member

    Found Toastman's QOS example "LATEST AND BEST SO FAR" (as of 2010).

    [​IMG]

    [​IMG]

    [​IMG]

    Used the Internet Wayback Machine
     
  99. cloneman

    cloneman Addicted to LI Member

    Why would something from 2010 be better than the rules he includes in his current releases?
     
  100. edusodanos

    edusodanos Serious Server Member

    The latest rules should include the ports of widely used mobile applications like wattszap and facebook ...
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice