1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

QoS, There Has To Be a Better Way......There Is

Discussion in 'Tomato Firmware' started by lightsword, Nov 1, 2013.

  1. lightsword

    lightsword Serious Server Member

    So I'm reading through the QoS sticky and throughout the whole thing I'm thinking, this just seems wrong. Now I'm not saying it doesn't work to some degree, but the way traffic is classified and prioritization is handled just seems well, unfair. Now this is not to any fault of toastman or anyone in particular, every router I have come across has terrible QoS capability in general and has resulted in me turning it off very quickly. The issue here that I see is that you are trying to manage an apartment complex network more or less like an ISP would but you are managing it like you would manage a network under your control by choosing which traffic/applications gets through. If an application I need happens to use peer to peer I would expect to get the same bandwidth rate as my web browser. Another issue is latency, latency is incredibly important for games and VoIP and your system can't classify everything properly, you can cap everything a little lower but the prioritization scheme throws that out the window to some degree if the network is fully saturated. I think a proper QoS system for an apartment should be simply splitting bandwidth evenly to all residents regardless of the application. Now obviously we don't want to be wasting any bandwidth so we want this bandwidth to be split evenly among all users dynamically. Meet the Hierarchical fair-service curve network scheduling algorithm an advanced new algorithm that actually does everything we would want in situations like these. It provides dynamic limiting of all users by IP address or any other buckets, full network utilization, and latency controls. That way if a user wants to play a game as long as they are not downloading anything they should have good latency if the game uses low bandwidth, the same should be true for any latency sensitive application. At the same time someone can download a peer to peer file but without crippling the network. This is possible because the max bandwidth rate per user is identical for all clients that request more speed than available, however it will change depending on total network utilization. Rather than manually limiting certain protocols because they are using more bandwidth than available, instead bandwidth per application is back in their control since the algorithm can be protocol agnostic and only take IP addresses into context. One other advantage is you can tell your users to run downloads during off peak hours and they will get much faster speeds. By giving everyone their own dynamically allocated bandwidth bucket you can solve all these issues at once. It is also highly customizable so if you want to allocate bandwidth differently between users that is also possible and you can even have one user have ultimate priority over all others. Openwrt has a number of examples how that would be done here. Hopefully something like this can be integrated into the tomatousb GUI.
    heebo1974 likes this.
  2. Marcel Tunks

    Marcel Tunks Networkin' Nut Member

    Funny that the OpenWRT site's example video is actually an explanation of fq_codel, which may be a better way of achieving what you describe. Unfortunately it's in kernel 3.5 and above.
  3. lightsword

    lightsword Serious Server Member

    Maybe it can be backported, I think codel can actually be combined with HFSC for certain configurations. HFSC is probably whats needed for per IP dynamic bandwidth splitting though, Ill see if I can find the correct way to configure it. Take a look at this http://wiki.openwrt.org/doc/howto/packet.scheduler/packet.scheduler.example4
    The big thing is that this allows dynamic bandwidth allocation. There are a number of other example scripts such as this as well that may be useful http://wiki.openwrt.org/doc/howto/packet.scheduler/packet.scheduler.example3
  4. phuque99

    phuque99 LI Guru Member

    That traffic control is simply necessary when demand exceeds supply. The "better way" is simply get more bandwidth so that supply meet and exceed demand.
    Marcel Tunks likes this.
  5. Toastman

    Toastman Super Moderator Staff Member Member

    It sounds wonderful. Yes! Until you try them.
    JoeDirte likes this.
  6. jerrm

    jerrm Network Guru Member

    Two problems with a strictly by IP approach, first a user is not necessarily a single IP. Second, with enough traffic you still eventually have to make a decision what type to prioritize.

    Options are always good, and different approaches are always better for different situations, but I don't think it's valid to argue it is better in all or even most instances.
    Marcel Tunks likes this.
  7. lightsword

    lightsword Serious Server Member

    If a user is not a single IP you have a security or authentication issue . But this should allow you to bucket all of a users devices if set up that way. How much traffic are we talking about in toastmans apartments? The bandwidth any single user should follow this general equation. (((total bandwidth available)-(sum all traffic from all users not at threshold))/(users at threshold))=(threshold). This lets the user choose what they want to do based on total available bandwidth. Hence why its called a fair service queue. In toastmans case how much total bandwidth are we talking about and for how many users are active at any one time? Only in instances of a very high volume of users should there be problems as no user should be able to generate more traffic than a proportional share at any time. I'm somewhat experienced in algorithmic programming and I've worked with a number of somewhat sophisticated informatics algorithms, although I've only worked in python for the most part To me this method is the only method that makes sense. Now if the network happens to be so congested that you absolutely have to use a classification system it should only kick in under peak conditions when the other system does not work and can be put in place on top of this system. The big failure I see with most QoS systems has to do with prioritizing the "what" instead of the "who" which in this case is "traffic type" and "unique user". In addition the lack of proper latency management in the current system is a huge issue. These modern QoS algorithms I am talking about are both highly customizable and highly dynamic and should be able to solve congestion issues better than simple traffic type based prioritization. I have lived in apartment complexes even with very high speed lines that have failed to implement traffic splitting properly and it can be extremely annoying when a latency sensitive application gets killed off by a http download or a torrent and at the same time having to download for hours even during off peak times when the network is at low utilization. This method insures no single user can be more "toxic" than any other, a single user using a single http download will not harm the network anymore than a single user running a torrent that establishes 50 connections since their IP address will be rate limited the same. I currently live in an apartment where a 50 megabit line is shared among about 5 people, however it is extremely rare that any of these users are all requesting maximum bandwidth or even close to maximum. The issue is that a torrent or other high bandwidth download can kill latency sensitive applications by saturating the connection. This has to do with the packets being un-prioritized. Now in my case I want the full 50 megabits to be available as long as it doesn't interfere with any other traffic. This algorithm should split the traffic evenly so if 2 users try and max the connection they will get 25 megabits each and if a third user is running a low traffic latency sensitive application such as a game they should get ultimate priority up until the point that the application is utilizing more bandwidth than either of the other 2 downloading users. This method also effectively deals with the difficult of classifying latency sensitive UDP applications, since as long as the application is low bandwidth it will be high in the buffer queue. If a user complains their game latency is high that would mean they are downloading something in the background. The key to my proposed method is ignoring the traffic type and focusing only on providing fair service(allocation) to each user.
  8. jerrm

    jerrm Network Guru Member

    I live in a household of six. At any time there are 30+ connected devices, about half wireless. Thats not a security issue, it's just the way things are. If I stay at a hotel for more than a night, I'll probably have at least three devices connected. Yes, things can be aggregated by account, but thats becomes another rather significant management problem.

    5 users for 50mbs is a good ratio, in many cases it will be more like 30 users for 10mbps or worse.

    For your QOS perspective of everybody's equal no matter what they are doing, it makes since. If I implemented QOS that way most anywhere we manage, I'd almost certainly lose the client. "What" is often more important than "Who" for my installs, but admittedly "what" can often be tied to "who" and hfsc could probably adapt to both.

    In tomato-speak what you're describing might be better classified as a bandwidth limiter replacement using hfsc instead of htb.

    My issue is not so much the underlying tool used, but the everybody's equal assumption. For an apartment, that may fit. For my house it does not, for my office it doesn't.
    Last edited: Nov 1, 2013
    JoeDirte likes this.
  9. lightsword

    lightsword Serious Server Member

    HFSC is highly configurable and should be able to make per account buckets, IMO you should only classify traffic on a network you have full control of unless you can't get by with device limiting. I just took a look at my network stats for the last 24 hours. The average total download rate is .5 megabits a second with a peak of 15 megabits a second and a total transfer of 5gb on a 50 megabit line and the average upload is .1 megabits a second peaking at 4 megabits a second with a total transfer of 1 gigabyte, this is with 5 different people and is obviously using a tiny percentage of the network capacity and these are people who extensively use the internet, this is also with QoS turned off. The issue comes down to peak traffic management and how to manage the packet buffer properly, if a user requests full bandwidth they should get all that is available however any other users's packets should have priority up until the transfer rates are equal. Generally using IP based buckets should be fine unless you are on a network that is significantly over capacity, there are a number of factors that get over looked in regards to traffic management. The problem when it comes to bulk downloads is that generally the user will transfer that data no matter what, if they transfer finishes quickly they are no longer using network resources and those can be freed up quicker. That way rather than setting a download to run continuously overnight and during peak times they have a good chance to finish the download while network congestion is low and their speed is fast. Slowing down users when the network is not at capacity is BAD and should never happen unless there there aren't enough transfers to request full usage. Minimizing bandwidth waste is very important for getting bulk download finished and off the network as quickly as possible so that during peak times you have less bulk downloads running. Many types of transfers such as web traffic are burst transfers, these work excellent for the type of QoS I am proposing since users quickly request high transfer rates but the transfers only happen over short periods, by finishing these as quickly as possible by fully utilizing available bandwidth you reduce the time the network is at peak capacity and reduce the latency of every request. The internet is like a series of tubes(heh) and certain things will try to get through until they are done, you don't want to be slowing those down when there is excess tube capacity. All that's left is to manage latency sensitive applications, but this system is ideal for that since traffic from a low bandwidth user has priority over traffic from a high bandwidth user and it is only the users fault if they are background downloading while playing a game or using voip as their traffic has priority up until the bandwidth threshold. Traffic type classification only works properly for networks you fully control and you know you have certain applications that must have priority over everything else. But HFSC is flexible even for that as you can place the priority above the regular traffic for a specific traffic type while still properly splitting the excess. You should never hard cap any transfers if there are unused network resources as you are just drawing the transfer out over a longer period of time. The problem is that the current tomato bandwidth limiter does not function dynamically, all you can do is manually set thresholds while those should be configured dynamically based on network utilization, there is no way to tell it to do an even split on the full available bandwidth based on current requests and utilization. The number of devices is not as relevant as how many are in use at once while there is network congestion, you wouldn't want to limit windows update in the middle of the night if nobody else is using bandwidth.
  10. jerrm

    jerrm Network Guru Member

    That describes every installation I have. I would not want the current traffic type classification removed, per device simply will not work, without many mor management headaches.

    If someone can give me a better integrated QOS/Limiter solultion, I will use it, and I don't really care what the underlying modules are, but I can't see it ever working without a traffic type component. Properly integrated, I could see the approach potentially simplifying the needed classes, which would be a good thing.

    Exactly why I thought the per device approach might be better as a limiter replacement, at least as a starting point. The hsfc module is already in tomato (shibby at least) - I assume it works. See if you can work up a scripted solution and throw out into the wild. There are folks here who will try anything. See if it goes anywhere.
  11. lightsword

    lightsword Serious Server Member

    I'm currently working on coding something, I'm fairly sure codel is going to come into play for latency handling and bandwidth detection. Its fairly complicated algorithmic code, seems to be a good deal more complicated than any current QoS implementations. Hopefully it doesn't cause performance issues for the routers CPU. I'm going to start with it on OpenWRT since codel is currently not in tomato and if I can make it work there ill try and backport codel to tomatousb.
    heebo1974 likes this.
  12. Marcel Tunks

    Marcel Tunks Networkin' Nut Member

    Gargoyle would be an easier start than OpenWRT, since it can do much of what you are talking about straight from the GUI. Last time I tried it, the QoS used a combination of hfsc with sfq. Paul Bixel and Dave Taht had some back and forth in the forum about whether fq_codel would be better than sfq, but I don't know if either of them got around to testing it. Supposedly there's no difference in CPU load between the two. I think cerowrt defaults to htb instead of hfsc, but I don't have the appropriate router to double check. The bufferbloat folks may have some insight into the pros and cons.

    Maybe I'm wrong about hfsc as a standalone solution - you clearly know more about it than I do. I just don't understand how it would prevent latency spikes when the network is under a heavy load.
  13. lightsword

    lightsword Serious Server Member

    It looks like it would probably be a combination of the two, at least if you want automated bandwidth measurement. The issue I'm running into is a lack of documentation since the traffic shapers are very new still.
  14. Toastman

    Toastman Super Moderator Staff Member Member

    @lightsword - It's interesting that you lived in apartment blocks where things didn't work out for you. That is precisely how I got involved with this stuff in the first place. I haven't seen a single apartment block in this city that didn't suffer the effects of some guy hogging bandwidth, no matter what high bandwidth was available (and we do have up to 200Mbps now). In general, residents never remembered the days where things worked, they only remembered the days when they couldn't download anything or read their mail. (Just before they moved out). Tomato changed all that.

    I currently have about 70 users online (it's weekend) and D/L pegged at the 14 Mbps of a 16Mbps line, with snappy HTTP browsing (most online pages open in around a second) and approx. 25-35mS ping to the ISP gateway. Looks like about 20 people are presently watching videos on YouTube. There is no stutter. That is how I expect QOS to work, and with proper attention, it does so.

    None of the users here has ever been aware that he is sharing that 16Mbps line with almost 200 other people.

    QOS works well here because my particular ISP has always provided exactly what they advertise, or better. Their 1Mbps/16Mbps is just what it says.

    Be interesting to see what you come up with.
  15. lightsword

    lightsword Serious Server Member

    The apartment blocks I lived in had slightly different problems although I suspect browsing was better than it is at your complex. One was a complex of around 400-500 people with a 400 megabit line primary fiber line. It was a complex where the residents were mainly students and was somewhat higher end. The issue was that during peak times torrents would saturate the line and make the connection nearly unusable you could only get .1 megabits and latency was very high. Their rather annoying solution was to hard limit every IP to 5 megabits a second and cap transfers to 1GB a day and it took them 3 months of everyone constantly complaining to even do that. Before during off peak hours I could get 30megabits and finish all my downloads then. The issue was that during off peak there was a ton of extra bandwidth and I was still waiting hours for transfers. The cap was trivial to bypass, just had to rotate the MAC address every time it was hit. Luckily you could pretty much always get 5 megabits a second but there were still occasional latency issues. But they went way overboard on what was necessary IMO. The problem seems to be mainly that nobody has coded a proper load balance that can prevent toxic traffic or a single from saturating a network while still letting that traffic work at a reasonable rate and allowing use of available bandwidth if requested. From what it sounds like the connections you are working with are far more overloaded than typical and you would need more aggressive QoS than normal. You would likely need to limit downloads more during peak usage and may not be able to rely completely on IP based fair sharing style management for a 16 megabit connection shared between 70 people, but it would seem the HFSC resource splitting would be able to be used still. I suspect these newer protocols aren't used all that yet much due to there high configuration complexity. The type of traffic management I most typically need is simply to protect latency sensitive applications and games from getting cut off. sf_codel appears to be very useful for appropriately managing buffer queues which are a big part of what makes a connection feel slow and can be used in combination with HFSC to split traffic approprietly. The problem with current implementations though is that I don't want anything hard capped, if there is available bandwidth on the main line it should be available for anything up until the network is fully utilized, but if you allow that a torrent will then saturate a network and kill connections. If prioritization is managed correctly then if someone wants to torrent and nobody is downloading anything they can without a problem, however if someone starts a game the torrent shouldn't have priority over the game traffic but should still be able to download using the remaining bandwidth. The latency on your network would likely still be too high for applications like games, at least for me. Games are extremely sensitive to latency spikes latency in general, any connection variance and have to have more or less full priority in the buffer queue. The problem is that games are UDP traffic which can be tricky to auto-classify, however one way to prioritize it is to prioritize all traffic from the IP address based on bandwidth utilization, since the total IP bandwidth should be lower than the per user max threshold the traffic should have a higher priority in the buffer than any downloads.
  16. Gitsum

    Gitsum LI Guru Member

    I'm no coding wizard or anything, but I do like to try new things and test them. I tried for a very long time to get HFSC to work using pfSense, which is the first place I'd ever heard of it, and it would just never work right. I tried to get answers on the forums and no one really seemed to care. I gave up and quit using pfSense because of this. I have tried all sorts of Linux distro version's of a home "Firewall" and such but NOTHING has ever been ever to match the effectiveness of Ubicom's Stream Engine that was on my old D-Link DIR-655! Too bad it can't keep up with today's high speed internet connections and too bad they got bought out.
  17. lightsword

    lightsword Serious Server Member

    Yeah, unfortunately queue management and bandwidth allocation right now are pretty bad and it looks like fq_codel is also a vital part of the equation for getting networking working properly, possibly a more important part at least for relatively uncongested networking that simply need reliability for low latency applications. I had a dir-655 a few years back and have actually made a recent attempt to port OpenWRT to it but I need someone who knows the kernel a little better in order to be able to do that. Maybe there is some interesting code I can pull from the old Ubicom source code that might help, although I suspect the best parts were tied to the actual hardware. The dir-655 runs linux just like most routers do and used OpenWRT internally as their base.

Share This Page