OpenVPN swamping CPU

Discussion in 'Tomato Firmware' started by kzrssk, Aug 7, 2014.

  1. kzrssk

    kzrssk LI Guru Member


    I am trying to implement Tomato with OpenVPN on a WRT54GS, but in testing, I got really slow transfer speeds. I checked the CPU usage, and it was through the roof. I'm assuming all that usage is either encryption or compression. Is there something I can tweak to reduce the load on the CPU, or is my router too old to really do OpenVPN at all?

    Thanks, all
  2. RMerlin

    RMerlin Network Guru Member

    You need a much more modern router to be able to use OpenVPN effectively. An RT-N66U will cap at around 15-20 Mbps/s at best with OpenVPN, and its CPU is probably at least twice faster than the WRT54G, if not more.
    koitsu likes this.
  3. Almaz

    Almaz Networkin' Nut Member

    Just lower your Encryption cipher such as RC2-40-CBC. It works fine without a single hiccup. Very easy on CPU.
  4. lancethepants

    lancethepants Network Guru Member

    And very easy to hack as well :)
    kzrssk likes this.
  5. Almaz

    Almaz Networkin' Nut Member

    For home use its more than enough but for corporate they wouldn't use wrt router in the first place.
  6. lancethepants

    lancethepants Network Guru Member

    Newer routers can take advantage of AES asm in OpenSSL, so I'd recommend a newer router and use AES.
  7. rs232

    rs232 Network Guru Member

    Hey lacethepants, is this HW acceleration you're talking about?
    If so how can you verify if the HW supports it or not?

  8. lancethepants

    lancethepants Network Guru Member

    Not so much HW acceleration, that makes it sound like a dedicated piece of hardware just for crypto which isn't the case. The asm is just code written in assembly which should run more effeciently for the cpu.
    In newer routers (probably all MIPSR2 routers) I've seen ASM enabled, but not in routers like wrt54gl (MIPSR1) which I don't think are capable.
    @RMerlin says he's backported better more advanced asm optimizations in his firmware (I think not in Tomato). Can you confirm?
    Last edited: Aug 8, 2014
  9. RMerlin

    RMerlin Network Guru Member

    Part of the ASM optimizations (MIPS and ARM) are in OpenSSL 1.0.1 (which Tomato uses), however I did backport additional ASM code from 1.0.2 for further performance improvements into my firmware.

    The ASM code supports both MIPSr1 and r2 if I recall. It's only on recent 1.0.2 backports that they did additional MIPSr2 optimizations.
  10. kzrssk

    kzrssk LI Guru Member

    Okay, I figured that all might be the case. I told my higher-ups it'd probably have to be a newer router or a dedicated server, and I think the decision is we're basically putting it on the back burner for now, heh.

    Definitely eyeing the ARM routers. How much better do they do than the MIPSies for small business-scale?
  11. koitsu

    koitsu Network Guru Member

    Stop buying consumer-grade hardware for workplaces. Reach out to Juniper, Cisco, Sonicwall, and other companies to buy an actual VPN concentrator device. I can particularly recommend Juniper Netscreen devices. Your pps (packets per second) and overall throughput will be extremely high; all these devices have actual hardware-level support for crypto, and you also get support through the companies.
    Toastman likes this.
  12. kzrssk

    kzrssk LI Guru Member

    I haven't bought any; can't really stop...

    When I used to work for a big business, I heard those names. The company had 30,000 employees. I need to provide remote access for one person. The sites don't list prices, so it sounds like a "Well, if you have to ask..." situation that I'm guessing I probably won't be able to justify.
  13. RMerlin

    RMerlin Network Guru Member

    "small business-scale" doesn't really provide a context.

    If we're talking one user at a time for occasional use, it will be fine. If we're talking 3+ concurrent users on frequent occasions, look for a business-class VPN appliance instead.
    gfunkdave and koitsu like this.
  14. koitsu

    koitsu Network Guru Member

    The companies I have worked for have been of varying size; from 20 employees (small ISPs and startups) to tens of thousands (Microsoft). The products I mentioned (particularly Netscreens) have been used in both demographics. Trust me: when it comes to something as important as a VPN concentrator used by multiple employees (multiple means 2 or more, maybe 3 or more) for workplace environments, it's best to open up communications with some of the larger vendors and discuss. Seriously -- talk to them, explain to their sales guys what your budget is, what your needs are. You might be surprised what they recommend or offer, and if they can't come up with something, they certainly can direct you to another party who might. Honest: talk to Juniper.
    Toastman and kzrssk like this.
  15. kzrssk

    kzrssk LI Guru Member

    Okay, I'll check then out. Thanks :)
  16. JeffB

    JeffB Reformed Router Member

    As far as the arm routers go, I'm using the R7000 as a VPN client using Shibby v121, not overclocked. Openssl speed for aes reports:

    type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    aes-128 cbc      31758.59k    35164.96k    36501.15k    36730.88k    36623.70k
    aes-192 cbc      27093.08k    29829.10k    31017.13k    31317.33k    31460.01k
    aes-256 cbc      24181.03k    26256.78k    27091.37k    27326.12k    27424.09k
    The 128 blowfish cbc encryption (commonly used with openvpn as well) is roughly the same speed as aes-192 cbc. This is sufficient for my needs for home use as my bandwidth is only 30 Mbps down / 5 Mbps up. For a business setting though it might be better to look at more professional solutions as mentioned above. If you were planning to purchase a router for the sole purpose of VPN you might look at the ZyXEL ZyWALL110 as it is not that much more expensive than the high-end consumer routers but it is specifically for VPN.
  17. koitsu

    koitsu Network Guru Member

    And because it's relevant: compare those numbers to a generic FreeBSD Core2Quad Q9550 system (AESNI is not available) using OpenSSL 0.9.8za (yes you read that right):

    type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    aes-128 cbc     131793.50k   144436.58k   146253.09k   146639.42k   145531.28k
    aes-192 cbc     120625.83k   127068.69k   128509.06k   128308.33k   128025.21k
    aes-256 cbc     108479.91k   113556.73k   114611.09k   114872.70k   114303.43k
    blowfish cbc     92141.56k    99253.58k   101148.36k   101634.55k   101737.80k
    rc2 cbc          27783.28k    28639.94k    28736.41k    28758.02k    28792.40k
    And now an RT-N66U (Broadcom SoC / MIPSR2), which is commonly regarded as "a workhorse" (ha!), using OpenSSL 1.0.1g:

    type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    aes-128 cbc       9173.16k     9645.23k     9820.81k     9925.56k     9963.24k
    aes-192 cbc       8044.59k     8496.06k     8452.47k     8629.59k     8616.57k
    aes-256 cbc       7213.38k     7569.79k     7694.00k     7700.96k     7710.93k
    blowfish cbc     12275.63k    13018.82k    13323.86k    13322.92k    13431.58k
    rc2 cbc           4908.90k     5073.81k     5094.06k     5174.66k     5178.12k
    And a WRT54GS is going to be magnitudes slower than that.

    And now folks hopefully see why Blowfish is a common default on non-x86 systems. (And on some other platforms, the IDEA cipher is sometimes a good performer -- now that the patents have expired as of 2012 folks everywhere can use it, including the US).

    This should hopefully give readers some idea of the massive difference between x86 systems and embedded MIPSR2 or ARM devices. I sound like a broken record these days -- the performance difference is like night and day, which is exactly why I tell people who are already using VPN clients on their x86 systems to keep them there if they care at all about throughput. Otherwise consider hardware offloading for crypto, semi-common in "enterprise-grade" devices (and you most likely won't find it on consumer-grade devices -- the vendors don't see it worth the cost).
    Last edited: Aug 12, 2014
    Toastman likes this.
  18. Marcel Tunks

    Marcel Tunks Networkin' Nut Member

    Cool, didn't realize OpenVPN made full use of SMP. Even if it didn't the difference would still be huge!
  19. kamaaina

    kamaaina Serious Server Member

    I can confirm from my own surprise that running the OpenVPN client on a consumer router is a bitch. First time I did this on my E3000 with the PIA provider I wondered what happened to the internet, as about 7-8 Mbps was all I could get down on a 50+ connection. With the N66U I managed to get about 12-15, then I started researching and it became clear that the router CPU is the limit.

    The ARM routers are doing much better. The AC68U/56U can do 25-30, and with R7000 gets 30+ on stock speed. O'clocked the R7000 at 1400 I have seen 45 Mpbs down on occasion, but that is rare as it depends on PIA's servers and the time of day. 30-35 is my average, I'd say. That said, if you run Tunnelblick or Viscosity SW on the Mac I get close to maximum cable bandwidth 50+. Running the VPN on the router is super convenient though as you won't forget and every device is included.
    Last edited: Sep 1, 2014
    Toastman and koitsu like this.
  20. JeffB

    JeffB Reformed Router Member

    As a side thought, on my R7000 I max out only 1 core with VPN and I know it can't be done in parallel but, is it possible to use both vpn clients on the tomato firmware on separate cores? I haven't tested this, hence why I'm asking here first as every time I mess with the internet here all hell breaks loose from my roommates. If it is possible then what about load balancing between the two client connections? If load balancing is an issue then what about dynamic or manual routing based on mac / ip to either of the 2 vpn clients?
  21. gfunkdave

    gfunkdave LI Guru Member

    Listen to koitsu on this one. All the companies he mentions make nice stuff. I'd also be sure to check out Sonicwall, which I believe might be the most budget friendly (though that's based on nothing more than what I think I've heard).
  22. koitsu

    koitsu Network Guru Member

    It may be possible via taskset(1). This is what's known as "setting CPU affinity" (on a per-process level). However:

    1. taskset is part of the util-linux package, and neither the full package nor the individual utility are part of TomatoUSB (nor available in Entware, from what I can tell),
    2. The functionality may not actually work given that TomatoUSB is based on Linux 2.6.22. The functionality of taskset(1) and related kernel code (the important part, of course) was introduced in Linux 2.5, but there is always a very strong possibility of bugs.

    This subject matter immediately gets into a very deep, very technical discussion, as well a philosophical (as in whether or not one should even do this) about the CPU scheduler in the Linux kernel, and mandates discussion with actual kernel developers. The LKML mailing list is a better place for that discussion.

    Most consumer routers in use today are not multi-core, which is why this subject doesn't come up very often.

    Finally, something you need to keep in mind: go right ahead and run two VPN clients, one bound to core #0, the other bound to core #1. Now answer me this: where does the kernel get its CPU time for all other tasks (like packet processing, forwarding, bridging of interfaces, software VLANs, firewall stack, timecounting, etc.)? It's going to be fighting against what you've bound explicitly with taskset.

    What I'm trying to tell you, nicely, is that basically consumer routers are not workhorses. They are not built to handle heavy cryptographic calculations. Their CPUs are not like x86 desktop CPUs -- they are significantly less powered and applications today are mainly optimised for x86, not MIPS or ARM. It would be possible for a router vendor to add a crypto offloading chip that handles certain kinds of ciphers, and then provide a Linux driver that allows cryptographic tie-ins to that hardware (but I do not know if such a driver exists on Linux 2.6 -- only recently was such introduced to OpenBSD for hardware crypto!), and then software that uses system library crypto calls / related crypto API could be configured to benefit from that driver thus hardware. Possible? Yes, but feasible? Probably not. (It would also bloat the firmware, and do not forget there are active people on this forum who are still using small routers like WRT54G)

    This is why I just tell people to keep their VPN clients running on a desktop -- even more so if you have multiple people behind a single router. VPNs really are (truly, I'm not kidding around) a "unique slowflake" when it comes to networking. Usually at companies (small, medium, and large) have literally a dedicated one or two employees who do nothing but VPNs because they understand the crypto and configuration involved. And in almost all those cases, hardware VPN concentrators are used because it's the only fast way. Dedicated device with dedicated hardware for a specific task. Really -- no company I've ever worked for (small to large, including Microsoft) has used "software VPNs" on consumer router equipment. They invest in things like Juniper NetScreens for a reason. You might ask Sonicwall though -- and if they can't help you, you can say "I understand. But hey, do you have any idea of competitors or other companies who DO offer {things you need}?" You'd be surprised how helpful sales folks can be if asked that question politely.

    The only thing that's "neat" these days, consumer-wise, is Intel implementing AES in their x86 processors. But guess what? There are operating systems that shun CPU-level AESNI because there are (possibly tin-foil-hat, but keep an open mind!) concern that AESNI may have back doors where governments are able to bust/break it quickly, and without cryptographers being able to look at the source code/design by Intel. It's been discussed before. OpenBSD, for example, tends to advocate a) open-source encryption models (so they can read the code and examine it for such things -- transparency = good), b) vendors who make encryption chips (for hardware offloading) but who they trust or have access to the source code (for review). The OpenBSD guys are very security-conscious.

    So my recommendation, politely, would be for you and your roommates to just run the VPN clients on their desktops. Your router then does what it was designed to do: forward packets and do NAT. It greatly diminishes the pain for everyone, while keeping throughput high. Yes, it means administrative work on each system to get the VPN up/going, but it's up to you to decide if that trade-off is worth it.
    Toastman likes this.
  23. RMerlin

    RMerlin Network Guru Member

  24. JeffB

    JeffB Reformed Router Member

    Thanks for the insight koitsu. I agree VPN should be done on desktops or on dedicated VPN hardware; the only reason I put it on the router is to catch all network traffic without using up all the VPN licenses (I have 5 licenses and there are 7 PC's that would actively use it). Another issue is I can't always verify all PC's are using VPN; I would have to sit down and look at the router logs to see what traffic is going where etc. I may try a hybrid approach using VPN on a couple PC's that I know I can easily manage and are static (like desktops) and then allowing those connections to bypass the router VPN. I did that with the E3000 and it worked alright.

    I may also setup a VPN client and share the connection on my Archlinux box that I have already hosting mediabrowser server, samba, and homeautomation software. It's a simple AMD A4-5300 (dual core) but will be upgrading to the A10-6800K. It supports the AESNI. I just have to figure out how to re-route traffic from my router to the PC and then allow the VPN traffic back through the router using just one NIC on that PC, something for another day...

    PS: I'm not trying to ask/demand for solutions so I hope I don't seem like it. None of this (Above) is really anything I need since I already have a working vpn client setup on the router. I just like to mess around with my equipment/network and learn about different/better ways to do things. Some things I test out/implement may be completely pointless; ie the mediabrowser server. There's already Plex server on the network; I'm the only one that really uses mediabrowser and I use it to playback videos on the same machine that hosts it.
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice