Access restriction: domain vs path matching regexes?

Discussion in 'Tomato Firmware' started by dylanjustice, Jan 29, 2014.

  1. dylanjustice

    dylanjustice Addicted to LI Member

    I want to use the HTTP request field to block Facebook, but not news articles about Facebook. The docs say that the field uses regexes, but I've tried and failed to write a a regex that matches Facebook in the domain but not elsewhere in the url. It appears to me that the field is matching dumb strings, not regexes at all.

    Can anyone confirm a list of operators that will actually work in this field? Does anyone have a way to match strings in the url's domain, but not in the path?
  2. koitsu

    koitsu Network Guru Member

    You did not disclose which firmware you're using (provide the exact filename please) -- it matters.

    Some firmwares are using a different iptables module (called xt_string) for this capability, which does not support regex (nor will it ever, from the state of things). The reason for the change is to try and get better HTTPS filtering support. Details are here (thread is long, get coffee, read it in full):

    I warned people of this in advance -- I told everyone what would break, the caveats, etc... *sigh*

    There isn't much you can do other than use a firmware that sticks to using the old (custom-written by Jonathan Zarate) module called web that did offer some basic form of regex and let you match against certain fields. You'd need to read the source code to the module to understand how it works -- and the code does not read easily.

    I always recommend to people that they try to avoid use of the Access Restrictions feature at all if possible (ex. do filtering within the browser used on a system itself, using things like Adblock Plus/Adblock Edge, etc.. It's a lot more efficient to do it there anyway).
  3. jan.n

    jan.n LI Guru Member

    I'm always interested to learn - in what way is it more efficient to do the filtering in the browser?
    Using the Access Restrictions feature, I can restrict all URLs on each and every device that uses my AP... A single point of configuration seems OK for me...
  4. Marcel Tunks

    Marcel Tunks Networkin' Nut Member

    Doing all the settings on one device is efficient in terms of time if you have many devices (e.g. large networks with thousands of clients), but filtering/blocking from the browser results in less load on the router. It's a better solution for networks where you have control of all the devices.

    (Edited an error above)
    Last edited: Feb 1, 2014
  5. jan.n

    jan.n LI Guru Member

    Thank you for giving me some insight! So this means that stuff blocked using the Access Restrictions feature does get loaded from the WAN and is filtered out on the router?
  6. koitsu

    koitsu Network Guru Member

    Access Restrictions causes your router to have to look at the payload of every single packet (destined to certain TCP ports; I believe the list is 80 (HTTP) and 443 (HTTPS)), look for a string within the packet, and either drop the packet if a match is found, or forward it (allow it through) if a match isn't found.

    What sounds like a very simple/obvious/logical methodology is in fact that -- however it is very CPU intensive to look at the payload of packets. True routers, including NAT devices -- focus on the word true -- are not supposed to look at payload ever; they are supposed to only look at the layer 3 (IP header, etc.) portion of the packet to decide what to do with it.

    Any kind of "content filtering" done on a router chews up CPU time, and routers often do not have high-end CPUs as their job is not to do content filtering but rather forward packets. For example Juniper M320 routers (enterprise routers that cost hundreds of thousands of dollars) use something like a Pentium 4 or Celeron, yet can handle hundreds of thousands of packets per second. How? Because they have dedicated hardware (called an "RE" or "routing engine") that does analysis of only layer 3 (ex. IP source/destination) and layer 4 (protocol (ex. TCP, UDP) and source/destination port) and do not look at the payload of packets. Cisco routers, same situation.

    For "content filtering" devices (ex. high-end gear made by Sandvine and other companies), those devices also contain dedicated hardware (specific CPUs or DSPs which can do substring matching or even regex on a dedicated chip).

    Consumer-grade routers do not have any of this. They have very cheap/inexpensive CPUs, and all packet forwarding in addition to other features people want (that includes VPN encryption, content filtering, etc.) are done at the CPU level. Meaning: the more crap you turn on, the less CPU time there is to do the things you want, and the slower your performance becomes.

    Thus, when it comes to residential content filtering, I always recommend if possible people do the filtering on the client (workstation/desktop) directly using things like Adblock Plus/Adblock Edge for Firefox/Chrome. You can read about making your own filters on their site. It's very easy, and they do offer what the OP wants. Desktop CPUs have way, way more horsepower than residential router CPUs.

    If you can't do filtering on the workstation/device (such as "I want to block some websites when my daughter is using the Nintendo Browser of her Nintendo DSi via our wifi router"), then Access Restriction (content filtering on the router) is all you can do.
    Toastman and jan.n like this.
  7. jan.n

    jan.n LI Guru Member

    THX :)
  8. jerrm

    jerrm Network Guru Member

    In addition to what koitsu said, keep in mind that the router hardware hasn't really kept up with connection speeds. 10+ years ago when the WRT54G came out it had a 200mhz clock, and a fast home connection was 3mbps, maybe 10mbps if you had a bleeding edge cable system. Even at what now seems like a lethargic 200mbs clock speed, for most users those routers still had cycles to burn with the connection speeds of the time.

    Today, the fastest supported Tomato router is only 3x that original 200mhz speed, but connection speeds have increased 10x-20x or more. Just keeping up basic routing without maxing out CPU is an issue in many cases.

    We are at the beginning of a major shift from MIPS to ARM and more capability, but Tomato isn't there yet (hopefully soon).
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice