1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

pixelserv compiled to run on router WRT54G

Discussion in 'Tomato Firmware' started by Jedis, Sep 5, 2009.

  1. Jedis

    Jedis Networkin' Nut Member

    Can anyone please compile pixelserv so that I can run it on a router instead of keeping a PC on 24/7?

    Any help would be greatly appreciated, thank you!
  2. rhester72

    rhester72 LI Guru Member

    It can't be done, because the built-in web server listens on port 80 of all interfaces (thus blocking pixelserv).

    If you did manage to get it working, you'd lose the web interface to Tomato on your LAN (because you can't have two listeners on 80 without some serious black magic), which would be bad.

    Run pixelserv on another host and redirect the target IPs to it.

    Rodney
  3. mstombs

    mstombs Network Guru Member

    The web interface management port defaults to 80 but is configurable through the web interface.

    Looks like pixelserv is a tiny perl script - I'm pretty sure someone could write a small standalone C-program to reproduce the functionality without the overhead of perl?
  4. Jedis

    Jedis Networkin' Nut Member

    I have my router configured to only respond to secure connections on port 443. Port 80 should be open and available to use then, right?

    If anyone can write a small C program as mstombs suggests to do what pixelserv does and allows it to be ran on a WRT54G router (already compiled), I would be willing to donate $10 to the cause.
  5. Planiwa

    Planiwa LI Guru Member

    Or perhaps a tiny shellscript using (functional) nc?
  6. mstombs

    mstombs Network Guru Member

    OK here is a small C-program that should perform the functionality of pixelserv.

    I just took the example webserver server.c program from http://beej.us/guide/bgnet/

    http://beej.us/guide/bgnet/examples/

    and changed the "Hello, world!" to an equivalent single binary string to that generated in pixelserv.pl

    To compile the attached source for Tomato using a toolchain previously installed to build the firmware (working binary attached)

    Code:
    export PATH=/opt/brcm/hndtools-mipsel-uclibc/bin:/opt/brcm/hndtools-mipsel-linux/bin:$PATH
    mipsel-uclibc-gcc -Os -Wall -D_GNU_SOURCE pixelserv.c -o pixelserv
    mipsel-uclibc-strip -s pixelserv -R .comment
    Change and disable the Tomato web admin on port 80 then copy the 10k binary it to your Tomato /var (or /tmp) folder with, for example winscp, and run with

    Code:
    # chmod +x pixelserv
    # ./pixelserv &
    # pixelserv: waiting for connections on port 80 ...
    I suggest a real mod developer takes a look before this is ready for all (I am not too sure what system library files got used...)!

    Now normal browsing to your router (on default port 80) should get a blank page, and adblock/hosts scripts should be able to use this to avoid "broken image" symbols in some browsers.

    Note also compiles and runs fine on Ubuntu (9.04 X64 for example)

    Code:
    user@Ubuntu:~$ cc pixelserv.c -o pixelserv
    user@Ubuntu:~$ sudo ./pixelserv &
    [1] 11454
    user@Ubuntu:~$ pixelserv: waiting for connections on port 80 ...
    test browsing to

    Code:
    http://localhost/anything
    edit:minor update 32 bytes smaller...

    revised code plus binary a couple of posts on
  7. Jedis

    Jedis Networkin' Nut Member

    It doesn't seem to be working. Ads are blocked, but instead of loading the pixel, I get error messages in the browser where the ads would normally load. I tried using 192.168.1.1 and 127.0.0.1 for the router IP address and neither work.

    Admin Access is set to HTTPS only. Attempting to access http://192.168.1.1/ gives an error 'The connection has been reset'.
  8. Jedis

    Jedis Networkin' Nut Member

    An update: Hitting refresh on http://192.168.1.1 causes an error 18 out of 20 attempts. The other two attempts did correctly display the pixel, so it is working... but not very reliably.
  9. mstombs

    mstombs Network Guru Member

    Interesting - I can't reproduce the error, but did see something similar in an earlier version when I used 2 sends for the data. It was fixed for me by making sure it was sent in one packet, but a delay after the second also worked.

    Here's another attempt - just added a half second sleep before closing the connection. If you monitor what is going on in the router with "top", you may now see more than one copy of pixelserv because a new process is created for every reply - which should not hang around for more than half a second.
  10. Jedis

    Jedis Networkin' Nut Member

    Thanks. It seems to be working 100% of the time now.

    Now which Scripts section should I launch this via and how should I do it? Issuing './pixelserv &' via ssh doesn't put it into the background.
  11. Jedis

    Jedis Networkin' Nut Member

    Ok, I tried this:
    Code:
    sleep 5
    wget -P /var http://path/to/pixelserv
    chmod +x pixelserv
    /var/pixelserv &
    In the INIT section. It doesn't appear to work though. If I use those same commands via ssh, they work fine though. Any ideas? Do I need some kind of error checking?
  12. Planiwa

    Planiwa LI Guru Member

    What makes you say this?
    Perhaps you don't realize that the Tomato shell (unfortunately) is not configured for job control.
    And possibly you are not aware of the nohup command, if you want a child process to persist after the parent session terminates.

    Of course scheduled and state-triggered scripts don't have this problem.
  13. mstombs

    mstombs Network Guru Member

    Great that the little binary does work when it works - it wouldn't increase the firmware size much if included by a mod author!

    My guess is that the wget fails because the WAN is not yet up when your INIT script runs. If you put the commands in the WANUP script you need to add protection to avoid multiple copies - the WANUP script is run everytime the WAN connects or re-connects. I tested using /jffs where I have a number of things happening on reboot. Where/ when do you wget your adblock hosts or domains lists from?
  14. TexasFlood

    TexasFlood Network Guru Member

    A while back I did my own version of an ad blocking script. A number of such scripts have been posted on this forum. but I wanted to customise them to suit my needs and was a good script exercise for me. Anyway, in that script, I waited for a test file to get created when the WAN comes up before executing the rest of the script.

    I put the following on the "WAN Up" tab, which creates the temporary file then removes the code. Not sure removing the code is really necessary, but it's kinda cool, :D
    Code:
    touch /tmp/wanisup
    sed -i '/wanisup/d' /tmp/script_wanup.sh
    Then on the "Init' tab, the following waits for the WAN to come up before executing any following statements:
    Code:
    while [ ! -f /tmp/wanisup ];do ( sleep 1);done
    rm /tmp/wanisup
  15. Jedis

    Jedis Networkin' Nut Member

    How about this in WAN Up:
    Code:
    sleep 5
    rm -f /var/pixelserv*
    wget -P /var http://path/to/pixelserv
    chmod +x pixelserv
    /var/pixelserv &
    Wondering if this will work or if there's a better way to do it?

    I'm using the 'All-encompassing ad-blocking solution' by rhester72. It pulls the adservers from http://pgl.yoyo.org/adservers/serverlist.php?hostformat=dnsmasq&showintro=0&mimetype=plaintext
  16. rhester72

    rhester72 LI Guru Member

    Perhaps instead of the half-second sleep (it seems kind of wasteful) it would be better to use a TCP_NODELAY send flag to prevent the Nagle algorithm from delaying the send until after you close() the connection?

    Rodney
  17. rhester72

    rhester72 LI Guru Member

    Or...

    http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#sendrecv

    Note the following text, specifically:

    The gist: It seems you need to loop send() until the *sum* of the return codes equals the expected sent byte count *AND* none of the return codes are -1.

    Rodney
  18. mstombs

    mstombs Network Guru Member

    May work but not as you intend, you shouldn't be able delete the binary if it is running (stop with "killall pixelserv"), and a second copy won't start as it will not bind to the port. I was thinking more like

    Code:
    if ps | grep [p]ixelserv; then
    wget -P /var http://path/to/pixelserv
    chmod +x /var/pixelserv
    /var/pixelserv &
    fi
    The [p] is important - otherwise it detects its own grep command.
  19. mstombs

    mstombs Network Guru Member

    Yes I read that, but it seems too complicated, I assumed it was just a race condition where the socket could be closed before the data completely transmitted. Doesn't seem correct the program should have to buffer and try again - its only about 130 bytes, I'd like to think the kernel wouldn't fragment it?

    Fully agree the usleep is a hack, I also use lots of "sleeps" in bash scripts when things work better when slowed down!

    I like the sound of the TCP_NODELAY, just got to find an example to copy

    http://www.unixguide.net/network/socketfaq/2.16.shtml

    and decide where to put it, and find how to create a test environment that can tell the difference. I'm pretty sure you have identified the issue with Nagle, by the way - it seems the half second will nicely get around the 0.2sec buffer delay mentioned here

    http://www.unixguide.net/network/socketfaq/2.11.shtml

    Edit: busybox ftpd.c uses TCP_NO_DELAY, so attached a new version with a bit more error checking and TCP_NO_DELAY rather than a sleep.
  20. ringer004

    ringer004 Networkin' Nut Member

    make pixelserv a real daemon

    This may or may not help, but having pixelserv being a real daemon might be the more correct (in principle) thing to do (instead of '&').

    When a normal program is invoked into the background with a '&', the program still wants to use the controlling terminal for its stdin, stdout, and stderr. A true daemon will disassociate itself from the controlling terminal and will automatically put itself in the background. So when the shell disappears (i.e., someone logs out of that terminal) the process will keep running.

    I admit I'm not familiar enough with the Tomato shell to know if this is a non-issue, but I know this works with a standard Linux desktop.

    I'll try and have a sample daemon attached, I hope it works.

    The use is simple. Just call the 'daemonize()' function as the first executable statement in your 'main', and add the daemonize function. In the sample code, I had some 'reapchild' calls to kill the zomby processes, but it appears that is already in pixelserv.c

    If this works, you won't need to use '&' anymore.

    Attached Files:

  21. mstombs

    mstombs Network Guru Member

    Thanks, yes it should do that, but will also need to change error messages to syslog or add an option not to daemonize to test - there are plenty of examples in the Tomato code to copy - miniupnpd for example.
  22. ringer004

    ringer004 Networkin' Nut Member

    Yes, of course there are numerous samples in the source code... :) Not sure what I was thinking. Well, maybe someone can use the sample for their own project.

    As far as debug, I usually default to running the program (with no command line args) in the foreground with all the debug messages you want. Then when it's all debugged, invoke the program with a '-d' (for daemon) which will daemonize the process and disable all output messages (at least to stdout / stderr).

    Anyway, this looked like an interesting project considering the start was a perl script.
  23. mstombs

    mstombs Network Guru Member

    I have taken your idea ringer004 to daemonize and therefore use syslog. Found that miniupnpd uses a simpler call to create a daemon, it does reduce system resources used ("top") and survives logout. I now know all about zombies and orphans and don't think that is a problem, thanks to the great original server code.

    No longer need to start it with " &", non daemon mode is a compile switch "-DTEST".

    I'm pretty sure the TCP_NO_DELAY is working fine rather than usleep after send?

    I haven't found the need to buffer and resend packets - but code is in beej's book

    http://beej.us/guide/bgnet/output/html/multipage/advanced.html#sendall

    should anyone feel the need

    source-code is getting longer, but fortunately not the binary...

    only thing I know not done is to write a pid file somewhere and fail more gracefully if double started.
  24. Jedis

    Jedis Networkin' Nut Member

    With pixelserv3, I am getting this:

    The connection was reset
    The connection to the server was reset while the page was loading.

    This happens when an ad is blocked or if I browse to the router on port 80 directly. The previous version was working fine.

    I'll go ahead and try v4 now and see if there's any change.
  25. Jedis

    Jedis Networkin' Nut Member

    Unfortunately v4 gives the same error as v3 - the connection was reset.

    EDIT: Seems to be sporadic like the first version. Sometimes it works and sometimes it doesn't.
  26. mstombs

    mstombs Network Guru Member

    The inelegant sleep goes back in then, there's something in the docs about "linger" I will look at. This seems to be the problem discussed here

    http://blog.netherlabs.nl/articles/...-so_linger-page-or-why-is-my-tcp-not-reliable

    Maybe the usleep can be replaced with "shutdown", which seems to be pixelserv.pl? I need to find a way to reproduce the error, so I can fix it!

    It doesn't have the error here with Ubuntu or XP/IE7 on WRT54GS v1.1 or WRT54G-TM (also works on dd-wrt, but there is an warning on start-up)

    V5 attached

    Another thing - no version number on the binary...

    What versions of router/firmware/PC/browser are you using?
  27. Jedis

    Jedis Networkin' Nut Member

    pixelserv5 seems to be working flawlessly. Attempted 50 connections via port 80 directly and the pixel loaded everytime. Also tested it on Myspace and all ads are being blocked with a pixel (no errors). I even rebooted the router and downloaded it again and it seems to still be working.

    Nice work :) Now if only Victek would incorporate it into his mod... :)
  28. mstombs

    mstombs Network Guru Member

    great, working fine here too (but so was V4!), so I am not intending to tweak anymore - anyone else feel free!

    I'm intrigued how different some web pages between are XP/IE7 and Ubuntu/Shiretoko - but you don't really need router ad-blocking if you can use firefox adblock_plus. Pixelserv does clean up blocked adds on IE though.
  29. s0dhi

    s0dhi Addicted to LI Member

    Amen! Victek was working on the nodogsplash to do this but it seems like this might be the easier thing to incorporate.

    BTW, I'm using the following on my debian box as the replacement for the ads:

    index.html

    Code:
    <html> 
    <body> 
     
    <!-- Lets add some comments in an effort to make the page size greater than 512 bytes so that Google Chrome actually uses this 404 error page.  
    Will it work? It probably will work but we have to ensure that the size of the file is greater than 512 bytes. --> 
     
    <!-- Lets add some comments in an effort to make the page size greater than 512 bytes so that Google Chrome actually uses this 404 error page.  
    Will it work? It probably will work but we have to ensure that the size of the file is greater than 512 bytes. --> 
     
    </body> 
    </html> 
    
  30. mstombs

    mstombs Network Guru Member

    umm - nocat/dogsplash is a captive portal, quite different! Captive portal means that all users get a splash web screen and and answer a question before being allowed access. Victek's latest did work - for a bit, but is much more complicated than this standalone app as it needs to be integrated into web gui, rc files and interacts with iptables (and everything else that also does!)
  31. s0dhi

    s0dhi Addicted to LI Member

    You're right.

    Even though I've moved to an ALIX board running pfSense, I did set pixelserv up on the 7 or so family & friends Tomato networks that I end up maintaining; it works quite nicely.

    It would be awesome if pixelserv could be included in a build so that it doesn't have to be downloaded every time the router is rebooted. A nice webGUI check box to enable/disable it (and flip router management to HTTPS) would be sweet.
  32. rhester72

    rhester72 LI Guru Member

    Why not just store the binary on CIFS/JFFS? No need to download every reboot.

    Rodney
  33. s0dhi

    s0dhi Addicted to LI Member

    Good point. I really need to get more sleep at night so I can wake up sharper.
  34. mstombs

    mstombs Network Guru Member

    I can't leave that usleep in there - it doesn't seem correct, so this version restricts itself to IPV4 connections, "lingers" and "shutdown" the connection before closing. So if Jedis could test please!

    I've also temporarily put a copy of the binary on my ISP website so if you want to try it from console access to get a copy onto your router

    Code:
    wget -P /var http://www.mstombs.talktalk.net/Tomato/pixelserv
    Above updated to V7, see later for source
  35. vyrticl

    vyrticl Networkin' Nut Member

    Thanks mstombs for all your hard work in getting this put together. This is something I've been wanting for a long time and now I'm getting excited. Props to you. However I'm running into a few issues with it.

    First though, I tried using the WAN UP script that was posted and it didn't work so I fixed it up a little bit. Here's the updated WAN UP script that works:

    Code:
    ps | grep [p]ixelserv
    if [ $? == 1 ]; then
        wget -P /var http://path/to/pixelserv
        chmod +x /var/pixelserv
        /var/pixelserv
    fi
    
    or if you have pixelserv on the jffs mount

    Code:
    ps | grep [p]ixelserv
    if [ $? == 1 ]; then
        /jffs/pixelserv
    fi
    
    Now, I'm having troubles getting pixelserv to work correctly for me. I've tried versions 5 and 6 and no luck with either of them. Maybe I'm doing something wrong.

    I have pixelserv in my jffs directory

    Code:
    # ls -la /jffs/
    drwxr-xr-x    1 root     root            0 Sep 11 21:32 .
    drwxr-xr-x   15 500      500           183 Aug 29 22:58 ..
    -rwxr-xr-x    1 root     root         9976 Sep 11 21:27 pixelserv
    
    And it is running

    Code:
    # ps | grep [p]ixelserv
      369 root       596 S    /jffs/pixelserv
    
    My log also shows pixelserv as listening on port 80

    Code:
    Sep 11 21:57:09 unknown user.notice root: Pixelserv loading
    Sep 11 21:57:09 unknown user.notice root: Adblock loading
    Sep 11 21:57:09 unknown daemon.notice pixerlserv[361]: waiting for connections on port 80
    Sep 11 21:57:18 unknown user.notice root: DOWNLOADED Adblock list
    Sep 11 21:57:20 unknown daemon.info dnsmasq[100]: read /etc/hosts - 15113 addresses
    Sep 11 21:57:20 unknown daemon.info dnsmasq[100]: read /etc/hosts.dnsmasq - 1 addresses
    
    And just a little snippet of my hosts file to make sure it looks correct

    Code:
    127.0.0.1  isafeantvirus.com
    127.0.0.1  isafeantivirus.com
    127.0.0.1  isaferantivirus.com
    127.0.0.1  isafe-antivirus.com
    127.0.0.1  isafeantivir.com
    127.0.0.1  isafeantiviruspro.com
    
    My WAN UP script is this

    Code:
    logger WAN UP scripts
    sleep 5
    
    logger Pixelserv loading
    ps | grep [p]ixelserv
    if [ $? == 1 ]; then
        /jffs/pixelserv
    fi
    
    logger Adblock loading
    if [ ! -f /tmp/dlhosts ]; then
        echo -e "#!/bin/sh\nwget -O - http://www.mvps.org/winhelp2002/hosts.txt | grep 127.0.0.1 | sed 's/[[:space:]]*#.*$/'\$'/' > /etc/hosts\nlogger DOWNLOADED Adblock list\nkillall -1 dnsmasq" > /tmp/dlhosts
        chmod 777 /tmp/dlhosts
        /tmp/dlhosts
    fi
    cru a DLhosts "00 8 * * 1 /tmp/dlhosts"
    
    I have my web admin access setup to HTTPS only and when I go to http://192.168.1.1 I get a blank page like I should. However all the ads on pages still show the nasty error message or missing image graphic.

    Does anything jump out at you as wrong or needing to be fixed?

    And again, thanks for all your hard work on this.
  36. mstombs

    mstombs Network Guru Member

    The localhost IP address in the hosts file is passed to your browser, which then looks on its own machine. Try changing the 127.0.0.1 to your router local IP address 192.168.1.1 - most host download scripts have an option to change the IP address to 0.0.0.0 or another of your choosing using a "sed" command.

    If those scripts command work then fine - I didn't think the router shell used "=="!
  37. vyrticl

    vyrticl Networkin' Nut Member

    oh duh, thanks for looking over that for me, that was it.

    thanks!
  38. s0dhi

    s0dhi Addicted to LI Member

    I've used/tested v6 and it appears to be working well - no issues thus far.
  39. Jedis

    Jedis Networkin' Nut Member

    I've tried v6 and out of 10 refreshes of http://192.168.1.1, 4 were successful in loading the pixel and 6 times the "connection was reset".

    I'm using a WRT54G v3 and performed a complete nvram erase a couple days ago when I flashed the router. No issues with the router itself other than some of these compiles of pixelserv acting wonky. I'm using Victek's RAF 1.25.8515 ND firmware, if that helps any. I'll just go back to using v5 until this is straightened out - worst case scenario, I'll just continue to use it, it works and is much better than leaving a PC on 24/7 :)

    Perhaps you could add in some debugging info that I can monitor via syslog to isolate where it's failing?
  40. Jedis

    Jedis Networkin' Nut Member

    Just as a sanity check, I went back and tested v5 again, refreshed 20 times each in Firefox and IE8 - using F5 and CTRL+F5 to force a fresh. 20/20 times in both browsers the pixel successfully loads.
  41. s0dhi

    s0dhi Addicted to LI Member

    One thing I can note is that all of the routers I have tested on are WRT54GL and all are overclocked to 250Mhz. I'm not sure if that will make a difference with v6.
  42. Jedis

    Jedis Networkin' Nut Member

    Mine is clocked at 216mhz - the default for the firmware.
  43. mstombs

    mstombs Network Guru Member

    So some progress!, thanks for testing.

    I can't think of any useful messages I can add, the only thing that seems to work for you is for the code waiting for a bit after the message is passed to the kernel, before telling the kernel to close the connection. I could make the usleep configurable I guess, so we can have one version. Are you testing over a slow wireless connection? There's nothing in the code that waits until it knows the browser has received the message.

    Both my routers are 8MB flash versions and not overclocked at the moment but wouldn't have thought that would make an effect. I'll try loading the same firmware, maybe monitoring the connection with wireshark would be revealing - my guess is that the browser is receiving the connection closed (RST?) message before the pixel?

    Anyone else noticed Internet Explorer often has the warning in bottom bar that a script error caused the page not to load properly? This can be caused by a web page trying to load a javascript from a blocked host, and not interpretting the blank pixel reply. Anyone know what a null javascript would look like?
  44. Jedis

    Jedis Networkin' Nut Member

    What do you mean by 'slow wireless connection'? My router is one room over from my laptop. 'RSSI' is -58 dBm and the 'Quality' is 41 according to the Device List page.
  45. mstombs

    mstombs Network Guru Member

    I do most testing over wired Ethernet, just wondered if sending out packets over wireless might be a reason for a delay needed.
  46. mstombs

    mstombs Network Guru Member

    Wireshark revealed that the router was sending aggressive RST connection closing messages. Nagle or linger had no effect, so I have removed them as I found a sequence of shutdown and buffer read that works. I'm so confident I will remove V5 and V6 from above.

    This version also reports the version info to the console when run.

    Code:
    pixelserv V7 compiled: Sep 14 2009 21:58:29 from pixelserv.c

    Attached Files:

  47. Jedis

    Jedis Networkin' Nut Member

    v7 seems to be working great! 50 connections on http://192.168.1.1 and all 50 are correctly displaying the pixel. I'll keep running this one for the next few days and report back if I see any issues!
  48. Jedis

    Jedis Networkin' Nut Member

    Been noticing an issue with this. When it blocks ads on an https server, you get a popup box asking you to login to the router in order for it to render the pixel. It happens frequently, especially on a certain bank's website.

    Anyway to have it ignore/not block ads coming from an https server?
  49. mstombs

    mstombs Network Guru Member

    Never noticed adds on an https server, but its nothing to do with pixelserv - the adblock script tell dnsmasq to pretend the router is the website host. Your browser tries to connect to the router web gui via its https port 443. Only thing I can suggest is you use a non standard https port for the router gui. I do notice that Internet explorer reports "error in script on page", this is because some blocked sites are asked for a script - and the browser gets the pixel.
  50. Jedis

    Jedis Networkin' Nut Member

    Gotcha. Good call. Changing the local access port to 8080 seems to be working fine. Thanks again.
  51. mstombs

    mstombs Network Guru Member

    A small update, prompted by feedback from dd-wrt user frater. this version same as V9 except the "-i" inetd mode bit #define disabled

    This version accepts an IP address as a parameter, which means it only listens on that IP address. You can therefore use a different IP address on the router and not move the default router IP

    To listen on a specific IP address, possibly a special secondary lan bridge IP (create manually using "ifconfig br0:0 192.168.0.2" for example?), use syntax

    /path/to/pixelserv 192.168.0.2

    Also this version does not create a new process for every page request, but still uses each connection only once. I reasoned that overhead in forking and cleaning up not worth the effort - and miniupnpd gets away with single thread and that does a lot more. But the high PIDs on my 'production router' do show how many ads the old version blocked since last reboot!

    Code:
    # ps
      PID USER       VSZ STAT COMMAND
        1 root      1716 S    init noinitrd
        2 root         0 SW   [keventd]
        3 root         0 SWN  [ksoftirqd_CPU0]
        4 root         0 SW   [kswapd]
        5 root         0 SW   [bdflush]
        6 root         0 SW   [kupdated]
        7 root         0 SW   [mtdblockd]
       29 root      1692 S    buttons
       34 root         0 SWN  [jffs2_gcd_mtd3]
       94 root      1956 S    syslogd -L -s 50
       95 root      1520 S    dropbear -p 22
       97 root      1940 S    klogd
       98 root      1724 S    nas /etc/nas.conf /var/run/nas.pid lan
      106 root      1968 S    crond -l 9
      111 root      1244 S    rstats
      117 root      1672 S    httpd -s
      127 nobody    1716 S    dnsmasq
      202 root      1588 S    upnp -D -L br0 -W vlan1 -I 60 -A 180
      355 root      1956 S    udhcpc -i vlan1 -s dhcpc-event -H wrt54gs -m
    18255 root       596 S    ./pixelserv
    18351 root      1588 R    dropbear -p 22
    18352 root      1980 S    -sh
    18354 root      1956 R    ps

    The attached binary built using old Linksys/Tomato toolchain for standard versions - it can be 40% smaller if we all move to teddy_bear's new kernel and toolchain! My quick test suggests this version runs on both, but the small new one won't run on old - missing library. Source and build script enclosed - haven't learnt how to write Makefiles yet!
  52. i1135t

    i1135t Network Guru Member

    Thanks, will try it for and see how it goes... you rock!
  53. Jedis

    Jedis Networkin' Nut Member

    Thanks for the update. I've had no issues with v9 though, so am going to stay with v9.
  54. mstombs

    mstombs Network Guru Member

    Do you mean the V7 you helped develop above? Or maybe you are running dd-wrt? Due to similarities in kernel and compiler many mipsel binaries are transferable between routers (even with my Ti AR7 adsl routers), but it is always preferable to use one built with the 'correct' toolchain.
  55. i1135t

    i1135t Network Guru Member

    Actually, with v9, the lag is a lot longer, so I will revert back to v7. I think Jedis meant v7. Thanks anyways.
  56. mstombs

    mstombs Network Guru Member

    What router/firmware are you using i1135t? I test on a lightly loaded WRT54G-TM, I did notice my busy production WRT54GS v1.1 was a bit slower, but both have 32MB ram. Serving each ad opens a new connection and several messages are sent between PC and router, the connection does hang around for a while, and we know lots of connections to the router cause slow downs. V7 could answer more requests in parallel, V9 serializes them, so if one stalls it would lag - i'll check again with Wireshark to see if anything can be done. Can you post or pm me with any specific problem websites?
  57. i1135t

    i1135t Network Guru Member

    I have the Asus WL500GP v1 router. It has 32MB of RAM and plently of CPU power. It happens not specific to any site, just randomly happens as it serves out the pixel for blocked sites that download from here:
    Code:
    http://www.mvps.org/winhelp2002/hosts.txt
    Version 7 is defintely much better, but still a little laggy on a few occasions when it does not respond and having to refresh the browser.
  58. mstombs

    mstombs Network Guru Member

    Thanks for the info, I've taken the V10 down as it just stopped working for me as well! More investigation needed!

    Test version 12 attached, added a couple of TCP flags to try to improve performance.

    Also packaged a simple bash script to load test the router running pixelserv, tentative conclusion latest version 20% faster than V7, but both can serve more than 100 blank ads per second so probably not found source of "lag". Also noticed that Teddy_bear's kernel much better at cleaning up residue connections than older stock Tomato.
  59. i1135t

    i1135t Network Guru Member

    Still having issues with v11, as some blocked sites just "hang" and not timing out, so reverting back to v7.
  60. mstombs

    mstombs Network Guru Member

    Do you get multiple copies of pixelserv in the process list ("ps" or "top")?, because that would only cause a problem to one ad in V7, in V12 it would stall the program.
  61. i1135t

    i1135t Network Guru Member

    I will try to test it when I get a chance. It's getting pretty busy right now and probably for the next couple of weeks. Will keep you updated.

    --EDIT--

    OK, looks like there is only one instance of pixelserv running at a time when processing multiple requests from browser. Verified through top and ps grep...
  62. vyrticl

    vyrticl Networkin' Nut Member

    mstombs, you took down v11 because of an update but you haven't posted an update.

    Where can we get the latest version?

    This little program is awesome and I would love for it to become a part of some of the mods out there.
  63. mstombs

    mstombs Network Guru Member

    Here's my current test version, I couldn't reproduce i1135t's exact problem, so can't claim it has fixed, but I did manage to break V11 with hung process/connections, some web pages do make dozens of simultaneous connections to their adservers.

    This version accepts an IP address as a parameter, which means it only listens on that IP address. You can therefore use a different IP address on the router and not move the default router IP

    To listen on a specific IP address, possibly a special secondary lan bridge IP (create manually using "ifconfig br0:0 192.168.0.2" for example?), use syntax

    /path/to/pixelserv 192.168.0.2

    V14 is back closer to V7 which should still be here, it forks to create a new process for each ad. I found that the alternative of doing it all in-line could stall the program and/or result in "connection is closed" browser messages if attempts made to close the connection too quickly . I also increased buffer sizes and added an extra http header to try let the browser know the connection should be closed.

    I also investigated what causes the script errors reported by Internet Explorer, it is basically a browser vulnerability - the scripts try to execute what they receive without honouring the header saying its a gif. I didn't find an easy way round this, there's lots of abuse of http standards, presumably to get around people who block cookies - the http request is sometimes for a "name.js?long_string", sometimes clearly just a pseudo name encrypted string.

    I've also seen more instances of ads being requested with https, which attempt to login to the router on port 443 if default port enabled for router GUI.
  64. 4Access

    4Access Network Guru Member

    Just wanted give some quick feedback. I stumbled across this thread the day before yesterday so v14 is the first build I've tested. I'm running it on my WRT54GSv1 with Tomato v1.25 and so far it has performed flawlessly!

    Thank you so much for spending the time to get this working mstombs! I'd say the thing is darn near perfect!

    I actually don't see how it could get much better unless you were to find a solution for those pesky javascript errors in IE. Speaking of which, you haven't come up with anything really creative in the last few days have you?

    I imagine the main problem is the cases where "the http request is sometimes... just a pseudo name encrypted string" right? Because if I'm following you correctly, in those cases you couldn't easily parse the request to determine if it was looking for a js file, so you wouldn't know whether to send the gif or a js response. Right? Assuming that's the issue, how often are js requests sent as an encrypted string? Would it make sense to build support for responding to clear text js requests into pixelserv or do a majority of sites obfuscate their js requests?

    In closing I'll say that in spite of the js issues I'm really pleased with pixelserv as it exists now. Hats off to you & thanks again!



    And on an insignificant & totally off topic note, this post is actually my first activity on this forum in almost exactly 4 years! I've barely even visited the site during that time and then the other day I just randomly came here to search for something and your post lured me into a response and prompted me to investigate how long it has actually been. My last post was December 30th, 2005! How crazy is that?! And WOW how time flies!
  65. mstombs

    mstombs Network Guru Member

    Good that it works for you! No I haven't been working on this, but if you look in the sourcecode you can see see some TEST code (compiled and run on Linux PC). I found that ignoring "*.js" was not good enough, I also saw ".bs" and sometimes no file extension altogether. Might be better to only send a blank pixel if an image is requested, but what to do if not?
  66. i1135t

    i1135t Network Guru Member

    Thanks for the update... so far no problems with v14. Will update if the situation changes. :)
  67. mstombs

    mstombs Network Guru Member

    Minor update, and 7k binary built using teddy_bear's K26 toolchain (rebuilt under Latest 64-bit Ubuntu).

    Adds configurable port number - mainly useful for testing.

    Code:
    root@WRT54G-TM:/tmp/var/tmp# uname -a
    Linux  WRT54G-TM 2.6.22.19 #3 Tue Mar 23 22:01:01 GMT 2010 mips GNU/Linux
    root@WRT54G-TM:/tmp/var/tmp#  ./pixelserv -h
    Unknown opt: -h
    Usage:./pixelserv [-i] [IP] [-p  80]
    i = inetd mode,IP or hostname to listen, p port No/name
    root@WRT54G-TM:/tmp/var/tmp#  ./pixelserv WRT54G-TM -p http
    pixerlserv[5897]: ./pixelserv V15  compiled: Mar 24 2010 23:24:31 from pixelserv15.c
    ...
    Mar 24 16:28:48 unknown daemon.notice pixerlserv[5899]: Listening on WRT54G-TM:http
    When Rodney's httpd mod discussed here
    http://www.linksysinfo.org/forums/showthread.php?t=64121

    incorporated, shouldn't have to move httpd off port 80 ...
  68. ulif18

    ulif18 Serious Server Member

    error "User defined signal 1" ?

    Tomato v1.27vpn3.6


    BusyBox v1.14.4 (2010-04-22 17:32:21 CEST) built-in shell (ash)
    Enter 'help' for a list of built-in commands.

    # cd /var
    # ./pixelserv
    User defined signal 1


    with v7,v14,v15
  69. mstombs

    mstombs Network Guru Member

    Not seen that error before, but I know the K26 toolchain one doesn't run on stock tomato due to library mismatch. Does it put anything more interesting the Tomato syslog?

    Rodney also has compiled versions in his repository - the static version is huge!
    http://www.linksysinfo.org/forums/showthread.php?t=64245

    I have just compiled a development version using Tomato 1.23 toolchain, and it runs on old Tomato 1.22 I have access to., as well as teddy_bear K26. I'll be interested if it works... maybe there's a conflict with VPN ver?

    Changes are to bind only to an interface (needed help from teddy_bear via dnsmasq 2.52 - there's a kernel bug not fixed until 2.6.31 that means you can't specify a 3 character interface name "br0"). Also experimenting with an extra TCP linger2 setting to try to avoid occasional zombie processes associating with sockets stuck in FIN_WAIT2, caused by browser crash I think.

    Code:
    Tomato v1.22.1570
    
    BusyBox v1.12.2 (2008-11-16 03:35:24 PST) built-in shell (ash)
    
    # chmod +x pixelserv
    # pixelserv -h
    Unknown opt: -h
    Usage:pixelserv [-i] [IP] [-p 80] [-n IF]
    i = inetd mode, IP or hostname to listen, p = port No/name IF = interface name (def br0)
    
    # ./pixelserv
    pixerlserv[4859]: ./pixelserv V16 compiled: Jul  8 2010 22:50:18 from pixelserv16.c
    
    -------
    Tomato v1.27.9047 MIPSR1-beta15 K26 USB Ext
    BusyBox v1.16.1 (2010-06-03 16:13:56 EDT) built-in shell (ash)
    root@wrt54g-tm:/tmp/var# ./pixelserv -p 81
    pixerlserv[8352]: ./pixelserv V16 compiled: Jul  8 2010 22:50:18 from pixelserv16.c
  70. srouquette

    srouquette LI Guru Member

    I'm running it with the adblock script, and it seems there's a problem with it, it seems to fork itself. Does anyone else have this problem?
  71. rhester72

    rhester72 LI Guru Member

    It does create a new thread for each new concurrent request, reaching a limit of 16 threads.

    Rodney
  72. srouquette

    srouquette LI Guru Member

  73. mstombs

    mstombs Network Guru Member

    It also daemonizes (which includes at least one fork - 2 actually if you compare the pids of the log messages) when first started so you don't need a "&" to launch as separate process. I tried in an earlier version to queue requests and process them quickly, but that tends to result in "connection reset messages" in the browser. The current version gives you an idea how many ads have been blocked by bumping up the process id counter.

    Code:
      116 root       596 S    /jffs/pixelserv
    14030 root      1956 R    ps
  74. mstombs

    mstombs Network Guru Member

    Prompted by usage in http://www.linksysinfo.org/forums/showthread.php?t=58443 here's an update with new features

    1. Only sends the null pixel if an image file gif,png,jpx requested. This fixes some reported script errors in Internet Explorer, but not all - some sites expect variables to be set/ iframes to be drawn in javascript etc.

    2. Can now be poked to report how many ads blocked using

    Code:
    kill -SIGHUP $(pidof pixelserv)
    which puts log entry, for example
    Code:
    Nov  2 21:18:05 wrt54g-tm daemon.info pixerlserv[1115]: ./pixelserv V17 compiled: Nov  2 2010 21:06:33 from pixelserv17.c
    Nov  2 21:18:05 wrt54g-tm daemon.notice pixerlserv[1117]: Listening on 0.0.0.0:80
    Nov  2 21:18:56 wrt54g-tm daemon.info pixerlserv[1117]: 1000 ads served
    same count also reported on normal exit.

    3. Various options can now be selected at compile time, the attached compiled with
    "-DDO_COUNT -DIF_MODE" using stock Tomato/Linksys toolchain and 'should run on all
    flavours of Tomato, can be made a couple of K smaller using TomatoUSB toolchain,
    but will not then run on stock tomato.

    If previous version works for you, no compelling to change!

    V17 withdrawn due to 2 bugs, stuck processes and 'silent' crashes when no file extension found
  75. srouquette

    srouquette LI Guru Member

  76. mstombs

    mstombs Network Guru Member

    I don't block ads on this site...

    The url you gave doesn't ask for an image, so your browser just gets a 200 header reply with no data which Google Chrome doesn't like - I see its says "Error 324 (net::ERR_EMPTY_RESPONSE): Unknown error.".

    I usually use Firefox which doesn't seem to mind getting nothing, for example

    from
    http://192.168.0.1/test

    rather than the pixel from

    http://192.168.0.1/test.gif

    So back to the drawing board! I still think it is Internet Explorer that is broken when it ignores the header reply and tries to interpret a gif as a script
  77. srouquette

    srouquette LI Guru Member

    try to send a character with the HTTP 200, like space :)
    edit: or maybe you can try to reverse the test. Return 200 if it's a .js file, and return a pixel if it's something else.
  78. dkirk

    dkirk LI Guru Member

    Nov 2 20:30:48 router daemon.info pixerlserv[764]: /tmp/mnt/sda1/pixelserv V17 compiled: Nov 2 2010 21:06:33 from pixelserv17.c
    Nov 2 20:30:48 router daemon.notice pixerlserv[766]: Listening on 192.168.1.2:80

    The log file for the pixelserv v17 is showing it starting with a name of "pixerlserv", and line #209 of v17 shows the problem.
  79. mstombs

    mstombs Network Guru Member

    This feature moved back into TEST, suspect need proper reply string to send a blank body at least, rather than just the first line of the null pixel string.


    Well spotted - there was an unpublished intermediate version with shorter name....

    So V18 attached, just adds the "count feature" from V17 to V16.

    [edit] same build using Teddy_bear TomatoUSB Toolchain, and same executable compressed using UPX http://upx.sourceforge.net/ on the router

    Attached Files:

  80. ~nephelim~

    ~nephelim~ Networkin' Nut Member

    I thought of moving the ongoing discussion here.

    The early test script I posted use nc -z that suppress the output.

    This one create a loop using seq command and doesn't use -z .
    The previous loop construct is supported only on bash ver >4

    Code:
    #!/bin/bash
    PXL_IP="192.168.1.2"
    CNTA=$(date +%s)
    for i in $(seq 1 500)
    do
    echo ------------------------------
    echo -e "GET / HTTP/1.1\r\nHost: $PXL_IP\r\nConnection: close\r\n"|nc $PXL_IP 80
    echo ---$i
    done
    CNTB=$(date +%s)
    echo Runtime $(($CNTB-$CNTA)) sec
    If nc still doesn't display the output on ubuntu it might be needed to remove netcat-openbsd and install netcat-traditional (available in package manager)

    Removing both half-socket shutdowns in pixelserv19.c appear to fix FINWAIT2/dormant pixelserv and TTP replies for nc tests
    Code:
    			/* clean way to flush read buffers and close connection */
    			if (shutdown (new_fd, SHUT_WR) == 0) {
    				while ( read (new_fd, buf, sizeof buf) > 0 );
    			}
    
    			shutdown (new_fd, SHUT_RD);
    From what I've read, though, FINWAIT2 are originally caused by clientside issues (nc in this case) not properly acknowledging serverside (pixelserv) socket closure.

    I was reading again fork details that mention the kernel-limit is set by CHILD_MAX. In NoUSB this constant is equal to 999

    "Accept-ranges: bytes\r\n" in pixelserv reply is not really needed and it advertises a feature pixelserv does not support.
    "Accept-ranges: none\r\n" would be more accurate for pixelserv purposes/implementation.
  81. srouquette

    srouquette LI Guru Member

    Can pixelserv cause browsing slow down, by slowing the router ?
    I installed adblock on a friend's router, and he mentioned he felt his browsing speed was slower since the update, and so do I.
    I'm not sure if it's pixelserv, or dnsmasq with a little bit bigger hosts list.
    But my friend mentioned the problem after enabling pixelserv. He was using adblock since a few days without it.

    and he also added one of my script in Init:
    Code:
    
    #!/bin/sh
    
    slp=4
    old_rx=0
    old_tx=0
    # qos from kbits to bytes, * 3/4
    qos_ibw=$(($(nvram get qos_ibw) * $slp * 96))
    qos_obw=$(($(nvram get qos_obw) * $slp * 96))
    
    while sleep $slp; do
    vlan=$(ifconfig vlan0 | grep bytes)
    rx=$(echo $vlan | sed -r "s/.*(RX bytes:)([0-9]*).*/\2/")
    tx=$(echo $vlan | sed -r "s/.*(TX bytes:)([0-9]*).*/\2/")
    if [ $(($tx - $old_tx)) -ge $qos_ibw ]; then led amber on; else led amber off; fi
    if [ $(($rx - $old_rx)) -ge $qos_obw ]; then led white on; else led white off; fi
    old_rx=$rx
    old_tx=$tx
    done
    
    This script toggle the leds if the download/upload is higher than 75% of the QoS (white for upload, amber for download).
    But the script updates the leds every 4 sec...
  82. ~nephelim~

    ~nephelim~ Networkin' Nut Member

    Using the realtime bandwidth monitor it looks like pixelserv can increase traffic on vlan0 and br0 and no traffic on ppp0 (apart few spikes that might be unrelated)

    According to /etc/qos, ppp0 is the interface used for traffic scheduling though.


    At glance impact on cpu usage appear minimal.

    It should be possible to stresstest pixelserv generating 5000 consecutive requests (it takes no more than 3 minutes) and look at average CPU load in the last 1 and 5 minutes while no other resource intensive task is active (average load on last minute can peak at 0.30 aka 30%). This test generates nearly 30 requests per second.


    I guess it could cause slowdowns while websurfing but only while multiple webpages need pixelserv services for many ads.

    EDIT: average CPU load figures I provided might not be accurate after all. It's probably better to test this at different times on multiple routers.
  83. mstombs

    mstombs Network Guru Member

    I test pixelserv at being able able to serve 500 blanks in 4 seconds, with little cpu hit, so don't think thats a problem. Your init script above is much heavier, the shell is launching many processes for binaries such as nvram and ifconfig and led. Have you also put in protection for only creating one copy of the script and allow the init script itself to exit?

    Web pages with javascript calls to pixelserv will be slowed down as they generate obscure script errors - known 'feature'.

    good point re accept-ranges.

    I'm currently looking into kernel code re FIN_WAIT2, somethings not working right..
  84. mstombs

    mstombs Network Guru Member

    On my ubuntu, nc defaults some alternative ratelimited dumb output tool, but nc.traditional is still there

    Code:
    $ ls -laF /bin/nc*
    lrwxrwxrwx 1 root root    20 2010-03-23 11:12 /bin/nc -> /etc/alternatives/nc*
    -rwxr-xr-x 1 root root 31296 2010-02-21 06:32 /bin/nc.openbsd*
    -rwxr-xr-x 1 root root 27136 2008-06-21 23:51 /bin/nc.traditional*
    But on my current build of Tomatousb - I do not have any problem with multiple pixelserv processes with any combination of rate limit iptables commands.

    Exactly which version are you having trouble with ~nephelim~?

    I have K2.6 version on test router

    Code:
     /tmp/home/root # uname -an
    Linux wrt54g-tm 2.6.22.19 #7 Sun Nov 28 18:40:49 GMT 2010 mips GNU/Linux
    and also tried a 2.4 version

    Code:
    Linux wrt54g-tm 2.4.37.10 #158 2010-11-27 12:54:55 CST mips GNU/Linux
    I cannot now reproduce the stuck processes - which were on a stock Tomato with 2.4.20 kernel I think.
  85. ~nephelim~

    ~nephelim~ Networkin' Nut Member

    I carried testing using mainly OpenBSD netcat (Debian patchlevel 1.89-3ubuntu2) on Ubuntu 10.10 (+guest addons +updates) guest on a Virtualbox 3.2.10 VM on Windows XP SP3 (+ Updates)

    On such setup nc script does not carry more than 30 requests/second (runtime about 17 seconds for 500 requests)

    Router Firmware K2.4 Nousb Std Build 52 for Buffalo WHR-G54S and pixelserv ver 19

    While nc test ran I continually executed

    Code:
    PXL_IP=192.168.1.2
    ps|grep $PXL_IP$
    
    in tomato web shell (tools-shell.asp)

    truncated TTP replies occurred also from time to time.

    I didn't get any of this after removing both half socket shutdowns ( close (new_fd); is still carried before exit)

    If you cannot reproduce neither of those glitches I'm really out of ideas about what is going wrong on my router. :confused:
  86. dkirk

    dkirk LI Guru Member

    Running pixelserv 18 for a while now and all of a sudden it reports in the log

    daemon.err pixelserv[808]: Abort: Address already in use

    when I reboot the router. The address declared in the script is the usual 192.168.1.2 with the router being .1, and the .2 address is empty. If I run pixelserv manually with a known empty IP from command line the log shows "Cannot assign requested address". Any thoughts?
  87. mstombs

    mstombs Network Guru Member

    @dkirk,

    The first message means pixelserv has been restarted, or another copy tried to start and the IP address in use.

    The IP address must exist before pixelserv can use it - with an "ifconfig br0:0 ..." command, or the "ip" equivalent.
  88. mstombs

    mstombs Network Guru Member

    @~nephelim~ (or anyone else who wants to test!)

    I also DID get a couple of stuck processes, but just can't reproduce at will! I haven't noticed missing characters - so hope its just a display thing after receiving the binary gif?

    If you look in the Linux src, tcp code is different between K2.4 and K2.6, the tcp stuff is in, for example /net/tcp.c there are strange comments re linger2 and FIN_WAIT2, it looks to me that only a negative value of SO_LINGER2 is used to disable waits in FIN_WAIT2, so here's a new version 20 attempting to turn this state off.

    Also "bytes" change to "none" in header, and change to default interface behaviour (in response to request elsewhere, where br0 not used). To listen only on br0 as V16-V19 default use "-n br0".
  89. mstombs

    mstombs Network Guru Member

    Version 21 drops to user "nobody" by default as per dnsmasq (still requires to be started as root), slight rearrangement of messages to limit size increase.

    Code:
    Usage:./pixelserv [-i] [IP] [-p 80] [-n IF] [-u user]
    i = inetd mode, IP or hostname to listen on (all), p = port No/name, IF = interface name (all), user ("nobody")
    root@wrt54g-tm:/tmp/home/root# ./pixelserv -p 8080 -n br0 -u root
    pixelserv[5152]: ./pixelserv V21 compiled: Dec  1 2010 00:10:59 from pixelserv21.c
  90. Jedis

    Jedis Networkin' Nut Member

    Still going strong with v7. I haven't bothered to update since I haven't had any problems on my WRT54G v3.

    # ./pixelserv
    pixelserv V7 compiled: Sep 14 2009 21:58:29 from pixelserv.c
    # uname -an
    Linux 2.4.20 #1 Fri Jul 24 04:24:56 CEST 2009 mips GNU/Linux
  91. mstombs

    mstombs Network Guru Member

    HaHa, stick with it, V9-V11 was a step back (no fork), most other changes "nice to have" configurable options - latest version uses same gif and same fork() method as V7.

    But do you regularly reboot router - occasional "stuck processes" - multiple entries of "pixelserv" processes in "ps" or "top" were a feature of V7, which was bigger problem in single threaded V9-V11.
  92. Jedis

    Jedis Networkin' Nut Member

    I haven't rebooted the router in almost two months. Here's the stats. Tell me what you think:
    Code:
    # ps
      PID USER       VSZ STAT COMMAND
        1 root      1768 S    init noinitrd 
        2 root         0 SW   [keventd]
        3 root         0 SWN  [ksoftirqd_CPU0]
        4 root         0 SW   [kswapd]
        5 root         0 SW   [bdflush]
        6 root         0 SW   [kupdated]
        7 root         0 SW   [mtdblockd]
       26 root      1720 S    buttons 
       73 root      1956 S    syslogd -R 192.168.1.6:514 -L -s 50 
       74 root      1532 S    dropbear -p 22 -a 
       78 root      1932 S    klogd 
       97 root      1960 S    crond 
      100 root      1256 S    rstats 
      107 root      1688 S    httpd -s 
      440 root       596 S    /var/pixelserv 
    10658 root       596 S    /var/pixelserv 
    26016 root      1720 S    redial 
    26017 root       856 S    pppoecd vlan1 -u XXXXXXXXXXXXXXXXXXXXXXXX -p XXXXXXXX
     3678 root       596 S    /var/pixelserv 
     3679 root       596 S    /var/pixelserv 
     3684 root       596 S    /var/pixelserv 
     3685 root       596 S    /var/pixelserv 
     9040 root       596 S    /var/pixelserv 
     9041 root       596 S    /var/pixelserv 
     9268 root       596 S    /var/pixelserv 
     9398 root       596 S    /var/pixelserv 
     9399 root       596 S    /var/pixelserv 
     9449 root       596 S    /var/pixelserv 
    29328 root       596 S    /var/pixelserv 
    29330 root       596 S    /var/pixelserv 
    19051 root       596 S    /var/pixelserv 
    19052 root       596 S    /var/pixelserv 
    19054 root       596 S    /var/pixelserv 
    19056 root       596 S    /var/pixelserv 
    31979 nobody    1284 S    dnsmasq 
      362 root       596 S    /var/pixelserv 
      965 root       596 S    /var/pixelserv 
     1024 root       596 S    /var/pixelserv 
    27133 root      1596 R    dropbear -p 22 -a 
    27135 root      1972 S    -sh 
    27137 root      1948 R    ps 
    # uptime
     05:42:31 up 59 days,  3:04, load average: 0.00, 0.00, 0.00
    # 
    
  93. ~nephelim~

    ~nephelim~ Networkin' Nut Member

    I'm testing latest pixelserv (V21) on latest K2.4 No usb no vpn build 54.


    I will have to test some more with and without ipfiler rules, but using a negative value for linger2 appear to help when pixelserv is bound to br0

    EDIT2: Still got hung pixelserv -n br0 instances on some attempts without rate limiting iptable rules.

    Code:
    PXL_IP=192.168.1.2
    PXL_EXE="/tmp/pixelserv"
    ifconfig br0:0 $PXL_IP
    $PXL_EXE -n br0 $PXL_IP;

    If I have pixelserv bind the default interface (not sure what it is) instead of br0 I can get stuck processes at times (it is more likely on early tests after a router reboot):

    Code:
    PXL_IP=192.168.1.254
    PXL_EXE="/tmp/pixelserv"
    ifconfig br0:0 $PXL_IP
    $PXL_EXE $PXL_IP
    I guess ifconfig should be called with different options in such case.


    Spamming Tomato webshell (Tools > system) with ps commands might increase this chance [strike](applies only when pixelserv is bound to default IF).[/strike]


    During the same test most replies are not truncated, only few are.
    The truncated replies might be even caused by some nc bug,[strike] but removing [/strike]

    Code:
    if (shutdown (new_fd, SHUT_WR) == 0)
    [strike]alone apparently suffices to prevent this. [/strike]

    EDIT: nope. it turned out removing shutdowns is unable to affect this at all (the following script had me identify these cases more easily)


    If nc provides output it would be possible to use this script which does return only truncated replies,
    Code:
    #!/bin/bash
    PXL_IP="192.168.1.2"
    CNTA=$(date +%s)
    for i in {1..500}
    do
    echo -e "GET / HTTP/1.0\r\n"|nc $PXL_IP 80|grep -a ^TTP
    done
    CNTB=$(date +%s)
    
    echo Runtime $(($CNTB-$CNTA)) sec
    
  94. mstombs

    mstombs Network Guru Member

    You have a problem 17 in ~30,000 haven't terminated, but you can tell from process ID that sometomes 10,000 requests without error!. Each pixelserv process consumes memory and will be associated with open connection "netstat -an". But you can't use V21 above on old Tomato - I need to post one built with old toolchain (library issue).
  95. Jedis

    Jedis Networkin' Nut Member

    Is there an elegant way of handling the zombie processes?

    Maybe with a script in the cron that will kill all of those zombie processes that have been running more than a couple minutes, checking once per day?
  96. damwill

    damwill LI Guru Member

    Here's the script I'm using, set to run every night:
    Code:
    kill $(pidof pixelserv)
    sleep 2
    /jffs/pixelserv 192.168.1.86
  97. mstombs

    mstombs Network Guru Member

    and as discussed before I would use "killall pixelserv", before restarting, as killing only the single parent just leaves children as orphans, and you have to wait for grim reaper (init) to reap them!
  98. rhester72

    rhester72 LI Guru Member

    Interesting. If you SIGHUP pixelstat to get the statistics, the children go away. Any idea why? (Maybe a different/custom signal should be used for stats, since I believe SIGHUP is also "passing through" the app to the kernel?)

    Rodney
  99. mstombs

    mstombs Network Guru Member

    Thanks Rodney, I was just looking at at a captured FIN_WAIT2 on my K2.4 box with V21,

    Code:
    ps ...
    
    Dec  1 20:59:11 wrt54gs daemon.info pixelserv[16215]: 993 requests served
    Dec  1 20:59:11 wrt54gs daemon.info pixelserv[15054]: 1157 requests served
    
    # netstat -an
    Active Internet connections (servers and established)
    Proto Recv-Q Send-Q Local Address           Foreign Address         State
    tcp        0      0 192.168.0.1:80          192.168.0.86:33152      FIN_WAIT2
    
    
    and finding many questions on the internet about it, but few answers, here for example, sounds the same problem completely different distro/kernel

    http://www.linuxforums.org/forum/networking/149949-connections-stuck-fin_wait2-days.html

    and this works - I know the SIGHUP signal kicks the main loop out of the "accept" state - I had to hide the "accept:" warning! The signal clearly hits the child as well because I got messages from both

    Code:
    Dec  1 20:59:11 wrt54gs daemon.info pixelserv[16215]: 993 requests served
    Dec  1 20:59:11 wrt54gs daemon.info pixelserv[15054]: 1157 requests served

    Have to admit out of my depth - but I know dnsmasq uses the SIGHUP signal to trigger cache reload, so I copied the mechanism as far as I could!

    But just scheduling a regular cron to do the SIGHUP is a work-around that appears to clean up!

    This is a cut from an old Tomato kernel source which doesn't seem to work...

    Code:
    	/*	This is a (useful) BSD violating of the RFC. There is a
    	 *	problem with TCP as specified in that the other end could
    	 *	keep a socket open forever with no application left this end.
    	 *	We use a 3 minute timeout (about the same as BSD) then kill
    	 *	our end. If they send after that then tough - BUT: long enough
    	 *	that we won't make the old 4*rto = almost no time - whoops
    	 *	reset mistake.
    	 *
    	 *	Nope, it was not mistake. It is really desired behaviour
    	 *	f.e. on http servers, when such sockets are useless, but
    	 *	consume significant resources. Let's do it with special
    	 *	linger2	option.					--ANK
    	 */
    
    	if (sk->state == TCP_FIN_WAIT2) {
    		struct tcp_opt *tp = &(sk->tp_pinfo.af_tcp);
    		if (tp->linger2 < 0) {
    			tcp_set_state(sk, TCP_CLOSE);
    			tcp_send_active_reset(sk, GFP_ATOMIC);
    			NET_INC_STATS_BH(TCPAbortOnLinger);
    		} else {
    			int tmo = tcp_fin_time(tp);
    
    			if (tmo > TCP_TIMEWAIT_LEN) {
    				tcp_reset_keepalive_timer(sk, tcp_fin_time(tp));
    			} else {
    				atomic_inc(&tcp_orphan_count);
    				tcp_time_wait(sk, TCP_FIN_WAIT2, tmo);
    				goto out;
    			}
    		}
    	}
    of course that may not even be code that gets compiled in ...
  100. rhester72

    rhester72 LI Guru Member

    Looks like the signal hits children as well. I'd recommend changing to SIGUSR1. dnsmasq uses SIGHUP because it actually takes advantage of the interruption, whereas you really can't.

    Rodney

Share This Page