1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Examining Tomato may Crash it

Discussion in 'Tomato Firmware' started by Planiwa, Nov 11, 2009.

  1. Planiwa

    Planiwa LI Guru Member

    One way to crash Tomato is try to find out why Tomato crashes by logging in and monitoring it.

    Every time you log into Tomato, you increase the risk that it will crash. I will now show one particular mechanism:

    Tomato's shell ("Busybox") is configured suboptimally. As a result, processes often do not end properly, when they are terminated. This is a fairly well-known bug that affects such common invocations as ping or tail -f, which are then killed with an INT signal (^C), but don't really go away. Or shell scripts with loops containing sleep.

    However, it may happen if one does nothing more than ssh to the router, and the session times out. This is one reason why I have said that it is completely counterproductive to try to increase system stability by using "Conntrack emergency expire", especially doing so repeatedly.

    When a session times out, all of its processes should be terminated, including the login shell, and its parent dropbear child.

    But what can happen instead is that the connection is removed from the Conntrack table, and appears to be terminated, but netstat still shows it. Worse, if you just log in and let it time out, both the shell and its parent dropbear remain as processes, taking up 25% of "the" memory.

    When memory drops "too" low gradually, the kernel will try to kill processes to free up memory, and it may catch (some of) those defunct processes. And quite possibly some essential processes.

    (It always starts by killing dnsmasq.)

    When there is a sudden surge in memory demand while memory is critically low, that may result in a crash.

    As you can see in the data below, memory had dropped to 240k Free + 2072k Cached.

    Here now some data:
    Code:
    # netstat -t
    Active Internet connections (w/o servers)
    Proto Recv-Q Send-Q Local Address           Foreign Address         State       
    tcp        0    128 ROUTER.site:ssh    xx.xxx.xx.xx:32689      ESTABLISHED 
    tcp        0    128 ROUTER.site:ssh    xx.xxx.xx.xx:33903      ESTABLISHED 
    tcp        0      0 ROUTER.site:ssh    xx.xxx.xx.xx:26783      ESTABLISHED 
    tcp        0      0 ROUTER.site:ssh    xx.xxx.xx.xx:28175      ESTABLISHED 
    tcp        0      0 ROUTER.site:ssh    xx.xxx.xx.xx:31495      ESTABLISHED 
    tcp        0    128 ROUTER.site:ssh    xx.xxx.xx.xx:28739      ESTABLISHED 
    tcp        0      0 ROUTER.site:ssh    xx.xxx.xx.xx:31660      ESTABLISHED 
    tcp        0    128 ROUTER.site:ssh    xx.xxx.xx.xx:27332      ESTABLISHED 
    tcp        0      0 ROUTER.site:ssh    xx.xxx.xx.xx:32470      ESTABLISHED 
    tcp        0      0 ROUTER.site:ssh    xx.xxx.xx.xx:34764      ESTABLISHED 
    tcp        0   1024 ROUTER.site:ssh    xx.xxx.xx.xx:32978      ESTABLISHED 
    tcp        0    128 ROUTER.site:ssh    xx.xxx.xx.xx:31516      ESTABLISHED 
    tcp        0      0 ROUTER.site:ssh    xx.xxx.xx.xx:34442      ESTABLISHED 
    tcp        0    128 ROUTER.site:ssh    xx.xxx.xx.xx:32602      ESTABLISHED 
    tcp        0      0 ROUTER.site:ssh    xx.xxx.xx.xx:27222      ESTABLISHED 
    # 
    # ctv -S 2222
    1110.22:57:41
    -RTR  2222/22  32978    <remote-admin tcp     1  5  E a
              SEARCH: 2222
    1110.22:57:43
    # ps
      PID USER       VSZ STAT COMMAND
        1 root      1768 S    init noinitrd 
        2 root         0 SW   [keventd]
        3 root         0 RWN  [ksoftirqd_CPU0]
        4 root         0 SW   [kswapd]
        5 root         0 SW   [bdflush]
        6 root         0 SW   [kupdated]
        7 root         0 SW   [mtdblockd]
       26 root      1720 S    buttons 
       28 root         0 SWN  [jffs2_gcd_mtd3]
       66 root      1720 S    redial 
       68 root       856 S    pppoecd vlan1 -u XXXXXXXXXXXXXXXXXXX -p XXXXXXXX -r 1492 -t 1492 -i 0 -I 30 -N 5 -T 5 -P 
       71 root      1948 S    syslogd -L -s 50 
       72 root      1532 S    dropbear -p 22 
       74 root      1932 S    klogd 
       99 root      1708 S    nas /etc/nas.conf /var/run/nas.pid lan 
      106 root      1960 S    crond -l 9 
      109 root      1256 S    rstats 
      113 root      1684 S    httpd -s 
     2828 root      1600 S    dropbear -p 22 
     3880 root      1600 S    dropbear -p 22 
     4157 root      1600 S    dropbear -p 22 
     4558 root      1600 S    dropbear -p 22 
     4661 root      1600 S    dropbear -p 22 
     4801 root      1600 S    dropbear -p 22 
     4825 root      1972 S    -sh 
     4965 root      1600 S    dropbear -p 22 
     5012 root      1972 S    -sh 
     5189 root      1600 S    dropbear -p 22 
     5204 root      1972 S    -sh 
     5371 root      1600 S    dropbear -p 22 
     5372 root      1972 S    -sh 
     5832 root      1600 S    dropbear -p 22 
     5856 root      1972 S    -sh 
     6968 root      1600 S    dropbear -p 22 
     6993 root      1972 S    -sh 
     6995 root      1600 S    dropbear -p 22 
     7031 root      1972 S    -sh 
     7391 root      1600 S    dropbear -p 22 
     7414 root      1972 S    -sh 
     8738 nobody     856 S    dnsmasq 
     8773 root      1600 S    dropbear -p 22 
     8774 root      1976 S    -sh 
     8845 root      1948 R    ps 
    # 
    # vit
    VIT: 4 222 632+2704 .1 2/44 8h 8h ppp0:781-86 eth1:807-711
    # 
    # tail -66 /var/log/*s
    Nov 10 21:59:11 ROUTER daemon.info dnsmasq[7639]: read /etc/hosts.dnsmasq - 12 addresses
    Nov 10 22:09:06 ROUTER user.info VIT: 3 45 240+2072 .5 1/54 7h 7h ppp0:674-79 eth1:698-610
    Nov 10 22:25:18 ROUTER daemon.info dnsmasq-dhcp[7639]: DHCPDISCOVER(br0) 192.168.0.113 ...
    ...
    Nov 10 22:35:53 ROUTER user.err kernel: Out of Memory: Killed process 7639 (dnsmasq).
    Nov 10 22:35:54 ROUTER daemon.info dnsmasq[8553]: started, version 2.49 cachesize 150
    Nov 10 22:35:54 ROUTER daemon.info dnsmasq[8553]: read /etc/hosts.dnsmasq - 12 addresses
    Nov 10 22:36:30 ROUTER user.err kernel: Out of Memory: Killed process 8553 (dnsmasq).
    Nov 10 22:36:31 ROUTER daemon.info dnsmasq[8579]: started, version 2.49 cachesize 150
    ...
    Nov 10 22:36:31 ROUTER daemon.info dnsmasq[8579]: read /etc/hosts - 0 addresses
    Nov 10 22:36:31 ROUTER daemon.info dnsmasq[8579]: read /etc/hosts.dnsmasq - 12 addresses
    Nov 10 22:37:01 ROUTER user.err kernel: Out of Memory: Killed process 8579 (dnsmasq).
    Nov 10 22:37:05 ROUTER daemon.info dnsmasq[8591]: started, version 2.49 cachesize 150
    Nov 10 22:37:05 ROUTER daemon.info dnsmasq[8591]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N DHCP no-TFTP
    Nov 10 22:37:05 ROUTER daemon.info dnsmasq-dhcp[8591]: DHCP, IP range 192.168.0.190 -- 192.168.0.199, lease time 2h
    Nov 10 22:37:05 ROUTER daemon.info dnsmasq[8591]: reading /etc/resolv.dnsmasq
    Nov 10 22:37:05 ROUTER daemon.info dnsmasq[8591]: using nameserver ...
    Nov 10 22:37:05 ROUTER daemon.info dnsmasq[8591]: using nameserver ...
    Nov 10 22:37:05 ROUTER user.err kernel: Out of Memory: Killed process 8591 (dnsmasq).
    Nov 10 22:37:06 ROUTER daemon.info dnsmasq[8594]: started, version 2.49 cachesize 150
    Nov 10 22:37:06 ROUTER daemon.info dnsmasq[8594]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N DHCP no-TFTP
    Nov 10 22:37:06 ROUTER daemon.info dnsmasq-dhcp[8594]: DHCP, IP range 192.168.0.190 -- 192.168.0.199, lease time 2h
    Nov 10 22:37:06 ROUTER daemon.info dnsmasq[8594]: reading /etc/resolv.dnsmasq
    Nov 10 22:37:06 ROUTER daemon.info dnsmasq[8594]: using nameserver ...
    Nov 10 22:37:06 ROUTER daemon.info dnsmasq[8594]: using nameserver ...
    Nov 10 22:37:06 ROUTER daemon.info dnsmasq[8594]: read /etc/hosts - 0 addresses
    Nov 10 22:37:06 ROUTER daemon.info dnsmasq[8594]: read /etc/hosts.dnsmasq - 12 addresses
    Nov 10 22:39:04 ROUTER user.err kernel: Out of Memory: Killed process 8594 (dnsmasq).
    Nov 10 22:39:08 ROUTER user.err kernel: Out of Memory: Killed process 2829 (sh).
    Nov 10 22:39:15 ROUTER user.err kernel: Out of Memory: Killed process 3881 (sh).
    Nov 10 22:39:19 ROUTER user.err kernel: Out of Memory: Killed process 4158 (sh).
    Nov 10 22:39:24 ROUTER user.err kernel: Out of Memory: Killed process 4559 (sh).
    Nov 10 22:39:30 ROUTER user.err kernel: Out of Memory: Killed process 4562 (sh).
    Nov 10 22:39:38 ROUTER user.err kernel: Out of Memory: Killed process 8704 (awk).
    Nov 10 22:39:41 ROUTER user.err kernel: Out of Memory: Killed process 4662 (sh).
    Nov 10 22:39:46 ROUTER user.err kernel: Out of Memory: Killed process 4703 (sh).
    Nov 10 22:39:46 ROUTER daemon.info dnsmasq[8738]: started, version 2.49 cachesize 150
    Nov 10 22:39:46 ROUTER daemon.info dnsmasq[8738]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N DHCP no-TFTP
    Nov 10 22:39:46 ROUTER daemon.info dnsmasq-dhcp[8738]: DHCP, IP range 192.168.0.190 -- 192.168.0.199, lease time 2h
    Nov 10 22:39:46 ROUTER daemon.info dnsmasq[8738]: reading /etc/resolv.dnsmasq
    Nov 10 22:39:46 ROUTER daemon.info dnsmasq[8738]: using nameserver ...
    Nov 10 22:39:46 ROUTER daemon.info dnsmasq[8738]: using nameserver ...
    Nov 10 22:39:46 ROUTER daemon.info dnsmasq[8738]: read /etc/hosts - 0 addresses
    Nov 10 22:39:46 ROUTER daemon.info dnsmasq[8738]: read /etc/hosts.dnsmasq - 12 addresses
    Nov 10 22:39:48 ROUTER user.info VIT: 4 148 380+2600 3.4 1/48 8h 8h ppp0:753-84 eth1:778-685
    Nov 10 22:39:52 ROUTER authpriv.info dropbear[4702]: exit after auth (root): Exited normally
    Nov 10 22:44:27 ROUTER authpriv.info dropbear[8773]: Child connection from xx.xxx.xx.xx:32978
    Nov 10 22:44:33 ROUTER authpriv.notice dropbear[8773]: password auth succeeded for 'root' from xx.xxx.xx.xx:32978
    # 
    
    I wonder if anyone is at all interested in all this.
     
  2. mstombs

    mstombs Network Guru Member

    If there are bugs in BusyBox config - please document and they can be fixed.

    "top" is a useful tool when it displays the parent pid - you can easily see what started what.

    In Tomato dropbear runs in daemon mode and always spawns a child to deal with a request, maybe it could be configured to limit the number of subprocesses, but then the n+1th request would fail to login. On other systems I have seen multiple instances of dropbear, due to broken connections, I'm sure it could be configured to timeout, but I usually manually kill off the ones I don't want.

    dropbear could also be run in inetd mode, with inetd managing the number of instances. If other things could also be configured to run only on demand there's a possibility of normally having more available free ram.
     
  3. Toastman

    Toastman Super Moderator Staff Member Member

    I'm very interested but am not sufficiently savvy to be able to assist much. Certainly anything you are able to offer that may help the developers or modders to configure tomato would be terrific. Your monitor scripts are first class.

    The very common situation that I see is a reboot of a router under heavy load, shortly after login to the web GUI.

    Thanks for the information that you are posting !
     
  4. mstombs

    mstombs Network Guru Member

    Note: sshd dropbear 0.52 supports a timeout parameter, which would kill off dormant copies.

    Code:
    -I <idle_timeout>  (0 is never, default 0)
    but no easy way to set via the web gui (needs a code Mod), but you could run dropbear from the command line.

    The reason dnsmasq get killed by the kernel is probably because it runs with low security as user "nobody", most other services stay running as "root". But if dnsmasq dies, the rc files restart it when it they notice, any dnsmasq does a lot of option handling etc when starting as "root" - this could be the point that the router dies.
     
  5. Planiwa

    Planiwa LI Guru Member

    Funny how not a single (technically knowledgeable) reader has said "I find that hard to believe -- someone logs in, session times out, and now there are orphaned processes?! This can't be true! Let me try this. ... (20 minutes later) ... Sure enough! This is amazing. This is so wrong. Let's see what causes this wrong behaviour."

    Instead, the problem itself (that processes are not terminated properly when the session times out) is ignored, and since one kind of timeout is already buggy, an additional layer of timeout is proposed in addition. :)
     
  6. mstombs

    mstombs Network Guru Member

    The default dropbear timeout is a compile-time #define in options.h, the current action appears deliberate it never times out - so you can leave some non-interactive command running - or on the assumption you may manage to reconnect to the same session?
     
  7. Toastman

    Toastman Super Moderator Staff Member Member

    That's interesting, Michael. Could you make a compile with this default behavior changed to something a little less risky for us to test?

    I must go get a HDD to put Ubuntu back on...
     
  8. Planiwa

    Planiwa LI Guru Member

    The above were still problems in stock 1.25, IIRC.

    I am now using

    Tomato v1.25.8515
    BusyBox v1.14.2 (2009-07-02 18:01:37 CEST) built-in shell (ash)
    and the above are not problems.

    Perhaps they are no longer problems in 1.26 Beta?

    In case someone is interested in verifying this:

    # cat /proc/loadavg
    0.03 0.01 0.00 2/22 7263

    this tells you how many processes (22)

    Please run the following commands, then interrupt them with Ctrl-C,
    then check if the number of processes has increased.
    If it has, the problem still persists.

    # ping 4.2.2.2

    # tail -f /var/log/*s

    # while sleep 2;do echo foo;done

    It would be valuable to know if these problems are gone.


    Similarly, it would be good to know if the problem resulting from disconnections happens with versions other than Tomato v1.25.8515

    To test that do:

    (assuming the default timeout of 1200 seconds for Established TCP)

    start an ssh session.

    # ps
    Code:
      PID USER       VSZ STAT COMMAND
        1 root      1772 S    init noinitrd 
        2 root         0 SW   [keventd]
        3 root         0 RWN  [ksoftirqd_CPU0]
        4 root         0 SW   [kswapd]
        5 root         0 SW   [bdflush]
        6 root         0 SW   [kupdated]
        7 root         0 SW   [mtdblockd]
       26 root      1720 S    buttons 
       28 root         0 SWN  [jffs2_gcd_mtd3]
       71 root      1948 S    syslogd -L -s 50 
       72 root      1532 S    dropbear -p 22 
       74 root      1932 S    klogd 
       99 root      1708 S    nas /etc/nas.conf /var/run/nas.pid lan 
      106 root      1960 S    crond -l 9 
      109 root      1256 S    rstats 
      113 root      1684 S    httpd -s 
    11555 root      1720 S    redial 
    11556 root       856 S    pppoecd vlan1 -u XXXXXXXXXXXXXXXXXXX -p XXXXXXXX -r 1492 -t 1492 -i 0 -I 30 -N 5 -T 5 -P 0 -C p
    11563 nobody     884 S    dnsmasq 
     1704 root      1600 S    dropbear -p 22 
     1709 root      1976 S    -sh 
     7324 root      1948 R    ps 
    
    You should see one sh and two dropbears.

    Let the session time out (should take 20 minutes).

    Make a new ssh connection, and run ps.

    Do you see 1 sh and 2 dropbears? or
    do you see 2 sh and 3 dropbears? (because the old ones did not terminate properly).

    Please report back. Thanks for helping.
     
  9. mstombs

    mstombs Network Guru Member

    I know that the dropbear timeout works, from testing on my adsl modem, it is a pain though when you are used to leaving ssh sessions open. I will experiment first with another dropbear setting.

    Code:
    /* Ensure that data is transmitted every KEEPALIVE seconds. This can
    be overridden at runtime with -K. 0 disables keepalives */
    #define DEFAULT_KEEPALIVE 0
    I'm hoping that setting this to say 60 seconds and the timeout to 120 secs will mean only disconnected dropbear sessions will be killed.

    I most recently built a variant of teddy_bears mod, but haven't tested on anything other than my own WRT54G-TM which has working serial console and JTAG!
     
  10. Planiwa

    Planiwa LI Guru Member

    I have a monitoring shell script, ctw. I invoke it remotely with ssh. Then I kill the ssh session with SIGINT.

    Here is what happens:

    The connection correctly moves from state Established to state Time Wait, and times out after 120 seconds.

    The dropbear process terminates correctly.

    But its 3 child processes remain:

    Code:
    18359 18358 root     S     1968  14%   0% /bin/sh /root/ctw 
    18358     1 root     S     1960  14%   0% sh -c /root/ctw 
    19376 18359 root     S     1932  13%   0% sleep 10 
    
    These processes not only remain extant, they actually continue executing -- consuming CPU time, spawning new processes, and writing their output to who knows where.

    This bug is perfectly reproducible.


    [Tomato v1.25.8515 ND BusyBox v1.14.2 (2009-07-02 18:01:37 CEST)]
     
  11. mstombs

    mstombs Network Guru Member

    Unfortunately, this is by design for dropbear, apparently the bug is in your monitoring script, it needs to notice it has lost its stdin/out (does it not die when the sleep timesout?), see following reply by the dropbear author to a patch request to do what you want to:-

    http://lists.ucc.gu.uwa.edu.au/pipermail/dropbear/2008q3/000782.html

    but the patch is in the previous post if you want to build a custom version...

    The dropbear idle timeout does work, but I find it too aggressive - it timesout if no DATA received, and more likely to create orphan processes. I am testing a patched version of dropbear which accepts the NULL keepalives from PuTTY - just need to move the watchdog timer reset back to where the original feature proposer put it reversing the dropbear author's mod proposed here:-

    http://lists.ucc.gu.uwa.edu.au/pipermail/dropbear/2008q3/000780.html
     
  12. Planiwa

    Planiwa LI Guru Member

    Well, thank you! :)

    To say "the bug is in your script" at least acknowledges that there *is* a problem.

    The links that you provide show that there are others who think that it is reasonable to expect that child processes of "hung-up" connections should be properly terminated. (Unless exempt from the expected hangup actions, via "nohup", etc.)

    But whether what you call design is good design or bad design, the fact remains that various activities, inspired by "random reboot crashes", and intended to monitor the system to try to understand what causes crashes, may themselves increase the risk of a crash.

    That is valuable knowledge, and to share it is the object of this thread.

    (And, "if monitoring can precipitate crashes then stop monitoring" is a possible, though perhaps not very helpful, response.)


    Yes, there are workarounds for some of these challenges, as you suggest.

    A repetitive monitor can check the return code of an echo to stdout.
    (There is a subtle irony in that -- perhaps it's time for a sequel to "The Unix and the Echo" :)

    One could run a euthanizer at the start of every interactive session (if one remembers), which will clean up after the session is disconnected, for whatever reason.

    But if someone just logs in and the session times out (for a variety of reasons, including a hyperactive expire-early script), the system will not clean up the disconnected processes. And most people won't be aware of the problem, and may log in a dozen more times, and then the router crashes "randomly".

    It seems reasonable to have multiple GUI and ssh windows open to monitor various aspects of the system simultaneously.

    But we have reports that suggest that logging in to the GUI may trigger "random reboots".

    It is probably not very well known that some Tomato web pages can themselves generate very large numbers of connections very fast.

    Problem awareness is a quality value. :)
     
  13. rkloost

    rkloost Addicted to LI Member

    Can you provide me with a copy of the script?

    BTW is the ND version the only version with problems or the regular version too?
     
  14. Planiwa

    Planiwa LI Guru Member

    As I said there is a workaround, so long as the script echoes to stdout.
    Simply follow an echo command with:
    Code:
    if [ $? -ne 0 ];then exit;fi ### $? == 141 when session is disconnected
    
    However, the situation where an ordinary user logs in, and the interactive session times out for inactivity, but processes keep running in limbo is not so easy to fix.

    Speaking of which, here are the log entries from a timed-out disconnected session:

    Code:
    Nov 16 15:20:31 ROUTER authpriv.info dropbear[18298]: Child connection from 74.x
    Nov 16 15:20:37 ROUTER authpriv.notice dropbear[18298]: password auth succeeded for 'root' from 74.x
    
    And from a session that terminates "normally":

    Code:
    Nov 16 15:52:59 ROUTER authpriv.info dropbear[20061]: Child connection from 74.x
    Nov 16 15:53:25 ROUTER authpriv.notice dropbear[20061]: password auth succeeded for 'root' from 74.x
    Nov 16 16:37:55 ROUTER authpriv.info dropbear[20061]: exit after auth (root): Exited normally
    
    IMHO, the absence of an "Exited abnormally" message speaks volumes.
    (But of course, that dropbear didn't exit at all. :)


    The shell problems that I listed as existing with 1.23 are fixed in the busybox I now use, and that also includes an awk bug that I had reported to the Busybox devs.

    (Really curious 1.23 users may want to try:
    Code:
    awk BEGIN{print}
    If it says 0 you have the bug)

    [Tomato v1.25.8515 ND BusyBox v1.14.2 (2009-07-02 18:01:37 CEST)]
     
  15. Planiwa

    Planiwa LI Guru Member

    Reina # dispatch disconnected interactive processes from limbo

    Here is a script that will remove all these disconnected processes, assuming that we can disregard the possibility that there are other active ssh or telnet sessions when we do this:

    Code:
    # Dispatch disconnected interactive processes from limbo -- Planiwa 2009-11-17
    # CAUTION: kills all other (inter)active foreground telnet or ssh sessions!
    
    ANC="";PAR=$$
    while [ $PAR != 1 ];do
      set -- $(cat /proc/$PAR/status | grep PPid)
      PAR=$2
      ANC="$PAR $ANC"
    done
    set -- $ANC
    if [ $# -ne 4 ];then echo FAILED: $ANC; exit; fi
    MYBEAR=$3
    MABEAR=$2
    
    set -- $(ps|grep dropbea[r])
    
    while [ $# -gt 0 ];do
      case $1 in
      $MYBEAR|$MABEAR);;
      *) echo kill -9 $1 ; kill -9 $1;;
      esac
      shift 7
    done
    
    [Tomato v1.25.8515 ND BusyBox v1.14.2 (2009-07-02 18:01:37 CEST)]
     
  16. Planiwa

    Planiwa LI Guru Member

    Correction: Dropbear does quit, albeit 6 hours later than expected

    I have now discovered that those dropbear processes (and their cubs) do exit eventually, about 6 hours after Conntrack times out the connection:

    Code:
    Nov 22 19:59:58 ROUTER authpriv.info dropbear[27249]: Child connection from x.x.x.x:29263
    Nov 22 20:00:02 ROUTER authpriv.notice dropbear[27249]: password auth succeeded for 'root' from x.x.x.x:29263
    Nov 23 04:22:34 ROUTER authpriv.info dropbear[27249]: exit after auth (root): error reading: Connection timed out
    
    (TCP Established Timeout == 1200)


    [Tomato v1.25.8515 ND BusyBox v1.14.2 (2009-07-02 18:01:37 CEST)][/QUOTE]
     
  17. Toastman

    Toastman Super Moderator Staff Member Member

    Just FYI - the instances I reported where logging in to the GUI sometimes causes a reboot, my web monitoring browser always initially opens the "status" page, nothing unusual, and usually if I get past that I don't get any troubles, at least not that I can remember! I also can't recall whether the pages might have been accessed within 6 hours previously, so that a link might be inferred. I'll keep an eye on it.
     
  18. Planiwa

    Planiwa LI Guru Member

    I had mentioned previously that some GUI pages generate lots of connections.
    The default Refresh setting of the Status/Overview page generates a new connection every 3 seconds. With a Time_Wait timeout of 120 s, this means a steady state of up to 40 connections always waiting to be buried:

    Code:
    # ctv -S 8080
    1124.11:07:01
    -RTR 8080/443  59396    <remote-admin tcp   115  5  TW a
    -RTR 8080/443  59425    <remote-admin tcp    99  5  TW a
    -RTR 8080/443  59497    <remote-admin tcp    94  5  TW a
    -RTR 8080/443  59506    <remote-admin tcp    90  5  TW a
    -RTR 8080/443  59514    <remote-admin tcp    85  5  TW a
    -RTR 8080/443  59520    <remote-admin tcp    79  5  TW a
    -RTR 8080/443  59536    <remote-admin tcp    73  5  TW a
    -RTR 8080/443  59546    <remote-admin tcp    68  5  TW a
    -RTR 8080/443  59555    <remote-admin tcp    62  5  TW a
    -RTR 8080/443  59564    <remote-admin tcp    58  5  TW a
    -RTR 8080/443  59567    <remote-admin tcp    42  5  TW a
    -RTR 8080/443  59596    <remote-admin tcp    35  5  TW a
    -RTR 8080/443  59602    <remote-admin tcp    32  5  TW a
    -RTR 8080/443  59609    <remote-admin tcp    26  5  TW a
    -RTR 8080/443  59612    <remote-admin tcp    20  5  TW a
    -RTR 8080/443  59623    <remote-admin tcp    16  5  TW a
    -RTR 8080/443  59624    <remote-admin tcp    11  5  TW a
    -RTR 8080/443  59628    <remote-admin tcp     4  5  TW a
    -RTR 8080/443  59632    <remote-admin tcp     2  5  E a
              SEARCH: 8080
    1124.11:07:03
    # 
    
    
    Real Time Traffic refreshes every 2 sconds for 60 connections:

    Code:
    # ctv -S 8080
    1124.11:34:38
    -RTR 8080/443  54946    <remote-admin tcp   119  5  TW a
    -RTR 8080/443  54949    <remote-admin tcp   118  5  TW a
    -RTR 8080/443  54948    <remote-admin tcp   118  5  TW a
    -RTR 8080/443  54953    <remote-admin tcp   117  5  TW a
    -RTR 8080/443  54952    <remote-admin tcp   117  5  TW a
    -RTR 8080/443  54957    <remote-admin tcp   113  5  TW a
    -RTR 8080/443  54979    <remote-admin tcp   112  5  TW a
    -RTR 8080/443  54985    <remote-admin tcp   109  5  TW a
    -RTR 8080/443  54995    <remote-admin tcp   108  5  TW a
    -RTR 8080/443  54997    <remote-admin tcp   106  5  TW a
    -RTR 8080/443  55005    <remote-admin tcp   104  5  TW a
    -RTR 8080/443  55009    <remote-admin tcp   103  5  TW a
    -RTR 8080/443  55016    <remote-admin tcp   100  5  TW a
    -RTR 8080/443  55020    <remote-admin tcp    98  5  TW a
    -RTR 8080/443  55025    <remote-admin tcp    96  5  TW a
    -RTR 8080/443  55028    <remote-admin tcp    93  5  TW a
    -RTR 8080/443  55026    <remote-admin tcp    93  5  TW a
    -RTR 8080/443  55029    <remote-admin tcp    90  5  TW a
    -RTR 8080/443  55030    <remote-admin tcp    87  5  TW a
    -RTR 8080/443  55031    <remote-admin tcp    86  5  TW a
    -RTR 8080/443  55034    <remote-admin tcp    83  5  TW a
    -RTR 8080/443  55046    <remote-admin tcp    81  5  TW a
    -RTR 8080/443  55054    <remote-admin tcp    80  5  TW a
    -RTR 8080/443  55058    <remote-admin tcp    77  5  TW a
    -RTR 8080/443  55062    <remote-admin tcp    76  5  TW a
    -RTR 8080/443  55066    <remote-admin tcp    75  5  TW a
    -RTR 8080/443  55074    <remote-admin tcp    71  5  TW a
    -RTR 8080/443  55068    <remote-admin tcp    71  5  TW a
    -RTR 8080/443  55082    <remote-admin tcp    69  5  TW a
    -RTR 8080/443  55089    <remote-admin tcp    66  5  TW a
    -RTR 8080/443  55092    <remote-admin tcp    64  5  TW a
    -RTR 8080/443  55099    <remote-admin tcp    63  5  TW a
    -RTR 8080/443  55113    <remote-admin tcp    59  5  TW a
    -RTR 8080/443  55104    <remote-admin tcp    59  5  TW a
    -RTR 8080/443  55116    <remote-admin tcp    55  5  TW a
    -RTR 8080/443  55114    <remote-admin tcp    55  5  TW a
    -RTR 8080/443  55118    <remote-admin tcp    52  5  TW a
    -RTR 8080/443  55125    <remote-admin tcp    50  5  TW a
    -RTR 8080/443  55127    <remote-admin tcp    48  5  TW a
    -RTR 8080/443  55128    <remote-admin tcp    45  5  TW a
    -RTR 8080/443  55129    <remote-admin tcp    43  5  TW a
    -RTR 8080/443  55131    <remote-admin tcp    42  5  TW a
    -RTR 8080/443  55132    <remote-admin tcp    41  5  TW a
    -RTR 8080/443  55138    <remote-admin tcp    39  5  TW a
    -RTR 8080/443  55145    <remote-admin tcp    36  5  TW a
    -RTR 8080/443  55150    <remote-admin tcp    33  5  TW a
    -RTR 8080/443  55155    <remote-admin tcp    31  5  TW a
    -RTR 8080/443  55153    <remote-admin tcp    31  5  TW a
    -RTR 8080/443  55159    <remote-admin tcp    24  5  TW a
    -RTR 8080/443  55167    <remote-admin tcp    22  5  TW a
    -RTR 8080/443  55170    <remote-admin tcp    19  5  TW a
    -RTR 8080/443  55175    <remote-admin tcp    13  5  TW a
    -RTR 8080/443  55182    <remote-admin tcp    11  5  TW a
    -RTR 8080/443  55186    <remote-admin tcp     8  5  TW a
    -RTR 8080/443  55195    <remote-admin tcp     6  5  TW a
    -RTR 8080/443  55201    <remote-admin tcp     4  5  TW a
    -RTR 8080/443  55213    <remote-admin tcp     2  5  TW a
    -RTR 8080/443  55220    <remote-admin tcp     1  5  E a
              SEARCH: 8080
    1124.11:34:42
    #
    
    Having several GUI windows open, trying to figure out what's going on, can easily add hundreds of connections.

    Also, I have found that 6 hours appears to be the minimum time to bury dropped-dead bears. It may be as much as 8 hours.

    (I've made some changes to cflp -- to skip unremarkable reports. But interest / utilisation / feedback seems to be minimal. :)
     

Share This Page