1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Router rebooting every 2 weeks

Discussion in 'Tomato Firmware' started by Morac, Dec 20, 2009.

  1. Morac

    Morac Network Guru Member

    I can't figure out why, but my Linksys WRT54GL running Tomato 1.27 reboots at least every 2 weeks. Anyone else having this problem?
     
  2. Planiwa

    Planiwa LI Guru Member

    It is clear that these routers do have a tendency to reboot on their own.
    There is general agreement that this is unacceptable for a well-designed router.

    There is also a widespread expectation that consumer appliances are unreliable junk that fails frequently, and is owned by people whose major value is "cheap", who have no interest in understanding anything, and who consider "reboot" the only interaction that they are capable of, and who find that action to be perfectly acceptable.

    Inasmuch as Tomato turns a consumer appliance into something that acts more like a well-designed router, we have a problem with expectations.

    If we were able to have this choice, a few adept users would prefer a router (or any computer), if it does crash at all, to record relevant system state information in NVM, or at least "leave the kernel panic message on the console" and stop, while the vast majority of ordinary users could not care less about reasons and would want the darn thing to reboot immediately, never mind what caused it.

    I have spent many hundreds of hours making monitoring tools and monitoring various vital data, hoping to find what triggers crashes.

    Apart from connection storms, which are perfectly preventable, I have not found what causes the routine reboots that definitely appear to be triggered by monitoring activity.

    I had hoped that if enough people monitor their systems we could get enough data to learn something, but I have found that most people end up learning to live with the problem.

    Few are willing to participate in a collaborative problem solving effort on obsolete technology, trying to discover why it breaks every two weeks. If it were every two hours, it would be different. But even with an average of 2 days and a standard deviation of 2 days, most would find it more troublesome to monitor it than to just accept it.

    So, ... yes, reboots happen.

    If you want to do monitoring, there are tools, e.g. vit:

    Code:
    # vit -s60 -d
    LANClients Connections Free+Cached,Free(KiBi) Load Processes FWUptime WANUptime SysUptime Interface<Dn>Up ...
    VIT: 091219_23:26:35 br0:3  17 FC:6492 F:1328 .1 1/23 47h +16s 3d ppp0<3390>1823 eth1<3519>1881
    VIT: 091219_23:27:36 br0:3  35 FC:6492 F:1328 .1 2/23 47h +16s 3d ppp0<3397>1823 eth1<3526>1881
    VIT: 091219_23:28:38 br0:3  55 FC:6492 F:1328 .0 1/23 47h +16s 3d ppp0<3404>1823 eth1<3533>1882
    VIT: 091219_23:29:43 br0:3  74 FC:6396 F:1228 .0 2/23 47h +16s 3d ppp0<3406>1823 eth1<3536>1882
    VIT: 091219_23:30:45 br0:3 142 FC:6396 F:1232 .0 1/23 47h +16s 3d ppp0<3412>1823 eth1<3541>1882
    VIT: 091219_23:31:46 br0:3 116 FC:6396 F:1232 .1 2/23 47h +16s 3d ppp0<3414>1824 eth1<3543>1882
    VIT: 091219_23:32:48 br0:3  36 FC:6396 F:1232 .0 1/23 47h +16s 3d ppp0<3416>1824 eth1<3546>1882
    VIT: 091219_23:33:49 br0:3  56 FC:6396 F:1232 .0 1/23 47h +16s 3d ppp0<3421>1824 eth1<3551>1883
    VIT: 091219_23:34:51 br0:3  48 FC:6396 F:1232 .1 2/23 47h +16s 3d ppp0<3424>1824 eth1<3554>1883
    VIT: 091219_23:35:52 br0:3 118 FC:6396 F:1232 .0 1/23 47h +16s 3d ppp0<3426>1824 eth1<3556>1883
    VIT: 091219_23:36:53 br0:3 169 FC:6396 F:1232 .0 1/23 47h +16s 3d ppp0<3429>1825 eth1<3559>1883
    VIT: 091219_23:37:55 br0:3  79 FC:6396 F:1232 .0 1/23 47h +16s 3d ppp0<3430>1825 eth1<3560>1883
    VIT: 091219_23:38:56 br0:3  88 FC:6396 F:1232 .0 1/23 47h +16s 3d ppp0<3432>1825 eth1<3561>1883
    VIT: 091219_23:39:57 br0:3 108 FC:6396 F:1232 .1 1/23 47h +16s 3d ppp0<3432>1825 eth1<3562>1884
    VIT: 091219_23:40:59 br0:3 126 FC:6396 F:1232 .1 1/23 47h +16s 3d ppp0<3433>1825 eth1<3563>1884
    VIT: 091219_23:42:00 br0:3  90 FC:6396 F:1232 .0 1/23 47h +16s 3d ppp0<3433>1825 eth1<3563>1884
    VIT: 091219_23:43:01 br0:3  48 FC:6396 F:1232 .0 1/23 47h +16s 3d ppp0<3433>1825 eth1<3563>1884
    VIT: 091219_23:44:03 br0:3  79 FC:6396 F:1232 .1 1/23 47h +16s 3d ppp0<3433>1825 eth1<3563>1884
    
    There's also CFLP.

    The point would be to monitor continually for 2 weeks in the faint hope of catching a condition a few seconds before the crash.

    Easier to live with it. :)

    Here is a list of possible crash-reboot triggers:

    [1] P2P connection storm causing Conntrack table memory crash
    [2] WiFi connection from particular clients causing driver crash
    [3] HTTPS connection attempt causing system crash
    [4] SSH connection attempt causing system crash
    [5] runaway or orphaned shell scripts causing memory depletion resulting in vital process termination

    any others?
     
  3. Morac

    Morac Network Guru Member

    What's weird is prior to them occurring, I had uptimes of many months. Now it seems I can't go more than two weeks without a restart. I'm not sure what the trigger is.
     
  4. Planiwa

    Planiwa LI Guru Member

    Most likely a different usage pattern. But who wants to hear that? :)
     
  5. karogyoker

    karogyoker Addicted to LI Member

    Trojan causes connection storms to DDoS sites by your zombie pc.
     
  6. Morac

    Morac Network Guru Member

    Since it tends to reboot when my PC isn't even on and I run NIS 2010, I highly doubt that.

    The only thing that's changed recently is I got an iPhone 3GS which can use the wireless network. I wonder if that has something to do with it? Though I don't see how.

    The last time I actually saw it reboot was when I switched my laptop from wireless network to wired, but I think that was because I had both active simultaneously. Other times it's rebooted when I haven't been doing anything.
     
  7. Planiwa

    Planiwa LI Guru Member

    Where's the data?

    For example, where are the iPhone's DHCP entries in the log?

    (If your iPhone generates excessive log entries, this may be a problem -- see #2 of the list that I gave).
     
  8. Morac

    Morac Network Guru Member

    Well it rebooted again yesterday at a seemingly random time (nothing was happening), just over 14 days after it last rebooted.
     
  9. rickh57

    rickh57 LI Guru Member

    I'm seeing a similar reboot issue. I added a Roku box to my network over Christmas to be able to stream Netflix movies.

    Since that box has been added, my router (WRT54GL) has been rebooting frequently. How do I add the vit monitoring tool? I googled it and couldn't find any information about getting and installing it.
     
  10. Morac

    Morac Network Guru Member

    Well it rebooted another 2 times in 3 days. I actually saw it reboot the last time or more specifically I noticed my laptop lost connection briefly. I wasn't doing anything at the time and my iPhone was turned off (though it had been on earlier).

    I decided to clear the NVRAM from the configuration screen, flash 1.25 and then clear the NVRAM again and set up everything manually. Hopefully this solves whatever problems I'm having.

    If not I have no idea what's wrong. Edit: I do have about 15 months of statistics data which I restore after a reset. I wonder if that's causing problems?
     
  11. Morac

    Morac Network Guru Member

    Well it just reboot again. It seems to be rebooting daily now. I'm going to try clearing out the bandwidth data and see if that helps. If not I have no idea what's wrong. Maybe it's going bad?
     
  12. Tim2k

    Tim2k Addicted to LI Member

    Maybe software isn't the issue and the RAM is defect? That's quite common for normal PCs. I don't know if there is an equivalent for Memtest86+ available for MIPS, but i think there must be some other ways to determine if the RAM is okay. :)
     
  13. Morac

    Morac Network Guru Member

    If it keeps rebooting I'm going to have to replace it I guess since at this point I've done everything I can think of short of flashing back the original Linksys firmware (which I haven't ran in years). What I don't understand is that it runs fine until it just suddenly reboots on it's own. About 10 seconds later it's up and running again as if nothing had happened. I had an old WRT54G fail and that one simply died. I've never had a router behave like this.

    Unfortunately if it is broken, I'll most likely upgrade to a 802.11n router. As far as I'm aware Tomato won't work on any 802.11n routers so that would mean giving up on Tomato which is a shame because I love the ability to check bandwidth as well as the actual outgoing ip connections.
     
  14. TexasFlood

    TexasFlood Network Guru Member

    Without any logs or VIT data, it's just guessing & not really even educated guessing. Hopefully removing the bandwidth data will help. On a lighter note, teddy_bear recently announced (here) an "experimental" beta version of his USB mod based on Linux Kernel 2.6 which supports the Asus RT-N16 802.11n router. Apparently those who have loaded it have reported success so far. I actually have a RT-N16 but haven't had a chance to try it yet, but believe me I will soon.
     
  15. Morac

    Morac Network Guru Member

    If it's not the bandwidth data (i.e. it reboots again after clearing it), then I'm at a loss since like I mentioned I ran 1.25 for months on end without reboots and it never rebooted on its own. The first unexplained reboot actually occurred a few days prior to upgrading to 1.27 and then persisted after upgrading and eventually downgrading and wiping all settings.

    The only explanation I can come up with is a hardware problem since clearing the NVRAM should wipe out all possible setting problems. The Admin->Configuration option to clear the NVRAM settings works does it not? At least it appeared to wipe all settings.

    I supposed for $55 I could pick up another WRT54GL to test with, but like I mentioned if I'm going to buy a new router, I might as well get an N router (especially since my WRT54GL isn't too found of my microwave).
     
  16. Toastman

    Toastman Super Moderator Staff Member Member

    The causes of reboots in the past have been:

    a) Certain clients associating with incompatible wireless adapters. For the most part, this was solved with the ND issues in more recent Tomato versions, and a simple script in earlier ND versions. Non ND versions will still have this issue, which mostly involved Intel cards - so if you are using non ND driver version, change it to ND. Remember it doesn't have to be your wireless card - it might be anybody's trying to connect. That won't show in your log if they don't know the encryption code, but it can crash your router.
    b) Line glitches, brownouts, which may be short in duration and not noticed. Can also cause NVRAM corruption and bricking.
    c) Faulty power supply. Old and failing capacitors in the router's onboard voltage regulators - very common problem in most consumer hardware.
    d) Lack of memory caused by various issues, not all of which are clear, but the most common one is connection storms from P2P or other badly behaved apps.

    My bet is on (d). Unless you run VIT you will not see anything happening unless you are very lucky, as many thousands of connections can open in just a few seconds, the router has no memory to do anything, and after the reboot of course there are no visible logs anyway.

    Notes on P2P clients:

    1) Some P2P clients such as STEAM are notorious for generating huge numbers of connections, from what I've read.
    2) Bit Torrent DNA is installed clandestinely by BT and can take most of your bandwidth to assist BT in making cash at your expense.
    3) DHT as a method for gaining downloads from peers seems to produce no visible benefit here, but does take a helluva lot of bandwidth.

    Trojans/Viruses can cause machines to open large numbers of connections at random. Occasionally someone here starts sending out thousands of mails using SMTP. Limiting the number of simultaneous mail connections solved that one and I haven't had problems for over 18 months now.

    Best way to at least reduce the problem is with aggressive timeout settings in Conntrack and the use of scripts in your firewall box to attempt to limit the number of connections that can be opened. This usually deals with most such occurrences in my experience. Try these settings http://www.linksysinfo.org/forums/showpost.php?p=337415&postcount=3

    While the router may be faulty, going on what you've said I don't believe that is the case.

    Hope this might give you some ideas!
     
  17. though

    though Network Guru Member

    i would rule out HEAT and the power supply. if you can, get the router in a cooler place (put a fan on it?), turn down the wireless power (maybe try 20), then try a different power supply.

    report back....
     
  18. Morac

    Morac Network Guru Member

    I can eliminate most from the list you posted simply for the fact that the problem started up very recently and I had no issues prior to that.

    a) I'm not using any new devices that I haven't already been using for years with the exception of the iPhone. Since it's rebooted when the iPhone was turned off, that's not likely the cause. I do have an Intel wireless card in my laptop, but like I said I've been using it for many years without issue and haven't upgraded the wireless card drivers in years either.

    b) The router is attached to a UPS so it's for all intents and purposes immune to power line issues.

    c) Can't eliminate this one. I have had a Linksys WRT54G router die do to a power supply problem, but it died suddenly, not like what's happening here.

    d) I don't run P2P applications, but I have done so in the past without issue. No viruses either. I normally don't have more than 70 active connections open at any time. In any case some of the reboots have occurred when my computers were physically powered off. I do have other always on devices that access the Internet (TiVo for example), but none that are new.


    What exactly does VIT (or CFLP) do any where can I get it? I tried searching for it (both) but came up empty.
     
  19. Morac

    Morac Network Guru Member

    The router is not hot to the touch and the room it is in is pretty cold as it is (I have the heat vents shut off in that room). I already have the wireless power set at 28.

    If it was a heat problem, it would have likely happened in the summer when the room gets hot.
     
  20. tomatofan

    tomatofan Addicted to LI Member

    Correction: Steam does not use any kind of P2P and does in fact open very few connections. Now searching for servers, that's another story.
     
  21. Toastman

    Toastman Super Moderator Staff Member Member

    You can see I've not used it, but the forums are full of complaints, so that's what it does uh?

    Yuk.

    VIT is a monitoring tool which you can run on your router to check for connection storms etc. It is written by Planiwa. Search the forum for his posts and you'll find it, he may read this thread, in which case you can ask him for the latest version.
     
  22. Morac

    Morac Network Guru Member

    He actually posted in the thread already, but didn't post any links.

    Edit: Okay found the script, but how do I actually get it over on to the router?
    Edit2: I see Vi is in Tomato, I guess I can use a copy paste method, but it seems like there should be an easier way.
     
  23. rickh57

    rickh57 LI Guru Member

    Thanks for the information about getting vit. I put it into file on a cifs share on a linux box. I have it running on my router with output going to a text file on the same share.
     
  24. Morac

    Morac Network Guru Member

    Well I thought removing the stats fixed the issue, but it didn't since my router rebooted about 11 days later. I had been keeping an eye on it and checked shortly before hand and there were very few connections, plenty of memory and CPU usage was low.

    I did notice that the time of the restart was about the same time I put my laptop into standby. I'm not sure what difference that would make since I've done that tons of times in the last 11 days.

    I still have no idea why it's rebooting, but at least it's not doing it daily anymore.
     
  25. Toastman

    Toastman Super Moderator Staff Member Member

    Morac, what version of tomato? Sorry if it's in the thread, I looked but didn't see it.
     
  26. Morac

    Morac Network Guru Member

    Currently it's 1.27, but I tried reverting to 1.25 with no difference. I've also cleared the NVRAM and set everything up from scratch as well as wiped out the statistics. Nothing seems to have helped.

    I first noticed the problem on November 29, 2009. I have no idea what changed at that point since prior to that I had been running 1.25 with an uptime of about 4 months with no spontaneous restarts. The only new wireless device I got was an iPhone 3GS, but I had that for a few weeks before the restarts started and the iPhone is rarely turned on when a restart occurs.

    Since then, except for a few days where the router was rebooting daily, it has restarted anywhere from every 9 to every 14 days. I've started keeping a log of the restart times to see if I can figure out if my (old unchanged) laptop is even on at those times. If it is on, there is very little traffic happening at the time. At the moment though I have no idea what the trigger is.

    Edit:

    I found a similar report here for the same device that started about the same time. My WRT54GL's ip address is always 192.168.1.1 so I don't know if it's reverting or not.
     
  27. Toastman

    Toastman Super Moderator Staff Member Member

    Are you trying normal driver or ND?

    Considering all you have said, it does appear to be a known association glitch, but this time it is with the iPhone. I don't think this will actually help you, but in my block we have 2 supposedly identical iPhones (I don't know anything about the model number though). One always connects fine, but the other connects about 70% of the time - on other occasions, the owner gets quite steamed up and keeps complaining about it. When this happens there are no records in the log of any connection attempts. In my experience mobile devices are trouble.

    BTW - Disregard the link you mentioned - this problem is almost certainly nothing to do with failed hardware.

    Did you ever read this thread about wifi connection problems?

    http://www.linksysinfo.org/forums/showthread.php?t=60509
     
  28. Morac

    Morac Network Guru Member

    I never looked at it, but I'll mention I'm not having an association problem. All my devices (including the iPhone) associate and work with the WRT54GL running Tomato the same as they always have. The only difference now is that every week or two the router restarts for no reason whatsoever.

    I do have an Intel wireless card, but it's the same one I've had for around 5 years and never had a problem with it and the WRT54GL w/ tomato. I haven't changed the drivers either.

    I'm running the normal driver BTW as I read the ND drivers can brick routers that it's not designed for, plus it's never caused problems in the past. I didn't see if it would work with the WRT54GL or not.

    Can a neighbor's wireless device cause a reboot even if it's not associating with my router?
     
  29. Toastman

    Toastman Super Moderator Staff Member Member

    Good thinking Batman. A neighbour's wireless may well try to associate with your router, and cause it to reboot. You wouldn't even know about it of course. If so the ND driver should help. I would recommend you try Victek's version RAF 8515.2 ND from his website http://victek.is-a-geek.com/tomato.html as some versions of original Tomato do not have the ND driver implemented fully. It's fine on the GL by the way.

    It might be the answer - worth a try anyway.
     
  30. Morac

    Morac Network Guru Member

    I guess I could try it. Does he have a 1.27 based version?

    Also I'm trying to figure out what the following means:
    I get 9 so I assume that means either version will work?

    Finally is the interface in Spanish or English?
     
  31. Toastman

    Toastman Super Moderator Staff Member Member

    No, he doesn't, not yet. The 1.27 version seems to me to be less stable than previously thought, there have been many posts suggesting problems, and Victek's earlier versions have been trouble-free. That's why I suggest you try the old one first. That way, you will know the problem isn't the firmware. Victek's GUI is in English. You can flash either .trx or .bin files from existing Tomato.

    Don't worry about the wl0_corerev confusion. Pretty well everybody uses ND on the GL these days.
     
  32. Morac

    Morac Network Guru Member

    Well I went 17 days this time before the router rebooted itself. I decided to install the 1.28.1806 ND beta firmware and see what happens.

    The firmware passed my ping packet flood test (I send large ping packets several hundred times a second and look for drops) so the wireless connection appears stable. Speed over time tests have been erratic with lots of high peaks and valleys instead of a flat line, but that happens somewhat regularly. Unfortunately I forgot to test this before upgrading so I don't know if it's firmware related or not.

    I'm going to run this for a while and see if it reboots again on it's own. If it does, hopefully by that time Victek will have his newer firmware out so I can try that.
     
  33. szfong

    szfong Network Guru Member

    I'm getting Tomato reboots every couple of weeks also. I have a pfsense router which monitors the main connection and the Tomato as a wireless switch. Anyhow, the previous version I remember running for months without a reboot. Something is definitely causing Tomato 1.27 to auto reboot. I'm on a Buffalo WHR-HP-G54.
     
  34. myersw

    myersw Network Guru Member

    I had issues with reboots when running Tomato 1.27 ND on a WRT54GL. About once a week or sometimes every two days. No heavy P2P or TB just light web browsing and would reboot.
    Installed VicTek's RAF 8515.2 ND and have run without reboots for over a month. This would include P2P and TB traffic as well as web browsing. All clients were the same in both the 1.27 and Victek usage. VicteK is much more stable for me.
    Love Tomato and have a ASUS RT-N16 on order to upgrade to.
    --bill
     
  35. GOPer

    GOPer Network Guru Member

    Same, I thought maybe it was my routers (3) WRT54GL, 54Gv4 and 54Gv1.1 each would show the up time has restarted. My up time last about 3 days maybe 5 at best.

    I put DD-WRT v24sp2 on them all to see if it's a hardware problem. My up time with DD-WRT has been 26 days so far. Last time I had good up time with Tomato was v1.23 I think.
     

Share This Page