1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Frequent WAN disconnects

Discussion in 'Tomato Firmware' started by woody, Mar 8, 2012.

  1. woody

    woody Networkin' Nut Member

    I am using TomatoUSB v1.28.8754 on an Asus WL-520GU router. My WAN connection is DSL using PPoE with a dynamic IP.

    I've been running the current setup on this router for 2 years or more without problems.
    Recently, I started having problems with the WAN disconnecting several times during the day, sometimes every few minutes. When I look at the status of the router, it shows that it's still connected, but I'm unable to ping anything on the internet. If I press the disconnect button, the status changes to "Disconnected", but I am unable to reconnect to the WAN unless I reboot the router.

    I have the log levels set to 8, but nothing shows up in the logs when the disconnects occur.

    I would like some suggestions for how to troubleshoot this problem. It's possible that the problem is with the router hardware, but I have no way to know for sure. Can anybody suggest something I can try to diagnose this?
     
  2. Planiwa

    Planiwa LI Guru Member

    Might want to look here to start:

    http://www.linksysinfo.org/index.php?threads/diagnosing-ppp-errors.30073/

    What's your modem?
     
  3. woody

    woody Networkin' Nut Member

    Thanks for the suggestions. I have an Actiontec 1701D modem set up in bridge mode. I think my problem is not with the modem. When the connection drops, restarting the modem doesn't help, but the connection restarts as soon as I reboot the router. Also, unlike the discussion in the thread you provided, I don't get any indication in the router logs that the connection has dropped.
     
  4. Planiwa

    Planiwa LI Guru Member

    Did you look at this checklist?:

    http://www.dslreports.com/forum/r22585731-RFC-Connection-Speed-Problems-Checklist

    You might also want to look at this:

    http://www.dslreports.com/forum/r22548658-Torrents-crash-my-Router


    If you want to understand what is happening, I propose that you stop rebooting the router.

    Instead, try to have an intelligent conversation with the router.

    The router has the answers, and rebooting destroys those answers.

    You say that the router still responds to web GUI access?

    That you can check the message log?

    Check the Advanced IP-Conntrack page?

    And find out the load average and free RAM?

    Can you still ssh or telnet to it?

    Can you run ifconfig?

    Instead of rebooting the router or the modem, have you considered connecting a computer to the modem and connecting via PPPoE?


    Of course I telnet to the modem and ask it about the state of the DSL line.
    I've never understood why other people don't.
     
  5. Gaius

    Gaius Networkin' Nut Member

    Upgrade to the Shibby custom Tomato build.
     
  6. woody

    woody Networkin' Nut Member

    When my WAN connection drops I'm still able to get into the router using telnet or html. When I look at the router logs, there is nothing to indicate a failure. The Tomato status page shows that the WAN connection is still up, but I'm unable to ping anything on the WAN side. If I look at the number of connections, it doesn't look particularly high, maybe 200 or so.

    If I press the WAN disconnect button on the status page, the WAN shows that it's disconnected, but it takes quite a while for it to connect again. If I reboot the router, it comes back up connected to the WAN.

    When I look at the modem stats, it shows no disconnects at all. Everything looks good on the modem except that the attenuation is high on the transmit side (100 db). But the modem doesn't show any resyncs or disconnects.

    I am not running any bittorrent clients on my LAN. However, I have run bittorrent in the past and it never crashed by router.

    At this point I'm still scratching my head. Does anybody know any status command I can use to show what's happening with the WAN connection on the router?

    Gaius: I don't know anything about the Shibby builds. Is there any reason to think they would address this problem?
     
  7. Planiwa

    Planiwa LI Guru Member

    Let's see if we can get some hard data.

    Is your modem a GT 701D?
    This modem has an event log. Does it show sync events? Does it have error/failure logs? You say that US Attenuation is reported as 100dB. Clearly that is not what it is. This means that the modem (or its counterpart) has a (design) defect. What data does the modem give you? Bitloading? SNR? SNRM? What is the DS Attenuation?

    You say that, when the "disconnect" happens you can't ping the WAN.
    Can you ping the modem? Cand you telnet to it?
    What do ifconfig and netstat tell you?

    Can you ping LAN hosts?
    Can you arping them?

    You say that you can get the router to reset the PPPoE connection.
    But you imply that rebooting the router is faster?!

    How long does it take to do it properly?
    What does the router's message log say, exactly, during that time?
    Do you get a new IP address?

    You say that when you "loose connection" that there are no messages in the router's log.
    What messages do you see if you disconnect the router from the modem for 5 minutes, and then reconnect?

    Dou you "lose connection" even if the LAN is idle?

    Do you "lose connection" if you connect your PC directly to the modem?

    Do you "lose connection" if there is no LAN connected to the router?


    What is your Connections Limit?
    I suggest that you set it to 300 until you have found the cause of this problem.
     
  8. woody

    woody Networkin' Nut Member

    OK, I had several failures today. They seem to last about 3-4 minutes. Here's an example:

    Modem data:
    DSL Status
    VPI: 8
    VCI: 35
    DSL Mode Setting: MMODE
    DSL Negotiated Mode: G.DMT
    Connection Status: Showtime
    Speed (down/up): 8128 / 512 Kbps
    ATM QoS class: UBR
    Near End CRC Errors : 0/-15
    Far End CRC Errors : 0/327
    Near End CRC(Within last 30 mins) : 0/0
    Far End CRC(Within last 30 mins) : 0/0
    Near End RS FEC : 0/-680
    Far End RS FEC : 0/0
    Near End FEC(Within last 30 mins) : 0/0
    Far End FEC(Within last 30 mins) : 0/0
    Discarded Packets(Within last 30 mins): 0
    SNR Margin (Downstream/Upstream): 0/12
    Attenuation (Downstream/Upstream): 100/15

    Modem System log:

    181:40:00 Elapsed Time -- MARK --
    182:00:00 Elapsed Time -- MARK --
    182:20:00 Elapsed Time -- MARK --
    182:40:00 Elapsed Time -- MARK --
    183:00:00 Elapsed Time -- MARK --
    183:20:00 Elapsed Time -- MARK --
    183:40:00 Elapsed Time -- MARK --
    184:00:00 Elapsed Time -- MARK --
    184:20:00 Elapsed Time -- MARK --
    184:40:00 Elapsed Time -- MARK --
    185:00:00 Elapsed Time -- MARK --
    185:20:00 Elapsed Time -- MARK --
    185:40:00 Elapsed Time -- MARK --
    186:00:00 Elapsed Time -- MARK --
    186:20:00 Elapsed Time -- MARK --
    186:40:00 Elapsed Time -- MARK --
    187:00:00 Elapsed Time -- MARK --
    187:20:00 Elapsed Time -- MARK --
    187:40:00 Elapsed Time -- MARK --
    188:00:00 Elapsed Time -- MARK --
    188:20:00 Elapsed Time -- MARK --
    188:40:00 Elapsed Time -- MARK --
    189:00:00 Elapsed Time -- MARK --
    189:20:00 Elapsed Time -- MARK --
    189:40:00 Elapsed Time -- MARK --
    190:00:00 Elapsed Time -- MARK --
    190:20:00 Elapsed Time -- MARK --
    190:40:00 Elapsed Time -- MARK --
    191:00:00 Elapsed Time -- MARK --
    191:20:00 Elapsed Time -- MARK --
    191:40:00 Elapsed Time -- MARK --
    192:00:00 Elapsed Time -- MARK --
    192:20:00 Elapsed Time -- MARK --
    192:40:00 Elapsed Time -- MARK --
    193:00:00 Elapsed Time -- MARK --
    193:20:00 Elapsed Time -- MARK --
    193:40:00 Elapsed Time -- MARK --
    194:00:00 Elapsed Time -- MARK --
    194:20:00 Elapsed Time -- MARK --
    194:40:00 Elapsed Time -- MARK --
    195:00:00 Elapsed Time -- MARK --
    195:20:00 Elapsed Time -- MARK --
    195:40:00 Elapsed Time -- MARK --
    196:00:00 Elapsed Time -- MARK --
    196:20:00 Elapsed Time -- MARK --
    196:40:00 Elapsed Time -- MARK --
    197:00:00 Elapsed Time -- MARK --

    Can't ping anything on the internet. Can ping the modem. Did not capture ifconfig and netstat.

    Can ping LAN ip's and they show up in ARP.

    Router doesn't detect lost connection for 3-4 minutes. At that point it detects the lost connection and restarts the PPoE session.
    I can force this by rebooting the router or by using the command line: 'service firewall start'

    I do get a new IP address after connection starts up.
    Here's the modem log. Note that connection failed around 17:50. The router indicates 'Disconnected' at 15:53:24

    Mar 13 17:00:01 Asus-Tomato syslog.info root: -- MARK --
    Mar 13 17:04:43 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPINFORM(br0) 192.168.123.54 00:26:18:76:f6:82
    Mar 13 17:04:43 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPACK(br0) 192.168.123.54 00:26:18:76:f6:82 Desktop-AMD
    Mar 13 17:19:47 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPINFORM(br0) 192.168.123.54 00:26:18:76:f6:82
    Mar 13 17:19:47 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPACK(br0) 192.168.123.54 00:26:18:76:f6:82 Desktop-AMD
    Mar 13 17:21:59 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPDISCOVER(br0) d8:a2:5e:51:16:80
    Mar 13 17:21:59 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPOFFER(br0) 192.168.123.59 d8:a2:5e:51:16:80
    Mar 13 17:22:00 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPREQUEST(br0) 192.168.123.59 d8:a2:5e:51:16:80
    Mar 13 17:22:00 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPACK(br0) 192.168.123.59 d8:a2:5e:51:16:80 Linda-iPad
    Mar 13 17:29:34 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPREQUEST(br0) 192.168.123.50 3c:d0:f8:97:e4:80
    Mar 13 17:29:34 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPACK(br0) 192.168.123.50 3c:d0:f8:97:e4:80 Linda-iPhone
    Mar 13 17:30:49 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPREQUEST(br0) 192.168.123.61 b4:07:f9:7f:b8:4b
    Mar 13 17:30:49 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPACK(br0) 192.168.123.61 b4:07:f9:7f:b8:4b Android-smartphone
    Mar 13 17:34:51 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPINFORM(br0) 192.168.123.54 00:26:18:76:f6:82
    Mar 13 17:34:51 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPACK(br0) 192.168.123.54 00:26:18:76:f6:82 Desktop-AMD
    Mar 13 17:35:31 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPREQUEST(br0) 192.168.123.50 3c:d0:f8:97:e4:80
    Mar 13 17:35:31 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPACK(br0) 192.168.123.50 3c:d0:f8:97:e4:80 Linda-iPhone
    Mar 13 17:38:25 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPREQUEST(br0) 192.168.123.50 3c:d0:f8:97:e4:80
    Mar 13 17:38:25 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPACK(br0) 192.168.123.50 3c:d0:f8:97:e4:80 Linda-iPhone
    Mar 13 17:41:34 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPREQUEST(br0) 192.168.123.50 3c:d0:f8:97:e4:80
    Mar 13 17:41:34 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPACK(br0) 192.168.123.50 3c:d0:f8:97:e4:80 Linda-iPhone
    Mar 13 17:45:50 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPREQUEST(br0) 192.168.123.61 b4:07:f9:7f:b8:4b
    Mar 13 17:45:50 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPACK(br0) 192.168.123.61 b4:07:f9:7f:b8:4b Android-smartphone
    Mar 13 17:49:12 Asus-Tomato user.debug ddns-update[31025]: Breaking /var/lock/ddns.lock
    Mar 13 17:49:41 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPDISCOVER(br0) d8:a2:5e:51:16:80
    Mar 13 17:49:41 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPOFFER(br0) 192.168.123.59 d8:a2:5e:51:16:80
    Mar 13 17:49:42 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPREQUEST(br0) 192.168.123.59 d8:a2:5e:51:16:80
    Mar 13 17:49:42 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPACK(br0) 192.168.123.59 d8:a2:5e:51:16:80 Linda-iPad
    Mar 13 17:49:54 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPINFORM(br0) 192.168.123.54 00:26:18:76:f6:82
    Mar 13 17:49:54 Asus-Tomato daemon.info dnsmasq-dhcp[30396]: DHCPACK(br0) 192.168.123.54 00:26:18:76:f6:82 Desktop-AMD
    Mar 13 17:53:24 Asus-Tomato daemon.warn pppoe[26406]: LCP appears to be disconnected (pending: 5).
    Mar 13 17:53:30 Asus-Tomato daemon.notice pppoe[26406]: Disconnected.
    Mar 13 17:53:30 Asus-Tomato daemon.notice pppoe[26406]: Connect time 59.6 minutes.
    Mar 13 17:53:30 Asus-Tomato daemon.notice pppoe[26406]: Sent 2474873 bytes, received 2909184 bytes.
    Mar 13 17:53:37 Asus-Tomato daemon.notice pppoe[26406]: Connected.
    Mar 13 17:53:37 Asus-Tomato daemon.notice pppoe[26406]: IP Address: 74.176.159.206
    Mar 13 17:53:37 Asus-Tomato daemon.notice pppoe[26406]: DNS Address: 205.152.37.23, 205.152.150.23
    Mar 13 17:53:37 Asus-Tomato daemon.info dnsmasq[30396]: exiting on receipt of SIGTERM
    Mar 13 17:53:37 Asus-Tomato daemon.info dnsmasq[31121]: started, version 2.55 cachesize 1500
    Mar 13 17:53:37 Asus-Tomato daemon.info dnsmasq[31121]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N DHCP TFTP
    Mar 13 17:53:37 Asus-Tomato daemon.info dnsmasq[31121]: asynchronous logging enabled, queue limit is 5 messages
    Mar 13 17:53:37 Asus-Tomato daemon.info dnsmasq-dhcp[31121]: DHCP, IP range 192.168.123.50 -- 192.168.123.61, lease time 1d
    Mar 13 17:53:37 Asus-Tomato daemon.info dnsmasq[31121]: reading /etc/resolv.dnsmasq
    Mar 13 17:53:37 Asus-Tomato daemon.info dnsmasq[31121]: using nameserver 205.152.37.23#53
    Mar 13 17:53:37 Asus-Tomato daemon.info dnsmasq[31121]: using nameserver 8.8.8.8#53
    Mar 13 17:53:37 Asus-Tomato daemon.info dnsmasq[31121]: using nameserver 8.8.4.4#53
    Mar 13 17:53:37 Asus-Tomato daemon.info dnsmasq[31121]: read /etc/hosts - 2 addresses
    Mar 13 17:53:37 Asus-Tomato daemon.info dnsmasq[31121]: read /etc/hosts.dnsmasq - 12 addresses

    If I lose connection, disconnect, then reconnect, I see normal startup process in the router.

    Not sure if the LAN is ever completely idle, but I get disconnects all during the day whether anyone is using a device or not.

    Haven't tried that. Because this is sporadic, I would have to disconnect everything else on the LAN to try that.

    Don't know; guess I could disconnect everything at night and see how the uptime looks.

    It's set at 8192 now. I don't want to change it until I've ruled out other stuff. Otherwise, if the disconnects stop, I won't be able to see why.
     
  9. Planiwa

    Planiwa LI Guru Member

    Your modem stats are severely lacking.
    This could be (in part) the fault of its peer DSLAM card.
    I have seen IKNS cards that produce useless stats like this.
    In particular --
    There is no definite indication of Sync uptime, resyncs since boot, errors since sync.
    There is no DS Atten. and no DS SNR or SNRM.
    And no bitloading.
    I would strongly suggest getting rid of those "Marks", since they destroy any real records, such as resync events.




    Finally you are noticing or acknowledging this:

    Mar 13 17:53:24 Asus-Tomato daemon.warn pppoe[26406]: LCP appears to be disconnected (pending: 5).

    This is a very different story from:

    I have the log levels set to 8, but nothing shows up in the logs when the disconnects occur.

    Also, unlike the discussion in the thread you provided, I don't get any indication in the router logs that the connection has dropped.

    When I look at the router logs, there is nothing to indicate a failure.


    So, disregarding your previous stories, :) and considering this new fact . . .

    It would appear that your problem is either with your DSL line, or with the PPPoE connection over that line.
    The causes could range from bad line condition to faulty modem/DSLAM, to faulty AC, etc.


    You may feel that your router should be able to detect the problem immediately and correct it. But the router operates at a higher level.

    If you wanted to detect the PPP loss sooner, you could set up a process that pings the gateway every 15 seconds and restarts the WAN if it receives 3 consecutive non-echoes. Then you would "know" in less than 1 minute.

    If I were in your situation, I would obtain (borrow?) a decent modem.

    I would certainly make a precise log of these outages over a week and then try to find an intelligent rep at the ISP.

    You seem to think that rebooting the router and then letting it establish a PPP connection is faster than letting the router (re-)establish a PPP connection.

    This may be worth thinking through. In particular, you may ask yourself:

    Do I believe that re-boot +WAN is faster than re-WAN, and if so, what measured evidence do I have, and what possible explanation.
     
  10. woody

    woody Networkin' Nut Member

    So, to follow up on this thread, I continued to diagnose the problem by writing a script that pings external servers every few minutes and then logs the result. I discovered that intermittent ping failures were happening frequently, even when the disconnect was not long enough to be noted.

    I eventually talked to a helpful technician at AT&T who told me that my problem was with the password that my PPoE connection was using. I've had this DSL service for 15 years or more using that password. The tech said that AT&T recently changed from using the original email password associated with the account to a separate, automatically generated password just for use with DSL. They supposedly sent this change in an email to my ancient, unused bellsouth.net email address, but I didn't see it there.

    At any rate, when they did this change, the old password continued to work, but the connection was unreliable. What a great way to implement a network change! If they had completely disabled the original password, the problem could have been resolved quickly. By making it just unreliable, they created a problem that was hard to diagnose.

    They also changed the profile my DSL account uses, but I'm not sure what effect that might have.

    So, after I started using the new password, my connection has been almost 100% reliable. I've logged no disconnects and only 1 ping failure in several days. Hope this information will be useful to anyone else who has the same problem.
     
  11. Planiwa

    Planiwa LI Guru Member

    Thanks for the update. That's bizarre.

    About the profile -- hard to know what effect unless you know what it was before and what it is now.
    They may have lowered your target data rate to get a more stable connection.
    Or they may have raised it (again) because the connection is (now) more stable.
     
  12. woody

    woody Networkin' Nut Member

    Looks to me like they lowered it. It was at 8150 and it's not showing 7100. However, both of those are higher than any speeds I can actually achieve.
     

Share This Page