1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Tomato ND USB Mod

Discussion in 'Tomato Firmware' started by teddy_bear, Dec 17, 2008.

  1. mitek76

    mitek76 Addicted to LI Member

    Network issue

    Seams to be not only samba problem but router network issue (or some filter applicaiton on it). I've tried samba2, samba3 and found the issue occures also on ftp server (but with different file - see attachment). I tried with wifi and lan and the same problem is observed. I also tried transfer from different machine. If someone could reproduce the issue I will be sure it is not my configuration problem.
     

    Attached Files:

    • f.zip
      File size:
      2 KB
      Views:
      17
  2. teddy_bear

    teddy_bear Network Guru Member

    You didn't specify what filesystem you use, whether you can or can't copy other files, and whether or not the problem occurs with multiple different USB drives...

    Anyway, it seems to be a problem on your end. I can upload both of your files using either built-in Samba, or built-in FTP - to either FAT or EXT2 partition.
     
  3. mitek76

    mitek76 Addicted to LI Member

    Networkin' Nut

    TB,
    I use ASUS WL500GP (v1) with latest build (tomato-ND-USB-8634-Ext.trx)
    The problem is when uploading these files from windows machine to the router either to USB or /tmp directory. Maybe the issue is not seen on other routers? Could you try this approach if you have WL500GP?
    BDW: The files I prepared are part of my camera MOV files. I can copy most of the files but fails on some. Failing files are quite large but I cut most of data and found the part of files that cause the issue.
     
  4. CsBubo

    CsBubo Addicted to LI Member

    I use this module from another mod where SDHC support was added, just loading it through the init script.
    Working fine with Teddy's firm on a WRT54G.

    Code:
    insmod mmc.o major=121 din=5 dout=4 clk=3 cs=7 maxsec=32 rahead=2 dbg=0001
    Of course You'd need to modify the parameters according to Your SD-mod implementation. And You can store it on JFFS, if using one of the smaller builds(no-cifs).
     

    Attached Files:

    • mmc.zip
      File size:
      13 KB
      Views:
      20
  5. Snoopyee

    Snoopyee Addicted to LI Member

    Let me state the specs of my setup before I get into my question/problem:

    Router: Asus WL-520GU
    Firmware: Tomato 1.25.8634 ND USB lite version
    USB settings in tomato: first 6 radio marks checked
    Printer: HP Deskjet 3650
    OS: Mac OS X 10.6 Snow Leopard
    Drivers from HP website claiming to work with OS X 10.6. Not sure if it's CUPS compatible - it doesn't say. From the website, it's the latest one updated from Sept 09 - http://h10025.www1.hp.com/ewfrf/wc/softwareList?os=219&lc=en&dlc=en&cc=us&product=304535

    Okay, here's the dealio: how do I get this to use the printer in a MAC? PC's work just fine and can print to the printer using TCP/IP w/ RAW 9100 just fine. It's only on the mac I'm having trouble. So far here's the settings I've tried:

    Protocols: Line Printer Daemon - LPD
    Address: 192.168.1.1:9100
    Queue: <blank>
    Print Using: I select the print drivers from the installed drivers from hp

    Protocols: HP Directjet - Socket
    Address: 192.168.1.1:9100
    Queue: <blank>
    Print Using: same drivers I downloaded and installed from HP

    And finally:
    Protocols: Internet Printing Protocol - IPP
    Address: 192.168.1.1:9100
    Queue: <blank>
    Print Using: same drivers as above

    The errors I get when trying to print is:
    "Opening Printer Connection Failed: Unable to open the printer connection. Please check your printer connection"

    Someone please help or at least direct me to a mac step by step? The only step by step instructions I see are for PC's. Can someone also confirm the settings if I'm missing something here? Thank you.
     
  6. teddy_bear

    teddy_bear Network Guru Member

  7. weixing

    weixing Addicted to LI Member

    fuse.o and ntfs3g

    Teddy,

    Can you compile a fuse.o for this mod for use with ntfs-3g 4.4?

    Also, can you tell me how to change the USB automount script if the ntfs driver changed to ntfs-3g instead of orginal ntfs.o?
     
  8. teddy_bear

    teddy_bear Network Guru Member

    Heh ;)... Unfortunately it's not as easy as compiling fuse module. Kernel 2.4.20 is not supported by any version of fuse. I already made some backports from more recent kernels for compatibility with fuse, and was even able to compile and load it. However, more backports are needed to make it actually working and stable - currently it's locking up and crashing a lot.
    Maybe I'll work on it some day, but this is not a simple task, so I won't promise anything...
     
  9. baker99

    baker99 Addicted to LI Member

     
  10. ray123

    ray123 LI Guru Member

    TeddyBear,
    I've made some more enhancements & fixes.

    These began as enhancements for easy automation of the start up of a bittorrent client (located on a USB drive) when the router boots up. Along the way I discovered a few problems which I also fixed. A bittorrent client absolutely requires a swapfile/partition, which is how I ran into some bugs/problems.
    It also turns out that a USB drive might take as long as 5-10 minutes to mount on a reboot--which changed the method that I had originally used to kick things off.

    As soon as this stuff get into a release, I'll finish up my "Guide to Running a Torrent Client and Other Software on a Tomato Router" document, and a tarball that non-Linux-expert people can drop onto a USB drive to get up and running simply.

    You can download the patchset at:
    http://www.mediafire.com/file/225gmfkwyit/rvt-v34-enhancement-patch.tgz

    1) Fixed busybox swaponoff.c to not die if one of the swapfiles listed
    in /etc/fstab doesn't exist.

    2) Configed busybox to include dirname (used by ipkg).

    3) Enhancement to kernel when mounting ext3. Log "journal recovery
    started" if that is the case. Journal recovery can take 5+ minutes!
    During this time the disc mount is still in progress.

    4) Enhancements to the usb mount script functionality:
    * Pass the path of the partition as the 1st argument to the script.
    (This will allow the user script to do special things for particular
    mounts, since it will know exactly which partition is being
    effected.)

    * When a USB device gets mounted, execute the "*.autorun" script(s)
    in the top directory of the device. At first I made it "*.usbmount",
    but then decided that "autorun" is more like the de-facto standard.


    5) Fixed the USB unmount disk process to turn off the swapfiles & swap
    partitions that are on the disk. This doesn't help if you just
    unplug the drive, but it does if you unmount the drive via the web
    GUI. (NAS and USB screen).

    6) Also fixed the unmount process so that it will unmount *all* the
    mountpoints (directories) that are on the partition(s) on the disk.
    For example, if you do something like
    "mount -o bind /tmp/mnt/usbdisk/optdir /opt"

    7) Enhanced the NAS/USB GUI screen to show if a swap parition is active.




    #1 thru #4 are in "rvt-v32.patch"

    #1 thru #7 are in "rvt-v34.patch"
     
  11. hapahopi

    hapahopi Addicted to LI Member

    can't wait to try this one out!
     
  12. teddy_bear

    teddy_bear Network Guru Member

    Great set of improvements as usual, thank you Ray - will include them into the next build. I took a quick look at some of the patches, and have a couple questions.
    Do you mean the Optware ipkg, or any other? Because I use Optware all the time, and have never seen ipkg complaining about missing dirname...
    Actually, the mountpoint was used as an argument initially... But I removed it later to avoid any confusion since the after-mount script is called only once per device, no matter how many partitions are getting mounted there. In your implementation, only the mountpoint of the last mounted partition on that device will be passed to the script. The same with .autorun - only the script located on the partition that gets mounted last will be executed. I thought of moving execution of ".autorun" scripts into the mount_partition() or mount_r() function, and maybe running "after-mount" script for each partition as well, but then I remembered why I did not do it this way in the first place - partitions can be mounted by "mount -a" command using fstab, and we have no control of that in the code... Also, I didn't want to run user scripts until all mounting is done. Looks like we need to loop through all the mounted partitions of the device after mounting is processed, and execute .autorun for all of them. What do you think?
     
  13. ray123

    ray123 LI Guru Member

    The ipkg I use is the one mentioned in the "ipkg and torrent for Tomato.txt" writeup that I posted & uploaded in July. It's the one that gets installed by the commane "wget http://www.wlan-sat.com/boleo/optware/optware-install-oleg.sh -P /tmp"

    Many of the package install scripts call dirname. Dirname itself is provided in the coreutils package. You probably installed that early on, so you wouldn't have gotten the error messages about "dirname not found". I installed it very late, so I saw it a lot. The first packages I installed were torrent clients and their dependencies. If somebody doens't install coreutils, they'll have problems.

    -----------------------------

    Dang, I didn't consider that you might have multiple partitions on a disk. Windows only allows 1 partition on removable devices, and I kinda implicitly assumed that there wouldn't be more than one. Clearly incorrect. I now have only one user partition and one swap partition--but in earlier testing I did have one USB stick with 3 partitions with ext3, etx2, and fat types.

    Let me think on this awhile. I'll get back with my thoughts.
     
  14. ray123

    ray123 LI Guru Member

    Ok, had some thoughts. I started writing stuff down---which usually helps clarify my thinking. As I wrote stuff down, it became clear to what to do. Pretty simple, too. Once again I prove to myself that it takes more code to do something the wrong way than to do it the right way. :)

    The first part is of the writeup is my developing thoughts, the proposals (and what these changes implement) are at the bottom. It was easier to modify the code than to write about the changes.

    Here's my new patchset & writeup. It completely replaces the last one. I did have one question on some of the code you put in somewhere between V24 and V32. That's in the writeup, too.

    http://www.mediafire.com/file/begznzhftiy/rvt-v34-enhancement-b-patch.tgz
     
  15. KyleChen

    KyleChen Addicted to LI Member

    Thanks teddy, Thanks ray123, thank you all for your time and good work!

    now i have the way to test the driver issue, ill come back later to post the result.
     
  16. jnappert

    jnappert LI Guru Member

    Is there support für reiserfs in this usb mod? I am running squid as caching proxy on my WL500gp. The cache is located at a 8GB USB Stick formatted aus ext3.

    I did tests on my linux device and: reiserfs has much more performance when caching small files.

    So - i like to change from ext3 to reiserfs if there is support with teddybears mod.
     
  17. CsBubo

    CsBubo Addicted to LI Member

    SDHC mod speed

    Never goes above 400KB. That's the max for this kind of interface(SD in SPI mode).
    But still ok for smaller DSL lines to utilize.
     
  18. ray123

    ray123 LI Guru Member

    You should not use ext3 on a USB flash drive, only on a hard drive. Due to the journal writes, ext3 write 2 or 3 times more than ext2 does. Only use ext2 or fat on a flash drive.

    Due to the fact that Linux does a *very* good job with disc buffer cache, it is quite difficult to accurately measure filesystem performance on Linux. Generally, what you are actually measuring is buffer cache access, not disc access.

    In addition to that, the router's bottleneck is transferring data across the USB---it's just not that fast. So, IMHO, it's highly unlikely that there is a significant actual noticible speed difference between ext and reiser.

    But to answer your question: No, no reiserfs support. The fs driver isn't there.
     
  19. ray123

    ray123 LI Guru Member

    Ok, I tried to build reiserfs. No joy.
    The version in Tomato (kernel version 2.4.20) won't build.
    I copied it from 2.4.37 and built it, which did build.

    But... (Yeah, you knew there was a "but" coming, didn't you.)

    fs/reiserfs/reiserfs.o is 304743 bytes.
    whereas
    fs/ext3/ext3.o is 118076 bytes.

    Three times larger.

    FWIW, fs/ext2/ext2.o is 64599 bytes. Not that that matters---both ext2 & 3 are in the Tomato build.

    If you still want to try it, PM me and I'll upload it for you to try. It may or may not work---I can't test it. You won't be able to automount it, though. You'll have to put it in fstab. And that might not even work. You might have to do it with your own mount command.
     
  20. jnappert

    jnappert LI Guru Member

    Thanks ray123. Manually mounting isnt a problem - but 3 times larger...

    I think about "downgrading" to ext2.
     
  21. CsBubo

    CsBubo Addicted to LI Member

    iptables is broken?

    Hi TB and Ray!

    It seems the iptables executable is having some problems, since within a terminal i got this:

    Code:
    [root@tomato init.d]$ iptables -L
    Chain INPUT (policy DROP)
    target     prot opt source               destination
    Segmentation fault
    and when I try to add some rules by hand:

    Code:
    [root@tomato init.d]$ iptables -I INPUT -i ppp0 -p tcp --syn --dport 51780 -j ACCEPT
    Segmentation fault
    [root@tomato init.d]$ iptables -I INPUT -i ppp+ -p tcp --syn --dport 51780 -j ACCEPT
    Segmentation fault
    [root@tomato init.d]$ iptables -I INPUT -i vlan1 -p tcp --syn --dport 51780 -j ACCEPT
    Segmentation fault
    
    The strange thing is, that the rules can be added through the GUI, which I suppose is using the same executable.
    Can You have a look?
    Thanks
     
  22. teddy_bear

    teddy_bear Network Guru Member

    CsBubo,
    Looks like you either have another copy of iptables, or - more likely - your library path has been changed, and now it picks up the wrong version of some of the shared libraries. Have you played recently with LD_LIBRARY_PATH, or ld.so.conf?
    Make sure you are also not making any changes to the default library path via profile files: /jffs/etc/profile or /opt/etc/profile (these profiles are picked up automatically).
     
  23. Megaweapon

    Megaweapon Addicted to LI Member

    Any word on enabling the tftpd support in dnsmasq at compile?

    I'd like to at least know if this is plausible.
     
  24. Mastec

    Mastec Network Guru Member

    I have run into a WDS problem with 34, the latest fix. I searched but didn't find anything addressing this, could be there but maybe I suffered a brainfart. LOL But anyway back to my issue.... When I load 34 onto my WL500g V2 and setup WDS with my WHR-HP-G54 with 1.25 loaded it can not connect. Although when I look at the Device List in the GUI on both routers it shows a strong connection. But there is no communication between the routers and the computers connected to the Buffalo can't access the network or the internet. If I fall back to 32 everything works like it should. I am by far not a newb to Tomato, have been using it since 1.16 so I know to reset to defaults and so on. Has anyone else had this problem?​
     
  25. CsBubo

    CsBubo Addicted to LI Member

    thanks

    I haven't changed the mentioned config, but My LD_LIBRARY_PATH is empty, if /opt isn't connected. Of course, when /opt is mounted, the /etc/profile is sourcing it.
    As I've added the /lib /usr/lib into it, it is working fine.
    The question is where originally LD_LIBRARY_PATH should be populated?
     
  26. ray123

    ray123 LI Guru Member

    It shouldn't be set at all.

    It sounds like you either have a path problem or something got set wrong in your profile. Do you have an /opt set up? 'cause this sounds like you are running a different executable depending on if you use command line or GUI.

    Two things to try: (Here's what mine says)
    #1
    root@Router:# /usr/bin/which -a iptables
    /usr/sbin/iptables

    #2
    root@Router:# ldd() { LD_TRACE_LOADED_OBJECTS=1 $*; }
    (You only need to do that once. I have it in my profile.)

    root@Router:# ldd iptables
    libdl.so.0 => /lib/libdl.so.0 (0x0x2ab07000)
    libnsl.so.0 => /lib/libnsl.so.0 (0x0x2ab49000)
    libiptc.so => //usr/lib/libiptc.so (0x0x2ab8a000)
    libc.so.0 => /lib/libc.so.0 (0x0x2abd1000)
    ld-uClibc.so.0 => /lib/ld-uClibc.so.0 (0x0x2aac0000)

    But, for example, when I do this...
    root@Router:# ldd /opt/bin/which
    libgcc_s.so.1 => /opt/lib/libgcc_s.so.1 (0x2aad7000)
    libc.so.0 => /opt/lib/libc.so.0 (0x2ab01000)
    ld-uClibc.so.0 => /opt/lib/ld-uClibc.so.0 (0x2aac0000)

    And you can see that it's picking up the libraries in /opt/lib

    Whereas,
    root@Router:# ldd /usr/bin/which
    libcrypt.so.0 => /lib/libcrypt.so.0 (0x0x2ab07000)
    libc.so.0 => /lib/libc.so.0 (0x0x2ab5c000)
    ld-uClibc.so.0 => /lib/ld-uClibc.so.0 (0x0x2aac0000)
     
  27. DCX2

    DCX2 Addicted to LI Member

    Hi. Thanks very much for making this mod; I have an Asus 520gU and I wanted samba + usb, and WRT was starting to make me twitch, and this just...worked.

    But then I started copying some files over and I noticed that they were taking up an unusually large amount of disk space...

    [​IMG]

    I read that there's some bug in Samba 2.0.10 that was fixed in 2.2.3 that caused something like this to happen http://wl500g.info/showthread.php?t=13236.

    I'm using Tomato Firmware v1.25.8634 ND USB Std. I looked at the changelog and I think you built with 2.0.10. Would it be possible to persuade you to provide a build with 2.2.3?
     
  28. teddy_bear

    teddy_bear Network Guru Member

    DCX2,
    Samba 2.0.10 is the last Samba version that can fit into 4MB flash - that's why it's used.
    As far as I understand this bug is only cosmetic - it only affects the size on disk reported by Samba clients, and not the actual size taken up by your files.

    Mastec,
    This is very strange because there were no changes between builds 32 and 34 that could affect wireless (here's the diff). I wonder if your issue is related somehow to a weird connection problems with wl500gpv2 discussed a few pages pack - and only affecting some (a few) users...
    Unfortunately, there was no feedback yet on the different WL driver versions posted by Ray and me here.

    Megaweapon,
    I checked, and TFTP increases the dnsmasq code by only 7KB (uncompressed). I think I'll go ahead and enable it for the next build (but no GUI - you'll have to activate it via "Dnsmasq Custom Configuration") since I just gained about 30KB (after compresion) by playing with kernel compile flags ;)...

    CsBubo,
    Even when /opt is connected, you should not change LD_LIBRARY_PATH. If it's in your /opt/etc/profile - remove it! Optware doesn't need it, and it can only cause problems for non-Optware apps.
     
  29. DCX2

    DCX2 Addicted to LI Member

    Oh, I see. Thank you for your prompt and informative reply.

    So the bug only affects file sizes reported by Samba? If I telnet and ls, then I should get the right sizes?

    Thanks again for taking the time to make this mod. I really, really appreciate it. :thumbup:
     
  30. DCX2

    DCX2 Addicted to LI Member

    Just FYI, I confirmed with du that a 1 byte text file actually consumes only 4kB on disk.
     
  31. TexasFlood

    TexasFlood Network Guru Member

    34 was the first version I tried and I don't have 32 to fall back to. I am also running this on a WL500g V2. Have seen some similar WDS issues (connection showing in device list but no communication). I can get it going by rebooting my routers a few times. This same setup is rock solid with my WRT54G v2.0 running vanilla Tomato 1.23 in the same role.
     
  32. Megaweapon

    Megaweapon Addicted to LI Member

    Awesome! You will want to update dnsmasq to 2.50 at the same time due to the TFTP security vulnerability that was found in 2.49.

    Thanks!
     
  33. ghostknife

    ghostknife Addicted to LI Member

    Yep I was waiting on that too.... guys what happened with the other WL drivers???

    Since i don't have the problem I was going to test and see if I could cause mine to break but didn't get to it yet. Also I could try WDS with another WL500 and my ADSL modem when I do that, maybe on the weekend. Wireless performance on that modem is total POS and i never use it at all so if it's WDS works with WL500 I'll be surprised.
     
  34. hypermood

    hypermood LI Guru Member

    dnsmasq minimum ttl patch

    After messing around with dnsmasq's caching, I found this patch, http://article.gmane.org/gmane.network.dns.dnsmasq.general/1957, which forces a minimum ttl for the records in the cache. There are pros and cons to this...

    Anyway, I merged the patch into dnsmasq 2.49 and tested with dig - seems to work well. Attached is the patch file which you can apply to build your own custom TB mod :thumbup:

    After building and installing the firmware, you can use the new configuration parameter, 'pos-ttl=1200', to the dnsmaq custom configuration in the web gui. 1200 would keep cache entries for a minimum of 20 minutes before a new query is need.
     

    Attached Files:

  35. koszpa

    koszpa Addicted to LI Member

    I have been quite busy recently but I have managed to try all the wireless drivers you posted earlier, here are the results with the v1.25.8634 vpn3.4 ND USB VPN:

    wl_4_130_19_0.o - wlan NOT working :frown:
    wl_4_150_10_5.o - wlan NOT working :frown:
    wl_4_150_10_29.o - wlan works fine :)
    wl_4_158_4_0.o - wlan NOT working :frown:

    I also tried another driver version from v1.23.8623 OND, by wl version it is the same as the first one in the previous list, but the two files are different:

    wl_4_130_19_0.o - insmod fails with some error code

    This is the driver which is from the original Tomato, and which works fine with the v1.23.8624a OND version.
     
  36. ray123

    ray123 LI Guru Member

    Yes, it will fail. The libraries changed and a referenced--but unused--function got removed from a lib file.
    I re-built it with a faked-out definition of that function, so it could be insmod'ed in the current Tomato. That's why the files are different.
    Note that we only have the objects files of the WL drivers, not the source, so we can't rebuild the object files.
     
  37. koszpa

    koszpa Addicted to LI Member

    Now I have copied the working wlan driver (wl_4_150_10_29.o) to the jffs and created a startup script with these two lines:

    rmmod $(lsmod | grep wl | cut -f1 -d" ")
    insmod /jffs/wl_4_150_10_29.o

    It exchanges the wlan drivers at every startup to the working one. However when I use this method the wlan starts only half-way working. The AIR laid stays half-lighted, and I can only search as a client with the site survey, and the network is not discoverable.

    Where and how should I put these lines to get this driver start properly at boot?
     
  38. ray123

    ray123 LI Guru Member

    Does it work okay if you log in after bootup and do it manually?
    If so, try putting a large sleep (5-10 or more seconds) before the rmmod in your script. It might be that your script gets executed while the router is still coming up.

    I see a number of places in the code where wl is loaded. Kinda hard to figure out all the paths through the code, though.

    BTW, you don't need the grep/cut. Just "rmmod wl".
     
  39. JackieBrown

    JackieBrown Guest

    I read the first 10 pages but there are a lot of posts here.

    Does the linksys 54gl work for this mod? I am looking at the usb support in your mod.

    **edit

    Nevermind. I don't think that router supports the nd driver. Sorry for the noise
     
  40. hypermood

    hypermood LI Guru Member

    NAS startup race condition

    teddy_bear,

    I found a race condition in the NAS/hotplug start sequence. The init process starts the vsftpd and smbd services on startup as it should. When I have my USB hard drive attached, quite often hotplug completes detection of the drive and executes 'restart_nas_services' before the calls in the init service have completed. The result is that smbd will have a conf file which has been written to concurrently and will be left in an invalid state.

    I noticed this after rebooting my router and smbd would not start - often complaining about a bad conf file. I fixed the problem by creating a lock file that prevents concurrent execution of the vsftpd and smbd start routines.

    I added a touch of syslog messages to confirm synchronization of the NAS services and have not had any startup issues since creating the patch.

    See attached.
     

    Attached Files:

  41. teddy_bear

    teddy_bear Network Guru Member

    koszpa,
    Thanks for testing - looks like we have a winner! I'm running 4.150.10.29 too now without any issues. Next build will include it as a standard, and we'll see if anyone else complains ;). But hopefully it will also fix some weird WDS problems experienced by other WL500gPv2 owners.

    hypermood,
    Thanks for the patch - will adopt it for the next build.

    JackieBrown,
    If you're talking about WRTSL54GS (that has USB port) then I believe this mod will run on it. As long as your router has wl0_corerev > 5 (check it by running "nvram get wl0_corerev" command) you should be able to use ND version.
     
  42. hypermood

    hypermood LI Guru Member

    I'm happily running a V1.0 WRTSL54GS with this mod :thumbup:
     
  43. ray123

    ray123 LI Guru Member

    Good analysis! I've been concerned about potential race conditions w/r/t USB drives for a while now, but never could see it. This is a quite small window, it's amazing that somebody not only managed to hit it, but also can hit it repeatably.

    But I don't think it's the right fix. It's kind of a sledgehammer. And it still doesn't fix the real problem, which is that the usb subsystem messes with samba & ftp before the init process has finished bringing the system up. Fix that, and this problem goes away.

    By looking at the diffs between v32 & v34, it looks like Teddy Bear was struggling with this "start" vs. "restart" problem and made an attempted fix. BTW, IMHO the right way to solve the concurrency issues of writing the config files is the way that miniupnd does, using mkstemp & rename. (This is the standard technique for doing this in Linux.) I actually had this coded up, but then discarded it because it didn't solve the start vs. restart issue---which is the root problem. Right now, except for startup, there is no concurrency problem here because the usb hotplug code is serialized.

    Anyway.....the attached patch fixes the startup race bug---in 4 lines of code.

    The other 2 changes are to correct a bug in the restart logic that occurs if, for some reason, the daemon process isn't active as expected. I think the right fix for this would be for do_start_stop_XXX to have "if (!start && !restart) return;" instead of the pidof() test. But I'll leave that cleanup for another time/person.
     

    Attached Files:

  44. ghostknife

    ghostknife Addicted to LI Member

    Ok so i tested them all once i figured out the correct commands and paths.....
    and they all work

    wl_4_130_19_0.o - wlan working
    wl_4_150_10_5.o - wlan working
    wl_4_150_10_29.o - wlan working
    wl_4_158_4_0.o - wlan working

    By tested I can't say the test was extensive, I just connected to it with HP iPaq and get email and browse so no throughput or other reliability test, but connects with no problem.

    Edit: I was running v1.25.8634 ND USB Ext at the time. OK Singapore GP starting now, going back to that.

    Edit2:
    OK so if i perform the steps manually the alternate drivers work so then tried to add to startup script to use 150_10_29 for testing but I can't get it to work. Commands seem to work Ok and driver loads in the log, wireless doesn't work i.e. I can't connect to it.
    Remove script, reboot to get default WL working, perform steps manually and it's ok so theres something wrong with my procedure. I don't know exactly where to put commands, I put them in Scripts > Init , is this correct?

    sleep 20
    rmmod $(lsmod | grep wl | cut -f1 -d" ")
    insmod /cifs1/wl_4_150_10_29.o

    I tried to load from jffs and from cifs1, same result. I see ray said you don't need the grep/cut but i can't figure out the syntax of the command without that since I don't understand this very well. Anyway maybe the WL driver needs to be fully incorporated in the flash for it to work properly so I'll wait for some help.
     
  45. hypermood

    hypermood LI Guru Member

    It was a typical race, sometimes it would work and sometimes it wouldn't. I was shocked when I cat'ed smbd.conf. :eek:

    The other problem I found with the stop start restart checks is that they rely on a static stopped variable. Hotplug and init are two separate processes so the value is inconsistent across process boundaries. Because of this (and my unfamiliarity with the firmware), I took a drastic approach on the synchronization. I could not give myself a warm fuzzy that the restart state issue along with hot plug races created by gui forced restarts wouldn't create another mess. I could probably force it to happen if I tired by plugging in the drive and messing with the gui. A bit contrived but seemingly possible.

    A clean alternative would be to have (create) a NAS service to manage everything and the have hotplug code signal with SIGHUP.
     
  46. ray123

    ray123 LI Guru Member

    Actually, I think that the orginal (pre-v34) code would have been fine with just the addition of my 4 line patch. As you say, there are problems with the static variables and the fact that init and each hotplug event are different processes.

    I had a heck of a time making the race condition reliably occur so I could test it. Non-repeatable bugs are danged hard to find/fix! I wound up having to insert additional testing-only code to force the race to happen. Interesting, though, I couldn't get the bad conf file thing to happen. What did happen was that smbd process was up okay but nmbd process was nowhere to be found.

    Anyway, thanks for the analysis. Now I'm anxious to delve (again) into the USB storage subsystem and see if I can find the bug that requires us to serialize the hotplug code. Looks like I'm gonna have to hack my hardware so I can connect up to the serial console, so I can get to the kernel debugger.
     
  47. hypermood

    hypermood LI Guru Member

    Here's the story...

    I stumbled across this problem only after using a USB hard drive and not a USB thumb drive (which worked fine). The drive takes longer to initialize by a few seconds and may explain why you had to add code to reproduce the problem. I also witnessed the nmbd failing to start but could not figure out why nmbd was bombing. The nmbd problem was truly random and only went away after I completely locked down the NAS stuff. I suspect it is due to some side effect which we haven't identified and the lock file masks or eliminates it. If I had to guess, the issue lies in the samba code exiting when two instances are starting at the same time.

    At least we have a better handle on the problem. I'm here to help!
     
  48. door_jam

    door_jam Network Guru Member

    Apparently WL-520GU supports Multifunction Printers, i.e. the scan function as well.
    Does this tomato ND USB Mod firmware support Multifunction Printers, too?
     
  49. weixing

    weixing Addicted to LI Member

    Samba3 daemon run twice

    I'm using WL500gP and installed Samba3, I find that everytime the router starts up, there are 2 instances of smbd runs, which takes up 20m of memory. Is it due to 500gP has 2 USB and thus daemon are run twice or there are issues with my config?
     
  50. hypermood

    hypermood LI Guru Member

    I haven't tested samba3's behavior but the 8634 build may attempt to launch more than one instance of samba simultaneously on startup/reboot. Try the patch.
     
  51. teddy_bear

    teddy_bear Network Guru Member

    Thank you guys for your research and suggestions!
    Precisely! As you noticed, that was not a problem prior to v32 because the start_services() call was protected by usb lock. However, in v34 I moved usb_unlock() to execute sooner. This was done to overcome some weird behavior of Tomato 1.25 when WAN mode is set to "Static" or "Disabled" - the usb_unlock() had to run before start_wan() otherwise it didn't work (strangely that was not an issue in 1.23)..... The real problem I should have addressed was to find out why usb_unlock() is not getting called, or doesn't work for any other reason in this case. However, I was getting lost in the Tomato WAN initialization code, and attempted a simingly easier approach with moving usb_unlock() up, which as you guys noticed created a race condition between init and hotplug. At the moment I didn't realize that the simultaneous writing to config files will be an issue, so the only problem with this race I was trying to solve was to prevent starting multiple instances of samba/ftp (and didn't solve it completely anyway).
    My first thought was to hold off sampa/ftp startup until start_services() is called using the technique similar to what Ray suggested with "sys_up" nvram variable - but then I decided to go the other way because setting up nvram could be affected in the same way as usb_unlock().
    This static variable, of course, was never intended to work across process boundaries - it's there to indicate whether or not the current process actually stopped the nas application. So if the application was not running and we didn't stop it, we won't attempt to start it back on "restart"...
    That would work if there was only a single config file that gets written. However, in case of vsftpd there are several files, and to keep them consistent the whole process of preparing the config files (and better yet the whole ftp/samba startup process) needs to be serialized.
    Also, there's a very small chance of another race - between hotplug and service (i.e. "service samba restart") processes which means protecting the init may not be enough. So I think I'll stick with the file lock approach for now - just reuse slightly modified existing usb_lock/usb_unlock functions instead of creating new ones... That hopefully should solve both issues (simultaneous config writing, and starting multiple instances of the applications) with both races - between hotplug and init/service.
     
  52. teddy_bear

    teddy_bear Network Guru Member

    "Apparently"? Have you tried it? Does the scan function actually work?
    If so, please specify the Asus firmware version that has it working, and whether or not you needed to install any additional software on a client to make it work.

    Asus firmware aside, you can use scan functionality with this mod by installing Optware sane-backends package on a router, and sane client app on your client box. Search this thread for details - it has been discussed here.
     
  53. door_jam

    door_jam Network Guru Member

    No, I have not tried it. I was researching on how to do it after I have flashed my WL-520GU with your firmware. Thanks, teddy_bear.

    I am no expert in all this. I have read on Asus' Support page their description of firmware version 3.0.0.8:

    ASUS WL-520GU firmware in English/Traditional Chinese
    ASUS New EZ UI : New ASUS easy user interface. Auto detection, no more setup.
    To get better performance when using All-in-One printer sharing function, please use Utility version 4.0.2.5 with this Firmware version.
    fixed bugs:
    fixed the ajax's function.
    fixed the QIS procedure.
    fixed the Time zone.
    The WAN could detect PPPoE normally when there is MOD in LAN

    teddy_bear, I shall look into your suggestion. Thanks.
     
  54. teddy_bear

    teddy_bear Network Guru Member

    Yep, thanks! Looks like Asus added this recently. They are using Eltima USB over Ethernet software which is unfortunately not free and not open source.

    It should be possible to take the server side part (that runs on the router) from Asus GPL sources and adopt it for this mod. I don't think it's worth the efforts though - it will only work with proprietary Asus printing utility/driver (which is the OEM version of Eltima software) on Windows, and it will take a lot of extra flash space on the router. Besides, it's not like that is the only option - we already can use open source sane-backend to share scan function. But I'll research it a bit more...
     
  55. ghostknife

    ghostknife Addicted to LI Member

    Ray, t_b, i see you guys are focused on something more important but can you tell me if these script command should work?
    I was playing with the WL a bit more and using the different drivers for testing, it's a pain to go through everything manually every time i reboot.

    FWIW 4_150_10_29 does seem to work better, several devices connect/negotiate connection faster than with other drivers although they all work. Streamed several radio station to Roku and PC for 2 days straight without disconnect now.
     
  56. toolbox

    toolbox Addicted to LI Member

  57. teddy_bear

    teddy_bear Network Guru Member

    If they work when you enter them manually but not in the Init script, then Init may call them too early in the boot up process. Try to move them into the WANUP script, and/or possibly increase the delay in the sleep command.
     
  58. teddy_bear

    teddy_bear Network Guru Member

    Yep, it's available in the same download location as other builds. Follow the links in the 1st post, and download build that has "vpn" in the name (tomato-1.25-ND-USB-8634-vpn3.4.rar is the latest one).
     
  59. ghostknife

    ghostknife Addicted to LI Member

    Hmm, with the sleep 20 i had it doesn't execute until after NTP update which is the last thing that appears in the log (usually) so maybe it won't work, I will just leave as is and stop playing with it then, thanks.
     
  60. koszpa

    koszpa Addicted to LI Member

    I have been messing with an event triggered start for the wl driver, I have created a script which greps for a specific text given as a parameter in the logs, and when the specific text appears in the logs then quickly replaces the wl driver:

    Code:
    #!/bin/sh
    LOG="/var/log/messages"
    
    COUNT=`grep -i -c "$1" "$LOG"`
    while [ "$COUNT" == "0" ]; do
      sleep 1
      COUNT=`grep -i -c "$1" "$LOG"`
    done
    
    echo "Removing actual WL & replacing to the working driver..." >> $LOG
    rmmod wl
    insmod /jffs/wl_4_150_10_29.o
    echo "WL driver replacemnet DONE." >> $LOG
    This script should be started as a new process from the init script like this:
    Code:
    /jffs/start_wl.sh "some string to find in the log" &
    In this example the script is saved as /jffs/start_wl.sh , unfortunately I have no better results comparing to the sleep 20 solution, I have to input the replacement lines after every startup. :frown:
     
  61. ray123

    ray123 LI Guru Member

    My first thought was to hold off sampa/ftp startup until start_services() is called using the technique similar to what Ray suggested with "sys_up" nvram variable - but then I decided to go the other way because setting up nvram could be affected in the same way as usb_unlock().

    Hmmmmm. I don't think that there's a race with the nvram. I think that is well protected. Anyway, after shooting off my 4-liner, I realized that there is still a race condition----it's just moved and the window got smaller. Or is it a deadlock?--I don't remember just now. To solve that, the "nvram_set("sys_up", "1")" has to be moved up and then call usb_lock. Something like that.

    edit: Ah, I remember now. There's a race condition between hotplug process checking sys_up and init process setting it. The solution is: in init: usb_lock, then start_services, then set sys_up, then usb_unlock. This is a very UGLY solution and if one of my junior programmers had brought it to me I'd have tossed it back to him and told him to fix it right.

    This static variable, ... it's there to indicate whether or not the current process actually stopped the nas application. So if the application was not running and we didn't stop it, we won't attempt to start it back on "restart"...
    But it doesn't actually test that. What it does is test if it's running and then act on that. It should not be testing if it is *actually* running, but rather if it *should* be running. This code is a very complicated way of doing not much of anything useful. It doesn't actuallydo what it is trying to do.

    Oh, and it also has the problem that if--for some reason-- either smbd or nmbd processe isn't running, this code will fail and *never* start them. This happened to me in my testing---smbd was there but nmbd wasn't, so the nas never got restarted like it should have. This code is very fragile. If the conditions that it expects are not exactly right, then it breaks.

    IMHO, the correct description of restart_nas_services is: stop samba, update its config file(s), and then start samba if its nvram enable flag is set. Ditto for ftpd.
    The v34 mod added a complexity: ...only start it up again if both smbd/nmbd processes are currently running...


    That would work if there was only a single config file that gets written. However, in case of vsftpd there are several files, and to keep them consistent the whole process of preparing the config files (and better yet the whole ftp/samba startup process) needs to be serialized.
    Also, there's a very small chance of another race - between hotplug and service (i.e. "service samba restart") processes which means protecting the init may not be enough. So I think I'll stick with the file lock approach for now - just reuse slightly modified existing usb_lock/usb_unlock functions instead of creating new ones... That hopefully should solve both issues (simultaneous config writing, and starting multiple instances of the applications) with both races - between hotplug and init/service
    .

    Yeah, I thought of the multiple file thing afterwards. :frown:
    BTW, one think I learned in my career was that there's no such thing as "a very small chance of a race condition". There are only race conditions that haven't bit you yet. And they *never* get solved by throwing complicated code at them. All that does is make them harder to find and make the code hard to understand.

    As I see it, there are 2 problems that need to be solved: P1) prevent usb hotplug from [re]starting nas/samba before all the system services have been started at system startup, and P2) prevent concurrency in stop/start of ftpd & samba. And, of course: P3) make sure that if usbhotplug wanted to restart samba before startup was complete, that samba gets started with the new/correct configuration.

    Also, I'm still trying to find the bug in usb storage that requires usb to be serialized. When/if I find that, the usb lock will go away. (I just had a terrible thought----what if it isn't in the kernel at all! What if it's somewhere in the Tomato nas restart code?)
    I think we need to fix the nas/samba race condition without depending on the usb_lock.

    Here's my current thinking:
    1) Drop back to the v32 code for stop/start samba/ftpd. The v34 mods don't truly solve the problem.

    2) Protect start samba/ftpd with a concurrency lock. As you suggested, reuse the existing the usb_lock code. This solves P2. (Maybe keep the v34 do_start_stop_samba function (without the restart logic) and do the lock/unlock in the start_samba function. I don't like all the goto's in hypermood's proposed mod. Either that, or drop completely back to the v32 code and just make sure that there is an unlock before each return.)

    3) I don't like to have nested locks. That way leads to hard-to-detect deadlocks. Therefore, it's dangerous to do samba_lock in code that is inside a usb_lock. It's dangerous (and complex to understand) to depend on usb_lock and samba_lock to interact with one another to accomplish the sequencing.

    4) Assume that the usb-hotplug concurrency bug will someday get fixed, and that the usb_lock will go away.

    5) Use the nvram "sys_up" flag as it was in my 4-line patch. It gets set by init after all the services have been started. It gets tested in restart_nas_services, which returns immediately if it isn't set. This solves P1. This also may be useful in other areas of Tomato, for other applications that may need to know if startup is complete.

    6) Add a new nvram flag "hotplug_nas". This would get set in restart_nas_services, the very first thing. It would get tested in init, right after setting "sys_up". If it is set, then call restart_nas_services. This handles P3.

    There is a very small window where restart_nas might be done twice. The sequence would be: init sets sys_up, hotplug sets hotplus_nas, hotplug sees that sys_up is set and therefore calls restart_nas, init sees that hotplus_nas is set and therefore calls restart_nas. Doing it twice is harmless, though.

    The other ordering would be to check hotplug_nas, set sys_up, and then call restart_nas iff hotplug_nas was set (this means you have to use a temporary variable). Regardless of the sequencing between hotplug & init, restart_nas would get called only once, either by hotplug or init, but not both. Of course, init has to test hotplus_nas *before* setting sys_up, otherwise there's a hole where restart_nas might not get called at all. Too bad there isn't a "nvram_test_and_set" function. :)

    Okay, this is harder to explain than to code. Here is the requisite code:
    Code:
    /* In services.c */
    void restart_nas_services(int start)
    {	
       /* restart all NAS applications */
       if (start)
          nvram_set("hotplug_nas", "1");
       /* restart all NAS applications */
       if (!nvram_get_int("sys_up"))   /* Don't restart if system isn't up yet. */
          return;
    #ifdef TCONFIG_SAMBASRV
    	if (start && nvram_get_int("smbd_enable"))
    //...
    	   ;
    }
    
    
    /*********************************/
    /* In init.c */
    
    {
       nvram_set("sys_up", "0");
       nvram_set("hotplug_nas", "0");
       
       for (;;) {
          // TRACE_PT("main loop state=%d\n", state);
          //...
          
          // ...
          start_vlan();
          start_lan();
          start_wan(BOOT);
          start_services();
          nvram_set("sys_up", "1");
          if (nvram_get_int("hotplug_nas")) 
    	 restart_nas_services(1);
          syslog(LOG_INFO, "Tomato %s", tomato_version);
          syslog(LOG_INFO, "%s", nvram_safe_get("t_model_name"));
          //...
       }
     
  62. teddy_bear

    teddy_bear Network Guru Member

    Thanks Ray - as always, a very detailed and deep analysis of the problem! I do think though that we all are overcomplicating the issue, and I was the one who started this by changing ftp/samba start/stop/restart logic in v34.
    That sequence is exactly what I was trying to get rid of in v34 (and what caused the concurrency problem) - the goal was to move usb_unlock before start_wan (and start_services since it's called after that).

    Anyway, I agree with you on all counts - the race conditions must be solved no matter how "small" are chances for them to occur, the existing code in start_stop routines added in v34 is ugly and has to be changed to make it actually working as well as to simplify it, the nested locks are dangerous, and I too would like to avoid goto's ;)... And of course I'd like the solution to be as simple as possible - i.e. do not use both - locks and nvram flags - if there's a way to get off using just one of them.

    So here's how I'm planning to address this now.

    Use the same usb lock to protect not only usb hotplug but also to serialize start/stop routines for nas applications. For that some small changes to usb_lock/usb_unlock routines are needed to allow nested calls (i.e. don't aquire lock if we already hold it, and don't release it unless it's the most outer "unlock" call). Roll back start/stop/restart nas applications logic back to what it was in v32 - no complexity checking for static variable and pidof(). However, protect all start_[samba|ftp]/stop_[samba|ftp]/restart_nas_services calls by usb lock. Also, handle vsftpd the same way we do samba from the beginning - send SIGHUP to vsftpd on starts/restarts instead of killing it (for some reason I thought that vsftpd doesn't handle SIGHUP correctly but apparently that was solved some versions ago), and check pidof() before starting vsftpd to prevent starting multiple instances.

    That's it. No multiple instances and no concurrency in start/stop ftp/samba. On startup it wouldn't matter who starts the samba/ftp first - hotplug or init - in any case they will get SIGHUP signal from the 2nd process to make sure the new/correct settings are picked up. No potential deadlocks issues, no need for new nvram flags, and no other hard to understand logic.
    The above solutions doesn't really rely on usb_lock even though for now it's going to use the same lock file. Later on, if we find out a way to get rid of usb lock, we can just remove lock/unlock calls from around hotplugging code, but leave it around start/stop/restart nas routines (of course, rename "usb" lock to "nas" in that case ;) ).

    Do you see any obvious flaws in this approach?
     
  63. hypermood

    hypermood LI Guru Member

    guys, gotos aren't really evil unless you are using them to dictate logic and algorithmic flow. When used as an error condition handler and point to a singular exit or label, you don't have to duplicate cleanup code over and over in a necessarily long function.

    teddy_bear, I think you are on the right track. The solution to the race is to implement as stateless a solution as possible by relying on serialization (only as necessary) to avoid complex decision making. If you have to evaluate a condition before you run, then you have to account for every possible pre-condition and post-condition. For this reason I would suggest defining certain serialized sections and rely on the known behaviors that they would enforce.

    With more than one process involved, I don't think you can ever get rid of the lock. In short, if you define more variables to detect/avoid a race, are you racing to define/check a variable? With a lock, the worst that I can see is that you will be left with a circumstance where a routine gets executed twice unnecessarily.

    You guys have done a great job. I'll go sit the bench and wait on the sidelines :thumbup:
     
  64. ghostknife

    ghostknife Addicted to LI Member

    koszpa, I think it's because the manual process is relying on you also going into the GUI to press 'save', without that it doesn't work. I tried having the GUI open and pressing save ASAP after it was booting, didn't work, might only be a small window of opportunity. Since I am not a programmer of any sort i don't know if it's possible to implement save from the script, probably not or ray would have said to.
    Also, did you have yours in the init or WAN up script?
    I only tried init because I run with the WAN (disabled) assigned to LAN (it's behind another router) so maybe this has some effect. Have not played with it any more, just don't reboot, problem solved! TBH I only need to reboot when upgrading/playing with settings anyway so kind of causing my own problem there :)
     
  65. apparissus

    apparissus Addicted to LI Member

    Thanks very much to teddy_bear and this community for making Tomato even better.

    I found a minor bug I thought I'd report: the script that writes /etc/upnp/config seems to assume a /24 network (255.255.255.0 subnet mask) for the 'allow' line. In my case, I traditionally u:eek:se 10.0.0.0/16 for some networks, and noticed that the upnp config file gets overwritten at boot with the wrong settings.

    Thanks again! ~app
     
  66. ray123

    ray123 LI Guru Member

    Um, sometimes the router reboots all by itself. Such as, when you get a power glitch.

    Doing it in a script is the right thing to do. You've got to be very wary of the timing, though. Startup timing varies all over the place, especially if you have an external USB disc.

    What I would do is something like have a loop where you use lsmod to check for "wl" once a second. When you see it, sleep a few seconds and then insmod the new wl module.
     
  67. teddy_bear

    teddy_bear Network Guru Member

    Yep, the /24 network is hardcoded :frown:... I'll fix it in the next build.

    In the meantime, if you want to permanently override the default miniupnpd config file, copy existing config into a new "/etc/upnp/config.alt" file, modify a new file as you wish, and then save it to nvram (make sure you specify the full path to the file):
    nvram setfile2nvram /etc/upnp/config.alt
    nvram commit

    Then restart miniupnpd to pick up the new settings (service upnp restart). Next time you reboot the router, the /etc/upnp/config.alt file will be automatically restored from the nvram, and will be used by miniupnpd instead of a default one.
     
  68. ray123

    ray123 LI Guru Member

    I'd prefer to see a more general ??_lock function rather than overloading the current usb_lock. BTW, locking counts can get pretty complicated, just by themselves. :frown:

    A more general function would pass the byte to lock, so usb would be "lock_get(0)" and nas would be "lock_get(1)". Ditto for unlock. Simple, and not overloaded, and the lock id's and locks are unique.
    That's just a trivial mod of the current usb_lock.
    lock.l_start = arg1; lock_l_len = 1;
    Did I mention it's simple?

    Philosophically, I don't like for samba/ftpd to get started by hotplugging before system startup is completed. Hence the way I did it---even though there's a couple of new flags in nvram. I really like the idea of a "sys_up" flag, I think that perhaps it might also be useful for other applications--like maybe the guys who are struggling to write scripts to replace the built-in wl module with another one. The race conditions during Tomato startup are a bitch and a surprise. It sure was a surprise to me when I discovered that mounting an ext3 disc can take as long as 5 minutes!

    Nonetheless, I think that allowing hotplug to [re]start nas before all the services are started will work ok, as long as it's protected with a lock.

    BTW, I finally got a PC loaded with the 2.4.x kernel with the usb & scsi code from Tomato. The idea is to test hotplugging on a platform where I can use a kernel debugged to catch oops's. (What a pain! Ubuntu doesn't like to run with 2.4, nor does the latest Debian. Luckily I had a spare partition on my disk so I could load a very very old version of Debian into it.)
    So far, I can duplicate all of the hotplug functionality of Tomato, except with bash scripts for the mounting & unmounting. Unfortunately, it has never failed yet. Next step is to use the hotplug code from Tomato.
    Not sure how far I'll get before we leave on our 40 day cruise in a couple weeks. :biggrin: :drinking:

    I did find this interesting comment in the old Debian hotplug stuff:
    # ... partial workaround for 2.4 uhci/usb-uhci driver problem: they don't
    # queue control requests, so device drivers can confuse each other if
    # they happen to issue requests at the same time ... it happens easily
    # with slow HID devices and "usbmodules".
    # starting with 2.5 (DEVPATH set), all hcds must queue control traffic.
     
  69. teddy_bear

    teddy_bear Network Guru Member

    Yep, but because they are unique and are applied to diffrenet regions of the file, there is still a chance for deadlocks (not that we need to worry about it now since usb locks are never nested within the nas locks - but who knows how it might change in the future)...
    That's why I was going to make a generic ??_lock() function, but call it with the same argument (lock name/id) for usb and nas locks. And no, I'm not going to keep counts of applied locks - just write pid into the lock file, and do nothing (return -1) if it already has pid of the current process...

    The easiest way for me to reproduce a problem in Tomato kernel was running a "cat /proc/bus/usb/devices" command in a loop while plugging and unplugging usb devices. I made sure the hotplug was not used by clearing /proc/sys/kernel/hotplug. I was getting the whole kernel to lock up this way with rebooting as the only way out... But the issue - if it exists in the kernel - may only be in a MIPS-specific part of the kernel, and if so you won't be able to reproduce it on x86...

    Anyway, hope you'll have a great trip :thumbup:. Where are you going?
     
  70. koszpa

    koszpa Addicted to LI Member

    I have checked that with printouts to the log file that the init script runs before anything you can see in the logs. As I know there is a wl command in the router which handles everything arount the wireless network. It loads the driver, it starts broadcasting, as I can see it is very complated to use, and to find the right order of calls at least for me. So my idea was to quickly remove the old wl driver, then load my working driver, so when the system comes to the step, to start the loaded drivers it should load the driver I set previously. In this case there is no need to click on the Save button in the gui, unfortunately this does not work. The drivers don't come up as they should, in fact if I try to disable and enable the wireless network the results are the same. Probably something is not correct in my reasoning, I am sure about that. ;) I can't simply remove and replace the old driver as I would, or it still depends on something I have no idea about, that is why I tried the previously submitt script to change the drivers based on an event in the logs....

    I think the WAN UP script is way too late to change the drivers this way, because they are already running, so the required and not known wl command or saving in the GUI manually can not be stepped over. This a little dark area in my mind, maybe if someone is familiar with this could help us.
     
  71. ray123

    ray123 LI Guru Member


    Okay, I just looked at the code. There seems to be exactly one place where you could reliably automatically swap out the wl driver during startup.

    During sysinit it does: load wl, start_jffs, load_files_from_nvram, run_nvscript("script_init"), start_lan, start_wan. Then long later, the USB drives & /cifs get mounted.

    In start_jffs it executes the "Execute When Mounted" script. Thats the place where you could invisibly swap out the wl driver. The new driver would have to be in /jffs, also. Unfortunately, even gzipped the various wl drivers are about 450K, so they won't fit in a router with a 4MB flash.

    What is needed is for somebody to write a new program that would be able to call the start/stop_wan functions in Tomato. This would have to be built in to the rc program. Then you could easily write a script that would reside on a USB drive (or /cifsN) which would stop_wan, rmmod wl, insmod new_wl, start_wan. The latter stuff is very simple with the new mods I submitted a month back. The former is pretty easy, too, but I don't have time to do it right now.
     
  72. teddy_bear

    teddy_bear Network Guru Member

    Isn't it already built-in?
    service net stop
    service net start

    or
    service wan stop
    service wan start
     
  73. ray123

    ray123 LI Guru Member

    And hope that the locker doesn't crash before it does the unlock. :) I think. There's lots of obscure failure cases here. :mad:

    Good--another data point. Easiest way for me to make it fail was to write a script that just fired off hotplug child processes, in a loop to add then remove all 3 USB sticks that were plugged in. Plugging & unplugging the USB cable was a pain, and violated my "principal of creative laziness." This helps confirm my suspicion that the bug is in the kernel. If I don't get somewhere soon, maybe I'll try backporting some of the code from the 2.5 kernel. That comment about "all hcds must queue control traffic" has me concerned.

    And, yes, what I got were hard crashes too. As in "pull the power plug" to reset it.

    Although, this is only an issue when you have more than 1 USB drive. And now that I broke down and splurged $37 on a 160GB usb harddrive I don't really need my sticks anymore. But now it's really bugging me!
    Good thing I'm retired.
     
  74. ray123

    ray123 LI Guru Member

    Oops, forgot....

    We're going to fly to Rome, then cruise the Black Sea and Mediterranean, then transatlantic back to Miami. Greece, France, Italy, Turkey, Russia, Spain, Portugul, etc. 27 ports in all.
     
  75. ray123

    ray123 LI Guru Member

    Could be. I dunno. I don't know what all gets done inside Tomato. It does a lot of stuff hard-coded that a normal Linux system does with init scripts. And anyway, you'd want to affect only the wireless wan, not the dsl/cable wan.

    Somebody will need to experiment and report back. I mainly wanted to tell folks about some of the gotcha's that I see. I can't be a trailblazer here, but will gladly shine a flashlight down the path for others to trod.
     
  76. teddy_bear

    teddy_bear Network Guru Member

    No, shouldn't matter... Because the only value there the current process is concerned about is its own pid. Anything else - i.e. pid left over after the crashed process - is ignored.
    Great! You'll probably visit both of my favorite cities on this trip - St.Pete, Russia, and Amsterdam. But almost 40 days sailing.... Isn't it too much sea ;)?
    Then maybe:
    wl down|up
    or
    wlconf <ifname> up|down
     
  77. teddy_bear

    teddy_bear Network Guru Member

    Update - build 35.

    This is the major update - mostly because among other changes it includes the updated Linux kernel to version 2.4.37, and new toolchain. Since I only tested it on wl-520gu router with DHCP WAN connection, for now this build has EXPERIMENTAL status - use it at your own risk, and until more brave testers try it before you, do not flush unless you know how to unbrick your router! Hopefully there will be no serious issues with this build though :)... And if you try it, please report your results here. So here's what's new:
    • Linux kernel is updated to the latest in 2.4.x series version 2.4.37.6.
    • Updated Toolchain: binutils 2.19.1, gcc 3.4.6, uClibc 0.9.29.
    • Automount/unmount improvements: support for .autorun and autostop scripts, deactivate swap on unmount, unmount all mountpoints when requested from the web GUI.
    • Added "Unmount all USB Drives" button action to "Administation -> Buttons".
    • Included FUSE 2.5.3 kernel driver. This allows installing and using Optware ntfs-3g package to mount writable NTFS partitions. Install the latest ntfs-3g Optware package 2009.4.4 on /opt, and automount, autoshare and mounting via GUI features will work for NTFS-formatted partitions.
    • Broadcom wireless driver downgraded to version 4.150.10.29 to solve issues with some Asus wl500gPv2 routers.
    • MiniUPnPd updated to the latest ver. 20090921, and compiled with enabled GENA UPnP events support (was disabled in build 27).
    • Fixed incorrect subnet mask written to MiniUPnPd configuration file.
    • SpeedMod patches updated to the latest version 118 (replaced Jenkins' lookup3 hash with faster MurmurHash 2.0).
    • CIFS is updated to version 1.49 backported from 2.6 kernel: many bug fixes, performance improvements, and security options to allow mounting NTLMv2 and LANMAN (might be required to mount some NAS disks) shares.
    • Proper (hopefully - but untested) detection of D-Link DIR-320 router.
    • Updated IMQ driver to set netfilter hooking behaviour from module parameter.
    • Busybox updated to ver. 1.14.4, with additional patches from 1.15 trunk, compiled with dirname applet, and with support for tainted module checking (required to insmod some extra kernel modules).
    • Dnsmasq is updated to ver. 2.50, enabled TFTP server in Dnsmasq (activate via "Dnsmasq Custom Configuration").
    • Added /proc/sys/net/ipv4/ip_conntrack_count.
    • Solved potential concurrency issues starting samba/ftp introduced in the previous build.
    • Various fixes: OpenSSL security patch (CVE-2008-5077), iptables fixes and enhancements from version 1.3.8, fixes for the Broadcom wireless driver, a few USB driver fixes backported from 2.6.x kernel tree, other minor fixes.
    • Display swap partitions status in the GUI, show labels for NTFS partitions and use them for automount.
    • Optware perl, if installed, should now be able to execute perl scripts without changes.
    • Old Linksys igmprt binary for multicast support is replaced by open source igmpproxy 0.1 application (the one Asus uses in their official firmwares). I had no other choice since the old binary couldn't be used with the new toolchain, and there were no sources. Hopefully the new igmp proxy is at least as good as the old one - but I have no way of testing it as I don't have IPTV or anything else that uses multicast.
    • Cosmetics and code clean-up.
    Known issues with this build (will be fixed in the next update):
    • Changing timeout values on the "Advanced -> Conntrack/Netfilter" page has no effect.
    • JFFS partition size could be determined incorrectly, reducing available JFFS space by no more than ~128 KB.
    Now for those who are wondering why the updated kernel (and still not 2.6).... Well, I got tired of backports from later kernel versions. Updating the kernel (since most work has already been done by others - OpenWRT, DD-WRT and newest Oleg's firmwares - all use 2.4.37 already) was a lot easier, and took much less time than all previous backports I had to make for USB and FS layers being usable. Just recently I tried to port FUSE driver back to kernel 2.4.20 to support ntfs-3g, and although I was able to make it work, it was not stable. So I had a choice between giving up, troubleshooting incompatibility issues between fuse and Tomato kernel, continue backporting code from later kernels, or just update the whole damn thing - and I chose the easiest approach ;). I also considered 2.6 - but new Broadcom WL drivers for kernel 2.6 only support newest wireless chipsets, and will not work for our routers, and the size of 2.6 kernel is huge - won't fit in 4MB flash anyway. When everyone around here switches to RT-N16 or WNR3500L and Tomato gets dual-band "N" mode support, then we can think about 2.6 - until then we have to be happy with the latest 2.4. It still gives us huge advantages compare to ancient 2.4.20, such us huge amount of bug fixes and improvements made in the last few years, wider selection of optional extra modules (i.e. reiser, hfsplus and xfs filesystems, usb serial drivers etc), easy way to apply current official kernel patches, FUSE driver support that allows using ntfs-3g, and finaly the ability to use newer toolchain (binutils/gcc) that produces a lot smaller binaries. In fact, even after all of the above changes, new kernel (read "more code"), FUSE addition, updated software, an increased size of updated CIFS driver, the new build is smaller than the previous one - each edition even gained 1 extra available jffs block.

    Links to the firmware binaries are in the 1st post - go to "Experimental" subfolder and download one of the 8735 builds (direct links: main and mirror). The optional extra modules, and additional Samba codepages are available in "extras.tar.gz" and "samba_extra_codepages.tar.gz" (available codepages are 932, 936, 949, 950, 1251) archives in the same folder.

    The complete source code is not yet in the git - I just started uploading it, and will finish it in the next couple days, once I sort out a mess in my local git repository ;).

    USB+VPN merged build is not yet available either - although I was able to merge the VPN mod sources with the updated kernel, it doesn't compile (fails somewhere in assembly code of new AES cypher added by fyellin). Hopefully SgtPepperKSU will get interested in updating kernel in his mod as well (the vanilla Tomato ND kernel and toolchain updates are in the git already), and will make changes to compile it with the new toolchain...
     
  78. mstombs

    mstombs Network Guru Member

    Big respect for that huge upgrade - I will try it on my non-usb WRT54G-TM later!

    If you want it - the source-code for igmprt is in the Linksys GPL source-code for the WRH54G_v1.01.04. I think you will find reference to this in the igmprt binary itself. Note some folk with iptv reported it didn't work anyway!
     
  79. Toastman

    Toastman Super Moderator Staff Member Member

    Ray 123 - have a great vac.
    T/B - Nice work with the newer kernel!

    8735 Lite seems OK on my WRT54GL, so far.

    ** Tried for 24 hours, stable on 54GL - test finished
     
  80. ray123

    ray123 LI Guru Member

    My my, you've been a busy little bee! No wonder you didn't have any posts for a while. Although......my heart was sinking as I read the massive additions & changes. I could just see the jffs size melting away---but then you said that it's even smaller than before. WoW!!!! :thumbup: :thumbup: (As Jim Cramer says, two thumbs up.)
     
  81. apparissus

    apparissus Addicted to LI Member

    teddy_bear: Thanks very much for the help, bug fix, and new kernel! I'll unleash R35 on my wl500gv2 and report back...
     
  82. Cyrix

    Cyrix Addicted to LI Member

    teddy_bear: thanks for the grate work!

    Is it possible to put a cpu load and mem usage graph to the next release (i saw this function in open wrt) ?

    Thanks
     
  83. ghostknife

    ghostknife Addicted to LI Member

    Hey thanks for more good work!

    WL-500gPv2 , flash result OK.
    Only just done it but it didn't kill anything so far, upgraded straight from the GUI with no problem, kept all settings and didn't reset NVRAM (yet) so see how that goes.

    EDIT 1:
    as I check this the log is filling up with these errors, will check and reboot it:

    Oct 9 01:02:57 WL500 user.err igmpproxy[840]: There must be at least 2 Vif's where one is upstream.
    Oct 9 01:02:57 WL500 user.err igmpproxy[841]: There must be at least 2 Vif's where one is upstream.

    EDIT 2:
    Keeps filling up with that error only when WAN is disabled and/or assigned to LAN. Don't know what it means really so pass on that, google produced no useful info except a few DD-Wrt/OpenWRT references in serial console logs, but not multiple instances like this.
     
  84. ghostknife

    ghostknife Addicted to LI Member

    And yes I saw this bit, just reporting that obviously there is some 'problem' then when WAN disabled?
     
  85. ray123

    ray123 LI Guru Member

    Can't you get this with some opt package? I would consider this type of thing a "real computer" thing, not a "router" thing----and if one is going to be using it as a real computer, then you've really got to have an external drive and opt packages installed.

    I hate to put extraneous stuff in the router's flash, because that just consumes the extremely limited flash space and takes away from the jffs size.
     
  86. teddy_bear

    teddy_bear Network Guru Member

    Thanks for testing guys!

    I'll look into igmpproxy problem with disabled WAN - even though mstombs pointed out where to find the igmprt sources so I can roll it back, I also did a bit of googling, and think that igmpproxy should be better overall. ghostknife - what is in your /etc/igmpproxy.conf file when WAN is disabled? By the way, does the original igmprt actually works for you (i.e. does something that you have use for)?
     
  87. TexasFlood

    TexasFlood Network Guru Member

    build 35

    WL-500gPv2 , flashed OK, upgraded from the GUI keeping all settings with no NVRAM reset.

    Did this just now, no problems noticed as of yet.

    The change in wireless driver caused no issues for me yet, but didn't solve my WDS sync (or lack of sync rather) either. This might be a ND / non-ND build issue. I could probably confirm this by flashing back to the official Tomato ND build, just not sure I want to right now, :biggrin:

    My main router previously was a WRT54G v2.0 and I as running non-ND official Tomato builds across the board with rock solid WDS and WDS syncing on reboot.

    Since I put in the WL-500gPv2 running teddy bear's version of Tomato as my main router if I reboot the main router then lose contact with one or more WDS satellite routers. Again, looks like a solid connection under the device listing but data isn't flowing - this hasn't changed since the previous build. My workaround for now is to reboot them all at the same time, then they all seem to come back up linked with data flowing fine.
     
  88. teddy_bear

    teddy_bear Network Guru Member

    What Tomato and wireless driver version are you running on your client routers?
    Also, what was the first version of USB mod you ever put on your Asus?
     
  89. ghostknife

    ghostknife Addicted to LI Member

    OK I reset NVRAM and started from scratch, problem occurs when 'Allow multicast'
    enabled, turn that on again and the log fills up with sane error.

    quickleave
    phyint vlan1 upstream
    altnet 0.0.0.0/0
    phyint br0 downstream ratelimit 0

    Pass...so probably not :)
     
  90. teddy_bear

    teddy_bear Network Guru Member

    Hmm... Is there any sense in having multicast router running with disabled wan? Where is it going to receive the multicast traffic from?
     
  91. TexasFlood

    TexasFlood Network Guru Member

    Tomato and wireless driver version running on routers WDS connected to the main Asus:

    WR850G v2
    Tomato (beta) Version 1.26 v1.26.1778 (just to get a feel for v1.26)
    wl: 3.90 RC37.0
    wl0: Feb 24 2005 20:22:09 version 3.90.38.0

    WR850G v3
    Tomato Version 1.23 v1.23.1607
    wl: 3.90 RC37.0
    wl0: Feb 24 2005 20:22:09 version 3.90.38.0

    WRT54G v2.0
    Tomato Version 1.23 v1.23.1607
    wl: 3.90 RC37.0
    wl0: Feb 24 2005 20:22:09 version 3.90.38.0

    First version of USB mod I ever put on the Asus router was tomato-ND-USB-8634-Ext
     
  92. ghostknife

    ghostknife Addicted to LI Member

    I wasn't aware and/or didn't have it on for any reason, just was on since whenever I first used tomato probably. Off now.

    Nothing now I rebooted again , file is gone.
     
  93. teddy_bear

    teddy_bear Network Guru Member

    I've heard before about the compatibility issues between different versions of wl drivers (or rather it's the problem of an old driver, and was resolved by Broadcom in one of the 4.150.X.X version). If any of your client routers are capable of running ND version (if wl0_corerev nvram variable is 5 or higher, it's ND capable), can you try to flush it with my or Victek's mod with new wl driver, and see if it resolves the WDS issue for that pair?
     
  94. teddy_bear

    teddy_bear Network Guru Member

    Unless someone lets me know what the multicast router can be used for with disabled wan, I will change it to not start IGMP proxy if wan is disabled or used for LAN.
     
  95. TexasFlood

    TexasFlood Network Guru Member

    I was wondering if putting ND on the other routers would solve the problem. I couldn't remember what corerev was required or what my routers were. I just checked and they're all corerev 7. If 5 is the minimum, then I should try it. I got the idea to stick with the old driver back when I as running dd-wrt. The newd version of that did work but I wasn't able to set up multiple virtual wireless networks unless I used the old driver. Anyway, gonna give this a try. Thanks for the info.

    * Update: Tried official Tomato 1.23ND first. It loaded OK on all routers but didn't resolve the problem. Guess next I try teddy bear on all and see what that does.

    * 2nd update: I got the Lite version on my WRT54G v2.0 but it didn't seem to help WDS re-establishing after rebooting the main Asus. And even though it's the same corerev, I found flashing tomato-ND-USB-8735-Lite on a Motorola WR850G router is not a good idea. The little red lights came on and they became unresponsive to wired or wireless. I had to tftp the previous firmware to revive them. Looking like I might have to live with the workaround.
     
  96. teddy_bear

    teddy_bear Network Guru Member

    BTW, with the new kernel the clock is dead on on my Asus. It's up for 2 days, and on every ntpc run after the last reboot (every 8 hrs), I'm getting:
    router user.info ntpc[1666]: Time Updated: no change needed
    with a few torrents running. Before it was 5-8 secs every 8 hrs.

    We'll see how long it stays on when I start overloading the router ;).
     
  97. gingernut

    gingernut LI Guru Member

    This was something I commented with Victek on but never really got a reason behind these WDS reestablishing link problems. The only way for me to have a more or less stable WDS link is to use the non ND builds.

    In device list the link is there with it's rssi value and all but with no access.

    What I observed is it has something to do with WPA encyption as using no wireless encryption or WEP seems to be fine.
     
  98. teddy_bear

    teddy_bear Network Guru Member

    :frown: Not sure what else to try... Maybe try changing between WEP/WPA/WPA2 modes as gingernut suggested to see if it makes any difference. Or try setting wds_timeout to 0, or to some large value (like 60 secs) - this is controlled by nvram variables wl_wds_timeout and wl0_wds_timeout. Remember to restart wlan or to do nvram commit and reboot after this change.
     
  99. koszpa

    koszpa Addicted to LI Member

    Thank you so much for you work Teddy, this upgrade is really more then awesome.
    Special thanks for the wl driver downgrade, now everything works perfectly with my ASUS wl500gPv2, I have just upragraded to this version, but it seems OK from every angle. Thanks again, good job! :thumbup::thumbup::thumbup:
     
  100. TexasFlood

    TexasFlood Network Guru Member

    Thanks for the suggestions. I tried setting wl_wds_timeout and wl0_wds_timeout both to 45 on all routers and that didn't help, ran out of time before trying them set to 0 but will get to that as well. This is starting to look like a drivers issue not restricted to your version of the firmware, one that I might just have to live with.
     

Share This Page