OpenWrt Forum Archive

Topic: RB-133 lockup

The content of this topic has been archived on 12 Apr 2018. There are no obvious gaps in this topic, but there may still be some posts missing at the end.

I am building from the svn HEAD, a kernel for the RB1XX. YAFFS is working, the new interrupt logic looks to be working but periodically the kernel hangs, the symptoms being the console and network are no longer working. There is no panic and my remote syslog server is not recording anything.

Can anyone provide suggestions on how to debug this scenario? I suspect that this problem is related to the networking and my gut feeling is that the problem is Madwifi/Interrupt related. One way for me to determine the culprit is to remove components (Hardware and Software) until the problem goes away, but this can be very time consuming.

I have been working at revision 7272 for the last couple of weeks and the 2.19 kernel has been solid on the RB-133 and 133/C hardware.

Maybe one of the the development guys can provide me a pointer or two to isolate the problem ... or indicate if this is a known problem.

Thanks,

Ian

can you try this  with madwifi-0.9.3.1 version?

The 0.9.3.1 version does not appear to lockup. It was pretty easy to reproduce when using the madwifi on the HEAD as it locked up every 3-4 reboots but 0.9.3.1 has not locked up once in 20 reboots.

(Last edited by osmosis on 8 Jun 2007, 19:32)

hope nbd is reading the above report ...

Everything was working fine with 0.9.3.1 and I mistakenly went back to version on the HEAD and started to have problems again.

Maybe ndb or someone else can provide some insight as to why a snapshot is being used for Madwifi?

Thanks,

Ian

I guess one of the main goals of open source is to develop and evolve.  As long as kamikaze is not stable yet, development and testing is a must and more important perhaps than stability ... looking @ kamikaze_7.06, madwifi-0.9.2.1 may safely go madwifi-0.9.3.1

(Last edited by acoul on 14 Jun 2007, 12:51)

I do understand about the Open Source development process, evolution etc ... I am just trying to understand why a snapshot is being used? There has to be a reason!

Is there a patch somewhere that updates support from 0.9.2.1 to 0.9.3.1?  If so it might be worth attaching it to a Trax ticket and adding the ticket number in here so that people can avoid the problem until it gets fixed in svn.

David

I have created a ticket and attached an archive that can compiles 0.9.3.1 (Thanks to acoul).

Hello,

A bit offtopic, but someone can test the new memory detection code on a RB133/RB133C/RB150/RB153? Florian tested it on a Cellvision CAS-771W and on a RB112, i tested it on a ZyXEL Prestige 334WT and it works fine on these boards. I would like to know how it works on the other RouterBOARDs.

Thanks,
Gabor

I will check on a RB133 and get back to you, thanks

Hi Gabor ... good work!

This is the results of my testing from r7639, which is built entirely from svn (No external changes).

RB-133 with 1 PCI Card - abbreviated dmesg.

loading kernel from nand... OK
setting up elf image... OK
jumping to kernel code
mem_detect: checking for 64MB chip
mem_detect: 1st pattern at 0x200000 is 0x00
mem_detect: 1st pattern at 0x400000 is 0xff
mem_detect: 1st pattern at 0x800000 is 0x00
mem_detect: 1st pattern at 0x1000000 is 0x55
mem_detect: 2nd pattern at 0x1000000 is 0x55
mem_detect: mirrored data found at 0x1000000
mem_detect: 16MB chip found
mem_detect: 16MB memory found
Linux version 2.6.21.5 (ielbury@nbdev) (gcc version 4.1.2) #1 Fri Jun 15 13:17:13 PDT 2007
ADM5120 revision 8, running at 175MHz
Boot loader is: RouterBOOT
Booted from   : NAND flash
Board is      : RouterBOARD 133
Memory size   : 16MB
CPU revision is: 0001800b
ADM5120 board setup
Determined physical RAM map:
 memory: 00d2d000 @ 002d3000 (usable)
Wasting 23136 bytes for tracking 723 unused pages

...

PCI: mapping irq for device 0000:00:01.0, slot:1, pin:1, irq:6

...

RB-133C with 3 PCI Cards - abbreviated dmesg.

loading kernel from nand... OK
setting up elf image... OK
jumping to kernel code
mem_detect: checking for 64MB chip
mem_detect: 1st pattern at 0x200000 is 0x00
mem_detect: 1st pattern at 0x400000 is 0x7f
mem_detect: 1st pattern at 0x800000 is 0x00
mem_detect: 1st pattern at 0x1000000 is 0x00
mem_detect: 1st pattern at 0x2000000 is 0x55
mem_detect: 2nd pattern at 0x2000000 is 0x55
mem_detect: mirrored data found at 0x2000000
mem_detect: 32MB chip found
mem_detect: 32MB memory found
Linux version 2.6.21.5 (ielbury@nbdev) (gcc version 4.1.2) #1 Fri Jun 15 13:17:13 PDT 2007
ADM5120 revision 8, running at 175MHz
Boot loader is: RouterBOOT
Booted from   : NAND flash
Board is      : RouterBOARD 133
Memory size   : 32MB
CPU revision is: 0001800b
ADM5120 board setup
Determined physical RAM map:
 memory: 01d2d000 @ 002d3000 (usable)
Wasting 23136 bytes for tracking 723 unused pages

...

PCI: mapping irq for device 0000:00:01.0, slot:1, pin:1, irq:6
PCI: mapping irq for device 0000:00:02.0, slot:2, pin:1, irq:7
PCI: mapping irq for device 0000:00:03.0, slot:3, pin:1, irq:8

...

ath_pci: 0.9.4.5 (svn r2420)
PCI: Enabling device 0000:00:01.0 (0000 -> 0002)
ath_pci: switching rfkill capability off
ath_pci: ath_pci: switching per-packet transmit powe
wifi0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 3
wifi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
wifi0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9
wifi0: turboA rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbp
wifi0: turboG rates: 6Mbps 12Mbps 18Mbps 24Mbps 36Mb
wifi0: H/W encryption support: WEP AES AES_CCM TKIP
wifi0: mac 10.5 phy 6.1 radio 6.3
wifi0: Use hw queue 1 for WME_AC_BE traffic
wifi0: Use hw queue 0 for WME_AC_BK traffic
wifi0: Use hw queue 2 for WME_AC_VI traffic
wifi0: Use hw queue 3 for WME_AC_VO traffic
wifi0: Use hw queue 8 for CAB traffic
wifi0: Use hw queue 9 for beacons
wifi0: Atheros 5212: mem=0x11400000, irq=6
PCI: Enabling device 0000:00:02.0 (0000 -> 0002)
wifi%d: request_irq failed

...

RB-133C with 3 PCI Cards - /proc/interrupts

root@TestGateway:/# cat /proc/interrupts
           CPU0
  2:          0            MIPS  cascade [INTC]
  6:          0            MIPS  wifi0
  7:      20767            MIPS  timer
  8:          0            INTC  wifi1
  9:        145            INTC  ADM5120 UART
 17:         29            INTC  ethernet switch

ERR:          0

RB-133C with 3 PCI Cards - iwconfig.

root@TestGateway:/# iwconfig
lo        no wireless extensions.

eth0      no wireless extensions.

eth1      no wireless extensions.

eth2      no wireless extensions.

br-lan    no wireless extensions.

imq0      no wireless extensions.

imq1      no wireless extensions.

wifi0     no wireless extensions.

ath0      IEEE 802.11g  ESSID:"XXXXXX"  Nickname:""
          Mode:Managed  Frequency:2.427 GHz  Access Point: Not-Associated
          Bit Rate:1 Mb/s   Tx-Power:0 dBm   Sensitivity=0/3
          Retry:off   RTS thr:off   Fragment thr:off
          Encryption key:XXXXXXXXX   Security mode:restricted
          Power Management:off
          Link Quality=0/127  Signal level=-256 dBm  Noise level=-256 dBm
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

Results:

- The memory detection logic is working.
- PCI interrupt mapping is not correct. There is a conflict with the timer IRQ <-> PCI slot 2.
- None of the Wifi PCI cards are working anymore on both the 133 and 133C.

Cheers,

Ian

(Last edited by osmosis on 15 Jun 2007, 22:31)

I made the following change in fixup-adm5120.c and all my Wifi cards work now. As this is NOT a proper fix, I have not created a ticket and maybe Gabor can verify???

int __init pcibios_map_irq(struct pci_dev *dev, u8 slot, u8 pin)
{
    int irq;
    
    irq = -1;

    if(slot > 0 && slot < 4) {
        irq = 10 + slot;
    }
    
    printk(KERN_INFO "PCI: mapping irq for device %s, slot:%u, pin:%u, "
        "irq:%d\n", pci_name(dev), slot, pin, irq);
        
    return irq;
}

Hello Ian,

Results:

- The memory detection logic is working.
- PCI interrupt mapping is not correct. There is a conflict with the timer IRQ <-> PCI slot 2.
- None of the Wifi PCI cards are working anymore on both the 133 and 133C.

Thank you for the detailed information. The memory detection logic are still broken on some Edimax devices but this don't matters here.

if(slot > 0 && slot < 4) {
        irq = 10 + slot;
    }

The code above assigns irq 11 for slot 1, irq 12 for slot 2, and irq 13 for slot 3. In my opinion this should be something like the code below, because the PCI interrupts counted from 14 since the new IRQ handler.

    if(slot > 0 && slot < 4) {
        irq = ADM5120_IRQ_PCI0 + slot-1;
    }

Anyway here are new code for PCI IRQ mapping in the svn from now.

Thanks,
Gabor

The discussion might have continued from here.