OpenWrt Forum Archive

Topic: kamikaze on WL-HDD currently broken

The content of this topic has been archived on 11 Apr 2018. There are no obvious gaps in this topic, but there may still be some posts missing at the end.

Hello everybody,

I've had some time again to play with the new BCM947xx code on my WL-HDD. Currently, it doesn't come far:

PMON version 5.3.22 [EL], LSI LOGIC Corp. and Broadcom Corp.
 Compiled on Thu Sep 23 15:53:24 2004
CPU type 4710.CPU clock frequency 125 MHz.Avail RAM 16384 KBytes.
NVRAM: MX29LV320T 2Mx16 TopB.
Visit www.carmel.com for updates.

~Rescue Flag disable.
Downloading os image in 3 seconds
Using specified MAC address.
et0: Broadcom BCM47xx 10/100 Mbps Ethernet Controller 3.11.19.0
rtl bug fix linkup!!
MAC Address: 00:11:d8:b7:ea:4a
Opened ethernet
Downloading from ethernet, ^C to abort
Downloading image time out
Boot os from the flash
CRC OK
Uncompressing....done
Doing command call 80001000

Exception Epc=80256f48 Cause=0000801c (DBE)
~Rescue Flag disable.
nvram not supported
PMON> set
Memory allocation error
PMON>

At first, I was suspecting the NVRAM to be broken. However, when booting directly into PMON with held-down reset button, NVRAM access works fine, so the kernel seems to be messing something up in the earliest stages of booting.

I've read through the code of PMON that was available from Asus and the exception handler is still the one from PMON itself. I don't know enough about the address space layout, though, so I can't tell what's mapped to address 0x80256f48. Is that already part of the kernel?

When booting into PMON first, I can trigger the same exception by doing

PMON> uncmp
Uncompressing....done
PMON> call 80001000

Exception Epc=80256f48 Cause=0000801c (DBE)
~Rescue Flag disable.
PMON>

I hope this is enough information for you guys to get a grip on this bug.

@nbd: does this happen on your WL-HDD too?

Florian

Ah, I forgot: revision was 6653.

Florian

Please compare this with the information in your build_mipsel/linux/System.map file

Ah, that's where I have to look..

The relevant part is probably this:

80250590 B ssb
80251b40 b nvram_buf
80259b40 b cfe_env
80259b50 b _nvdata

A friend of mine told me that exception type DBE could mean Data Bus Error, so perhaps it's something like a misaligned word access - but I'm just guessing wildly here ;-)

P.S. In order to rule out a physically damaged NVRAM, I flashed an older, known-working image, which worked as expected.

Florian

I just realiized: what is EIP doing inside nvram_buf? There shouldn't be any executable code here, right? Or does EIP simply point to the faulty address in this case?

Which leads me to another question: I would like to contribute a bit more to kernel development, however, reading source code (PMON is mostly uncommented) can get you only so far. Can you give me some pointers to better documentation?

Florian

Hi again,

I've just tried revision 7003 and now all that happens anymore is

...
Boot os from the flash
CRC OK
Uncompressing....done
Doing command call 80001000

After that, nothing happens anymore. I also opened a ticket: https://dev.openwrt.org/ticket/1613

Please allow me to reiterate: I have a WL-HDD with serial console, I don't need it for any "production" tasks, I have (at least some) knowledge about kernel coding and low-level hardware programming and I would be quite willing to help with debugging - I just need some pointers on where to start.. (e.g. whether it would make sense to omit the NVRAM initialization from the kernel and try again).

Yours, Florian

Hi floe,

I'm also running kamikaze on 3 WL-HDD and checked out rev. 7004, but it builds for me fine and everything is working well!

There are still some little bugs in the generated image (for example, in the /etc/hosts there is "localhost" with a dot behind; when you try to configure the build to meet your own ip-range it get's into the network file, but with " ' " commented, so it won't work and the nfsd-script isn't working by default, you need to copy another script and port it to meet the nfsd module loading, but that's it...everything else is working for me...

Regards

Joachim

jhinsch wrote:

I'm also running kamikaze on 3 WL-HDD and checked out rev. 7004, but it builds for me fine and everything is working well!

Good news, though I'm now a bit confused as to why it doesn't work for me.. Are you using the brcm47xx-2.6 target (CONFIG_LINUX_2_6_BRCM47XX), too? If yes, then maybe I have a confused NVRAM.. could you perhaps send me a copy of your NVRAM contents?

Thanks, Yours, Florian

Hi Floe,

sorry,but I'm on the 2.4 Kernel with WL-HDD target...

Best regards

Joachim

PS: Could it be, that you have installed a 160GB HDD from SAMSUNG? I have trouble with one WL-HDD which isn't booting anymore, inside is a brand new 160GB HDD...so I have to remove it and try again...

Nope.. I'm using a compact flash card. Anyway, this is not a hardware problem, as earlier revisions worked (not well, but they booted at least).
As an update: I've tried a completely fresh checkout of rev. 7031, just to be on the safe side. Still the same (non-)result..

Florian

Hmmm, ok, so if you like, I could provide my nvram contents, just to be sure everything is ok...But, maybe you try out the 2.4 Kernel with the WL-HDD target?

Thanks, no need for the NVRAM now.. as for 2.4, the USB device support is not quite what I would like.

For the developers: I've tried a kernel with just about every debug feature switched on. The result is slightly different - now I get PMON exceptions again:

Boot os from the flash
CRC OK
Uncompressing....done
Doing command call 80001000

Exception Epc=80300280 Cause=00008010 (AdEL)
~Rescue Flag disable.
PMON> 
PMON> call 80001000

Exception Epc=80304d60 Cause=00008028 (RI)
~Rescue Flag disable.
PMON> call 80001000

Exception Epc=80304d60 Cause=00008028 (RI)
~Rescue Flag disable.
PMON>

The (possibly) relevant parts from System.map:

802ff278 t __build_store_reg
802ff328 T build_clear_page
80300850 T build_copy_page
803028cc t set_ntlb
...
80304ca0 T lockdep_info
80304d60 T lockdep_init
80304ddc t lockdep_proc_init

I'm not sure the second error is meaningful, though, as I've just tried to restart the kernel after the first exception..

Florian

Hi everyone.. just curious if there are any new developments regarding the WL-HDD and brcm47xx-2.6.
Has anybody managed to get this combination to run?

Florian

floe wrote:

... Has anybody managed to get this combination to run?

Yeah me, both bcrm-2.6 (port) and bcrm47xx-2.6 with kamikaze_7.06, also switched to kernel 2.6.19.7 from linux-mips-git.
My bcrm-2.6 target comes with wl-hdd rtc working.
The bcrm47xx-2.6 finished right now at the moment.
Some lower performance for net (~ -300 kb) and ide (~ -1.5 MB) in kernel 2.6 series on this device.

Regards
b.sander

Hm, unfortunately, it still doesn't work for me..
I have taken a completely fresh checkout of revision 7770, selected target brcm47xx-2.6 (CONFIG_LINUX_2_6_BRCM47XX=y) and ran make.
I have then flashed the resulting openwrt-brcm47xx-2.6-squashfs.trx to my WL-HDD via tftp.

The result is still the same:

...
Boot os from the flash
CRC OK
Uncompressing....done
Doing command call 80001000

The device hangs after that, no serial output and no ping responses. Could you perhaps send me a .trx file you have successfully booted or upload it somewhere so that I can try if that at least works with my device?

Thanks, Yours, Florian

floe wrote:

Hm, unfortunately, it still doesn't work for me..
I have taken a completely fresh checkout of revision 7770, selected target brcm47xx-2.6 (CONFIG_LINUX_2_6_BRCM47XX=y) and ran make.
I have then flashed the resulting openwrt-brcm47xx-2.6-squashfs.trx to my WL-HDD via tftp.

Sure, because my mod is not in trunk.

floe wrote:

... Could you perhaps send me a .trx file you have successfully booted or upload it somewhere so that I can try if that at least works with my device? ...

Plz PM me your e-mail addy.

Regards
b.sander

Just wanted to mention that the WL-HDD is back wink Thanks, nbd and b.sander and everybody else who participated.
There seems still to be a problem with the serial port, though.. detection doesn't work, judging by the kernel log.

Florian

P.S. Out of sheer curiosity: what exactly was the problem which prevented it from booting?

(Last edited by floe on 10 Jul 2007, 21:17)

Hi all,

just to get down to earth from initial euphoria of my wl-hdd working again with kamikaze...

Serial port blocks after init spawns serial console (see cosole log i attached to ticket 1613) and syslog is flooded with messages of ash process killed and restarting..)

I insmoded kmod-brcm for wireless and system freezes :-( One can see last kernel messages with 'insmod <module>; logread` so it can be still investigated.

Yes, i would like to know too what the hell was on with previous kamikaze versions.

Andy

Hi

Does this mean that I can do a checkout of the latest version and run it on my wl-hdd? Or should I grab a specific version?

Pretty cool that you guys got it working. I pretty much thought that 4710 was dead in the water.

/R

The last one which I tried and that worked was rev. 7905, but I suppose also the latest one will do.
Some things don*t work again yet, e.g. serial and wireless AFAICT.

Florian

The discussion might have continued from here.