OpenWrt Forum Archive

Topic: 32MB flash in TL-MR11U/AR9331-based HW

The content of this topic has been archived between 7 Apr 2018 and 22 Apr 2018. There are no obvious gaps in this topic, but there may still be some posts missing at the end.

Hi,

After some reading in various topics I decided to upgrade memories in my TL-MR11U (the chinese version of the TL-MR3040). There are good topics on that and they all stick to a 16MB flash. I figured I would use a bigger flash, reasoning that the AR9331 would need the first MBs for booting, and the kernel would take over by using MTD (which does not rely on a hardware FTL).

Well, modding to 64MB DRAM went well, the 32MB flash failed at first because openwrt (mktplinkpart.c) failed to find the TP-link header.
Odd... but not - here's why: the m25p80 MTD driver detects the >16 MB flash so it switches to 4-byte addressing mode. That, however, is not supported by the hardware FTL in the ar9331. Not a problem normally, because the kernel uses SPI (it did to figure out the make and model of the flash chip).

Yet, the maintainer for the ar9331 spi driver optimized spi access by determining if it is a data read and if so, uses memcpy() to transfer data using the (much faster) hardware FTL.

This is fixable by modifying some struct member ( is_flash) which would cause that driver to refrain from memcpy and use spi instead. I actually modified the driver to always use the spi routines. Additionally, I modified the m25p80.c driver to reset the chip to 3-byte addressing mode on unregistering the driver, so the device can boot after reset.

So now I have a operational OpenWRT image with 32 Mbyte of flash. There is just one problem: no wireless because the ath9k driver cannot load the ART data onto the hardware.

I did make modifications to the u-boot, so my ART partition is now at where the 64k uboot-env used to be - also hoping that the kernel driver would somehow receive a copy of this ART partition by using the MTD, but it appears it is not so.

At this moment, I am considering two options to get wireless up again:

1) make the ATH9k kernel driver retrieve a copy of the MTD partition, store it in kernel memory and use it as a reference, or
2) make u-boot initialize the wireless - code for this would be similar to dev-init in ar7240-pci.c. I will try the u-boot version developed by pepe2k tonight...

I took a look at the ath9k kernel driver, but I can't figure out how it learns to read the eeprom and from what location. The "ahb.c" file appears to contains some functions that hint on the memory mapped access, but it receives the base address from somewhere else. 

Any insights or suggestions on how to get the ART data into the wireless driver are highly appreciated!

Hi pepe2k,

Thanks for your hints. Yes, I've read those posts and maybe I have misread them. My openWRT is running happy with a 32Mb flash, and the 16Mb limit applies to using the hardware FTL for accessing the SPI flash -- which is why I am not using it, and openwrt can address the full 32Mb.

My only challenge seems to be getting the wifi driver up and running. I can imagine it would be a problem if the radio in the ar9331 is hardwired to use the ar9331 FTL, instead of me loading the data of the ART into the appropriate registers (as appears to be done in the u-boot's file 'ar7240-pci.c'.) Or to work from a shadow copy in RAM.

The datasheet does not suggest that the radios are looking for settings at a predefined memory location.  The Artheros sources I looked at so far seem to suggest the ART is transferred to a set of registers in the radio. What I cannot easily determine, is how the ath9k kernel driver is dealing with this or if it would be an option to modify your u-boot sources so it loads the ART into the chip registers while the hardware FTL is still active..

gerritb wrote:

Thanks for your hints. Yes, I've read those posts and maybe I have misread them. My openWRT is running happy with a 32Mb flash, and the 16Mb limit applies to using the hardware FTL for accessing the SPI flash -- which is why I am not using it, and openwrt can address the full 32Mb.

I didn't know that there is other possibility to access SPI devices/memory. Did you make any tests to confirm that you are able to use all available space, like uploading over SCP a big file (> 16 MB) and download it back?

On AR9331 with OpenWrt all I need to do, to make WiFi working, is to place the ART data at the end of flash, no matter what size of flash I'm using. So I don't think that it's limited by AR9331 FTL. As far I remember, in TP-Link U-Boot for AR9331 based devices, the ar7240_pci.c file isn't used. In my version I have removed every unnecessary files, so probably you should use original sources from TP-Link GPL for that.

gerritb wrote:

The datasheet does not suggest that the radios are looking for settings at a predefined memory location.

I think that it's need to be done in code. I don't know OpenWrt very well, so I also don't know where you should search the code responsible for that.

I'll give it a go, as for fully using it: jffs formatted all 19 Mb remaining without complaints, so I have no reasons to assume it would not be useable....  It is a fair bit slower because is using bit-banging to access the SPI... 

Maybe I will post a question to the ath9k kernel driver developers about this, too...

Could you post changes which you made to run OpenWrt on 32 MB? smile

pepe2k wrote:

Could you post changes which you made to run OpenWrt on 32 MB? smile

Sure - I'll do that later this evening..

pepe2k wrote:

On AR9331 with OpenWrt all I need to do, to make WiFi working, is to place the ART data at the end of flash, no matter what size of flash I'm using.

That was my thought, too. But I did not mark that partition 'writeable' at first and I could only use openwrt to burn it. And then I decided to move the ART partition to 0x10000, because it appears to be referred by u-boot as well (BOARDCAL and WLANCAL defines in ap121.c) and u-boot can't read beyond 16MB..  I did update tplinkpart.c to accomodate this relocation. So, maybe I'm going to try just putting a copy in the last sector, too.

gerritb
Script using mtd4 art block - /etc/hotplug.d/firmware/10-ath9k-eeprom

Thanks, I was working on Attitude Adjustment, but if f/w loading is supported on barrier breaker, I'll give that a try..

I'm still working on getting the artheros drivers to work, and I am almost there.. I'm just hacking around in the openwrt build tree and did not yet find time to use 'quilt' for patching. I changed two files in build-dir/linux-ar71xx /linux-3.3.8/ to get openWRT running:

1) drivers/mtd/devices/m25p80.c
=======================================================
static int __devexit m25p_remove(struct spi_device *spi)
{
        struct m25p     *flash = dev_get_drvdata(&spi->dev);
        int             status;

        const struct spi_device_id      *id = spi_get_device_id(spi);

        struct flash_info *info = (void *)id->driver_data;
        // return 3-byte address mode so hardware FTLs do not get confused.

        set_4byte(flash, info->jedec_id, 0);

       /* Clean up MTD stuff */
         <<code continues here.. >>
====================================================
2)  drivers/spi/spi-ath79.c, tail of 'ath79_spi_setup_transfer(..)',
lines 338,340 commented out
====================================================
// if (cdata->is_flash)
//      sp->bitbang.txrx_bufs = ath79_spi_txrx_bufs;
// else
         sp->bitbang.txrx_bufs = spi_bitbang_bufs;

<< code continues here >>

What is the flash chip you use?

Here comes quilt patch with your changes against current trunk:

diff --git a/target/linux/ar71xx/patches-3.10/940-add-32mb-flash-support.patch b/target/linux/ar71xx/patches-3.10/940-add-32mb-flash-support.patch
new file mode 100644
index 0000000..7bca77e
--- /dev/null
+++ b/target/linux/ar71xx/patches-3.10/940-add-32mb-flash-support.patch
@@ -0,0 +1,31 @@
+--- a/drivers/mtd/devices/m25p80.c
++++ b/drivers/mtd/devices/m25p80.c
+@@ -1137,6 +1137,13 @@ static int m25p_remove(struct spi_device
+     struct m25p    *flash = dev_get_drvdata(&spi->dev);
+     int        status;
+ 
++    const struct spi_device_id    *id = spi_get_device_id(spi);
++
++    struct flash_info *info = (void *)id->driver_data;
++    // return 3-byte address mode so harware FTLs do not get confused.
++
++    set_4byte(flash, info->jedec_id, 0);
++
+     /* Clean up MTD stuff. */
+     status = mtd_device_unregister(&flash->mtd);
+     if (status == 0) {
+--- a/drivers/spi/spi-ath79.c
++++ b/drivers/spi/spi-ath79.c
+@@ -335,9 +335,9 @@ static int ath79_spi_setup_transfer(stru
+         return ret;
+ 
+     cdata = spi->controller_data;
+-    if (cdata->is_flash)
+-        sp->bitbang.txrx_bufs = ath79_spi_txrx_bufs;
+-    else
++    // if (cdata->is_flash)
++    //     sp->bitbang.txrx_bufs = ath79_spi_txrx_bufs;
++    // else
+         sp->bitbang.txrx_bufs = spi_bitbang_bufs;
+ 
+     return ret;

I use a spansion 25fl256s0 in WSON8 package, ordered from digi-key.

.... and it works!

By overwriting the second sector in flash I also erased the MAC address that is on offset 0x1FC00. Putting that address back made my AP happy

Thanks for the quilt patch, btw.

Wow! Congratulations! You are welcome. Can you get some pictures of how you have soldered WSON8 package?

gerritb

Are you can provide speed test  R/W with this stuff?
Use package coreutils-dd (After install the package you must use "/usr/bin/dd" with speed metering)
Or you can use hdparm package (read speed only "hdparm -Tt /dev/mtd#")

You can see more at this link

pictures and speedtest I'll share later this weekend - but I expect it to be slow because I disabled the tweak in ath79-spi.c.

I soldered the WSON8 by separately applying tin on the pads of the chip and on the pcb pads. Then put the chip down, solder one pin, align the chip, solder the opposite pin and then solder the others. I bought a -relatively- cheap rework station from e-Bay (an Atten AT8586) so I used my hot-air gun for that.

16MB is HW limit,SW is hard workaround the HW limit...
Are you sure the 32M flash work now?Any test for this case?Put a file more than 16MB to board and get it back,the MD5 is still right?

mips wrote:

16MB is HW limit,SW is hard workaround the HW limit...
Are you sure the 32M flash work now?Any test for this case?Put a file more than 16MB to board and get it back,the MD5 is still right?

login as: root
Server refused our key
root@192.168.1.1's password:


BusyBox v1.19.4 (2013-07-17 23:03:38 CEST) built-in shell (ash)
Enter 'help' for a list of built-in commands.

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 ATTITUDE ADJUSTMENT (Attitude Adjustment, r37378)
 -----------------------------------------------------
  * 1/4 oz Vodka      Pour all ingredients into mixing
  * 1/4 oz Gin        tin with ice, strain into glass.
  * 1/4 oz Amaretto
  * 1/4 oz Triple sec
  * 1/4 oz Peach schnapps
  * 1/4 oz Sour mix
  * 1 splash Cranberry juice
 -----------------------------------------------------
root@zeepaard:~# md5sum
^C
root@zeepaard:~# df -h
Filesystem                Size      Used Available Use% Mounted on
rootfs                   25.0M    752.0K     24.3M   3% /
/dev/root                 6.0M      6.0M         0 100% /rom
tmpfs                    30.2M    272.0K     29.9M   1% /tmp
tmpfs                   512.0K         0    512.0K   0% /dev
/dev/mtdblock3           25.0M    752.0K     24.3M   3% /overlay
overlayfs:/overlay       25.0M    752.0K     24.3M   3% /
root@zeepaard:~# cd ~
root@zeepaard:~# df -h
Filesystem                Size      Used Available Use% Mounted on
rootfs                   25.0M     16.5M      8.5M  66% /
/dev/root                 6.0M      6.0M         0 100% /rom
tmpfs                    30.2M    380.0K     29.8M   1% /tmp
tmpfs                   512.0K         0    512.0K   0% /dev
/dev/mtdblock3           25.0M     16.5M      8.5M  66% /overlay
overlayfs:/overlay       25.0M     16.5M      8.5M  66% /
root@zeepaard:~# cd root
-ash: cd: can't cd to root
root@zeepaard:~# cd /root
root@zeepaard:~# ls
python-2.7.4.amd64.msi
root@zeepaard:~# md5sum python-2.7.4.amd64.msi
7c44c508a1594a8be8145d172b056b90  python-2.7.4.amd64.msi
root@zeepaard:~# ls -al
drwxr-xr-x    1 root     root             0 Jul 19 20:54 .
drwxr-xr-x    1 root     root             0 Jan  1  1970 ..
-rw-r--r--    1 root     root      16625664 Apr 27 09:14 python-2.7.4.amd64.msi
root@zeepaard:~# md5sum python-2.7.4.amd64.msi
7c44c508a1594a8be8145d172b056b90  python-2.7.4.amd64.msi
root@zeepaard:~# reboot
root@zeepaard:~#
login as: root
Server refused our key
root@192.168.1.1's password:


BusyBox v1.19.4 (2013-07-17 23:03:38 CEST) built-in shell (ash)
Enter 'help' for a list of built-in commands.

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 ATTITUDE ADJUSTMENT (Attitude Adjustment, r37378)
 -----------------------------------------------------
  * 1/4 oz Vodka      Pour all ingredients into mixing
  * 1/4 oz Gin        tin with ice, strain into glass.
  * 1/4 oz Amaretto
  * 1/4 oz Triple sec
  * 1/4 oz Peach schnapps
  * 1/4 oz Sour mix
  * 1 splash Cranberry juice
 -----------------------------------------------------
root@zeepaard:~# cd /root
root@zeepaard:~# ls
python-2.7.4.amd64.msi
root@zeepaard:~# ls -al
drwxr-xr-x    1 root     root             0 Jul 19 20:54 .
drwxr-xr-x    1 root     root             0 Jan  1  1970 ..
-rw-r--r--    1 root     root      16625664 Apr 27 09:14 python-2.7.4.amd64.msi
root@zeepaard:~# md5sum python-2.7.4.amd64.msi
7c44c508a1594a8be8145d172b056b90  python-2.7.4.amd64.msi
root@zeepaard:~# df -h
Filesystem                Size      Used Available Use% Mounted on
rootfs                   25.0M     16.5M      8.5M  66% /
/dev/root                 6.0M      6.0M         0 100% /rom
tmpfs                    30.2M    264.0K     29.9M   1% /tmp
tmpfs                   512.0K         0    512.0K   0% /dev
/dev/mtdblock3           25.0M     16.5M      8.5M  66% /overlay
overlayfs:/overlay       25.0M     16.5M      8.5M  66% /
root@zeepaard:~#

the MD5 is the same as computed by my ubuntu VM, and the one supplied by python.org, as I would have expected. I have even power-cycled the router to ensure it is in flash.

For testing the speed I used dd, but the one provided by busybox and timed it :

root@zeepaard:~# time dd count=14 bs=1M if=/dev/mtd2 of=/dev/null
14+0 records in
14+0 records out
real    0m 18.58s
user    0m 0.00s
sys     0m 1.00s

14 MB in 19 seconds equates to the same numbers found in the MMC hack, some 700 Kb/s. I would expect this too, since it is the same code that does the read access..

I must admit I patched one other file, which is the ar9003_eeprom.c file. It should be done elsewhere, but hey-it works.  -again, w/o Quilt, from my notes:

// compat-wireless -> drivers/net/wireless/ath/ath9k/ar9003_eeprom.c
// Add these in the heading

#include <linux/kernel.h>
#include <linux/mtd/mtd.h>
#include <linux/err.h>


// == new function

static int ar9300_eeprom_restore_mtd(struct ath_hw *ah, u8 *mptr,
                                       int mdata_size)
{
        struct ath_common *common = ath9k_hw_common(ah);
        int num;
        struct mtd_info *mtd_info = NULL;
        size_t written;
        for (num = 0; num < 64; num++) {
                printk(".");
                mtd_info = get_mtd_device(NULL, num);
                if(mtd_info->type == MTD_ABSENT) {
                        printk(" mtd %d not here\n",num);
                        put_mtd_device(mtd_info);
                        continue;
                }

                if (!strcmp("art",mtd_info->name)) {
                    printk("Using %s for eeprom data..\n", mtd_info->name );
                    break;
                } else {
                    put_mtd_device(mtd_info);
                    mtd_info = NULL;
                }
        }

        if (mtd_info == NULL) {
         ath_err(common,"mtd restore failed");
         return -EIO;
        }

        // all that remains now, is copying to mptr..
//        printk("fetch from mtd %s: %d bytes to %0x8x",mtd_info->name,mdata_size,(u32) mptr);

        mtd_read(mtd_info, 0x1000, mdata_size, &written, mptr);
        put_mtd_device(mtd_info);
        return 0;
}

/// in eeprom_restore_internal(...), lines 3320-3324 or thereabout:

 if (ath9k_hw_use_flash(ah)) {
                 u8 txrx;
                ath_err(common,"try mtd restore to %8x .. ",(u32)mptr);
                if (ar9300_eeprom_restore_mtd(ah, mptr, mdata_size) < 0) {
                 ath_err(common,"well, that failed..");
                 ar9300_eeprom_restore_flash(ah, mptr, mdata_size);
        }

(Last edited by gerritb on 20 Jul 2013, 00:19)

And pictures: the album is here.

Close-up of the flash chip:
closeup.

And the full PCB:
full PCB
The wires you see running across the PCB is the serial interface attached to the unused pins on the ethernet port. That way, everything remains intact..

After some reading I discovered that the wifi pineapple is based on the AP121 board, which coincidentally also is similar to this board...

Gerritb,that's cool,happy hack!

I just guess:
The 16M flash limit is by the the ar9331 SoC "address map" design,if we don't use the "memory map" to access flash,just use the raw SPI,we can workaroud the flash limit.

The old good days we use the "memory maped" address to access flash will not work if it more than 16MB,this will case some bug,just like this:

        u8 *ee = (u8 *) KSEG1ADDR(0x1fff1000);

        ath79_init_mac(ath79_eth0_data.mac_addr, mac, 0);

        ath79_register_mdio(0, 0x0);
        ath79_register_eth(0);

        ath79_register_wmac(ee, mac);

The "(u8 *) KSEG1ADDR(0x1fff1000);" will auto map to the end 64k of the flash,no matter your flash is 4M/8M/16M.but now it's not work.
When access flash more than 16M,you can't use this SoC aready mapped address.

But we need verify the speed we access the flash?it's same as the old case?

mips wrote:

Gerritb,that's cool,happy hack!

thanx ;-)

indeed the patches I made revert optimizations that let OpenWRT access the flash by memory map. And indeed the ath9k driver could not initialize.

I submitted a question to the ath9k-devel mailing list, and the suggestion was not to modify the driver but to change the code that calls ath79_register_wmac(ee,mac). I would like to follow-up on that, but currently I'm just too happy with the results and I need the device for my holidays...

Alternatively, I merged the ART partition into the second sector of the flash so I could change the KSEG address to 0x101f1000.

mips wrote:

But we need verify the speed we access the flash?it's same as the old case?

So, this depends... U-boot uses the 'memory mapped' mode. That's in hardware so it is very quick in loading and starting the kernel. Once the kernel takes over, it degrades into 'bit-banging' which is cpu-intensive. If you stick with 16MB flashes, then do not use the patches for ath79-spi.c: they would slow you down....

(Last edited by gerritb on 20 Jul 2013, 02:37)

When choosing a flash chip do we need to note the clock frequency of the spi or is it auto detected?
S25FL064P
– Normal READ (Serial): 40 MHz clock rate
– FAST_READ (Serial): 104 MHz clock rate (maximum)
S25FL064K
Speed
– Normal READ (Serial): 33 MHz clock rate
– FAST_READ (Serial): 80 MHz clock rate (maximum)
S25FL127S
– Normal READ (Serial): 50 MHz clock rate
– FAST_READ (Serial): 104 MHz clock rate (maximum)
EN25Q64
– Normal READ (Serial): 50 MHz clock rate
– FAST_READ (Serial): 104 MHz clock rate (maximum)
Are they interchangeable?

(Last edited by alphasparc on 20 Jul 2013, 20:59)

alphasparc wrote:

When choosing a flash chip do we need to note the clock frequency of the spi or is it auto detected?

In practice, all operations R/W with flash memory (except for the quick read command) hold in the 'bit-banging' mode, ie depends on the CPU speed.

My router has a CPU = 400Mhz, and here is the result:

root@OpenWrt:~# cat /sys/kernel/debug/mmc0/clock
25000000
root@OpenWrt:~# /usr/bin/dd count=10M if=/dev/mmcblk0p1 of=/dev/null
27072+0 records in
27072+0 records out
13860864 bytes (14 MB) copied, 20.1252 s, 689 kB/s

root@OpenWrt:~# echo '15000000' > /sys/kernel/debug/mmc0/clock
root@OpenWrt:~# /usr/bin/dd count=10M if=/dev/mmcblk0p1 of=/dev/null
27072+0 records in
27072+0 records out
13860864 bytes (14 MB) copied, 20.0976 s, 690 kB/s

root@OpenWrt:~# echo '10000000' > /sys/kernel/debug/mmc0/clock
root@OpenWrt:~# /usr/bin/dd count=10M if=/dev/mmcblk0p1 of=/dev/null
27072+0 records in
27072+0 records out
13860864 bytes (14 MB) copied, 19.7986 s, 700 kB/s

root@OpenWrt:~# echo '9000000' > /sys/kernel/debug/mmc0/clock
root@OpenWrt:~# /usr/bin/dd count=10M if=/dev/mmcblk0p1 of=/dev/null
27072+0 records in
27072+0 records out
13860864 bytes (14 MB) copied, 20.0985 s, 690 kB/s

root@OpenWrt:~# echo '8000000' > /sys/kernel/debug/mmc0/clock
root@OpenWrt:~# /usr/bin/dd count=10M if=/dev/mmcblk0p1 of=/dev/null
27072+0 records in
27072+0 records out
13860864 bytes (14 MB) copied, 30.0687 s, 461 kB/s
root@OpenWrt:~#

You can see that Real speed SPI ~9MHz for 400MHz CPU.
In parallel (when R/W), we can see that the processor is fully loaded.

But this does not apply to the NOR flash chip, which supports fast read function (~3Mb/s SPI = ~33MHz rate - while my flash chip en25f32 support up to 100MHz).
If you disable this function, then the rate will be the same (~700kB/s SPI for 400MHz CPU).

At the same time, we note that the write speed also depends on the CPU anyway.
NOR flash chip and SoC does not support fast write over SPI.

(Last edited by Dioptimizer on 21 Jul 2013, 00:32)

Ok thanks!