OpenWrt Forum Archive

Topic: Openwrt build: low bandwidth on 11ac radio module;

The content of this topic has been archived between 14 Apr 2018 and 3 May 2018. There are no obvious gaps in this topic, but there may still be some posts missing at the end.

Hi,

I am experiencing a limited bandwidth issue related with an QCA9888 802.11 Wifi module (b/g/n/ac 2.4GHz or 5GHz).

Running the OEM firmware (which was based also on an earlier version of OpenWrt) and iperf command, I can get an throughput around the 320/330/340 Mbits/s.
For example, running one instance of iperf on the radio equipment as server and another on the laptop as client, I get:

root@OpenWrt:/# 
root@OpenWrt:/# iperf -s -i 1 -t 30
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.5.1 port 5001 connected with 192.168.5.3 port 62326
[  4]  0.0- 1.0 sec  36.3 MBytes   305 Mbits/sec
[  4]  1.0- 2.0 sec  36.2 MBytes   303 Mbits/sec
[  4]  2.0- 3.0 sec  37.2 MBytes   312 Mbits/sec
[  4]  3.0- 4.0 sec  37.3 MBytes   313 Mbits/sec
[  4]  4.0- 5.0 sec  37.7 MBytes   316 Mbits/sec
[  4]  5.0- 6.0 sec  38.1 MBytes   320 Mbits/sec
[  4]  6.0- 7.0 sec  37.5 MBytes   315 Mbits/sec
[  4]  7.0- 8.0 sec  38.0 MBytes   319 Mbits/sec
[  4]  8.0- 9.0 sec  37.3 MBytes   313 Mbits/sec
[  4]  9.0-10.0 sec  38.2 MBytes   320 Mbits/sec
[  4]  0.0-10.0 sec   374 MBytes   314 Mbits/sec
^C
root@OpenWrt:/#

If I change the roles (laptop running iperf as server and the radio equipment running iperf as client), I get:

root@OpenWrt:/# 
root@OpenWrt:/# iperf -c 192.168.5.3 -i 1 -t 30
------------------------------------------------------------
Client connecting to 192.168.5.3, TCP port 5001
TCP window size: 21.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.5.1 port 49386 connected with 192.168.5.3 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec  35.5 MBytes   298 Mbits/sec
[  3]  1.0- 2.0 sec  36.1 MBytes   303 Mbits/sec
[  3]  2.0- 3.0 sec  38.9 MBytes   326 Mbits/sec
[  3]  3.0- 4.0 sec  40.9 MBytes   343 Mbits/sec
[  3]  4.0- 5.0 sec  40.3 MBytes   338 Mbits/sec
[  3]  5.0- 6.0 sec  41.0 MBytes   344 Mbits/sec
[  3]  6.0- 7.0 sec  39.5 MBytes   331 Mbits/sec
[  3]  7.0- 8.0 sec  38.4 MBytes   322 Mbits/sec
[  3]  8.0- 9.0 sec  39.6 MBytes   332 Mbits/sec
[  3]  9.0-10.0 sec  40.4 MBytes   339 Mbits/sec
[  3] 10.0-11.0 sec  40.8 MBytes   342 Mbits/sec
[  3] 11.0-12.0 sec  40.6 MBytes   341 Mbits/sec
[  3] 12.0-13.0 sec  39.3 MBytes   329 Mbits/sec
[  3] 13.0-14.0 sec  37.1 MBytes   311 Mbits/sec
[  3] 14.0-15.0 sec  35.9 MBytes   301 Mbits/sec
[  3] 15.0-16.0 sec  37.1 MBytes   311 Mbits/sec
[  3] 16.0-17.0 sec  37.4 MBytes   314 Mbits/sec
[  3] 17.0-18.0 sec  37.9 MBytes   318 Mbits/sec
[  3] 18.0-19.0 sec  38.6 MBytes   324 Mbits/sec
[  3] 19.0-20.0 sec  40.1 MBytes   337 Mbits/sec
[  3] 20.0-21.0 sec  38.4 MBytes   322 Mbits/sec
[  3] 21.0-22.0 sec  40.1 MBytes   337 Mbits/sec
[  3] 22.0-23.0 sec  39.9 MBytes   334 Mbits/sec
[  3] 23.0-24.0 sec  36.5 MBytes   306 Mbits/sec
[  3] 24.0-25.0 sec  40.0 MBytes   336 Mbits/sec
[  3] 25.0-26.0 sec  40.4 MBytes   339 Mbits/sec
[  3] 26.0-27.0 sec  40.3 MBytes   338 Mbits/sec
[  3] 27.0-28.0 sec  40.1 MBytes   337 Mbits/sec
[  3] 28.0-29.0 sec  38.9 MBytes   326 Mbits/sec
[  3] 29.0-30.0 sec  40.3 MBytes   338 Mbits/sec
[  3]  0.0-30.0 sec  1.14 GBytes   327 Mbits/sec
root@OpenWrt:/# 

So, no huge differences here.

When running my openwrt custom image, the throughput outputted by iperf depends on the client-server roles.
If the radio equipment runs iperf command as client and connects to the laptop (who is running iperf as server), the best throughput is around 130 Mbits/s;

root@OpenWrt:system:/# iperf -c 192.168.5.3 -i 1 -t 30
------------------------------------------------------------
Client connecting to 192.168.5.3, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[  3] local 192.168.5.1 port 38153 connected with 192.168.5.3 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec  10.4 MBytes  87.0 Mbits/sec
[  3]  1.0- 2.0 sec  12.1 MBytes   102 Mbits/sec
[  3]  2.0- 3.0 sec  13.1 MBytes   110 Mbits/sec
[  3]  3.0- 4.0 sec  13.4 MBytes   112 Mbits/sec
[  3]  4.0- 5.0 sec  13.4 MBytes   112 Mbits/sec
[  3]  5.0- 6.0 sec  13.2 MBytes   111 Mbits/sec
[  3]  6.0- 7.0 sec  13.4 MBytes   112 Mbits/sec
[  3]  7.0- 8.0 sec  14.2 MBytes   120 Mbits/sec
[  3]  8.0- 9.0 sec  15.1 MBytes   127 Mbits/sec
[  3]  9.0-10.0 sec  15.0 MBytes   126 Mbits/sec
[  3] 10.0-11.0 sec  15.5 MBytes   130 Mbits/sec
[  3] 11.0-12.0 sec  16.5 MBytes   138 Mbits/sec
[  3] 12.0-13.0 sec  16.2 MBytes   136 Mbits/sec
[  3] 13.0-14.0 sec  16.1 MBytes   135 Mbits/sec
[  3] 14.0-15.0 sec  16.5 MBytes   138 Mbits/sec
[  3] 15.0-16.0 sec  16.4 MBytes   137 Mbits/sec
[  3] 16.0-17.0 sec  16.4 MBytes   137 Mbits/sec
[  3] 17.0-18.0 sec  16.1 MBytes   135 Mbits/sec
[  3] 18.0-19.0 sec  16.4 MBytes   137 Mbits/sec
[  3] 19.0-20.0 sec  16.8 MBytes   141 Mbits/sec
[  3] 20.0-21.0 sec  16.1 MBytes   135 Mbits/sec
[  3] 21.0-22.0 sec  16.1 MBytes   135 Mbits/sec
[  3] 22.0-23.0 sec  16.4 MBytes   137 Mbits/sec
[  3] 23.0-24.0 sec  16.1 MBytes   135 Mbits/sec
[  3] 24.0-25.0 sec  16.1 MBytes   135 Mbits/sec
[  3] 25.0-26.0 sec  15.8 MBytes   132 Mbits/sec
[  3] 26.0-27.0 sec  16.0 MBytes   134 Mbits/sec
[  3] 27.0-28.0 sec  16.5 MBytes   138 Mbits/sec
[  3] 28.0-29.0 sec  16.0 MBytes   134 Mbits/sec
[  3] 29.0-30.0 sec  16.6 MBytes   139 Mbits/sec
[  3]  0.0-30.0 sec   458 MBytes   128 Mbits/sec
root@OpenWrt:system:/# 

If we change the iperf roles (the equipment running iperf server and laptop running iperf client), the throughput can reach around 210 Mbits/s.

root@OpenWrt:system:/# iperf -s -i 1 -t 30
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.5.1 port 5001 connected with 192.168.5.3 port 62301
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 1.0 sec  18.5 MBytes   155 Mbits/sec
[  4]  1.0- 2.0 sec  20.1 MBytes   169 Mbits/sec
[  4]  2.0- 3.0 sec  19.0 MBytes   159 Mbits/sec
[  4]  3.0- 4.0 sec  23.1 MBytes   194 Mbits/sec
[  4]  4.0- 5.0 sec  25.1 MBytes   210 Mbits/sec
[  4]  5.0- 6.0 sec  25.6 MBytes   215 Mbits/sec
[  4]  6.0- 7.0 sec  25.1 MBytes   210 Mbits/sec
[  4]  7.0- 8.0 sec  24.8 MBytes   208 Mbits/sec
[  4]  8.0- 9.0 sec  25.6 MBytes   215 Mbits/sec
[  4]  9.0-10.0 sec  25.1 MBytes   211 Mbits/sec
[  4]  0.0-10.0 sec   232 MBytes   195 Mbits/sec
[  4] local 192.168.5.1 port 5001 connected with 192.168.5.3 port 62302
[  4]  0.0- 1.0 sec  24.0 MBytes   201 Mbits/sec
[  4]  1.0- 2.0 sec  25.2 MBytes   211 Mbits/sec
[  4]  2.0- 3.0 sec  25.1 MBytes   210 Mbits/sec
[  4]  3.0- 4.0 sec  25.3 MBytes   212 Mbits/sec
[  4]  4.0- 5.0 sec  25.4 MBytes   213 Mbits/sec
[  4]  5.0- 6.0 sec  23.8 MBytes   200 Mbits/sec
[  4]  6.0- 7.0 sec  24.9 MBytes   209 Mbits/sec
[  4]  7.0- 8.0 sec  24.9 MBytes   209 Mbits/sec
[  4]  8.0- 9.0 sec  24.6 MBytes   206 Mbits/sec
[  4]  9.0-10.0 sec  24.9 MBytes   209 Mbits/sec
[  4]  0.0-10.0 sec   248 MBytes   208 Mbits/sec
[  4] local 192.168.5.1 port 5001 connected with 192.168.5.3 port 62303
[  4]  0.0- 1.0 sec  24.3 MBytes   204 Mbits/sec
[  4]  1.0- 2.0 sec  25.5 MBytes   214 Mbits/sec
[  4]  2.0- 3.0 sec  25.1 MBytes   211 Mbits/sec
[  4]  3.0- 4.0 sec  25.1 MBytes   210 Mbits/sec
[  4]  4.0- 5.0 sec  25.1 MBytes   211 Mbits/sec
[  4]  5.0- 6.0 sec  25.5 MBytes   214 Mbits/sec
[  4]  6.0- 7.0 sec  25.1 MBytes   211 Mbits/sec
[  4]  7.0- 8.0 sec  25.3 MBytes   212 Mbits/sec
[  4]  8.0- 9.0 sec  25.7 MBytes   216 Mbits/sec
[  4]  9.0-10.0 sec  25.2 MBytes   211 Mbits/sec
[  4]  0.0-10.0 sec   252 MBytes   211 Mbits/sec
^C
root@OpenWrt:system:/# 

The radio equipment is always in AP mode (using hostapd) and the laptop connects normally as a client station.
The setup is the same for both scenarios, only changing the image, firmware (OEM uses an old firmware based on three files: bin, opt and ). I've tried to use the same hostapd configuration that I found on the OEM image but the throughput is even worst (around 60Mbits/s if I remember well). The hardware is the same and there is no change in the physical position of the radio equipment or the laptop.

I've already tried to change some configurations on my hostapd.conf file but none of them allowed me to get better throughput's. I know that I can use the iperf window parameter and try to get a little better results...but they are just that: little better results; and anyway, when testing the OEM image, the throughput was not affected by this parameter either.

I noticed something strange in the output of the iw dev wlan1 station dump command: the TX bitrate is always 6.0 MBits/s. After doing some search, I found some posts and even the link https://wireless.wiki.kernel.org/en/use … e_is_wrong refers that this is normal, so I am not paying to much attention to this aspect.

I also noticed that I have the same issue (low bandwidth) on another 802.11ad radio module present on the same radio equipment but it is not clear if it has relation with this or not (the other can only reach ~60Mbits/s when it should reach 1.4Gbits/s, according to the OEM image).

Anyone has an idea or suggestion where to find the source of the problem?

Here is some additional information that I hope,help to understand and discover the source of the problem.

root@OpenWrt:system:/# iw dev wlan1 info
Interface wlan1
        ifindex 4
        wdev 0x100000001
        addr 00:a0:c6:00:d9:42
        ssid SysTeam_wlan1_11ac
        type AP
        wiphy 1
        channel 116 (5580 MHz), width: 80 MHz, center1: 5610 MHz
        txpower 23.00 dBm
root@OpenWrt:system:/# 
root@OpenWrt:system:/# 
root@OpenWrt:system:/# iw dev wlan1 station dump
Station 48:45:20:8c:b6:f8 (on wlan1)
        inactive time:  350 ms
        rx bytes:       1595556
        rx packets:     18421
        tx bytes:       508926302
        tx packets:     331945
        tx retries:     0
        tx failed:      1
        signal:         -37 dBm
        signal avg:     -38 dBm
        tx bitrate:     6.0 MBit/s
        rx bitrate:     866.7 MBit/s VHT-MCS 9 80MHz short GI VHT-NSS 2
        authorized:     yes
        authenticated:  yes
        preamble:       long
        WMM/WME:        yes
        MFP:            no
        TDLS peer:      no
        connected time: 83 seconds
root@OpenWrt:system:/# 
root@OpenWrt:system:/# 
root@OpenWrt:system:/# iwinfo dev wlan1 info
No such wireless backend: dev
root@OpenWrt:system:/# iwinfo wlan1 info
wlan1     ESSID: "SystemsTeam_wlan0"
          Access Point: 00:A0:C6:00:D9:42
          Mode: Master  Channel: 116 (5.580 GHz)
          Tx-Power: 23 dBm  Link Quality: 70/70
          Signal: -38 dBm  Noise: -93 dBm
          Bit Rate: 6.0 MBit/s
          Encryption: none
          Type: nl80211  HW Mode(s): 802.11bgnac
          Hardware: 168C:003C 0000:0000 [Qualcomm Atheros QCA9880]
          TX power offset: none
          Frequency offset: none
          Supports VAPs: yes  PHY name: phy1
root@OpenWrt:system:/# iwinfo phy1 info
phy1      ESSID: unknown
          Access Point: 00:00:00:00:00:00
          Mode: Master  Channel: 116 (5.580 GHz)
          Tx-Power: 23 dBm  Link Quality: unknown/70
          Signal: unknown  Noise: unknown
          Bit Rate: unknown
          Encryption: unknown
          Type: nl80211  HW Mode(s): 802.11bgnac
          Hardware: unknown [Generic MAC80211]
          TX power offset: unknown
          Frequency offset: unknown
          Supports VAPs: yes  PHY name: phy1
root@OpenWrt:system:/# iw phy1 info
Wiphy phy1
        max # scan SSIDs: 16
        max scan IEs length: 195 bytes
        max # sched scan SSIDs: 0
        max # match sets: 0
        Retry short limit: 7
        Retry long limit: 4
        Coverage class: 0 (up to 0m)
        Device supports AP-side u-APSD.
        Available Antennas: TX 0x7 RX 0x7
        Configured Antennas: TX 0x7 RX 0x7
        Supported interface modes:
                 * managed
                 * AP
                 * AP/VLAN
                 * monitor
                 * mesh point
        Band 1:
                Capabilities: 0x19ef
                        RX LDPC
                        HT20/HT40
                        SM Power Save disabled
                        RX HT20 SGI
                        RX HT40 SGI
                        TX STBC
                        RX STBC 1-stream
                        Max AMSDU length: 7935 bytes
                        DSSS/CCK HT40
                Maximum RX AMPDU length 65535 bytes (exponent: 0x003)
                Minimum RX AMPDU time spacing: 8 usec (0x06)
                HT TX/RX MCS rate indexes supported: 0-23
                VHT Capabilities (0x338001b2):
                        Max MPDU length: 11454
                        Supported Channel Width: neither 160 nor 80+80
                        RX LDPC
                        short GI (80 MHz)
                        TX STBC
                        RX antenna pattern consistency
                        TX antenna pattern consistency
                VHT RX MCS set:
                        1 streams: MCS 0-9
                        2 streams: MCS 0-9
                        3 streams: MCS 0-9
                        4 streams: not supported
                        5 streams: not supported
                        6 streams: not supported
                        7 streams: not supported
                        8 streams: not supported
                VHT RX highest supported: 0 Mbps
                VHT TX MCS set:
                        1 streams: MCS 0-9
                        2 streams: MCS 0-9
                        3 streams: MCS 0-9
                        4 streams: not supported
                        5 streams: not supported
                        6 streams: not supported
                        7 streams: not supported
                        8 streams: not supported
                VHT TX highest supported: 0 Mbps
                Frequencies:
                        * 2412 MHz [1] (30.0 dBm)
                        * 2417 MHz [2] (30.0 dBm)
                        * 2422 MHz [3] (30.0 dBm)
                        * 2427 MHz [4] (30.0 dBm)
                        * 2432 MHz [5] (30.0 dBm)
                        * 2437 MHz [6] (30.0 dBm)
                        * 2442 MHz [7] (30.0 dBm)
                        * 2447 MHz [8] (30.0 dBm)
                        * 2452 MHz [9] (30.0 dBm)
                        * 2457 MHz [10] (30.0 dBm)
                        * 2462 MHz [11] (30.0 dBm)
                        * 2467 MHz [12] (disabled)
                        * 2472 MHz [13] (disabled)
                        * 2484 MHz [14] (disabled)
        Band 2:
                Capabilities: 0x19ef
                        RX LDPC
                        HT20/HT40
                        SM Power Save disabled
                        RX HT20 SGI
                        RX HT40 SGI
                        TX STBC
                        RX STBC 1-stream
                        Max AMSDU length: 7935 bytes
                        DSSS/CCK HT40
                Maximum RX AMPDU length 65535 bytes (exponent: 0x003)
                Minimum RX AMPDU time spacing: 8 usec (0x06)
                HT TX/RX MCS rate indexes supported: 0-23
                VHT Capabilities (0x338001b2):
                        Max MPDU length: 11454
                        Supported Channel Width: neither 160 nor 80+80
                        RX LDPC
                        short GI (80 MHz)
                        TX STBC
                        RX antenna pattern consistency
                        TX antenna pattern consistency
                VHT RX MCS set:
                        1 streams: MCS 0-9
                        2 streams: MCS 0-9
                        3 streams: MCS 0-9
                        4 streams: not supported
                        5 streams: not supported
                        6 streams: not supported
                        7 streams: not supported
                        8 streams: not supported
                VHT RX highest supported: 0 Mbps
                VHT TX MCS set:
                        1 streams: MCS 0-9
                        2 streams: MCS 0-9
                        3 streams: MCS 0-9
                        4 streams: not supported
                        5 streams: not supported
                        6 streams: not supported
                        7 streams: not supported
                        8 streams: not supported
                VHT TX highest supported: 0 Mbps
                Frequencies:
                        * 5180 MHz [36] (23.0 dBm)
                        * 5200 MHz [40] (23.0 dBm)
                        * 5220 MHz [44] (23.0 dBm)
                        * 5240 MHz [48] (23.0 dBm)
                        * 5260 MHz [52] (23.0 dBm) (radar detection)
                        * 5280 MHz [56] (23.0 dBm) (radar detection)
                        * 5300 MHz [60] (23.0 dBm) (radar detection)
                        * 5320 MHz [64] (23.0 dBm) (radar detection)
                        * 5500 MHz [100] (23.0 dBm) (radar detection)
                        * 5520 MHz [104] (23.0 dBm) (radar detection)
                        * 5540 MHz [108] (23.0 dBm) (radar detection)
                        * 5560 MHz [112] (23.0 dBm) (radar detection)
                        * 5580 MHz [116] (23.0 dBm) (radar detection)
                        * 5600 MHz [120] (23.0 dBm) (radar detection)
                        * 5620 MHz [124] (23.0 dBm) (radar detection)
                        * 5640 MHz [128] (23.0 dBm) (radar detection)
                        * 5660 MHz [132] (23.0 dBm) (radar detection)
                        * 5680 MHz [136] (23.0 dBm) (radar detection)
                        * 5700 MHz [140] (23.0 dBm) (radar detection)
                        * 5720 MHz [144] (23.0 dBm) (radar detection)
                        * 5745 MHz [149] (30.0 dBm)
                        * 5765 MHz [153] (30.0 dBm)
                        * 5785 MHz [157] (30.0 dBm)
                        * 5805 MHz [161] (30.0 dBm)
                        * 5825 MHz [165] (30.0 dBm)
        valid interface combinations:
                 * #{ AP, mesh point } <= 8, #{ managed } <= 1,
                   total <= 8, #channels <= 1, STA/AP BI must match, radar detect widths: { 20 MHz (no HT), 20 MHz, 40}

        HT Capability overrides:
                 * MCS: ff ff ff ff ff ff ff ff ff ff
                 * maximum A-MSDU length
                 * supported channel width
                 * short GI for 40 MHz
                 * max A-MPDU length exponent
                 * min MPDU start spacing
        Device supports VHT-IBSS.
root@OpenWrt:system:/# 

And my hostapd.conf file is as follows:

### hostapd configuration file
ctrl_interface=/var/run/hostapd
interface=wlan1
driver=nl80211
#bridge=br-lan

### IEEE 802.11
ssid=SysTeam_wlan1_11ac
hw_mode=a
channel=0
max_num_sta=128
disassoc_low_ack=1
auth_algs=1
#preamble=1

### DFS
ieee80211h=1
ieee80211d=1
country_code=US

### IEEE 802.11n
ieee80211n=1
ht_capab=[HT40+][LDPC][SHORT-GI-20][SHORT-GI-40][TX-STBC][RX-STBC1][DSSS_CCK-40]

### IEEE 802.11ac
ieee80211ac=1
vht_oper_chwidth=1
vht_capab=[MAX-MPDU-11454][RXLDPC][SHORT-GI-80][TX-STBC-2BY1][RX-STBC-1][MAX-A-MPDU-LEN-EXP7][RX-ANTENNA-PATTERN][TX-ANTENNA-PATTERN]


### WPA/IEEE 802.11i
wpa=0
#wpa_key_mgmt=WPA-PSK
#wpa_passphrase=12345678
#wpa_pairwise=CCMP


### Wi-Fi Protected Setup (WPS)
#wps_state=2
#ap_setup_locked=0
#wps_pin_requests=/var/run/hostapd_wps_pin_requests
#device_name=QCA Access Point
#manufacturer=Qualcomm Atheros
#device_type=6-0050F204-1
#config_methods=virtual_push_button physical_push_button label keypad virtual_display
#pbc_in_m1=1
#ap_pin=12345670
#upnp_iface=br-lan
#eap_server=1


### hostapd event logger configuration
logger_syslog=127
logger_syslog_level=2
logger_stdout=127
logger_stdout_level=2

### WMM
#wmm_enabled=1
#uapsd_advertisement_enabled=1
#wmm_ac_bk_cwmin=4
#wmm_ac_bk_cwmax=10
#wmm_ac_bk_aifs=7
#wmm_ac_bk_txop_limit=0
#wmm_ac_bk_acm=0
#wmm_ac_be_aifs=3
#wmm_ac_be_cwmin=4
#wmm_ac_be_cwmax=10
#wmm_ac_be_txop_limit=0
#wmm_ac_be_acm=0
#wmm_ac_vi_aifs=2
#wmm_ac_vi_cwmin=3
#wmm_ac_vi_cwmax=4
#wmm_ac_vi_txop_limit=94
#wmm_ac_vi_acm=0
#wmm_ac_vo_aifs=2
#wmm_ac_vo_cwmin=2
#wmm_ac_vo_cwmax=3
#wmm_ac_vo_txop_limit=47
#wmm_ac_vo_acm=0


### TX queue parameters
#tx_queue_data3_aifs=7
#tx_queue_data3_cwmin=15
#tx_queue_data3_cwmax=1023
#tx_queue_data3_burst=0
#tx_queue_data2_aifs=3
#tx_queue_data2_cwmin=15
#tx_queue_data2_cwmax=63
#tx_queue_data2_burst=0
#tx_queue_data1_aifs=1
#tx_queue_data1_cwmin=7
#tx_queue_data1_cwmax=15
#tx_queue_data1_burst=3.0
#tx_queue_data0_aifs=1
#tx_queue_data0_cwmin=3
#tx_queue_data0_cwmax=7
#tx_queue_data0_burst=1.5

My custom Openwrt image targets an IPQ806x platform and is based on Bleeding Edge, r49395 (kernel 4.1.23). The custom image uses compat-wireless-2016-01-10-1-r49395 (ath10k with ACS and N support enable).
Both images use iperf v2.0.5 (8/Jul/2010).

Kind regards

(Last edited by sjuliao on 23 Sep 2016, 11:09)

Hi,

A few more details regarding this problem.

I've tested with the Ath10k hostapd conf example files, as well as, others conf files and settings that were reported to work on other posts and linux drivers threads...always achieving the same results or even worst values.


I've already tested with different firmwares, namely:

QCA9888:
- firmware-5.bin_10.4-3.2-00072 and board-2.bin (error: invalid fw magic number)

QCA988x:
- firmware-2.bin_999.999.0 (disabling DFS in hostapd.conf; ~43.5MBits/s with RadioEquipment running iperf as client and laptop with iperf server);
- firmware-3.bin_10.2-00082-4-2 (*)
- firmware-4.bin_10.2.4.48 (*)
- firmware-5.bin_10.2.4.70.54  (error: invalid fw magic number)
- firmware-5.bin_10.2.4.70-2  (error: invalid fw magic number)
- firmware-5.bin_10.2.4.70.13-2 (*) - this is the actual fw being used (and the one used for the measures in first post)
(*) - These versions share basically the same measures described in the first post.



I've followed the  suggestions given on https://wireless.wiki.kernel.org/en/use … s_wrong.3F to ensure the best throughput, namely:
- using 80MHz channels;
- disable ATH10K DEBUG options;
- kernel without debug (not even the boot log or output form insmod commands - except errors);
- correct hostapd configuration (at least, to the best of my knowledge...)

The bandwidth given by iperf have improved about 20MBits/s and my recent tests are:
* Test1 - Radio equipment running Iperf.Client and Laptop Iperf.Server: ~150MBits/s
* Test2 - Radio equipment running Iperf.Server and Laptop Iperf.Client: ~230MBits/s

Also, I've taken a look at the load averages during the tests.
For test1, the typical values I get on average for 60s are

Mem: 60728K used, 420212K free, 24K shrd, 0K buff, 21156K cached
CPU:   0% usr  53% sys   0% nic  44% idle   0% io   0% irq   1% sirq
Load average: 1.62 1.43 0.83 2/57 780
  PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND
  779   133 root     R     1136   0%  52% iperf -c 192.168.5.3 -i 1 -t 80
   22     2 root     DW       0   0%   2% [kworker/0:1]
  780   133 root     R     1084   0%   1% top -b -d 1 -n 60
    3     2 root     SW       0   0%   1% [ksoftirqd/0]
  769     1 root     S     1656   0%   0% hostapd /etc/hostapd/hostapd_wlan1_11ac.conf

For Test2, the typical values I get on average for 60s are

Mem: 57492K used, 423448K free, 24K shrd, 0K buff, 20708K cached
CPU:   0% usr  39% sys   0% nic   8% idle   0% io   0% irq  51% sirq
Load average: 2.12 1.28 0.58 1/56 773
  PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND
  770   133 root     R     1292   0%  43% iperf -s
    3     2 root     RW       0   0%  42% [ksoftirqd/0]
   22     2 root     DW       0   0%   4% [kworker/0:1]
  773   133 root     R     1084   0%   1% top -b -d 1 -n 60
  568     1 root     R     1084   0%   1% /usr/sbin/ntpd -n -S /usr/sbin/ntpd-hotplug -p 0.openwrt.pool.ntp.org -p 1.openwrt.pool.ntp.org -p 2.openwrt.pool.ntp.org -p 3.openwrt.pool.ntp.org
    7     2 root     SW       0   0%   1% [rcu_preempt]
   44     2 root     SW       0   0%   1% [kworker/0:2]
  769     1 root     S     1656   0%   0% hostapd /etc/hostapd/hostapd_wlan1_11ac.conf

It is evident that when running iperf as server on the radio equipment, the CPU idle % is very low (usually less than 20%) but for the other test, the CPU idle is always > 40% (despite the throughput being worst).
The load average on both tests starts around 0.9/1 and after 60s it reaches typically 2.0 (test2 is a little more demanding and typically the load average reaches 2.3). The IPQ8064  features a dual-core SMP Qualcomm Krait CPU @ 1.4Ghz with an ARM-v7 ISA.
I've not measured typical load averages for the same test scenarios with the OEM image yet.
But from my analysis, regarding the custom image and at least for test1 it is clear that the problem is not related with the CPU overload.


Has anyone experienced the same low bandwidth with an QCA988x?
Anyone has an equipment that uses this chip that can share the firmware version that is using, hostapd settings, openwrt build and linux kernel versions?

Could it be related with the Ath10k driver? Is there something that I could optimize on the Ath10k that could potentially increase the bandwidth?
I am aware of the MPDU and MSDU but I think I am already trying to use them correctly in my hostapd.conf...

(Last edited by sjuliao on 21 Sep 2016, 23:02)

Your router's CPU isn't fast enough. Connect a fast machine to your router and use it as iperf server/client and repeat your tests

You mean using two different computers connected through the router? I will try that.

But that does not explains why the OEM image can achieve higher and TX/RX consistent throughput and the openwrt has these low bandwidth and discrepancy between TX/RX.
Could it be due to the way Ath10k driver was implemented in these recent versions?

Hi,

As suggested by the user azapfig, I did test the router as just an AP, connecting two laptops and running iperf just on the clients.
One of the laptops is the one used in previous tests (described in previous posts) and has an 11ac module (with just two antennas but it detects the connection as capable of achieving 866.7MBits/s - in optimal conditions!!!). This laptop will be labeled as LP_Fixed as it was fixed for all tests.
For the other endpoint I've used two different laptops, but each one only with 11b/g/n (I had no other 11ac at this moment). However, the connection was detected as capable of achieving 300MBits/s which was enough to check any eventual improvements. These were used just one at time and I will label them as LP1 and LP2.

Using LP1 last night the bandwidth achieved was around 100MBits/s and was consistent changing the iperf roles between both endpoints.
Using LP2, the typical bandwidth achieved was around 92MBits/s. I won't say that this difference is significant as they have been tested in different environments, with different conditions (and the laptop specifications are also different). For simplicity, let's assume these bandwidth were both 100MBits/s.

This poor performance happens because the connection between LP1 or LP2 to the router is only able to achieve...exactly...100MBits/s!

Regarding the router resources, performing the iperf test between endpoints, the idle processor % was always > than 70%; the load average for 60s was ~1.20 processes (to 2 processors).

Looking to these tests and since neither the idle % nor the load average are at critical values, I would not suspect from the router's processor.
But I am not entirely sure if it is running at its maximum frequency and I would like to confirm this aspect.
On the original image, both cores are initialized @ 800MHz, with an "ondemand" governor policy and I've confirmed that, when required, both CPUs scale their frequency to 1.4GHz (in this situation, idle % is always >50% and load average observed was around 1.6 processes for 60s).
On my image, my governor policy was performance but meanwhile I've changed it to "ondemand" also. According to bootlog, CPU0 is initialized @800MHz and CPU1@385MHz but I don't know at which frequency they are running later because I have no /sys/bus/cpu/devices/cpu*/cpufreq/cpu_freq_cur/min/max (these are available on OEM image) and /proc/cpuinfo has almost no information (when compared with cpuinfo available on OEM image).
There is no frequency in both cpuinfo's (and even if it was present, it would not be much help since it is does not reflect dynamic scaling freq.) but there is a little BogoMips difference between them (12.56 on OEM image; 12.50 on my image).

Is there a package that can be selected/installed that provides the /sys/bus/cpu/devices/cpu*/cpufreq/cpu_freq_cur|min|max information?
Is there any module/package available on OpenWrt that can give information on the frequency that the processors are running at a precise moment?
I can program a small program that does a specific and deterministic action, measuring its running time but I would prefer to use one package/module that was proven to be reliable for this task.
I've already confirmed that my device tree enumerates a list of frequencies for each core and they are correct as they respect the min and max freq of OEM image...I just don't need to be sure that these lists are correctly applied at runtime.

Regarding the available QCA988x firmwares, I've tested all API3, API4 and API5 available on my local folder without success. Not being the firmware nor the processor, I think the most probable source for the problem is the Ath10k driver...

(Last edited by sjuliao on 23 Sep 2016, 08:27)

Post /etc/config/wireless. Maybe there's something there. I'm not a WiFi guru, but those are present here, maybe there's something in your WiFi config.

Hi stangri,

The frequencies that could be find in /etc/config/wireless (if any listed because they could be not specified), are the channel frequencies where the different radio modules operate.
In my case, that file was created using "wifi detect", right after installing and loading the drivers for the radio modules. Later I did some changes there but I don't think the problem could be there anymore because even using "wifi up" (which uses that settings) or launching hostapd with a specific conf file, the low bandwidth is still present. I've also tried with the default and simple 11ac hostapd configuration provided in
https://wireless.wiki.kernel.org/en/use … figuration

interface=wlan0
driver=nl80211

ssid=ath10k-test

hw_mode=a
channel=36
ht_capab=[HT40+]
ieee80211n=1
ieee80211ac=1
vht_oper_chwidth=1
vht_oper_centr_freq_seg0_idx=42

and the problem is present. Unless I have an stupid error in both conf files...but acording to many other posts that I've checked, these configurations have worked well for other users. The ones that experienced the same problem, were using CC build (which does not support IPQ806x and I don't want to loose time add supporting to a previous OpenWrt version), solved the problem changing the firmware version or are still without an answer for their problem (as I am).

What I was asking and I am trying to check are the real CPU frequencies at one given moment (which can vary if you have a dynamic scale governor policy).

(Last edited by sjuliao on 23 Sep 2016, 11:09)

Some interesting news to share...

I've compiled an Openwrt image based on kernel 3.18.29.
This image uses the same compat-wireless drivers (2016-01-10-1-r49395), same QCA988x fw5 10.2.4.70.13-2 and same hostapd configuration.
The iperf is same version as before.

Differences spotted:
- /sys/bus/cpu/devices/cpu*/cpufreq/cpu_freq_cur|min|max information is available by default and both CPU are running (in this image!) @1.4GHz, as defined by governor policy (set for performance);

- CPUInfo lists different information (than the other images)...weird!

OEM Image

root@OpenWrt:/# cat /proc/cpuinfo 
Processor       : ARMv7 Processor rev 0 (v7l)
processor       : 0
BogoMIPS        : 12.56

processor       : 1
BogoMIPS        : 12.56

Features        : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 
CPU implementer : 0x51
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0x04d
CPU revision    : 0

Hardware        : Qualcomm Atheros AK01-1XX reference board
Revision        : 0000
Serial          : 0000000000000000

root@OpenWrt:/# dmesg | grep CPU
[    0.000000] Booting Linux on physical CPU 0
[    0.000000] CPU: ARMv7 Processor [512f04d0] revision 0 (ARMv7), cr=10c5387d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
[    0.000000] PERCPU: Embedded 8 pages/cpu @c0c66000 s10624 r8192 d13952 u32768
[    0.000000] SLUB: Genslabs=11, HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[    0.152327] CPU: Testing write buffer coherency: ok
[    0.152546] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
[    0.155014] CPU1: Booted secondary processor
[    0.155076] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
[    0.155170] Brought up 2 CPUs
[    0.409778] acpuclk-ipq806x acpuclk-ipq806x: ACPU PVS: 1
[    0.416963] acpuclk-ipq806x acpuclk-ipq806x: CPU0: 6 frequencies supported
[    0.416963] acpuclk-ipq806x acpuclk-ipq806x: CPU1: 6 frequencies supported
[   12.931708] CPU_INTR_ADDRESS = [0]
[   17.972508] Target CPU Intr Cause 0x5040 
[   18.098781] Target CPU Intr Cause after CE reset 0x40 
[   29.844985] CPU_INTR_ADDRESS = [0]

Custom image k4.1

root@OpenWrt:/# cat /proc/cpuinfo 
processor       : 0
model name      : ARMv7 Processor rev 0 (v7l)
BogoMIPS        : 12.50
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 
CPU implementer : 0x51
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0x04d
CPU revision    : 0

processor       : 1
model name      : ARMv7 Processor rev 0 (v7l)
BogoMIPS        : 12.50
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 
CPU implementer : 0x51
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0x04d
CPU revision    : 0

Hardware        : Qualcomm (Flattened Device Tree)
Revision        : 0000
Serial          : 0000000000000000


root@OpenWrt:/# dmesg | grep CPU
[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] CPU: ARMv7 Processor [512f04d0] revision 0 (ARMv7), cr=10c5787d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
[    0.000000] PERCPU: Embedded 11 pages/cpu @ddc14000 s12672 r8192 d24192 u45056
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[    0.000000]  RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
[    0.000888] CPU: Testing write buffer coherency: ok
[    0.001153] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
[    0.084064] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
[    0.084179] Brought up 2 CPUs
[    0.084211] CPU: All CPU(s) started in SVC mode.
[    3.021121] CPU0 @ 800000 KHz
[    3.023573] CPU1 @ QSB rate. Forcing new rate.
[    3.026711] CPU1 @ 384000 KHz

Custom Image k3.18:

root@OpenWrt:/# cat /proc/cpuinfo
processor       : 0
model name      : ARMv7 Processor rev 0 (v7l)
BogoMIPS        : 21.87
Features        : half thumb fastmult vfp edsp neon tls vfpv4 idiva idivt 
CPU implementer : 0x51
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0x04d
CPU revision    : 0

processor       : 1
model name      : ARMv7 Processor rev 0 (v7l)
BogoMIPS        : 45.57
Features        : half thumb fastmult vfp edsp neon tls vfpv4 idiva idivt 
CPU implementer : 0x51
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0x04d
CPU revision    : 0

Hardware        : Qualcomm (Flattened Device Tree)
Revision        : 0000
Serial          : 0000000000000000


root@OpenWrt:/# dmesg | grep CPU
[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] CPU: ARMv7 Processor [512f04d0] revision 0 (ARMv7), cr=10c5787d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
[    0.000000] PERCPU: Embedded 9 pages/cpu @ddc18000 s7360 r8192 d21312 u36864
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[    0.000000]  RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
[    0.000980] CPU: Testing write buffer coherency: ok
[    0.001273] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
[    0.084061] CPU1: Booted secondary processor
[    0.084186] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
[    0.084311] Brought up 2 CPUs
[    0.084344] CPU: All CPU(s) started in SVC mode.
[    2.861217] CPU0 @ 800000 KHz
[    2.863681] CPU1 @ QSB rate. Forcing new rate.
[    2.866812] CPU1 @ 384000 KHz

I was not expecting to see that BogoMIPS difference...the governor policy has obviously something to do with the increased values (expected) but the difference between cores is strange...maybe the core are exactly not identical, some minimum difference on the clock path, I don't know...


And the most interesting of all:

Router running iperf server and LP_fixed running iperf client (only 30s):

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\iperf>iperf -c 192.168.5.1 -i 1 -t 30
------------------------------------------------------------
Client connecting to 192.168.5.1, TCP port 5001
TCP window size:  208 KByte (default)
------------------------------------------------------------
[  3] local 192.168.5.3 port 60447 connected with 192.168.5.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec  36.5 MBytes   306 Mbits/sec
[  3]  1.0- 2.0 sec  39.2 MBytes   329 Mbits/sec
[  3]  2.0- 3.0 sec  40.6 MBytes   341 Mbits/sec
[  3]  3.0- 4.0 sec  41.1 MBytes   345 Mbits/sec
[  3]  4.0- 5.0 sec  40.5 MBytes   340 Mbits/sec
[  3]  5.0- 6.0 sec  40.0 MBytes   336 Mbits/sec
[  3]  6.0- 7.0 sec  40.4 MBytes   339 Mbits/sec
[  3]  7.0- 8.0 sec  40.2 MBytes   338 Mbits/sec
[  3]  8.0- 9.0 sec  41.1 MBytes   345 Mbits/sec
[  3]  9.0-10.0 sec  40.5 MBytes   340 Mbits/sec
[  3] 10.0-11.0 sec  41.0 MBytes   344 Mbits/sec
[  3] 11.0-12.0 sec  40.8 MBytes   342 Mbits/sec
[  3] 12.0-13.0 sec  41.0 MBytes   344 Mbits/sec
[  3] 13.0-14.0 sec  39.8 MBytes   333 Mbits/sec
[  3] 14.0-15.0 sec  39.2 MBytes   329 Mbits/sec
[  3] 15.0-16.0 sec  38.0 MBytes   319 Mbits/sec
[  3] 16.0-17.0 sec  39.1 MBytes   328 Mbits/sec
[  3] 17.0-18.0 sec  39.1 MBytes   328 Mbits/sec
[  3] 18.0-19.0 sec  40.5 MBytes   340 Mbits/sec
[  3] 19.0-20.0 sec  41.1 MBytes   345 Mbits/sec
[  3] 20.0-21.0 sec  40.6 MBytes   341 Mbits/sec
[  3] 21.0-22.0 sec  41.8 MBytes   350 Mbits/sec
[  3] 22.0-23.0 sec  41.0 MBytes   344 Mbits/sec
[  3] 23.0-24.0 sec  40.6 MBytes   341 Mbits/sec
[  3] 24.0-25.0 sec  41.4 MBytes   347 Mbits/sec
[  3] 25.0-26.0 sec  40.4 MBytes   339 Mbits/sec
[  3] 26.0-27.0 sec  41.0 MBytes   344 Mbits/sec
[  3] 27.0-28.0 sec  41.8 MBytes   350 Mbits/sec
[  3] 28.0-29.0 sec  41.6 MBytes   349 Mbits/sec
[  3] 29.0-30.0 sec  41.4 MBytes   347 Mbits/sec
[  3]  0.0-30.0 sec  1.18 GBytes   339 Mbits/sec

Router running iperf client and LP_fixed running iperf server (only 30s):

root@OpenWrt:/# iperf -c 192.168.5.3 -i 1 -t 30
------------------------------------------------------------
Client connecting to 192.168.5.3, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[  3] local 192.168.5.1 port 52774 connected with 192.168.5.3 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec  29.8 MBytes   250 Mbits/sec
[  3]  1.0- 2.0 sec  29.9 MBytes   251 Mbits/sec
[  3]  2.0- 3.0 sec  29.9 MBytes   251 Mbits/sec
[  3]  3.0- 4.0 sec  30.0 MBytes   252 Mbits/sec
[  3]  4.0- 5.0 sec  29.9 MBytes   251 Mbits/sec
[  3]  5.0- 6.0 sec  30.2 MBytes   254 Mbits/sec
[  3]  6.0- 7.0 sec  30.2 MBytes   254 Mbits/sec
[  3]  7.0- 8.0 sec  29.9 MBytes   251 Mbits/sec
[  3]  8.0- 9.0 sec  25.9 MBytes   217 Mbits/sec
[  3]  9.0-10.0 sec  27.0 MBytes   226 Mbits/sec
[  3] 10.0-11.0 sec  27.6 MBytes   232 Mbits/sec
[  3] 11.0-12.0 sec  30.1 MBytes   253 Mbits/sec
[  3] 12.0-13.0 sec  29.1 MBytes   244 Mbits/sec
[  3] 13.0-14.0 sec  28.5 MBytes   239 Mbits/sec
[  3] 14.0-15.0 sec  28.2 MBytes   237 Mbits/sec
[  3] 15.0-16.0 sec  27.9 MBytes   234 Mbits/sec
[  3] 16.0-17.0 sec  29.5 MBytes   247 Mbits/sec
[  3] 17.0-18.0 sec  30.1 MBytes   253 Mbits/sec
[  3] 18.0-19.0 sec  29.5 MBytes   247 Mbits/sec
[  3] 19.0-20.0 sec  28.9 MBytes   242 Mbits/sec
[  3] 20.0-21.0 sec  30.2 MBytes   254 Mbits/sec
[  3] 21.0-22.0 sec  30.2 MBytes   254 Mbits/sec
[  3] 22.0-23.0 sec  30.0 MBytes   252 Mbits/sec
[  3] 23.0-24.0 sec  29.9 MBytes   251 Mbits/sec
[  3] 24.0-25.0 sec  28.9 MBytes   242 Mbits/sec
[  3] 25.0-26.0 sec  29.4 MBytes   246 Mbits/sec
[  3] 26.0-27.0 sec  29.9 MBytes   251 Mbits/sec
[  3] 27.0-28.0 sec  30.2 MBytes   254 Mbits/sec
[  3] 28.0-29.0 sec  30.1 MBytes   253 Mbits/sec
[  3] 29.0-30.0 sec  29.4 MBytes   246 Mbits/sec
[  3]  0.0-30.0 sec   880 MBytes   246 Mbits/sec

So, one of the tests is according to what I can also achieve on OEM image and the other test has still some maximum limited bandwidth...but anyway, completely and better results than using the K4.1 image.
Since the Ath10k is same version (provided by compat-wireless), the difference can only rely on Kernel version or patches being applied to it.

(Last edited by sjuliao on 23 Sep 2016, 12:15)

good progress. for your other test with 3 laptops you could just connect one of them with cable to the router instead using bgn wifi.

Hi anarchy99,

Yes, you are right but that is a different test and would add more complexity to my setup (at this stage, I need to validate simple things and only then increase gradually complexity as possible).
Since I've tested this way with the OEM and my custom images are not reaching the same bandwidth, I want to isolate only the 11ac interface for now; if you had more complexity to the setup, harder is the debug and the number of potential sources for the problems increase almost exponentially...

When I've tested the original image running Iperf over just Eth (router <-> LP_fixed), it gave me aprox. the expected results for an 1GBit interface (~845MBits/s with TCP) but there was also a little difference when running Iperf as client on router (about 150 to 200 MBits/s less). On my custom image, running Iperf on router as client gave me poor results (~540Mbits/s with TCP); the other test gave aprox. 840MBits/s (not an huge difference).
I've concentrated my efforts on 11ac because the low bandwidth problem on the 11ac was more severe...eventually, I will realize that they are related...but until that happens, I cannot assume that.

(Last edited by sjuliao on 23 Sep 2016, 13:35)

well let's assume your LP_fixed connects to the AP at 867Mbps, instead of running iperf on router you run it on another laptop that connects to the same AP at 2.4GHz 300Mbps. connecting that another laptop to eth port instead of 2.4GHz wifi and running iperf (server/client whichever you want) seems less complex to me, will isolate your tests from crowded (and limited to 300Mbps) 2.4GHz band allowing you to push as much as traffic is possible through your 802.11ac AP while not stressing the router with running iperf directly on it.

Hi anarchy99,

I did understand the reason behind your suggestion and I agree with you.
But as I said, I did observe something unusual also on Ethernet (that I am not sure if they are related...at least the problem is not exactly the same experienced on 11ac, version k4.1), so it was one of the reasons why I didn't want to imply the *suspect* ethernet on that test.
Also, since I saw that the bandwidth problem was also happening on the 11ac channel with the other laptops (100MBits/s with the LP1/LP2 distanced from the router by only a few centimeters, with a maximum 300MBits/s possible and...let's face it: the processor should be able to deal with this easily!), it was obvious for me that the problem was not the b/g/n mode but something else.

I will need to do that test later (it is the purpose of this work in future) but for now, I've decided to keep them separated (at least while they are *suspect* for me, with different behaviors).

Meanwhile, I found the module responsible for the  /sys/bus/cpu/devices/cpu*/cpufreq/cpu_freq_cur|min|max information.

It is called cpu-freq and is a module/driver of the linux kernel itself (located on <linux-version>/drivers/cpufreq). It is available for selection through make kernel_menuconfig and to my surprise, it is selected by default in my .config4-1 file (including optional stats module), I've confirmed that they are compiled and included on both kernel versions but on the K4.1 it does not export the info on sysfs folder as supposed.

And according to the cpufreq-stats documentation, I should be able to also see <sysfs root>/devices/system/cpu/cpuX/cpufreq/stats/ ...but they don't exist as well.

I've tried to compile it as an module and load it manually but same result...
I only need to realize why...

(Last edited by sjuliao on 23 Sep 2016, 17:19)

Hi,

The problem persists...

The complexity of the image has been reduced to the minimum packages/modules necessary to test the QCA988x, either for Linux Kernel 3.18 and 4.1 and I can confirm that the problem is present in both images (most noticeable on K4.1 since both RX/TX on 11ac are affected but also present on K3.18).

I've checked if it could be happening on K4.1 due to the lack of some important patch (there is a difference of about 25 patches between K3.18 and K4.1) but after inspecting the patches files one-by-one and comparing to the target K4.1 files, it seems that this is not the case.

At this point, I am almost without ideas of what I can do to debug and solve further this problem.
It seems that there is some important change on the Linux Kernel that affects the network communication but I don't have an precise idea where to look...

Once again, I ask:
- does anyone have used the QCA988x with K3.18 or K4.1 and have experienced the same low bandwidth problem?

Kind regards

(Last edited by sjuliao on 26 Sep 2016, 21:58)

Not sure I can help you...
Does it concern only QCA988x ?
11ad and gigabit Ethernet are also slower Than OEM ? Any queuing policy ? Firewall ? IP tables filter ?
Any differences on cfg80211 settings ? ( capabilities, PDU aggregation, anything link to framing, offloading)

Let me know if I can check on my design something to check kernel performances. On my side, I'm trying to setup 60ghz interface. I've got one limitation. The MTU has to be lower than 240 to operate correctly. Any ideas or things to check ? With this limitation. The throughput is 200mbit/s.

(Last edited by bzh35 on 26 Sep 2016, 22:36)

Hi bzh35,

No, it does not concern only the QCA988x because the 1Gb Ethernet has some strange behavior too (between different images). But until this moment, I cannot say with total certain that there is a relation between them.

Here are a compilation of the different throughput's by interface:

Ethernet 1Gb:
(iperf -s on router; average on 60s):
OEM Image       --- 903 Mbits/sec
CustomK3.18   --- 930 Mbits/sec
CustomK4.1     --- 767 Mbits/sec

(iperf -c on router; average on 60s):
OEM Image       --- 711 Mbits/sec
CustomK3.18   --- 889 Mbits/sec
CustomK4.1     --- 534 Mbits/sec

Analysis:
- OEM image has a minor bandwidth limitation when TX packets to other clients;
- the K3.18 has better performance (and correct for 1Gb);
- K4.1 has some bandwidth limitation and poor performance;


Wireless 11ac:
(iperf -s on router; average on 60s):
OEM Image       --- 337 Mbits/sec
CustomK3.18   --- 344 Mbits/sec
CustomK4.1     --- 213 Mbits/sec

(iperf -c on router; average on 60s):
OEM Image       --- 342 Mbits/sec
CustomK3.18   --- 260 Mbits/sec
CustomK4.1     --- 131 Mbits/sec

Analysis:
- here I am taking OEM image as a reference (even if the 11ac results are not near the ones we should expect for an 11ac connection);
- K3.18 has some TX bandwidth limitation but seems okay in RX;
- K4.1 has TX and RX bandwidth limitation;
- from what I could confirm, the resources consumption and load average is similar between custom images and a little worst than the OEM image but it does not seem to be a processor limitation...
- using bmon to monitor the bandwidth, the results are the very similar as given by iperf ...so iperf is not measuring wrongly...

Both images K3.18 and K4.1 used in this measures are completely identical, only changing the linux kernel version.


I don't think it is related with the Firewall or iptables policy as I have already disable them completely in my earlier tests and the results were the same.
The configuration is shared between custom images. Also I have tried to copy the parameters of the OEM configuration but never seen the same 11ac performances on custom images; using an hostapd.conf file inspired in the parameters that are suggested by the Ath10k page, I can reach better throughput's on my custom images, but not near the OEM ones.

Regarding the 11ad, let me guess your problem:
you are experiencing a firmware crash when you try to send packets with an size greater than 255 bytes from your router to other clients but the RX works fine with larger packets; am I correct?
If this is your problem, yes, I am experiencing the same problem and I have already done some debug on the Wil6210 driver, the descriptor vring's, configuration, etc and have not found anything.
I think the problem here (firmware crash) is only related with the wil6210 or possibly the ath10k drivers (but in case of the ath10k, it does not fails for the 11ac).

(Last edited by sjuliao on 27 Sep 2016, 12:51)

Another test...

Using loopback interface on the router itself:
OEM Image       --- 4.896 Gbits/sec
CustomK3.18   --- 4.058 Gbits/sec
CustomK4.1     --- 1.948 Gbits/sec

It seems that the performance decreases with the linux kernel version increase. This is also what we can see with the other interfaces (with exception that the Ethernet seems to have better performance on K3.18 than in OEM image...weird!).

I know that the CPU scales correctly its frequency on K3.18 when demanded but until this moment, I am unable to prove this on K4.1 because the cpufreq stats are not exported to the debugfs (even if the module is included and compiled by default). Even if I set the scaling policy to "performance", I really want to prove that the CPU is in fact running at the expected maximum frequency.
This is currently what I am trying to prove. If I can prove that the processor is running at the correct frequency (when demanded), it will help to reduce the problem origin to the Linux Kernel implementation.

Last friday, while trying to prove this, I found some strange info across the different images:

On OEM image

root@OpenWrt:/# cat /proc/cpuinfo 
Processor       : ARMv7 Processor rev 0 (v7l)
processor       : 0
BogoMIPS        : 12.56

processor       : 1
BogoMIPS        : 12.56

Features        : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 
CPU implementer : 0x51
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0x04d
CPU revision    : 0

Hardware        : Qualcomm Atheros AK01-1XX reference board
Revision        : 0000
Serial          : 0000000000000000

root@OpenWrt:/# dmesg | grep CPU
[    0.000000] Booting Linux on physical CPU 0
[    0.000000] CPU: ARMv7 Processor [512f04d0] revision 0 (ARMv7), cr=10c5387d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
[    0.000000] PERCPU: Embedded 8 pages/cpu @c0c66000 s10624 r8192 d13952 u32768
[    0.000000] SLUB: Genslabs=11, HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[    0.152327] CPU: Testing write buffer coherency: ok
[    0.152546] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
[    0.155014] CPU1: Booted secondary processor
[    0.155076] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
[    0.155170] Brought up 2 CPUs
[    0.409778] acpuclk-ipq806x acpuclk-ipq806x: ACPU PVS: 1
[    0.416963] acpuclk-ipq806x acpuclk-ipq806x: CPU0: 6 frequencies supported
[    0.416963] acpuclk-ipq806x acpuclk-ipq806x: CPU1: 6 frequencies supported
[   12.931708] CPU_INTR_ADDRESS = [0]
[   17.972508] Target CPU Intr Cause 0x5040 
[   18.098781] Target CPU Intr Cause after CE reset 0x40 
[   29.844985] CPU_INTR_ADDRESS = [0]
root@OpenWrt:/# cat /sys/bus/cpu/devices/cpu*/cpufreq/cpuinfo_cur_freq 
800000
800000
root@OpenWrt:/# cat /sys/bus/cpu/devices/cpu*/cpufreq/cpuinfo_min_freq 
384000
384000
root@OpenWrt:/# cat /sys/bus/cpu/devices/cpu*/cpufreq/cpuinfo_max_freq 
1400000
1400000
root@OpenWrt:/# 

On K3.18

root@OpenWrt:/# cat /proc/cpuinfo
processor       : 0
model name      : ARMv7 Processor rev 0 (v7l)
BogoMIPS        : 21.87
Features        : half thumb fastmult vfp edsp neon tls vfpv4 idiva idivt 
CPU implementer : 0x51
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0x04d
CPU revision    : 0

processor       : 1
model name      : ARMv7 Processor rev 0 (v7l)
BogoMIPS        : 45.57
Features        : half thumb fastmult vfp edsp neon tls vfpv4 idiva idivt 
CPU implementer : 0x51
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0x04d
CPU revision    : 0

Hardware        : Qualcomm (Flattened Device Tree)
Revision        : 0000
Serial          : 0000000000000000


root@OpenWrt:/# dmesg | grep CPU
[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] CPU: ARMv7 Processor [512f04d0] revision 0 (ARMv7), cr=10c5787d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
[    0.000000] PERCPU: Embedded 9 pages/cpu @ddc18000 s7360 r8192 d21312 u36864
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[    0.000000]  RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
[    0.000980] CPU: Testing write buffer coherency: ok
[    0.001273] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
[    0.084061] CPU1: Booted secondary processor
[    0.084186] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
[    0.084311] Brought up 2 CPUs
[    0.084344] CPU: All CPU(s) started in SVC mode.
[    2.861217] CPU0 @ 800000 KHz
[    2.863681] CPU1 @ QSB rate. Forcing new rate.
[    2.866812] CPU1 @ 384000 KHz


root@OpenWrt:/# cat /sys/bus/cpu/devices/cpu*/cpufreq/cpuinfo_min_freq
384000
384000
root@OpenWrt:/# cat /sys/bus/cpu/devices/cpu*/cpufreq/cpuinfo_max_freq
1400000
1400000
root@OpenWrt:/# cat /sys/bus/cpu/devices/cpu*/cpufreq/cpuinfo_cur_freq
1400000
1400000

On K4.1

root@OpenWrt:/# cat /proc/cpuinfo 
processor       : 0
model name      : ARMv7 Processor rev 0 (v7l)
BogoMIPS        : 12.50
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 
CPU implementer : 0x51
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0x04d
CPU revision    : 0

processor       : 1
model name      : ARMv7 Processor rev 0 (v7l)
BogoMIPS        : 12.50
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 
CPU implementer : 0x51
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0x04d
CPU revision    : 0

Hardware        : Qualcomm (Flattened Device Tree)
Revision        : 0000
Serial          : 0000000000000000


root@OpenWrt:/# dmesg | grep CPU
[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] CPU: ARMv7 Processor [512f04d0] revision 0 (ARMv7), cr=10c5787d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
[    0.000000] PERCPU: Embedded 11 pages/cpu @ddc14000 s12672 r8192 d24192 u45056
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[    0.000000]  RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
[    0.000888] CPU: Testing write buffer coherency: ok
[    0.001153] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
[    0.084064] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
[    0.084179] Brought up 2 CPUs
[    0.084211] CPU: All CPU(s) started in SVC mode.
[    3.021121] CPU0 @ 800000 KHz
[    3.023573] CPU1 @ QSB rate. Forcing new rate.
[    3.026711] CPU1 @ 384000 KHz

As can be seen, the CPUInfo does not list the frequency for the processor (and it would be a static info...not really useful).
What I found very strange was the BogusMIPS for the K3.18 image...it is very different from the others.  I know that this value is not reliable and that it is calculated at boot time... but I cannot justify this huge difference when comparing with the OEM (these were the images where I could prove that the processor is running at the correct frequency). Maybe it has something to do with the number or type of processes running at the time when the calculation was done...I don't know...

Between OEM and K4.1, there is not much difference regarding the BogusMIPS...but the performance is very poor on K4.1...that is one of the reasons why I suspect it has something to do with the Linux Kernel implementation and not with the processor itself.

(Last edited by sjuliao on 27 Sep 2016, 20:44)

sjuliao wrote:

Regarding the 11ad, let me guess your problem:
you are experiencing a firmware crash when you try to send packets with an size greater than 255 bytes from your router to other clients but the RX works fine with larger packets; am I correct?
If this is your problem, yes, I am experiencing the same problem and I have already done some debug on the Wil6210 driver, the descriptor vring's, configuration, etc and have not found anything.
I think the problem here (firmware crash) is only related with the wil6210 or possibly the ath10k drivers (but in case of the ath10k, it does not fails for the 11ac).

Hi sjuliao,
Do you know if it happens on the different version of kernel you have used ?
It doesn't happen between 2 ipq8064 targets (sta & ap) with stock firmware.

bzh35 wrote:

Hi sjuliao,
Do you know if it happens on the different version of kernel you have used ?
It doesn't happen between 2 ipq8064 targets (sta & ap) with stock firmware.

Yes, it happens on both Linux Kernel versions K3.18, K4.1 and also on K4.4.
However the firmware crash only happens if you try to TX an packet with size > 255 bytes.
The RX works well and actually, it is possible to measure the bandwidth on it.
Using K3.18, I was able to achieve 824MBits/s (in same conditions, OEM achieves 1.14 Gbits/s)...so the bandwidth problem is also present on that interface...not really surprising.

The OEM image works well.

What is your hardware? What kernel/drivers/firmware are you using? If different, describe your problem.

(Last edited by sjuliao on 27 Sep 2016, 20:50)

Hi sjuliao,
My configuration is based on AD7200 with custom firmware kernel 4.4 and latest kernel driver for wil6210. I've tried different 60g firmware.
Firmware A is from AD7200 image.
Firmware B is from Atheros windows driver.
Firmware A ping latency is bad ~80ms with stock router image.
Firmware B ping latency is good ~1ms with stock router image.
I observe the same issue on TX data path with A and B with custom router image.
RX data path is not affected by the packet size limitation ( monitor mode is working).

(Last edited by bzh35 on 27 Sep 2016, 21:22)

bzh35 wrote:

Hi sjuliao,
My configuration is based on AD7200 with custom firmware kernel 4.4 and latest kernel driver for wil6210. I've tried different 60g firmware.
Firmware A is from AD7200 image.
Firmware B is from Atheros windows driver.
Firmware A ping latency is bad ~80ms with stock router image.
Firmware B ping latency is good ~1ms with stock router image.
I observe the same issue on TX data path with A and B with custom router image.
RX data path is not affected by the packet size limitation ( monitor mode is working).

I've tested with four different firmware versions:
- 6980
- 7233
- 7239
- 7759
All firmware versions have the same behavior. My wil6210 driver is the one included in compat-wireless-2016-01-10 and according to my search, it is the latest one.
Regarding this problem and since I suspect it is related with the wil6210 driver (although I didn't found the source of the problem), I've contacted the wil6210 maintainer (after spending one entire week - including weekend - trying to debug it). Meanwhile when I get this problem solved and if I do not get any answers for the 11ad problem, I will have to spent more time with it.

(Last edited by sjuliao on 27 Sep 2016, 21:43)

No one has experienced the low bandwidth problem?
It is hard to believe that I am the only one that is experiencing this problem...or could it just be that the people who faces this problem in some builds, just wait/try another build and install it (not debugging or solving the problem)?

Any suggestions how I can debug it further? Something that I could be missing or have not done yet?

Hi sjuliao,
I start to investigate the '255 maximum packet size' bug impacting the 60Ghz chipset.
If the skb length is 255 bytes, the firmware crashes and TX IRQ is not returned. The driver is not concerned by this bug. It's firmware+ platform + kernel version issue(my hypothesis).
I have performed some tracing and find that a pcie/DMA sequence crashes the firmware.
I have made some changes on host driver and the TX throughput is now around 1.7 Gbit/s using iperf TCP with MTU=1500 over 60Ghz channel 2. I think a workaround is possible but it may impact other PCIE bus and 11ac chipsets. I will keep you inform on the progress.

(Last edited by bzh35 on 28 Sep 2016, 20:59)

Hi sjuliao,
Regarding the performance issue on 11ac,
Did you check the kernel network scheduler and queuing ?
This part of the kernel has changed from 3.18 version.

Did you check hardware low level  settings on the different bus involved on the data path used to manage skb down to the 11ac chip ?
Up to now, I never check the 11ac performance only 11ad on my target. What is your platform ? Can I purchase one ? Is it a product under development or other thing ?

Hi bzh35,

bzh35 wrote:

Hi sjuliao,
I start to investigate the '255 maximum packet size' bug impacting the 60Ghz chipset.
If the skb length is 255 bytes, the firmware crashes and TX IRQ is not returned.

Yes, that was what I also discover through the wil6210 debug. For packets <= 255 bytes, an ISR TX 0x00000005 is returned and everything completes successfull

Wed Sep 14 08:05:21 2016 kern.debug kernel: [  656.329018] wil6210 0002:01:00.0 wlan0: DBG[ IRQ]Pseudo IRQ 0x00000002
Wed Sep 14 08:05:21 2016 kern.debug kernel: [  656.329044] wil6210 0002:01:00.0 wlan0: DBG[ IRQ]wil6210_mask_irq_pseudo()
Wed Sep 14 08:05:21 2016 kern.debug kernel: [  656.329067] wil6210 0002:01:00.0 wlan0: DBG[ IRQ]ISR TX 0x00000005
Wed Sep 14 08:05:21 2016 kern.debug kernel: [  656.329085] wil6210 0002:01:00.0 wlan0: DBG[ IRQ]TX done

but for packets with larger length, you receive a different interrupt:

Wed Sep 14 08:06:22 2016 kern.debug kernel: [  716.940432] wil6210 0002:01:00.0 wlan0: DBG[ IRQ]Pseudo IRQ 0x00000004
Wed Sep 14 08:06:22 2016 kern.debug kernel: [  716.940463] wil6210 0002:01:00.0 wlan0: DBG[ IRQ]wil6210_mask_irq_pseudo()
Wed Sep 14 08:06:22 2016 kern.debug kernel: [  716.940489] wil6210 0002:01:00.0 wlan0: DBG[ IRQ]ISR MISC 0x80000000
Wed Sep 14 08:06:22 2016 kern.err kernel: [  716.940516] wil6210 0002:01:00.0 wlan0: Firmware error detected, assert codes FW 0x000012aa, UCODE 0x00000000

bzh35 wrote:

The driver is not concerned by this bug. It's firmware+ platform + kernel version issue(my hypothesis).

Since this behavior is the same for all the firmware versions when used on my custom images (even the one that comes with OEM image), the only things that change are the Linux Kernel and Wil6210 driver version...so I'm suspecting on them only.

bzh35 wrote:

I have performed some tracing and find that a pcie/DMA sequence crashes the firmware.
I have made some changes on host driver and the TX throughput is now around 1.7 Gbit/s using iperf TCP with MTU=1500 over 60Ghz channel 2. I think a workaround is possible but it may impact other PCIE bus and 11ac chipsets. I will keep you inform on the progress.

Do you mean that you have successfully put it working either on TX or RX with packets up to 1500 bytes? That is awesome news!!! How did you did that? What have you changed? I've tried different modifications on the wil6210 driver while debugging it but none of my attempts were successful...