Debugging OpenWRT wifi on TP-Link WDR3600


Preface

Changes happened, when I created a new network for radio1 via the LUCI webinterface. The new network should offer my guests a 5Ghz wifi (instead of the 2.4Ghz that was already active). After creating and configuring the interface, LUCI fires some scripts to reconfigure the wifi interfaces in OpenWRT. Well, the 5Ghz did not come back online. All I got from the system log was

1
2
3
4
5
nl80211: Could not configure driver mode
nl80211: deinit ifname=wlan1 disabled_11b_rates=0
nl80211 driver initialization failed.
wlan1: interface state UNINITIALIZED->DISABLED
wlan1: AP-DISABLED

Some reconfiguration tries later I found a fix that brought interface wlan1 back to life: setting a fixed channel. In my case it was channel 44 with 40Mhz width. I still don't know why this worked, and why it did not work before with the 'auto' setting.

Investigating

Now that I had some free time I got back to the problem. My fear was that some hardware part got faulty over the years of usage and I hoped to find some hints that would give me enough evidence to consider replacing the hardware.

I switched back to 'auto' on the interface. Did not work, of couse. Helped to have a cable attached this time. To make the solution story short from here, I will now reduce the story telling.

Debugging

For learning effects I really recommend switching to CLI from now on. OpenWRT normally enables a dropbear SSH server you can use to log in remotely.

First thing I want to do is getting more verbose output from hostapd. hostapd configures the hardware wifi device with a configuration file that OpenWRT creates via the wifi tool, read from UCI configuration settings. They land in /var/run/hostapd-phyX.conf and can also be edited directly, if you don't want to use UCI all the time (uci set wireless.radio1.log_level=0; uci commit; wifi that would be for example).

To do this, use ps to identify the running hostapd instance for your wifi interface and kill it. Then use hostapd -dd -t -P /var/run/wifi-phyX.pid /var/run/hostapd-phyX.conf (where phyX is your interface) to start hostapd in foreground.

As mentioned earlier, hostapd here uses a config file you can edit. This one looks like the following:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
~# cat /var/run/hostapd-phy1.conf
driver=nl80211
logger_syslog=127
logger_syslog_level=2
logger_stdout=127
logger_stdout_level=2
country_code=DE
ieee80211d=1
ieee80211h=1
hw_mode=a
channel=acs_survey

ieee80211n=1
ht_coex=0
ht_capab=[HT40+][LDPC][SHORT-GI-20][SHORT-GI-40][TX-STBC][RX-STBC1][DSSS_CCK-40]

interface=wlan1
ctrl_interface=/var/run/hostapd
disassoc_low_ack=1
preamble=1
wmm_enabled=1
ignore_broadcast_ssid=0
uapsd_advertisement_enabled=1
wpa_passphrase=WhatIsEncryption
auth_algs=1
wpa=2
wpa_pairwise=CCMP
ssid=GuestWLAN
bridge=google-grid
wpa_key_mgmt=WPA-PSK

The interesting part is the first paragraph. It holds country_code, hw_mode and channel. A short explanation:

  • country_code: This ISO code determines in what country you run your wireless device. Every country has its own regulations and you should make sure to not use frequencies that are not free for public usage.
  • hw_mode: This is a or n, when you have an IEEE 802.11an router.
  • channel: The channel your device uses. In 5GHz this ranges up to 196 (see Wikipedia List of WLAN channels). The channels define a frequency range. So channel 44 is around 5220Mhz.

Now you can edit this file and for example make a try with channel=44. Running hostapd on CLI like above will give you more verbose output and a more detailed error message. It may help to also look up what channels your hardware supports. Use iw list to show hardware information.

It's also possible to monitor the device directly with tcpdump. You need to install tcpdump for this and add a monitoring interface to your physical interface with iw phy phyX interface add mon0 type monitor. Then ip link set mon0 up and tcpdump -i mon0. You should see packets like wifi beacons. You can also use iwcap which is not as space hungry as tcpdump. Start it with iwcap -i mon0 -f -o /tmp/wlan.dump, you can later read the file with tcpdump -r wlan.dump.

In the end I could pin the error down to the selection of the channel in combination with the channel width. I still don't know why the automatic settings failed. But going back to auto (acs_survey) now suddenly did work again for me.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
~# hostapd -dd -t -P /var/run/wifi-phy1.pid /var/run/hostapd-phy1.conf
1356044400.555851: Configuration file: /var/run/hostapd-phy1.conf
1356044400.590771: wlan1: interface state UNINITIALIZED->COUNTRY_UPDATE
1356044400.595493: ACS: Automatic channel selection started, this may take a bit
1356044400.600435: wlan1: interface state COUNTRY_UPDATE->ACS
1356044400.600685: wlan1: ACS-STARTED
1356044401.076737: wlan1: ACS-COMPLETED freq=5700 channel=140
1356044401.080373: wlan1: interface state ACS->DFS
1356044401.084424: wlan1: DFS-CAC-START freq=5700 chan=140 sec_chan=0, width=0, seg0=0, seg1=0, cac_time=60s
1356044402.109982: wlan1: DFS-CAC-COMPLETED success=1 freq=5700 ht_enabled=1 chan_offset=0 chan_width=1 cf1=5700 cf2=0
1356044402.110718: Using interface wlan1 with hwaddr fe:fe:fe:fe:fe:fe and ssid "GuestWLAN"
1356044402.521345: wlan1: interface state DFS->ENABLED
1356044402.521465: wlan1: AP-ENABLED

Where ACS is automatic channel selection, CAC channel availability check and DFS dynamic frequency selection. See this paper about DFS for more information.

Frequency regulations

A bonus topic if you are interested. As mentioned the frequency regulation depends on what country you are in. In Germany there are three 5Ghz zones which regulate the frequency ranges and their usage restrictions. You can use iw reg get to get a list in OpenWRT:

1
2
3
4
5
6
7
~# iw reg get
country DE: DFS-ETSI
    (2400 - 2483 @ 40), (N/A, 20), (N/A)
    (5150 - 5250 @ 80), (N/A, 20), (N/A), NO-OUTDOOR
    (5250 - 5350 @ 80), (N/A, 20), (0 ms), NO-OUTDOOR, DFS
    (5470 - 5725 @ 160), (N/A, 27), (0 ms), DFS
    (57000 - 66000 @ 2160), (N/A, 40), (N/A)

This topic is btw the main topic in recent discussions about firmware lockdowns to comply with a new FCC regulation.