WiFi Direct on Android | IT IS TERRIBLE

Wifi Direct; a  wireless communication channel that runs in parallel with your Wifi connection, guaranteeing point-to-point communication at speeds much faster than BLE. It sounds perfect for a local multiplayer game, or a camera peer network. It works up to 20 meters in open space and penetrates a single wall given your antenna is strong.

Sounds perfect? I really wish it was. I spent over a month trying to ensure it works on as many devices as possible. In principle, it works exactly the same as BLE. You make your device visible for a wifi direct connection, and another device upon discovering it connects to the former.

If you're planning to use Wifi Direct, I'd suggest you don't and figure out another mechanism.

Here's a list of problems I faced:
The SDK:
  • The heart of the issue is each phone functions differently.
  • The Android Software Development Kit ensures all devices above API 19 provide the same set of APIs, but the lower layers manufacturer implementation is different for each device. All the Android SDK really does is provide an abstraction layer above individual driver implementations. This is true for most hardware specific APIs in the SDK, but I suppose the device CTS (a compatibility test suite to validate whether a device is fit to be sold as an Android compliant device) doesn't do a very good test at testing wifi direct.
  • A couple simple examples of this erratic behavior; LG Q6 changes its device name if there is a device failure,  Nexus stops connecting randomly post 2 unsuccessful connect() attempts
  • Unlike other APIs in Android, each of the Wifi Direct APIS responds through an onSuccess() or an onFailure() callbacks.
  • If one command is called but not yet processed, any other call will throw onFailure()
  • Broadcast receivers for wifi direct are messed up! They may return the same information anywhere between once, to thrice, hence messing up the listener's logic - for example, onConnect() may return twice.
  • Discovery time is dependent on who started discovery first. It can take anywhere between 2-10 seconds to discover. 
  • Discovery may even fail at times even if startDiscovery() returns true; the only way out here is to stopDiscovery() and start it again, but this is tricky because you can only call startDiscovery() once the call from stopDiscovery() returns.
  • Some calls don't return; for example, there is no timeout for connect() call, and if the device you're connecting to doesn't accept, you can't fire any other command until this returns. Thus, you have to implement your own timeout for each call.
  • There is no way for a non-group owner to know the details of everyone in the group other than to query it from the group owner.
I ended up implementing boolean flags to ensure we don't trigger commands and get unnecessary onFailure() callbacks and have a message queue for retrying failing commands after 2 seconds. The alternative and better solution is to use a message queue only and no boolean flags; let the APIs fail with onFailure() and then add the same message back to the message queue. This makes sense because you can then consistently use the message queue throughout to implement timeouts, and retry on onFailure() calls.

The library works about 95% of the times on all devices I've tested, and I'm content with this. The only problem is discovery times are fairly high; up to 10 seconds, and I can't pass any metadata apart from the information in the name of the device being broadcasted.

So, what next? Android provides another set of APIs for Network Service Discovery (NSD) over Wifi Direct. My hunch was it only performs a service discovery after the initial wifi direct discovery; but a colleague was certain it'd work faster as we'd been using NSD over Wifi for quite some time.
It turns out, it performs worse than the simple wifi direct discovery.

Specific problems with Wifi Direct NSD:
  • The only advantage is that you can send metadata bundled with the service info (port, IP, name, etc.) without pairing up with the connection. 
  • You do not get a service lost callback, so you will still have to rely on the former discovery lost mechanism for invalidating old/lost connections
  • It fails half of the time to discover when the normal P2P discovery works, and we're clueless as to why.


Popular posts from this blog

Firebase Auth | The Debug vs Release Signature Problem

A Deep Learning Classifier for FIFA vs Real Football

I'm now an Estonian eResident!