Page 2 of 4

Re: Calling experienced network programmers

Posted: Tue Oct 18, 2016 4:00 pm
by jms
I agree that something isn't as robust as it might be.

I've been writing network/socket code for decades now and do network TAPs for a living (and it's almost certain your bank, mobile provider and ISP all use kit we supply) so like to think I've see just about every kind of weird network problem going. I have also read up on Wi-Fi standards.

Expecting it would be flaky. I ran my own little test involving regular (every minute or two) UDP pings to my own server including serial numbers and that ran for a couple of weeks with no trouble at all.

I therefore suspect that the extra complexity of MQTT brings about problems.

Proper evaluation should involve something like UDP which is as low as you can go, then TCP which introduces many of its own problems then work up to MQTT, SSL and the likes.

Also talk to a local server preferably not via an ISP-controlled router or access point.

In summary KEEP IT SIMPLE.

Re: Calling experienced network programmers

Posted: Tue Oct 18, 2016 9:40 pm
by torwag
I ran my own little test involving regular (every minute or two) UDP pings to my own server including serial numbers and that ran for a couple of weeks with no trouble at all.

Did you perform this test under flaky wifi conditions? As far as I understood, most problems arouse when the wifi signal starts to get weak and packages or the entire connections gets lost.
That might correlate with observations where problems appear if the cpu is under heavy load with the user application and the underlying rtos does not have enough time to take care of the network connection.
I guess what is really needed (i have no idea about the specific internals) is a proper scheduler between micropython and the rtos system.
Maybe all the problems are gone with the esp32, since it uses two cpu cores?!
Other than this afaik, we need non blocking sockets for the mqtt stuff, thus that a longer response time (for whatever reasons) does not conflict with the rest of the program logic.

Re: Calling experienced network programmers

Posted: Wed Oct 19, 2016 6:19 am
by pythoncoder
torwag wrote:@ Pythoncoder,

could you be a bit more specific what do you stop to work on exactly, I identified three main projects
1. your own scheduler
2. socket based mqtt for devices without network interface
3. work on mqtt primary on the esp8266 platform

As you know I play (with very slow progress) with mqtt on the esp8366 and we also discussed to use your scheduler. It might be important to know for the further development whether you will still work (bugfix) the scheduler.
Or we agree to go with uasync and try to make mqtt libs more robust.
I am emphatically continuing to support the scheduler. I regard this as essentially complete and tested - it dates back over two years. Like all software projects there will be minor bugfixes, support for new platforms and changes to accommodate alterations to the underlying MicroPython API. I expect that uasyncio will be developed to the point where my solution becomes redundant for new projects but mine seems sufficiently widely used that I intend to support it indefinitely.

The scheduler does work on the ESP8266 but, if you use it for socket programming, it will lack the level of performance achieved on other platforms. This is because of the blocking behaviour of sockets. Given this, fast scheduling seems impossible. I would therefore question whether it makes sense to use my solution on the ESP. Its principal advantage over the official uasyncio solution is performance. If this is unachievable then there seems little reason to prefer it.

From my point of view there was only one other project - I have no official involvement with the MQTT library. This project was as I described, to bring MQTT to MicroPython platforms lacking WiFi connectivity. It is this that I have decided to shelve. This is for two reasons. Firstly I'm frankly out of my depth dealing with the issues which arise when WiFi connectivity degrades. And secondly I don't think my aims are achievable until official asynchronous socket/MQTT code emerges. In short I believe that the underlying support needs to improve. I'm sure the team will achieve this, but I lack the experience to be able to assist them in this effort.

I hope this clarifies things.

Re: Calling experienced network programmers

Posted: Wed Oct 19, 2016 12:21 pm
by jms
My test was oh maybe six feet from my router though this goes wobbly from time to time thanks to Virgin Media. I just wanted to know whether the ESP itself causes trouble. A real application, no matter what it uses must have its own policy for what to do when connectivity fails. This could be as simple as reboot. A logical progression would be to shield the module, unplug the router, F with the power supply and see if it crashes, recovers or what.

Your problem down to a combination of blocking sockets, differences between lwIP on ESP and BSD sockets on a heavyweight OS and lack of really good documentation. Berkeley sockets which underlies just about all modern OSs with all its quirks is very well understood it being decades old.

(A relatively new problem is stateful (NATting) routers which might drop packets for forgotten connections where without those routers undead TCP connections would have been cleared down.)

Jon

Re: Calling experienced network programmers

Posted: Wed Oct 19, 2016 2:07 pm
by stijn
pythoncoder wrote:The scheduler does work on the ESP8266 but, if you use it for socket programming, it will lack the level of performance achieved on other platforms. This is because of the blocking behaviour of sockets.
I'm probably missing something here, but doesn't select() solve this? E.g. instead of blocking you check if data's available first, which is non-blocking, then only read if there's enough data.

Re: Calling experienced network programmers

Posted: Wed Oct 19, 2016 6:38 pm
by deshipu
Select would have solved this, if it existed.

Re: Calling experienced network programmers

Posted: Thu Oct 20, 2016 11:59 am
by pythoncoder
I also (rarely) experienced socket writes blocking. Is this expected?

Re: Calling experienced network programmers

Posted: Thu Oct 20, 2016 12:28 pm
by stijn
Typical scenario is the hardware's write buffers are full so write() will block until there's room again (because receiving end has consumed data).
Select would have solved this, if it existed.
yeah so that's what I was missing :)

Re: Calling experienced network programmers

Posted: Thu Oct 20, 2016 2:41 pm
by Lysenko
The ESP8266 has a 1460 byte send buffer and it is fairly aggressive about waiting on ACKs. I'm not familiar with the background here, so I may be way off base, but it sounds to me that you're might be using the wrong tool for the job.

TCP is a streaming protocol. You use it when you need to p-t-p transfer large amounts of data, you depend on precise packet sequencing and you can't afford any data loss.

Sending small packets (like a sensor reading) periodically over an inherently "connectionless" IP transport (anything wireless) is usually a job for UDP, particularly if you don't have pre-emptive threading or select().

In the context of MQTT that means the "S" variant:

http://mqtt.org/documentation

... though personally I would use CoAP. MQTT (and AMQP come to that) assume that there is a stable TCP/IP transport layer. That isn't a valid assumption with anything wireless you therefore end up implementing ad hoc workarounds for the cases where their assumptions fail. CoAP, on the other hand, assumes an unreliable transport layer as a starting point.

Note I'm not saying that MQTT over WiFi is bad per se (though it often is): just that it isn't suited to a partially implemented (no select()) socket stack on a processor/runtime that can't easily pre-empt and reset a blocked subsystem.

Re: Calling experienced network programmers

Posted: Thu Oct 20, 2016 3:20 pm
by jms
Whether select or poll or timeout/blockability (applied via ioctl or fcntl) or process alarms or indeed many threads is largely down to preference.

Heavyweight servers with many sockets use kqueue and similar.

But I believe the _only_ things we have here are socket timeouts and blocking through the usocket module. Good enough for everything you'll realistically do with the ESP8266.