[etherlab-users] Alternating working count 0/24 (zero and complete)

Thu Jul 17 11:41:05 CEST 2014

Thanks for your response.

On 07/16/2014 12:43 AM, Gavin Lambert wrote:
> On 15 July 2014, quoth J. van der Wulp:
>>  - when a frame exceeds the 128 byte threshold then increasingly often
>> the latency of the response frame increases (seems a 100microsecond
>> offset) but our time budget (time between send() and receive()) is 100
>> microseconds. This is the cause for working count 0 errors.
>>  - as long as the process data is such that frame size stays below ~128
>> bytes there is no problem, the working counts stay stable and response
>> latency is more or less constant
> 
> Sure you're not getting a 10Mbit link instead of 100Mbit?  128 bytes of data
> at 5kHz will just about saturate a 10Mbit link.

This is a very interesting thought. It took me while to validate that
indeed the link is 100Mbit. I had to patch the driver to be sure. I
placed the following fragment in ec_poll:

        struct ethtool_cmd cmd = { ETHTOOL_GSET };

        dev->ethtool_ops->get_settings(dev, &cmd);
			     netif_info(tp, probe, tp->dev, "SPEED 0x%d.\n",
ethtool_cmd_speed(&cmd));

which resulted in 100 printed repeatedly. I also established that during
link negotiation the slave as well as the master side advertise 100Mb
(among others).

So far I have found no other good explanation of why the ~128 byte
boundary is so special. I have done an experiment with a couple of
Beckhoff modules with process data of 67 bytes, which I could scale up
to 15Khz without problems (just an occasional working count problem once
every couple of seconds). Yet when crossing the ~128 boundary at 5Kh it
still collapses.

Inspired by older patched versions of the r8169 I made a change to the
ec_poll routine which relieves the symptoms. I still have a bit of an
uneasy feeling with this change as it removes inspection of the
interrupt status register and directly starts the rtl_rx/rtl_tx buffer
processing. Why would the interrupt status be good with frames below 128
bytes and not good otherwise? I still have the feeling that I miss
something.

> 
>>  - use the generic module when operating at 5Khz (only tested with 1.5.2
>> with frame size less than ~128 bytes) gives the same working count 0
>> symptoms, for our application we really seem to need the patched
>> drivers...
> 
> The generic driver is rarely stable over 1kHz; sometimes not even that.
> 
> 

I now capture on the debug interface, but it drops a lot of packages,
and I am not sure as to whether the capture timestamps I get are realistic.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: r8169-3.4-ec_poll.patch
Type: text/x-diff
Size: 578 bytes
Desc: not available
URL: <http://lists.etherlab.org/pipermail/etherlab-users/attachments/20140717/d66ff8e5/attachment-0005.patch>