[etherlab-dev] EoE in OP mode

Gavin Lambert gavinl at compacsort.com
Thu Jan 19 00:32:10 CET 2017


Note that the patchset has only really been tested with RT_PREEMPT or
otherwise standard user mode usage.

 

In particular, there are some patches that change locks and callbacks in
ways that I don't think are entirely compatible with RTAI / Xenomai; there
have previously been reported problems using those with this patchset.

 

As I was neither the author of those patches nor do I use Xenomai (or EoE)
myself, I don't really know what needs to be done to resolve the issues
(except just dropping them and possibly breaking the scenario they were
originally authored to fix); additionally, I don't have much time at the
moment to work on EtherCAT.  I welcome assistance in correcting this
situation. :)

 

 

As far as I understand, ec_master_send/receive are only ever supposed to be
invoked on one thread at a time; when you're using the userspace library
this is enforced by a Linux lock in the corresponding ioctl, but this
doesn't apply or is insufficient when using a kernel-mode application or
RTAI/Xenomai.  In those, you need to register callbacks and use your own
appropriate locking mechanism to ensure that the send/receive are not called
concurrently.

 

In particular note that both the send callback and the receive callback are
permitted to do nothing if called in a context where they can't wait on a
lock but something else is concurrently busy doing the same thing.  So if
you're calling send/receive from an interrupt thread, you will need to keep
track of this and force the EoE thread callback to block until the interrupt
is done, and also to make the interrupt thread avoid send/receive without
blocking if the EoE thread is already in the middle of it.  Alternately you
could probably make the interrupt handler responsible to do both of these
things and have the EoE callbacks always do nothing, which might be better
for your application performance.  (Though like I said, I haven't looked at
the code much in this area so take these suggestions with a grain of salt; I
could have something incorrect.)

 

From: etherlab-dev [mailto:etherlab-dev-bounces at etherlab.org] On Behalf Of
Geller, Nir
Sent: Wednesday, 18 January 2017 23:38
To: etherlab-dev at etherlab.org; Slutsker, Rasty
<rasty.slutsker at servotronix.com>
Subject: [etherlab-dev] EoE in OP mode

 

Hi,

 

I recently upgraded ethercat master to Gavin Patchset 20160804, adding to
that, patch 0061.

 

EoE seems to be working fine while the master is idle, with heavy SDO
traffic in parallel.

 

When the master is active our realtime application invokes
ecrt_master_receive(master);  and  ecrt_master_send(master);  from interrupt
context, and NOT from 

ec_master_operation_thread() thread context.

 

The problem comes up when the master is active.

 

Just as I issue

 

ifconfig eoe0a1 up

 

I get a bunch of UNMATCHED DATAGRAMS in the kernel log, and the master is
released.

 

[   73.324525] EtherCAT DEBUG 0: UNMATCHED datagram:

[   73.324528] EtherCAT DEBUG: 0D 83 01 00 10 09 08 80 00 00 68 5A 4A 84 9C
9B 

[   73.324539] EtherCAT DEBUG: 84 11 01 00 

[   73.324544] EtherCAT DEBUG 0: UNMATCHED datagram:

[   73.324547] EtherCAT DEBUG: 04 84 01 00 90 09 08 80 00 00 B0 3D 4C 84 9C
9B 

[   73.324557] EtherCAT DEBUG: 84 11 01 00 

[   73.324562] EtherCAT DEBUG 0: UNMATCHED datagram:

[   73.324565] EtherCAT DEBUG: 0C 85 00 00 00 00 10 80 00 00 00 00 70 FF FF
FF 

[   73.324575] EtherCAT DEBUG: 50 52 70 FF FF FF 00 00 31 00 03 00 

[   73.324584] EtherCAT DEBUG 0: UNMATCHED datagram:

[   73.324587] EtherCAT DEBUG: 07 86 01 00 30 01 02 00 00 00 08 00 01 00 

[   73.324838] EtherCAT 0: fsm->slaves_responding[fsm->dev_idx]=1

[   73.324843] EtherCAT 0: 0 slave(s) responding on main device.

[   73.324846] EtherCAT 0: datagram->working_counter=0
<-------------------------  In wireshark capture WC is 1 !!!!

[   73.324850] EtherCAT 0: datagram->state=4

[   73.324853] EtherCAT 0: datagram->device_index=0

[   73.324856] EtherCAT 0: datagram->device_origin=0

[   73.324860] EtherCAT 0: datagram->index=134

[   73.324863] EtherCAT 0: datagram->type=7

[   73.324866] EtherCAT DEBUG 0: Rescanning the bus

 

 

This happens due to a timeout. When the EoE thread invokes

 

master->receive_cb(master->cb_data); which leads to invoke of
ecrt_master_receive(master); it somehow messes up 

 

master->devices[EC_DEVICE_MAIN].cycles_poll

 

which leads to a negative time delta in the calculation
master->devices[EC_DEVICE_MAIN].cycles_poll - datagram->cycles_sent.

 

Attempting to bypass that in the EoE thread, I commented out
master->receive_cb(master->cb_data);  and  master->send_cb(master->cb_data);

and once I invoke

ifconfig eoe0a1 up

 

I get an explosion of

 

[  123.529911] EtherCAT WARNING 0-main-0: Failed to receive mbox check
datagram for eoe0a1.

[  123.529918] EtherCAT WARNING 0-main-0: Failed to receive mbox check
datagram for eoe0a1.

[  123.529925] EtherCAT WARNING 0-main-0: Failed to receive mbox check
datagram for eoe0a1.

[  123.529932] EtherCAT WARNING 0-main-0: Failed to receive mbox check
datagram for eoe0a1.

 

 

If I comment out only master->receive_cb(master->cb_data);

 

I get no errors in dmesg, but then of course, EoE is not functional, and EoE
thread starts gathering more and more CPU usage.

 

I understand that an invoke of master->send_cb(master->cb_data); leads to 

ec_master_internal_send_cb     -->     ecrt_master_send_ext(master);  

 

which pulls datagrams from master->ext_datagram_queue and pushes them
forward with ec_master_queue_datagram(), and then

invokes ecrt_master_send(master); which will lead to a collision with
ecrt_master_send() in the interrupt context.

 

So instead of invoking master->send_cb(master->cb_data); i tried only to
pass datagrams from master->ext_datagram_queue,

but it caused a kernel panic.

 

 

So, if I want EoE to work when the master is active, how should I pass
datagrams from EoE thread to the master?

Should I change the ethernet.c state machine?

 

Thanks a lot,

 

Nir.

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20170119/efca17e7/attachment-0001.html>


More information about the etherlab-dev mailing list