[etherlab-dev] Possible Realtime Issues with Ethercat Master and RT Preempt Kernel

Thu Feb 4 00:31:04 CET 2016

Well, I guess that would work too, but I was thinking of a different arrangement.

I have the "real" output values stored in scattered memory locations (in an object graph related to their functions; not structured like the domain memory at all) and then the cyclic task uses EC_WRITE_* to copy the individual values from the objects to the domain memory.

It's not really any different from having a secondary cache that can be memcpy'd, I guess, but it "feels" like less copying.  (Well, I suppose technically it might be slightly slower when doing the actual copy, but conversely it'd be faster at doing the calculations, so I think that's a wash.)

OTOH I'm not controlling precision motors, so calculation latency probably doesn't bother me as much as it does some others. :)             

> -----Original Message-----
> From: Graeme Foot [mailto:Graeme.Foot at touchcut.com]
> Sent: Thursday, 4 February 2016 12:21
> To: Gavin Lambert <gavin.lambert at compacsort.com>; 'Tillman, Scott'
> <Scott.Tillman at bhemail.com>; Dr.-Ing. Matthias Schöpfer
> <schoepfer at robolab.de>; etherlab-dev at etherlab.org
> Subject: RE: [etherlab-dev] Possible Realtime Issues with Ethercat Master and
> RT Preempt Kernel
> 
> Hi,
> 
> Yes, the EC_WRITE_* macros should still be used when writing to the cached
> write memory, but then a straight memcpy from the cache to the domain
> memory is fine.
> 
> Graeme.
> 
> 
> -----Original Message-----
> From: Gavin Lambert [mailto:gavinl at compacsort.com]
> Sent: Thursday, 4 February 2016 11:48 a.m.
> To: 'Tillman, Scott'; Graeme Foot; Dr.-Ing. Matthias Schöpfer; etherlab-
> dev at etherlab.org
> Subject: RE: [etherlab-dev] Possible Realtime Issues with Ethercat Master and
> RT Preempt Kernel
> 
> On 3 February 2016 21:02, quoth Tillman, Scott:
> > Since you brought up the typical process cycle: I have been using a
> > process similar the second one you describe.  I was very surprised
> > when I was doing my initial development that the output frame and the
> > return frame were overlaid, requiring double buffering of the output
> > data.  It seems like you should be able to configure the domain to
> > place the return data in a separate (possibly
> > neighboring) memory area.  As it is the double buffering is the same
> > idea, but causes an extra memcpy just prior to sending the domain data.
> 
> The expectation is that you'll use the EC_WRITE_* macros to insert values into
> the domain memory; this takes care of byte-swapping to little-endian for you if
> you happen to be running on a big-endian machine.  You can usually only get
> away with a blanket memcpy if you know your master code will only ever run on
> little-endian machines.
> 
> > More problematic is the absence of any way to block (in user-space)
> > waiting for the domain's return packet.  As it is I am setting up my
> > clock at 0.5ms to handle a 1ms frame time:
> [...]
> > Are these two things there somewhere and I've just missed them, or is
> > there a good reason they haven't been implemented?  It seems like
> > these two items would minimize the overhead and maximize the
> > processing time available for most applications.
> 
> There isn't really a way to do that; it's a fundamental design choice of the
> master.  The EtherCAT-custom drivers disable interrupts and operate purely in
> polled mode in order to reduce the latency of handling an interrupt and
> subsequent context-switching to a kernel thread and then a user thread.  What
> gets sacrificed along the way is any ability to wake up a thread when the packet
> arrives, since nothing actually knows that the packet has arrived until polled.
> 
> To put it another way, when the datagram arrives back from the slaves, it just
> sits in the network card's hardware buffer until the buffer read is triggered by
> an explicit call to ec_master_receive().
> 
> The generic drivers have interrupts enabled (so the packets will be immediately
> read out of the hardware buffer into a kernel buffer) but the master still treats
> it as a polled device and won't react until explicitly asked to receive.
> 
> With some patches (such that ec_master_receive will tell you if it has received
> all the datagrams back, or similar) you could call this repeatedly (perhaps with
> short sleeps) shortly after sending the datagrams to detect as soon as they're
> back again, but obviously this will increase the processor load and give the
> system less time to do non-realtime things.  If you have some idle cores then
> this may not be a problem, however, and the quicker reaction may be worth it.
> 
> Having said that, as long as your calculation time is fairly constant, it's probably
> better to use the "classic" cycle structure than to do this -- the exact same
> input values will be read either way, as they're captured at the "input latch
> time" of the slave, which is typically either just after the last or in anticipation
> of the next datagram exchange.
>