[etherlab-dev] Support for multiple mailbox protocols

Thu Jun 26 04:19:16 CEST 2014

On 26 June 2014, quoth Knud Baastrup:
>> Additionally it doesn't look like you have any protection against 
>> concurrent CoE access (which TBH I'm not entirely sure whether this 
>> occurs, but Frank's patch 27 suggests it does), and I'm definitely 
>> not a fan of allocating/deallocating memory on each mailbox transfer, 
>> which is what it looks like you're doing.
>
> I believe that the check_mbox flag should work ok for concurrent CoE 
> access as well (however, I can as well not see how this can happen?) 
> as the check_mbox flag will ensure only one ongoing read request per 
> slave no matter which mailbox protocol.

The issue, as I understand it, is that both fsm_master and fsm_slave have
their own separate fsm_coe instances.  (Several other state machines have
references to an fsm_coe but it's always handed down from one or the other
of these parents.)  So it's just a question of whether fsm_master and
fsm_slave can execute (their CoE related parts) concurrently or not.  Which
I'm not entirely certain about from looking at the code, but I should add
that after adding Frank's coe-lock patch I have observed cases where it has
reported concurrent CoE access.  (I haven't been able to get it to happen in
my bench testing but it has occurred in field tests; as a result I'm not
sure exactly where it's coming from.)

If there really is concurrent CoE going on, it's not a good idea to send two
CoE requests in parallel to the same slave -- some slaves can cope with that
(and send both replies) but some may choke, and the order in which they
reply is not guaranteed.  So for one thing, your patch doesn't attempt to
control sending, only receiving; this could result in both requests being
sent, but the FSM that "wins" the check-lock might not be the one whose
answer first arrives.  And the non-atomic check I mentioned before could
result in both checks being active at once if they're coming from separate
threads (which is less likely than sequentially concurrent access, but if
you didn't want to protect against threads you wouldn't have used a lock).

> Each fetch data datagram is already allocating memory corresponding to 
> the mailbox size so allocation memory is already heavily used.

I don't believe so.  Each time it does call ec_datagram_prealloc, yes, but
this will only allocate memory if the datagram isn't already large enough;
it might take a few calls to fully expand the whole datagram ring buffer but
after that it should be able to exchange datagrams of any equal or smaller
size without reallocation.  (That's why it's a prealloc, not an alloc.)
Conversely your version will always free/realloc on every transfer, which is
what I'm objecting to.

Regards,
Gavin Lambert