[etherlab-dev] Locking in etherlabmaster

Wed Feb 28 13:42:21 CET 2018

Hi all

I would like to initiate a discussion on how locking is done in
etherlabmaster.

How are applications using etherlabmaster API supposed to handle
synchronization of etherlabmaster internal data structures?

This is complicated by the fact that etherlabmaster is used both in normal
user-space context and in RTAI context, where (normal) blocking calls
are not allowed.

As I understand it, the initial approach for etherlabmaster was to leave the
problem of synchronization / locking to the application.

Later on, locking was introduced in the code, but without adding a clear plan
for it.  The result have been problems keeping locks out of real-time context,
race conditions due to insufficient locking, as well as high latency caused by
locks being held for extended periods of time.  I believe we have reached a
stage where it has become too difficult to make changes to locking related
code in a safe way.  Most of us only understand the code either as compiled
with --enable-rtdm or without it.

To improve this situation, we could go in number of different directions:

1. Drop locking as such (again), leaving the burden to application developers.
2. Improve the implementation by clearly specifying what locks exists, what
   they protect (and thus when to hold them), and in which order they are
   allowed to be held.  This might be difficult to combine with RTAI, as
   synchronization between code that cannot use locks in RTAI context and
   other code using internal locks is not possible.
3. Rethink the API/ABI, making a clear split between real-time and non
   real-time.  All real-time API/ABI calls must be lock-free.  And I am not
   just talking about removing the locks and let the application handle it,
   but change the design of the API to be really lock-free.

Unfortunately, I believe all 3 directions described above bears the risk of
forking the codebase as we know it today.  But, as the current situation does
that already (as witnessed by the user space fork by Frank Heckenbach), I
don't think that it is a (new) problem.

IMHO, the best chance of keeping a common codespace is direction 3, a
lock-less real-time API.  A lock-less real-time API should be good for all
use-cases, both normal user-space and RTAI, and both for single application
and multiple applications.  Only problem is that it will most likely be a hard
split from the current API/ABI, so legacy applications will keep the current
API/ABI alive.

Please let me know what you think.  Am I the only one seeing a problem here?
Anyone having other good ideas on how to approach this?

/Esben