[etherlab-dev] Userspace fork of Etherlab

Fri Jan 16 14:00:18 CET 2015

Florian Pose wrote:

> On Wed, Nov 26, 2014 at 06:31:18AM +0100, Frank Heckenbach wrote:
> > this is to announce my plan to port the Etherlab code to userspace.
> > I'll explain my reasons and roadmap below. If someone is interested
> > in this or has some comments, please let me know. Otherwise, I
> > expect to proceed on my own and publish the result on my web site
> > when finished.
> 
> we discussed many of your arguments before, so let me comment on them
> below.
> 
> > Reasons:
> > 
> > - I had to make a number of changes to the Etherlab code to fix some
> >   bugs and make it useable to us. The Etherlab developers are
> >   obviously not interested in those changes, so I have to maintain
> >   them myself. As was discussed on this list some months ago,
> >   keeping up with newer Etherlab versions would cost me additional
> >   maintenance and testing effort (already now since my changes are
> >   based on 1.5.0, whereas 1.5.2 has some conflicting changes), so
> >   I'd use my own fork of 1.5.0 which is known to work for us rather
> >   than 1.5.2 anyway.
> 
> I wrote, that I *am* willing to include your patches (I already prepared
> the default branch to include them), but due to my running projects I am
> a little later than expected, but I'm sure I will manage it this year.

Just out of curiosity, did you actually do it? I searched the
repository ("default", "stable-1.5") for things from my patches, but
didn't find anything. (Or did I get confused by the versioning
again?)

Well, it's probably too late now anyway. I'll need to really start
my project next week.

We're also upgrading to a new kernel version and everything which
means I'd first have to build a new RTAI kernel which has always
been a bit problematic in my experience. OTOH, Martin Troxler sent
me patches from his started userspace port. So even if I had a
working 1.5.2 with my patches integrated, the userspace port in fact
seems the easier option to me now (especially in the long run).

> So please be unpatient, this is an open-source project.

This means there are always two options, integrate or fork. Well ...

> > - Keeping up with new kernel versions is also not always easy
> >   (especially for the drivers which are patched files from the
> >   standard kernel, but also other kernel interfaces are known to
> >   change often), whereas userspace code is much easier to maintain
> >   (incompatible library changes are quite rare).
> 
> The problem is, that using a socket (generic driver) is not always
> sufficient. My first goal is to keep stable interfaces for RTAI, Xenomai
> (Kernel) and lxrt, posix (Userspace). Using a socket automatically uses
> the lower network stack layers and is *not* realtime-capable up to now
> (even with an RT-preempt-patched kernel). So the generic driver is one
> option. It may work, but there are setups where it definitely does not.
> That's why we keep the native drivers and have to stay in kernel (at
> least with the driver layer).

Latency was my main worry, so I did some timing tests. I used a
simple C program that sends and receives valid trivial EtherCAT
packets. I ran it in soft-realtime with CPU affinity (both of which
proved necessary to get the results I did). During the test I
exerted heavy load on the CPU, disk I/O and another ethernet
interface.

My result was that with very small packets, I can get cycle times of
1500ms without overruns. As the size of the packet (or packets if
larger than MTU) per cycles increases, so does the cycle time to run
reliably, and with really large packets I can get cycle times such
that a bit more than 50% of theoretical bandwidth is used in either
direction (which seems quie reasonable since it's what I've
experienced and seen recommended for other communication protocols,
and it also proves full-duplex works, otherwise no more than 50%
would be possible).

Our project runs at 2000ms (500Hz), and at that rate, I could get
packets of 5KB/cycle without overruns which is way above what we
require. So the userspace port will only be for "slow" cycles (up to
500Hz), but that's what we need. Maybe in a few years, the standard
kernel's RT features will improve and the userspace code will allow
for faster cycle times without many changes.

> > - So far we've been using RTAI for our realtime code. But that's
> >   also always been a bit troublesome (kernel version dependencies,
> >   high crash potential in case of problems, additional code with its
> >   own set of bugs, etc.), so we'd rather try to get rid of it
> >   anyway. Meanwhile the RT capabilities of the standard kernel have
> >   improved in recent years, and due to the wide availability of SMP,
> >   we can, if necessary, increase RT-ability by using CPU affinity
> >   (reserve one CPU for RT code, leave the other CPUs for the rest --
> >   of course, kernel code, esp. network drivers might need special
> >   consideration here).
> 
> We also dropped RTAI a long time ago for our projects, but you have to
> always be grateful in what you support, because there are many users
> that still require RTAI, lxrt, etc.

I'm not criticizing your choice to still support RTAI, just saying
that I hope I won't need it anymore.

> > - Our application code uses a lot of floating point which is
> >   supported in RTAI (though with some quirks), not in non-RTAI
> >   kernel mode, but of course easily in userspace.
> 
> Noone forces you to have your application in kernel space. The
> userspace-library offers the same API in userspace.
> 
> > - Userspace code is generally much easier to debug.
> > 
> > - If I had known and considered all this back then, I might have
> >   started with other code instead of Etherlab which is already
> >   userspace based, but has different interfaces (and possibly
> >   different bugs). But as things are now, since my code is tightly
> >   bound to the Etherlab interface, and well tested with (the patched
> >   version of) it, it seems easier to port this code to userspace
> >   than change my application code.
> 
> Your argumentation sounds a little bit like you don't know of the
> userspace library at all. Please take a look in the documentation and
> the lib/ subdirectory of the stable-1.5 branch. Our realtime
> applications of the last years are alltogether in userspace.

I know about the userspace library. But moving to userspace is not
my main goal. My main goal is better maintainability for my
application, and getting rid of kernel dependencies is a step
towards this goal. Using the kernelspace Etherlab code with a
userspace application wouldn't help in any significant way since I'd
have the same amount of kernel dependencies (my application does not
contain many).

Besides, I'm not sure whether to trust the userspace library and the
cdev it uses. Don't take this personally, but given the number of
bugs I've found and fixed in the code, and the fact that the two
versions sometimes use parallel code to achieve the same things (cf.
my comment about my patch #27), you must understand that I'm at
least a bit skeptical. And even though I didn't use it much, I did
already seem to find some locking bugs in the cdev code (see my
patch #19). So realistically, when using this for my application,
I'd have to expect a long time of testing before I could consider it
stable ...

Regards,
Frank

-- 
Dipl.-Math. Frank Heckenbach <f.heckenbach at fh-soft.de>
Stubenlohstr. 6, 91052 Erlangen, Germany, +49-9131-21359
Systems Programming, Software Development, IT Consulting