[etherlab-dev] [etherlab-users] Using EtherCAT EoE and Build Errors

Frank Heckenbach f.heckenbach at fh-soft.de
Thu Aug 16 09:38:10 CEST 2012


Thomas Bitsky, Jr. wrote:

> I'm not sure if this is a problem in my source code or a bug in the code
> relating to synchronization.
> 
> So, my problem has become this: I can successfully use EoE when the
> EtherCAT network is not operational. I can successfully use the EtherCAT
> network in Operation if the virtual EoE interface is down, but if I put the
> EtherCAT network into Operation and use the callbacks to handle EoE, the
> entire computer locks up.
> 
> For reference:
> EtherCAT version: stable-1.5
> System: Linux laptop14 2.6.32-42-generic-pae #95-Ubuntu SMP Wed Jul 25
> 16:13:09 UTC 2012 i686 GNU/Linux
> I am not using any real-time extensions.
> GCC: 4.4
> 
>
> The problem could very well be in my source code, although I've matched it
> closely to the EtherLAB examples. Once the virtual EoE interface goes up,
> the kernel log is filling with the errors I mentioned before:
> 
> [ 2687.384659]
> [ 2687.384665] Pid: 0, comm: swapper Tainted: P        W
>   (2.6.32-42-generic-pae #95-Ubuntu) Latitude E6510
> [ 2687.384672] EIP: 0060:[<c03ac336>] EFLAGS: 00000202 CPU: 3
> [ 2687.384680] EIP is at acpi_idle_enter_bm+0x275/0x2a4
> [ 2687.384684] EAX: c088eb4c EBX: 00000ee7 ECX: 00000000 EDX: 03036000
> [ 2687.384689] ESI: 00000000 EDI: f6e404cc EBP: f74cbf78 ESP: f74cbf50
> [ 2687.384694]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> [ 2687.384698] CR0: 8005003b CR2: b94c0004 CR3: 00799000 CR4: 000006f0
> [ 2687.384703] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [ 2687.384707] DR6: ffff0ff0 DR7: 00000400
> [ 2687.384710] Call Trace:
> [ 2687.384719]  [<c04ca54a>] cpuidle_idle_call+0x7a/0x100
> [ 2687.384727]  [<c01085a4>] cpu_idle+0x94/0xd0
> [ 2687.384735]  [<c05b31b7>] start_secondary+0xc4/0xc6
> [ 2687.393250] BUG: scheduling while atomic: swapper/0/0x10000100
> [ 2687.393256] Modules linked in: durability ec_generic ec_e1000
> ec_8139too ec_master mii michael_mic arc4 binfmt_misc snd_hda_codec_idt
> 
> 
> The notable parts of my code are the callback:
> 
> void
> send_callback(void *cb_data)
> {
> ec_master_t *m = (ec_master_t *) cb_data;
>         down(&master_sem);
> ecrt_master_send_ext(m);
>         up(&master_sem);
> }
> 
> void
> receive_callback(void *cb_data)
> {
> ec_master_t *m = (ec_master_t *) cb_data;
>         down(&master_sem);
> ecrt_master_receive(m);
>         up(&master_sem);
>  }
> 
> If they are not in the program and activated by ecrt_master_callbacks, then
> there is no lock-up. Of course, EoE doesn't work. Once I put them in, the
> system hangs in about 30 seconds or less. I can't see any obvious reason
> for this: it looks like dead lock. I added traces and watched the kernel
> log viewer; I think it's locking up in ecrt_master_send_ext and not
> returning.
> 
> In any case, I've been working on this for five days. If anyone can shine
> some light on what I'm doing wrong, or how I can fix this, I'd appreciate
> it.

This looks very much like a problem I experienced some time ago.
Since then, I've investigated the issue and found several bugs in
the code.

I think I've fixed all the problems now. We've been using the
patched code for a while now and it seems to run stable.

I plan to send my patches to this list, though I'll need a few more
days to prepare them properly. If you need them more urgently,
please contact me directly.

For reference, we're using EtherCAT Master 1.5 with an RTAI Linux
2.6.24-16 kernel and mostly the e1000 driver (though the latter
seems not pertinent to the problem). This means just using RTAI with
the original EtherCAT Master probably won't help you.

Regards,
Frank

-- 
Dipl.-Math. Frank Heckenbach <f.heckenbach at fh-soft.de>
Stubenlohstr. 6, 91052 Erlangen, Germany, +49-9131-21359
Systems Programming, Software Development, IT Consulting


More information about the etherlab-dev mailing list