[etherlab-users] Error events during long-term testing

Sebastien BLANCHET blanchet at iram.fr
Fri May 2 17:46:52 CEST 2014


Dear Johansen,

According to your log, the duration between two error events is 
variable, between 6 and 11 days, therefore it is probably not an integer 
overflow but something else.

To be sure, change the frequency of the cycle to check if the error 
frequency depends on it.

Note: I have the same problem with a very similar configuration (except 
that I have an Intel PC instead of a PowerPC). Unfortunately, I have not 
found a solution yet.

regards,
---
Sebastien BLANCHET

On 05/02/2014 01:04 PM, Johansen Ernst wrote:
> Hello together,
>
> I'm doing a long-term test of:
>
> root at MTEST-PC-IFC12:~# ethercat version
>
> IgH EtherCAT master 1.5.2 2eff7c993a63
>
> Basically the application is working perfectly except for error events
> that takes place approx every 16 days... As my final application is
> running 24/7 this is a problem.
>
> I will continue testing for the months to come, but I have the feeling
> that some kind of "overflow" is happening.
>
> Does anybody have an explanation for what the bus is doing?
>
> My application is running on a PowerPC with GNU/Linux PREEMPT_RT.
>
> The application is in user space and is triggered cyclically at 3125Hz.
>
> On the bus there is a small number of Beckhof devices:
>
> root at MTEST-PC-IFC12:~# ethercat config
>
> 0:0  0x00000002/0x044c2c52  0  OP
>
> 0:1  0x00000002/0x10243052  1  OP
>
> 0:2  0x00000002/0x0c1d3052  2  OP
>
> 0:3  0x00000002/0x25213052  3  OP
>
> 0:4  0x00000002/0x084c3052  4  OP
>
> Below is the /var/log/syslog for the last month. There were a total of 5
> error events during this period of time.
>
> cut
>
> Mar 30 03:28:15 MTEST-PC-IFC12 -- MARK --
>
> Mar 30 03:33:26 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: 2 working
> counter changes - now 6/6EtherCAT 0: Domain 0: Working counter changed
> to 0/6
>
> Mar 30 03:33:26 MTEST-PC-IFC12 kernel: EtherCAT ERROR 0-2: Failed to
> receive AL state datagram: Datagram timed out.
>
> Mar 30 03:33:26 MTEST-PC-IFC12 kernel: EtherCAT 0: 0 slave(s) responding
> on main device.
>
> Mar 30 03:33:27 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
> ef907e4c (domain0-0-main) was SKIPPED 1562 times.
>
> Mar 30 03:33:27 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 382 datagrams
> TIMED OUT!
>
> Mar 30 03:33:27 MTEST-PC-IFC12 kernel: libphy: mdio at ffe24520:03 - Link
> is Down
>
> Mar 30 03:33:27 MTEST-PC-IFC12 kernel: EtherCAT 0: Link state of ecm0
> changed to DOWN.
>
> Mar 30 03:33:28 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
> ef907e4c (domain0-0-main) was SKIPPED 1930 times.
>
> Mar 30 03:33:28 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 80 datagrams
> TIMED OUT!
>
> Mar 30 03:33:28 MTEST-PC-IFC12 kernel: libphy: mdio at ffe24520:03 - Link
> is Up - 100/Full
>
> Mar 30 03:33:28 MTEST-PC-IFC12 kernel: EtherCAT 0: Link state of ecm0
> changed to UP.
>
> Mar 30 03:33:28 MTEST-PC-IFC12 kernel: EtherCAT 0: 5 slave(s) responding
> on main device.
>
> Mar 30 03:33:28 MTEST-PC-IFC12 kernel: EtherCAT 0: Slave states on main
> device: SAFEOP, OP + ERROR.
>
> Mar 30 03:33:28 MTEST-PC-IFC12 kernel: EtherCAT 0: Scanning bus.
>
> Mar 30 03:33:29 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0-1: Slave has
> state error bit set (SAFEOP + ERROR)!
>
> Mar 30 03:33:29 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
> counter changed to 4/6
>
> Mar 30 03:33:29 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
> ef907e4c (domain0-0-main) was SKIPPED 15 times.
>
> Mar 30 03:33:29 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 2 datagrams
> TIMED OUT!
>
> Mar 30 03:33:30 MTEST-PC-IFC12 kernel: EtherCAT 0: Bus scanning
> completed in 1302 ms.
>
> Mar 30 03:33:30 MTEST-PC-IFC12 kernel: EtherCAT 0: Using slave 0 as DC
> reference clock.
>
> Mar 30 03:33:30 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: No app_time
> received up to now, but master already active.
>
> Mar 30 03:33:30 MTEST-PC-IFC12 kernel: EtherCAT ERROR 0-1: AL status
> message 0x001B: "Sync manager watchdog".
>
> Mar 30 03:33:30 MTEST-PC-IFC12 kernel: EtherCAT 0-1: Acknowledged state
> SAFEOP.
>
> Mar 30 03:33:30 MTEST-PC-IFC12 kernel: EtherCAT 0: Slave states on main
> device: OP.
>
> Mar 30 03:33:31 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
> counter changed to 6/6
>
> Mar 30 03:48:15 MTEST-PC-IFC12 -- MARK --
>
> cut
>
> Apr  6 20:28:17 MTEST-PC-IFC12 -- MARK --
>
> Apr  6 20:30:45 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: 2 working
> counter changes - now 6/6<7>EtherCAT WARNING 0: 1 datagram UNMATCHED!
>
> Apr  6 20:30:46 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
> counter changed to 0/6
>
> Apr  6 20:30:46 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
> ef907e4c (domain0-0-main) was SKIPPED 1 time.
>
> Apr  6 20:48:17 MTEST-PC-IFC12 -- MARK --
>
> cut
>
> Apr 16 06:48:19 MTEST-PC-IFC12 -- MARK --
>
> Apr 16 06:51:39 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
> counter changed to 6/6
>
> Apr 16 06:51:39 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
> counter changed to 0/6EtherCAT ERROR 0-1: Failed to receive AL state
> datagram: Datagram timed out.
>
> Apr 16 06:51:39 MTEST-PC-IFC12 kernel: EtherCAT 0: 0 slave(s) responding
> on main device.
>
> Apr 16 06:51:39 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
> ef907e4c (domain0-0-main) was SKIPPED 699 times.
>
> Apr 16 06:51:39 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 187 datagrams
> TIMED OUT!
>
> Apr 16 06:51:39 MTEST-PC-IFC12 kernel: libphy: mdio at ffe24520:03 - Link
> is Down
>
> Apr 16 06:51:39 MTEST-PC-IFC12 kernel: EtherCAT 0: Link state of ecm0
> changed to DOWN.
>
> Apr 16 06:51:40 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
> ef907e4c (domain0-0-main) was SKIPPED 1196 times.
>
> Apr 16 06:51:40 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 73 datagrams
> TIMED OUT!
>
> Apr 16 06:51:41 MTEST-PC-IFC12 kernel: libphy: mdio at ffe24520:03 - Link
> is Up - 100/Full
>
> Apr 16 06:51:41 MTEST-PC-IFC12 kernel: EtherCAT 0: Link state of ecm0
> changed to UP.
>
> Apr 16 06:51:41 MTEST-PC-IFC12 kernel: EtherCAT 0: 5 slave(s) responding
> on main device.
>
> Apr 16 06:51:41 MTEST-PC-IFC12 kernel: EtherCAT 0: Slave states on main
> device: SAFEOP, OP + ERROR.
>
> Apr 16 06:51:41 MTEST-PC-IFC12 kernel: EtherCAT 0: Scanning bus.
>
> Apr 16 06:51:42 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0-1: Slave has
> state error bit set (SAFEOP + ERROR)!
>
> Apr 16 06:51:42 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
> counter changed to 4/6
>
> Apr 16 06:51:42 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
> ef907e4c (domain0-0-main) was SKIPPED 13 times.
>
> Apr 16 06:51:42 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 2 datagrams
> TIMED OUT!
>
> Apr 16 06:51:43 MTEST-PC-IFC12 kernel: EtherCAT 0: Bus scanning
> completed in 1543 ms.
>
> Apr 16 06:51:43 MTEST-PC-IFC12 kernel: EtherCAT 0: Using slave 0 as DC
> reference clock.
>
> Apr 16 06:51:43 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: No app_time
> received up to now, but master already active.
>
> Apr 16 06:51:43 MTEST-PC-IFC12 kernel: EtherCAT ERROR 0-1: AL status
> message 0x001B: "Sync manager watchdog".
>
> Apr 16 06:51:43 MTEST-PC-IFC12 kernel: EtherCAT 0-1: Acknowledged state
> SAFEOP.
>
> Apr 16 06:51:44 MTEST-PC-IFC12 kernel: EtherCAT 0: Slave states on main
> device: OP.
>
> Apr 16 06:51:44 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
> counter changed to 6/6
>
> Apr 16 07:08:19 MTEST-PC-IFC12 -- MARK --
>
> cut
>
> Apr 27 21:08:22 MTEST-PC-IFC12 -- MARK --
>
> Apr 27 21:47:13 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: 2 working
> counter changes - now 6/6EtherCAT 0: Domain 0: Working counter changed
> to 0/6
>
> Apr 27 21:47:13 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
> ef907e4c (domain0-0-main) was SKIPPED 1 time.
>
> Apr 27 21:47:13 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 1 datagram
> UNMATCHED!
>
> Apr 27 22:08:22 MTEST-PC-IFC12 -- MARK --
>
> cut
>
> Apr 30 13:08:22 MTEST-PC-IFC12 -- MARK --
>
> Apr 30 13:35:39 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
> counter changed to 6/6
>
> Apr 30 13:35:39 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
> counter changed to 0/6EtherCAT ERROR 0-4: Failed to receive AL state
> datagram: Datagram timed out.
>
> Apr 30 13:35:39 MTEST-PC-IFC12 kernel: EtherCAT 0: 0 slave(s) responding
> on main device.
>
> Apr 30 13:35:39 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 9 datagrams
> TIMED OUT!
>
> Apr 30 13:35:40 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
> ef907e4c (domain0-0-main) was SKIPPED 689 times.
>
> Apr 30 13:35:40 MTEST-PC-IFC12 kernel: libphy: mdio at ffe24520:03 - Link
> is Down
>
> Apr 30 13:35:40 MTEST-PC-IFC12 kernel: EtherCAT 0: Link state of ecm0
> changed to DOWN.
>
> Apr 30 13:35:40 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 99 datagrams
> TIMED OUT!
>
> Apr 30 13:35:41 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
> ef907e4c (domain0-0-main) was SKIPPED 102 times.
>
> Apr 30 13:35:42 MTEST-PC-IFC12 kernel: libphy: mdio at ffe24520:03 - Link
> is Up - 100/Full
>
> Apr 30 13:35:42 MTEST-PC-IFC12 kernel: EtherCAT 0: Link state of ecm0
> changed to UP.
>
> Apr 30 13:35:42 MTEST-PC-IFC12 kernel: EtherCAT 0: 5 slave(s) responding
> on main device.
>
> Apr 30 13:35:42 MTEST-PC-IFC12 kernel: EtherCAT 0: Slave states on main
> device: SAFEOP, OP + ERROR.
>
> Apr 30 13:35:42 MTEST-PC-IFC12 kernel: EtherCAT 0: Scanning bus.
>
> Apr 30 13:35:42 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0-1: Slave has
> state error bit set (SAFEOP + ERROR)!
>
> Apr 30 13:35:42 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 2 datagrams
> TIMED OUT!
>
> Apr 30 13:35:43 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
> counter changed to 4/6
>
> Apr 30 13:35:43 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
> ef907e4c (domain0-0-main) was SKIPPED 14 times.
>
> Apr 30 13:35:43 MTEST-PC-IFC12 kernel: EtherCAT 0: Bus scanning
> completed in 1325 ms.
>
> Apr 30 13:35:43 MTEST-PC-IFC12 kernel: EtherCAT 0: Using slave 0 as DC
> reference clock.
>
> Apr 30 13:35:43 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: No app_time
> received up to now, but master already active.
>
> Apr 30 13:35:43 MTEST-PC-IFC12 kernel: EtherCAT ERROR 0-1: AL status
> message 0x001B: "Sync manager watchdog".
>
> Apr 30 13:35:43 MTEST-PC-IFC12 kernel: EtherCAT 0-1: Acknowledged state
> SAFEOP.
>
> Apr 30 13:35:44 MTEST-PC-IFC12 kernel: EtherCAT 0: Slave states on main
> device: OP.
>
> Apr 30 13:35:44 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
> counter changed to 6/6
>
> Apr 30 13:48:22 MTEST-PC-IFC12 -- MARK --
>
> cut
>
> Best regards,
>
> Ernst Johansen
>
>
>
> _______________________________________________
> etherlab-users mailing list
> etherlab-users at etherlab.org
> http://lists.etherlab.org/mailman/listinfo/etherlab-users
>



More information about the Etherlab-users mailing list