[etherlab-users] Error events during long-term testing
Sebastien BLANCHET
blanchet at iram.fr
Sat May 3 11:37:21 CEST 2014
Dear Ernst,
In my application the link never goes down, only the slaves disappear
every 24 hours. (Working counter changed to 0).
It is a specific problem because I have other ethercat systems that run
without a hiccup during thousands of hours.
For the moment I have not invest a lot of time on this issue.
If I found a solution, I will post it on the mailing list.
regards,
---
Sebastien BLANCHET
On 05/02/2014 06:07 PM, Ernst Johansen wrote:
> Hello Sebastien,
>
> thank you for the feedback. There are two kinds of errors… The one where the link goes down has occurred 3 times with approx 16 days delay in between. Unfortunately its not exactly 15.9 days in between - that would be 2**32 packets. I will wait for another error (should come in 14 days) and then change the test application and increase cycle rate to 5kHz.
>
> Is the link going down in your application too?
>
> Best regards,
> Ernst
>
>
> On 02 May 2014, at 17:46, Sebastien BLANCHET <blanchet at iram.fr> wrote:
>
>> Dear Johansen,
>>
>> According to your log, the duration between two error events is variable, between 6 and 11 days, therefore it is probably not an integer overflow but something else.
>>
>> To be sure, change the frequency of the cycle to check if the error frequency depends on it.
>>
>> Note: I have the same problem with a very similar configuration (except that I have an Intel PC instead of a PowerPC). Unfortunately, I have not found a solution yet.
>>
>> regards,
>> ---
>> Sebastien BLANCHET
>>
>> On 05/02/2014 01:04 PM, Johansen Ernst wrote:
>>> Hello together,
>>>
>>> I'm doing a long-term test of:
>>>
>>> root at MTEST-PC-IFC12:~# ethercat version
>>>
>>> IgH EtherCAT master 1.5.2 2eff7c993a63
>>>
>>> Basically the application is working perfectly except for error events
>>> that takes place approx every 16 days... As my final application is
>>> running 24/7 this is a problem.
>>>
>>> I will continue testing for the months to come, but I have the feeling
>>> that some kind of "overflow" is happening.
>>>
>>> Does anybody have an explanation for what the bus is doing?
>>>
>>> My application is running on a PowerPC with GNU/Linux PREEMPT_RT.
>>>
>>> The application is in user space and is triggered cyclically at 3125Hz.
>>>
>>> On the bus there is a small number of Beckhof devices:
>>>
>>> root at MTEST-PC-IFC12:~# ethercat config
>>>
>>> 0:0 0x00000002/0x044c2c52 0 OP
>>>
>>> 0:1 0x00000002/0x10243052 1 OP
>>>
>>> 0:2 0x00000002/0x0c1d3052 2 OP
>>>
>>> 0:3 0x00000002/0x25213052 3 OP
>>>
>>> 0:4 0x00000002/0x084c3052 4 OP
>>>
>>> Below is the /var/log/syslog for the last month. There were a total of 5
>>> error events during this period of time.
>>>
>>> cut
>>>
>>> Mar 30 03:28:15 MTEST-PC-IFC12 -- MARK --
>>>
>>> Mar 30 03:33:26 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: 2 working
>>> counter changes - now 6/6EtherCAT 0: Domain 0: Working counter changed
>>> to 0/6
>>>
>>> Mar 30 03:33:26 MTEST-PC-IFC12 kernel: EtherCAT ERROR 0-2: Failed to
>>> receive AL state datagram: Datagram timed out.
>>>
>>> Mar 30 03:33:26 MTEST-PC-IFC12 kernel: EtherCAT 0: 0 slave(s) responding
>>> on main device.
>>>
>>> Mar 30 03:33:27 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
>>> ef907e4c (domain0-0-main) was SKIPPED 1562 times.
>>>
>>> Mar 30 03:33:27 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 382 datagrams
>>> TIMED OUT!
>>>
>>> Mar 30 03:33:27 MTEST-PC-IFC12 kernel: libphy: mdio at ffe24520:03 - Link
>>> is Down
>>>
>>> Mar 30 03:33:27 MTEST-PC-IFC12 kernel: EtherCAT 0: Link state of ecm0
>>> changed to DOWN.
>>>
>>> Mar 30 03:33:28 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
>>> ef907e4c (domain0-0-main) was SKIPPED 1930 times.
>>>
>>> Mar 30 03:33:28 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 80 datagrams
>>> TIMED OUT!
>>>
>>> Mar 30 03:33:28 MTEST-PC-IFC12 kernel: libphy: mdio at ffe24520:03 - Link
>>> is Up - 100/Full
>>>
>>> Mar 30 03:33:28 MTEST-PC-IFC12 kernel: EtherCAT 0: Link state of ecm0
>>> changed to UP.
>>>
>>> Mar 30 03:33:28 MTEST-PC-IFC12 kernel: EtherCAT 0: 5 slave(s) responding
>>> on main device.
>>>
>>> Mar 30 03:33:28 MTEST-PC-IFC12 kernel: EtherCAT 0: Slave states on main
>>> device: SAFEOP, OP + ERROR.
>>>
>>> Mar 30 03:33:28 MTEST-PC-IFC12 kernel: EtherCAT 0: Scanning bus.
>>>
>>> Mar 30 03:33:29 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0-1: Slave has
>>> state error bit set (SAFEOP + ERROR)!
>>>
>>> Mar 30 03:33:29 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
>>> counter changed to 4/6
>>>
>>> Mar 30 03:33:29 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
>>> ef907e4c (domain0-0-main) was SKIPPED 15 times.
>>>
>>> Mar 30 03:33:29 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 2 datagrams
>>> TIMED OUT!
>>>
>>> Mar 30 03:33:30 MTEST-PC-IFC12 kernel: EtherCAT 0: Bus scanning
>>> completed in 1302 ms.
>>>
>>> Mar 30 03:33:30 MTEST-PC-IFC12 kernel: EtherCAT 0: Using slave 0 as DC
>>> reference clock.
>>>
>>> Mar 30 03:33:30 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: No app_time
>>> received up to now, but master already active.
>>>
>>> Mar 30 03:33:30 MTEST-PC-IFC12 kernel: EtherCAT ERROR 0-1: AL status
>>> message 0x001B: "Sync manager watchdog".
>>>
>>> Mar 30 03:33:30 MTEST-PC-IFC12 kernel: EtherCAT 0-1: Acknowledged state
>>> SAFEOP.
>>>
>>> Mar 30 03:33:30 MTEST-PC-IFC12 kernel: EtherCAT 0: Slave states on main
>>> device: OP.
>>>
>>> Mar 30 03:33:31 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
>>> counter changed to 6/6
>>>
>>> Mar 30 03:48:15 MTEST-PC-IFC12 -- MARK --
>>>
>>> cut
>>>
>>> Apr 6 20:28:17 MTEST-PC-IFC12 -- MARK --
>>>
>>> Apr 6 20:30:45 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: 2 working
>>> counter changes - now 6/6<7>EtherCAT WARNING 0: 1 datagram UNMATCHED!
>>>
>>> Apr 6 20:30:46 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
>>> counter changed to 0/6
>>>
>>> Apr 6 20:30:46 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
>>> ef907e4c (domain0-0-main) was SKIPPED 1 time.
>>>
>>> Apr 6 20:48:17 MTEST-PC-IFC12 -- MARK --
>>>
>>> cut
>>>
>>> Apr 16 06:48:19 MTEST-PC-IFC12 -- MARK --
>>>
>>> Apr 16 06:51:39 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
>>> counter changed to 6/6
>>>
>>> Apr 16 06:51:39 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
>>> counter changed to 0/6EtherCAT ERROR 0-1: Failed to receive AL state
>>> datagram: Datagram timed out.
>>>
>>> Apr 16 06:51:39 MTEST-PC-IFC12 kernel: EtherCAT 0: 0 slave(s) responding
>>> on main device.
>>>
>>> Apr 16 06:51:39 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
>>> ef907e4c (domain0-0-main) was SKIPPED 699 times.
>>>
>>> Apr 16 06:51:39 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 187 datagrams
>>> TIMED OUT!
>>>
>>> Apr 16 06:51:39 MTEST-PC-IFC12 kernel: libphy: mdio at ffe24520:03 - Link
>>> is Down
>>>
>>> Apr 16 06:51:39 MTEST-PC-IFC12 kernel: EtherCAT 0: Link state of ecm0
>>> changed to DOWN.
>>>
>>> Apr 16 06:51:40 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
>>> ef907e4c (domain0-0-main) was SKIPPED 1196 times.
>>>
>>> Apr 16 06:51:40 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 73 datagrams
>>> TIMED OUT!
>>>
>>> Apr 16 06:51:41 MTEST-PC-IFC12 kernel: libphy: mdio at ffe24520:03 - Link
>>> is Up - 100/Full
>>>
>>> Apr 16 06:51:41 MTEST-PC-IFC12 kernel: EtherCAT 0: Link state of ecm0
>>> changed to UP.
>>>
>>> Apr 16 06:51:41 MTEST-PC-IFC12 kernel: EtherCAT 0: 5 slave(s) responding
>>> on main device.
>>>
>>> Apr 16 06:51:41 MTEST-PC-IFC12 kernel: EtherCAT 0: Slave states on main
>>> device: SAFEOP, OP + ERROR.
>>>
>>> Apr 16 06:51:41 MTEST-PC-IFC12 kernel: EtherCAT 0: Scanning bus.
>>>
>>> Apr 16 06:51:42 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0-1: Slave has
>>> state error bit set (SAFEOP + ERROR)!
>>>
>>> Apr 16 06:51:42 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
>>> counter changed to 4/6
>>>
>>> Apr 16 06:51:42 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
>>> ef907e4c (domain0-0-main) was SKIPPED 13 times.
>>>
>>> Apr 16 06:51:42 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 2 datagrams
>>> TIMED OUT!
>>>
>>> Apr 16 06:51:43 MTEST-PC-IFC12 kernel: EtherCAT 0: Bus scanning
>>> completed in 1543 ms.
>>>
>>> Apr 16 06:51:43 MTEST-PC-IFC12 kernel: EtherCAT 0: Using slave 0 as DC
>>> reference clock.
>>>
>>> Apr 16 06:51:43 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: No app_time
>>> received up to now, but master already active.
>>>
>>> Apr 16 06:51:43 MTEST-PC-IFC12 kernel: EtherCAT ERROR 0-1: AL status
>>> message 0x001B: "Sync manager watchdog".
>>>
>>> Apr 16 06:51:43 MTEST-PC-IFC12 kernel: EtherCAT 0-1: Acknowledged state
>>> SAFEOP.
>>>
>>> Apr 16 06:51:44 MTEST-PC-IFC12 kernel: EtherCAT 0: Slave states on main
>>> device: OP.
>>>
>>> Apr 16 06:51:44 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
>>> counter changed to 6/6
>>>
>>> Apr 16 07:08:19 MTEST-PC-IFC12 -- MARK --
>>>
>>> cut
>>>
>>> Apr 27 21:08:22 MTEST-PC-IFC12 -- MARK --
>>>
>>> Apr 27 21:47:13 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: 2 working
>>> counter changes - now 6/6EtherCAT 0: Domain 0: Working counter changed
>>> to 0/6
>>>
>>> Apr 27 21:47:13 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
>>> ef907e4c (domain0-0-main) was SKIPPED 1 time.
>>>
>>> Apr 27 21:47:13 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 1 datagram
>>> UNMATCHED!
>>>
>>> Apr 27 22:08:22 MTEST-PC-IFC12 -- MARK --
>>>
>>> cut
>>>
>>> Apr 30 13:08:22 MTEST-PC-IFC12 -- MARK --
>>>
>>> Apr 30 13:35:39 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
>>> counter changed to 6/6
>>>
>>> Apr 30 13:35:39 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
>>> counter changed to 0/6EtherCAT ERROR 0-4: Failed to receive AL state
>>> datagram: Datagram timed out.
>>>
>>> Apr 30 13:35:39 MTEST-PC-IFC12 kernel: EtherCAT 0: 0 slave(s) responding
>>> on main device.
>>>
>>> Apr 30 13:35:39 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 9 datagrams
>>> TIMED OUT!
>>>
>>> Apr 30 13:35:40 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
>>> ef907e4c (domain0-0-main) was SKIPPED 689 times.
>>>
>>> Apr 30 13:35:40 MTEST-PC-IFC12 kernel: libphy: mdio at ffe24520:03 - Link
>>> is Down
>>>
>>> Apr 30 13:35:40 MTEST-PC-IFC12 kernel: EtherCAT 0: Link state of ecm0
>>> changed to DOWN.
>>>
>>> Apr 30 13:35:40 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 99 datagrams
>>> TIMED OUT!
>>>
>>> Apr 30 13:35:41 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
>>> ef907e4c (domain0-0-main) was SKIPPED 102 times.
>>>
>>> Apr 30 13:35:42 MTEST-PC-IFC12 kernel: libphy: mdio at ffe24520:03 - Link
>>> is Up - 100/Full
>>>
>>> Apr 30 13:35:42 MTEST-PC-IFC12 kernel: EtherCAT 0: Link state of ecm0
>>> changed to UP.
>>>
>>> Apr 30 13:35:42 MTEST-PC-IFC12 kernel: EtherCAT 0: 5 slave(s) responding
>>> on main device.
>>>
>>> Apr 30 13:35:42 MTEST-PC-IFC12 kernel: EtherCAT 0: Slave states on main
>>> device: SAFEOP, OP + ERROR.
>>>
>>> Apr 30 13:35:42 MTEST-PC-IFC12 kernel: EtherCAT 0: Scanning bus.
>>>
>>> Apr 30 13:35:42 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0-1: Slave has
>>> state error bit set (SAFEOP + ERROR)!
>>>
>>> Apr 30 13:35:42 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: 2 datagrams
>>> TIMED OUT!
>>>
>>> Apr 30 13:35:43 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
>>> counter changed to 4/6
>>>
>>> Apr 30 13:35:43 MTEST-PC-IFC12 kernel: EtherCAT WARNING: Datagram
>>> ef907e4c (domain0-0-main) was SKIPPED 14 times.
>>>
>>> Apr 30 13:35:43 MTEST-PC-IFC12 kernel: EtherCAT 0: Bus scanning
>>> completed in 1325 ms.
>>>
>>> Apr 30 13:35:43 MTEST-PC-IFC12 kernel: EtherCAT 0: Using slave 0 as DC
>>> reference clock.
>>>
>>> Apr 30 13:35:43 MTEST-PC-IFC12 kernel: EtherCAT WARNING 0: No app_time
>>> received up to now, but master already active.
>>>
>>> Apr 30 13:35:43 MTEST-PC-IFC12 kernel: EtherCAT ERROR 0-1: AL status
>>> message 0x001B: "Sync manager watchdog".
>>>
>>> Apr 30 13:35:43 MTEST-PC-IFC12 kernel: EtherCAT 0-1: Acknowledged state
>>> SAFEOP.
>>>
>>> Apr 30 13:35:44 MTEST-PC-IFC12 kernel: EtherCAT 0: Slave states on main
>>> device: OP.
>>>
>>> Apr 30 13:35:44 MTEST-PC-IFC12 kernel: EtherCAT 0: Domain 0: Working
>>> counter changed to 6/6
>>>
>>> Apr 30 13:48:22 MTEST-PC-IFC12 -- MARK --
>>>
>>> cut
>>>
>>> Best regards,
>>>
>>> Ernst Johansen
>>>
>>>
>>>
>>> _______________________________________________
>>> etherlab-users mailing list
>>> etherlab-users at etherlab.org
>>> http://lists.etherlab.org/mailman/listinfo/etherlab-users
>>>
More information about the Etherlab-users
mailing list