[etherlab-users] Problem with distributed clocks in RTAI: "Slave did not sync after 5000 ms" & "No Sync Error"

Sat Oct 20 12:38:24 CEST 2018

Also notable is that the userspace code reaches the 10 microseconds
threshold after 5.2 seconds, despite having half the update rate of RTAI (1
kHz vs. 2 kHz).

Best,
Mohsen

On Sat, Oct 20, 2018 at 1:58 PM Mohsen Alizadeh Noghani <
m.alizad3h at gmail.com> wrote:

> Update: I increased EC_DC_SYNC_WAIT_MS to 50000 (50 seconds). I also set
> debug level to 1, "ethercat debug 1" and the closest slave 0 (reference
> clocks) gets to syncing is after about 48 seconds! ("abs_sync_diff" =
> approximately 1.156 ms).
> At this point, the value starts to diverge and ends at 1.77 seconds at the
> 50 seconds mark.
> *kernel: [14573.717225] EtherCAT DEBUG 0-0: Sync after 47800 ms:
> 1156111 ns*
> .
> .
> .
>
> *kernel: [14575.919495] EtherCAT DEBUG 0-0: Sync after 49996 ms:
> 1771539607 ns*
>
> *kernel: [14575.923534] EtherCAT WARNING 0-0: Slave did not sync after
> 50012 ms.*
>
> *kernel: [14575.923536] EtherCAT DEBUG 0-0:
> app_start_time=593344410354840000*
>
> *kernel: [14575.923538] EtherCAT DEBUG 0-0:
>  app_time=593344420279840000*
>
> *kernel: [14575.923539] EtherCAT DEBUG 0-0:
>  start_time=593344420379840000*
>
> *kernel: [14575.923540] EtherCAT DEBUG 0-0:     cycle_time=500000*
>
> *kernel: [14575.923542] EtherCAT DEBUG 0-0:     shift_time=125000*
>
> *kernel: [14575.923543] EtherCAT DEBUG 0-0:      remainder=0*
>
> *kernel: [14575.923544] EtherCAT DEBUG 0-0:
> start=593344420380465000*
>
> *kernel: [14575.923545] EtherCAT DEBUG 0-0: Setting DC cyclic operation
> start time to 593344420380465000.*
>
> *kernel: [14575.928611] EtherCAT DEBUG 0-0: Setting DC AssignActivate to
> 0x0300.*
>
> *kernel: [14575.941292] EtherCAT 0: Domain 0: Working counter changed to
> 3/6.*
>
> *kernel: [14576.000500] EtherCAT DEBUG 0-0: Processing register request...*
>
> *kernel: [14576.004685] EtherCAT DEBUG 0-0: Register request successful.*
> *kernel: [14576.050335] EtherCAT DEBUG 0-0: Now in SAFEOP.*
>
> - Is there something wrong with the algorithm used for nudging
> "abs_sync_diff" towards 0?
> - If so, why does it work perfectly fine in the userspace, but not in RTAI?
>
> Best,
> Mohsen
>
> On Fri, Oct 19, 2018 at 1:05 PM Mohsen Alizadeh Noghani <
> m.alizad3h at gmail.com> wrote:
>
>> Hello everyone.
>> I'm using kernel 3.4.6, RTAI 4.0 and IgH Master 1.5.2.
>> When running a simple RTAI program
>> <https://github.com/mohse-n/L7N_EtherLab/blob/master/rtai/rtai_sample.c>
>> that uses distributed clocks (basically the dc_rtai example), I encounter
>> the following kernel log:
>>
>> *kernel: [ 1891.643677] EtherCAT 0: Link state of ecm0 changed to UP.*
>> *kernel: [ 1891.647798] EtherCAT 0: 2 slave(s) responding on main device.*
>> *kernel: [ 1891.647800] EtherCAT 0: Slave states on main device: PREOP.*
>> *kernel: [ 1891.647837] EtherCAT 0: Scanning bus.*
>> *kernel: [ 1892.083268] EtherCAT 0: Bus scanning completed in 436 ms.*
>> *kernel: [ 1892.083271] EtherCAT 0: Using slave 0 as DC reference clock.*
>> *kernel: [ 1906.700138] EtherCAT: Requesting master 0...*
>> *kernel: [ 1906.700142] EtherCAT: Successfully requested master 0.*
>> *kernel: [ 1906.700160] EtherCAT 0: Domain0: Logical address 0x00000000,
>> 24 byte, expected working counter 6.*
>> *kernel: [ 1906.700161] EtherCAT 0:   Datagram domain0-0-main: Logical
>> offset 0x00000000, 24 byte, type LRW.*
>> *kernel: [ 1906.700185] EtherCAT 0: Master thread exited.*
>> *kernel: [ 1906.700187] EtherCAT 0: Starting EtherCAT-OP thread.*
>> *kernel: [ 1906.704215] ec_rtai_sample: RT timer started with 3116/3117
>> ticks.*
>> *kernel: [ 1906.704218] ec_rtai_sample: Initialized.*
>> *kernel: [ 1911.935059] EtherCAT WARNING 0-0: Slave did not sync after
>> 5000 ms.*
>> *kernel: [ 1911.946039] EtherCAT 0: Domain 0: Working counter changed to
>> 3/6.*
>> *kernel: [ 1914.070216] EtherCAT ERROR 0-0: Failed to set OP state, slave
>> refused state change (SAFEOP + ERROR).*
>> *kernel: [ 1914.073870] EtherCAT ERROR 0-0: AL status message 0x002D: "No
>> Sync Error".*
>> *kernel: [ 1914.081189] EtherCAT 0-0: Acknowledged state SAFEOP.*
>> *kernel: [ 1919.308375] EtherCAT WARNING 0-1: Slave did not sync after
>> 5000 ms.*
>> *kernel: [ 1919.321187] EtherCAT 0: Domain 0: Working counter changed to
>> 6/6.*
>> *kernel: [ 1921.449013] EtherCAT ERROR 0-1: Failed to set OP state, slave
>> refused state change (SAFEOP + ERROR).*
>> *kernel: [ 1921.452670] EtherCAT ERROR 0-1: AL status message 0x002D: "No
>> Sync Error".*
>> *kernel: [ 1921.459991] EtherCAT 0-1: Acknowledged state SAFEOP.*
>> *kernel: [ 1921.469158] EtherCAT 0: Slave states on main device: SAFEOP.*
>>
>> The slaves (servo drives) would give an alarm related to EtherCAT
>> communication.
>> Apparently, the slaves are unable to sync after 5 seconds. But why?
>> (Note: I have tested the distributed clocks example in userspace and it
>> works, so I don't think the issue is from the slaves' side.)
>> Best,
>> Mohsen
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.etherlab.org/pipermail/etherlab-users/attachments/20181020/4d8010eb/attachment-0003.htm>