[etherlab-dev] Hot plugged modules failing to read DC register

Gavin Lambert gavin.lambert at tomra.com
Thu Sep 26 06:21:25 CEST 2019


I'm not talking about the part of the master config where it uses the 0x050x registers to read the SII content; I'm talking about what happens before that.

My point was that even if I randomly power on a slave, I have never seen that bit be set to 1 unless the EEPROM was not programmed correctly, or a reload request was issued manually.

According to the ESC datasheets, it has to complete loading the EEPROM before "the PDI is available", which means that the slave app can't talk to the EtherCAT network in any way until after the EEPROM is loaded.

I assumed (and have been unable to refute in testing) that this also applies on the ECAT side - the slave is invisible until it has either successfully loaded the first 8 words of EEPROM or given up trying.  (The "rest" of EEPROM is irrelevant here - the ESC doesn't care about it, it's only used by the master and slave app, and doesn't affect the status of that bit.)  After all, part of the data in those first words tells the ESC how to configure its physical ports and what kind of link detection to use.

It's possible that this is some sort of hardware difference - I'm mostly using my own FPGA-based slaves.


Regarding the "rest" of the SII, at the time that the master does the initial SII read 0x0500 and 0x0501 should both be 0, so the master can talk to the EEPROM and the slave cannot.  It should only set 0x0500==1 after it has finished, at which point the slave is allowed to do whatever it likes.  (And this all happens well after the read of 0x0910.)


Gavin Lambert
Senior Software Developer

[cid:logo_compac_5dcf97ef-52f5-498c-8b9b-728410ddffaf.png]
[cid:compacicon_82e8a8c7-154a-4a32-9720-a5badb6258e0.png]<http://www.compacsort.com> [cid:facebook_fa85b924-53b9-45cc-8162-0564f64ec3a3.png] <https://www.facebook.com/Compacsort>  [cid:linkedin_4ec016ad-84fa-443c-85a3-b9615a4ccef8.png] <https://www.linkedin.com/company/compac-sorting-equipment/>  [cid:youtube_32142163-fc27-4aed-b14d-e8a377f98a6d.png] <https://vimeo.com/compacsort>  [cid:twitter_d89338d8-98c8-4b65-9a9e-7b1333160b0d.png] <https://twitter.com/compacsort>  [cid:insta2_1cd85de9-b3a2-4971-9904-52b2481a7c82.png] <https://www.instagram.com/compacsort/>

COMPAC SORTING EQUIPMENT LTD | 4 Henderson Pl | Onehunga | Auckland 1061 | New Zealand
Switchboard: +64 96 34 00 88 | tomra.com<http://www.tomra.com>

The information contained in this communication and any attachment is confidential and may be legally privileged. It should only be read by the person(s) to whom it is addressed. If you have received this communication in error, please notify the sender and delete the communication.

From: Graeme Foot <Graeme.Foot at touchcut.com>
Sent: Thursday, 26 September 2019 15:39
To: Gavin Lambert <gavin.lambert at tomra.com>; etherlab-dev at etherlab.org
Subject: RE: Hot plugged modules failing to read DC register

The SII information is read via registers, after trying to read the DC System time register.  Register 0x0502, bit 12, gives the EEPROM loading status:
0: EEPROM loaded, device information ok
1: EEPROM not loaded, device information not available (EEPROM loading in progress or finished with a failure)

Generally the first 8 words will be available while the rest of the EEPROM is being read.  But you may be unlucky and read early and get some invalid data.  In this case you may get some intermittent weird problems that you can't track down.

If you continue to read the rest of the EEPROM before the loading status bit is cleared then you will definitely be reading invalid data.  The rest of the SII will be read the first time the slave is seen by the master, unless your are reading the SII from file.  On subsequent scans it will just read the first 8 words and match it to the cached version (if you are using that patch).


It's generally only a problem if slaves are being powered up after the master is already running (e.g.: a remote IO cabinet).

Graeme.

From: Gavin Lambert <gavin.lambert at tomra.com<mailto:gavin.lambert at tomra.com>>
Sent: Thursday, 26 September 2019 2:28 PM
To: Graeme Foot <Graeme.Foot at touchcut.com<mailto:Graeme.Foot at touchcut.com>>; etherlab-dev at etherlab.org<mailto:etherlab-dev at etherlab.org>
Subject: RE: Hot plugged modules failing to read DC register

Having to wait for the EEPROM is also something unusual.  Usually the ESC just reads the first 8 words of EEPROM (which is relatively fast) and AFAIK it does this before the slave becomes "visible" to EtherCAT.  So normally if the master can see the "EEPROM not loaded" bit, this is a hard error that can only be fixed by reprogramming the EEPROM; it's not something that the master should be waiting for.

The slave app might want to read some configuration data from the EEPROM on boot as well, of course, but normally this either happens before bringing the ESC out of reset or it happens during a time window the master provides (via "configure --enable-sii-assign").  And in neither case should it be affecting reads of 0x0910.


Gavin Lambert
Senior Software Developer


[cid:image001.png at 01D57481.4CF3D8B0]
[cid:image002.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.compacsort.com&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301198481&sdata=VIFTHvpNGrZz9Guuexxd91Zqq6Hx1nR3ESbRQCVcajE%3D&reserved=0>[cid:image003.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.facebook.com%2FCompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301198481&sdata=%2FxwRB%2FrA2wrKrmSORLclzA84NaB8rzTd6WPwcU220ds%3D&reserved=0>[cid:image004.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Fcompac-sorting-equipment%2F&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301208476&sdata=WolHvOfPrFMPvKIsOsyEsToDQvN6BSMCGMV7nse0EKE%3D&reserved=0>[cid:image005.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fvimeo.com%2Fcompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301208476&sdata=PyMXGJhII%2ByBY1Gd23HaUibWwAwa1RkLhRpIL%2Bx9bko%3D&reserved=0>[cid:image006.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fcompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301208476&sdata=5Sxc38zwLoRzo8Hrb9EyAr8mKiSEe1A%2FaEgSwRLyXOo%3D&reserved=0>[cid:image007.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.instagram.com%2Fcompacsort%2F&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301218468&sdata=GKc%2FwQ4nYvduSTKP%2BC21CyD7m7Ms%2BcmgJTH98s49Ib4%3D&reserved=0>

COMPAC SORTING EQUIPMENT LTD | 4 Henderson Pl | Onehunga | Auckland 1061 | New Zealand
Switchboard: +64 96 34 00 88 | tomra.com<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tomra.com&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301218468&sdata=WmNCep3T%2BsTHDZ0qDjX%2BmoOAAKfAlTx15%2FQRt3PoRVU%3D&reserved=0>

The information contained in this communication and any attachment is confidential and may be legally privileged. It should only be read by the person(s) to whom it is addressed. If you have received this communication in error, please notify the sender and delete the communication.

From: Graeme Foot <Graeme.Foot at touchcut.com<mailto:Graeme.Foot at touchcut.com>>
Sent: Thursday, 26 September 2019 14:17
To: Gavin Lambert <gavin.lambert at tomra.com<mailto:gavin.lambert at tomra.com>>; etherlab-dev at etherlab.org<mailto:etherlab-dev at etherlab.org>
Subject: RE: Hot plugged modules failing to read DC register

Yes, it is straight from boot.  Generally in a hotplug situation as the master needs to be already alive to immediately start configuring the slave, plus the slave needs to be powering up.  This can happen if the modules are in a remote cabinet with their own power switch.

It happens in a simple setup (pc -> EK1110 -> EK1100 -> EL7342).
It doesn't happen when directly connected to my CX2020 (pc -> EL7342), due to becoming the DC ref clock.
It does happen if there's another module in between (pc -> EL1008 -> EL7342), due to no longer being the DC ref clock.

The last two cases don't generally happen in the field as the module is powered up at the same time as the PC so has had a chance to fully initialise before the master starts configuring it.

The slave is responding to 0x0010 and 0x0130.  It looks like it is correctly allowing register 0x0000 (12 bytes) to be read (the next step after 0x0130).  It then fails to read 0x0910.  It then continues on successfully with 0x0110 and reading the SII registers (although it has to wait until the EEPROM has finished reading).

It looks like I have this situation with the following modules:
EL5101
EL7332
EL7342

Next step, talking to Beckhoff.

Cheers,
Graeme.


From: Gavin Lambert <gavin.lambert at tomra.com<mailto:gavin.lambert at tomra.com>>
Sent: Thursday, 26 September 2019 1:46 PM
To: Graeme Foot <Graeme.Foot at touchcut.com<mailto:Graeme.Foot at touchcut.com>>; etherlab-dev at etherlab.org<mailto:etherlab-dev at etherlab.org>
Subject: RE: Hot plugged modules failing to read DC register

Is this happening straight from boot, or is it initially working and then later going into a bad state?  I was assuming the latter.

If it's happening straight from boot, then you should probably ask Beckhoff why the slave isn't responding to 0x0910 shortly after boot (ideally with a Wireshark trace).  As far as I am aware, this is supposed to always work, if the slave is responding on EtherCAT at all.

(In order to reach the point where it tries to read 0x0910, the slave must be successfully replying to reads of 0x0130 and writes of 0x0010.)


Another thing you could possibly check is whether it happens when the slave is alone on the network or if it only happens when there is more than one slave.  For example, if you have some slaves that use a dip-switch station address or alias it's possible that this could be conflicting with the address assignment used by Etherlab.


Gavin Lambert
Senior Software Developer


[cid:image001.png at 01D57481.4CF3D8B0]
[cid:image002.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.compacsort.com&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301228468&sdata=lJAs7ULjlKMV71qLHQqGmHycE8IOjClcXJJ3kjwYiwg%3D&reserved=0>[cid:image003.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.facebook.com%2FCompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301228468&sdata=p4fTHsZVzZWWJOolqOtVwSY1y0Pzy42G4HYpIaHFZ4M%3D&reserved=0>[cid:image004.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Fcompac-sorting-equipment%2F&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301238458&sdata=E3D8bS9RyYTyugsL%2B1e0pdf4egcwvF%2FDfTXZ3S5b2OY%3D&reserved=0>[cid:image005.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fvimeo.com%2Fcompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301238458&sdata=s7pfkBxTx0vOC%2BOuq2qUFB4XO2rgVG%2FK%2BuMQXABGkwE%3D&reserved=0>[cid:image006.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fcompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301248453&sdata=gtWee0w12nfcbVOdRfs%2Fjy6kJydVdREuz183MJQ3Abo%3D&reserved=0>[cid:image007.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.instagram.com%2Fcompacsort%2F&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301248453&sdata=Mg%2BfzU2oaTpCwa8wupy2CCzvJv5JKP8TiaohqM8Hyus%3D&reserved=0>

COMPAC SORTING EQUIPMENT LTD | 4 Henderson Pl | Onehunga | Auckland 1061 | New Zealand
Switchboard: +64 96 34 00 88 | tomra.com<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tomra.com&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301248453&sdata=CkB5mULv2uj2HWTDioHvW1aiVV5d41NXz4OP3W9NbN4%3D&reserved=0>

The information contained in this communication and any attachment is confidential and may be legally privileged. It should only be read by the person(s) to whom it is addressed. If you have received this communication in error, please notify the sender and delete the communication.

From: Graeme Foot <Graeme.Foot at touchcut.com<mailto:Graeme.Foot at touchcut.com>>
Sent: Thursday, 26 September 2019 13:29
To: Gavin Lambert <gavin.lambert at tomra.com<mailto:gavin.lambert at tomra.com>>; etherlab-dev at etherlab.org<mailto:etherlab-dev at etherlab.org>
Subject: RE: Hot plugged modules failing to read DC register

Hi,

There is no "AL status message" occurring in the syslog.  As such there is also no acknowledgement.

I just mentioned the reg_read request as an aside to say that I can't manually query the state of the slave while it is in SAFEOP+ERROR.

The error flag is set and the slave goes to SAFEOP+ERROR due to DC failing to sync.
DC fails to sync due to the master not setting the DC time offset.
The master does not set the DC time offset due to the has_dc_system_time flag not being set.
The has_dc_system_time flag is not set because the slave does not respond to the 0x0910 reg read request.


I'm not using any patch to disable automatic rescan as far as I'm aware.  I'm only using your patchset and the extra's I've submitted.

Cheers,
Graeme.

From: Gavin Lambert <gavin.lambert at tomra.com<mailto:gavin.lambert at tomra.com>>
Sent: Thursday, 26 September 2019 1:13 PM
To: Graeme Foot <Graeme.Foot at touchcut.com<mailto:Graeme.Foot at touchcut.com>>; etherlab-dev at etherlab.org<mailto:etherlab-dev at etherlab.org>
Subject: RE: Hot plugged modules failing to read DC register

I'm not talking about a reg_read request.

When the slave first goes into SAFEOP+ERROR, the master should detect this during its regular slave scan cycle (fsm_master_state_read_al_status).  It then calls ec_fsm_change_ack to acknowledge the ERROR state - as it does this, it should report something like this to the syslog (even at debug 0 level):

kernel: [49369.927687] EtherCAT ERROR 0-main-2: AL status message 0x001B: "Sync manager watchdog".
kernel: [49369.953541] EtherCAT 0-main-2: Acknowledged state SAFEOP.

This is the message that tells you why the slave was unhappy and went to the ERROR state in the first place.  After this, the slave should be in SAFEOP (no ERROR), and the master will usually then bring it back to INIT and do a full reconfigure and back to OP (if the master app is still running).

If you have the quick-op patch applied, and the AL status was 0x001B and not some other error (and the master hasn't restarted since the slave entered the error state), then instead of going back to INIT in the above case it will try to go straight back from SAFEOP to OP to reduce the downtime.  As mentioned in the patch notes, this is fine in most cases but it can upset some DC slaves (as it may not give them enough time to resync), so you may want to disable this feature via configure.  (You can clearly see in the syslogs when it's attempting a quick return to OP.)

The slave should always accept acknowledging the ERROR state and returning to INIT.

The "slave has error_flag set" thing is different; that's referring to slave->error_flag, which is only set when the slave fails to respond correctly to requests (such as refusing to perform AL state changes or not responding to internal register requests), despite still responding to the broadcast AL state request (used to determine the number of active slaves).  The only way to clear this flag is to rescan, either via "ethercat rescan" or by the automatic rescan that it performs when the number of slaves changes (as it normally does during a genuine comms interruption).

So have a look further back in the logs to see why the error_flag is being set in the first place.

Though also make sure that you're not using any extra patches which disable the automatic rescan.  I recall someone on this list a while ago suggesting this was a good idea; it isn't.


Gavin Lambert
Senior Software Developer


[cid:image001.png at 01D57481.4CF3D8B0]
[cid:image002.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.compacsort.com%2F&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301258447&sdata=laYC7rwaR2RKOVQFJbO9%2F4hm1pUIJLSexSQhOIwcdMk%3D&reserved=0>[cid:image003.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.facebook.com%2FCompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301258447&sdata=uqALH26LPcUc%2BSl3H2o0yL2gQYLJiJqLILwDnAzXlig%3D&reserved=0>[cid:image004.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Fcompac-sorting-equipment%2F&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301268442&sdata=TSvdxN9c87bXj6y0bAFr17S64rpSiK84shcjnDFU%2Bhg%3D&reserved=0>[cid:image005.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fvimeo.com%2Fcompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301268442&sdata=MB6TLA42epuLkcGKATthI1keuCTTY2%2BctSMDVcBQ7UY%3D&reserved=0>[cid:image006.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fcompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301278434&sdata=3tTrdKdFNlO5LXvBpD3nTWSPSlRaBNajgtvmJDk7N%2Fs%3D&reserved=0>[cid:image007.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.instagram.com%2Fcompacsort%2F&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301278434&sdata=m49CBMzsagLcN1b9pgiRjM5FV%2B%2FKhpWT1U%2FTaxFuOAs%3D&reserved=0>

COMPAC SORTING EQUIPMENT LTD | 4 Henderson Pl | Onehunga | Auckland 1061 | New Zealand
Switchboard: +64 96 34 00 88 | tomra.com<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tomra.com&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301288430&sdata=ExBbiT6JV%2F7F8Ay17EQu4Rxl3w56vBPt33apouB2VWo%3D&reserved=0>

The information contained in this communication and any attachment is confidential and may be legally privileged. It should only be read by the person(s) to whom it is addressed. If you have received this communication in error, please notify the sender and delete the communication.

From: Graeme Foot <Graeme.Foot at touchcut.com<mailto:Graeme.Foot at touchcut.com>>
Sent: Thursday, 26 September 2019 12:32
To: Gavin Lambert <gavin.lambert at tomra.com<mailto:gavin.lambert at tomra.com>>; etherlab-dev at etherlab.org<mailto:etherlab-dev at etherlab.org>
Subject: RE: Hot plugged modules failing to read DC register

Hi Gavin,

When 0x0910 returns a working counter of 0 the master assumes the slave only supports delay measurement and the slave->has_dc_system_time flag remains zero.  If the flag is zero a few DC configuration steps are missed.  Subsequently, when the slave DC Assign Active value is set the slave has not been set up to use DC correctly and fails to sync.

So without the patch (extracts from attached "DC Sync Error.txt"):
[  270.369125] EtherCAT DEBUG 0-main-3: Ready for requests.
[  270.376908] EtherCAT DEBUG 0-main-3: Slave has no System Time register; delay measurement only.
[  270.393234] EtherCAT WARNING 0-main-3: SII Read Error, EEPROM not loaded.  Retrying...
[  270.476850] EtherCAT 0-main-3: SII EEPROM loaded.  Continuing.
[  270.916606] EtherCAT DEBUG 0-main-3: Unknown category type 0x003C.
[  270.916612] EtherCAT DEBUG 0-main-3: Slave is not in the state to do mailbox com (INIT), setting to PREOP.
[  270.916616] EtherCAT DEBUG 0-main-3: Configuring...
[  270.920595] EtherCAT DEBUG 0-main-3: Now in INIT.
[  270.920598] EtherCAT DEBUG 0-main-3: Clearing FMMU configurations...
[  270.920601] EtherCAT DEBUG 0-main-3: Clearing mailbox check flag...
[  270.922603] EtherCAT DEBUG 0-main-3: Clearing sync manager configurations...
[  270.924595] EtherCAT DEBUG 0-main-3: Configuring mailbox sync managers...
[  270.924600] EtherCAT DEBUG 0-main-3: SM0: Addr 0x1000, Size 128, Ctrl 0x26, En 1
[  270.924605] EtherCAT DEBUG 0-main-3: SM1: Addr 0x1080, Size 128, Ctrl 0x22, En 1
[  270.930589] EtherCAT DEBUG 0-main-3: Now in PREOP.
[  270.930592] EtherCAT DEBUG 0-main-3: Finished configuration.

[  273.639104] EtherCAT WARNING 0-main-3: Slave seems not to support distributed clocks!
[  273.647686] EtherCAT DEBUG 0-main-3: Setting DC cycle times to 1000000 / 0.
[  273.649100] EtherCAT DEBUG 0-main-3: Checking for synchrony.
[  273.743070] EtherCAT DEBUG 0-main-3: Sync after  100 ms: 1828300146 ns
[  273.842996] EtherCAT DEBUG 0-main-3: Sync after  200 ms: 1828303744 ns
...
[  278.440471] EtherCAT DEBUG 0-main-3: Sync after 4800 ms: 1830172951 ns
[  278.540418] EtherCAT DEBUG 0-main-3: Sync after 4900 ms: 1830251113 ns
[  278.640369] EtherCAT WARNING 0-main-3: Slave did not sync after 5000 ms.
[  278.648810] EtherCAT DEBUG 0-main-3: app_start_time=0
[  278.648814] EtherCAT DEBUG 0-main-3:       app_time=289356014683
[  278.648817] EtherCAT DEBUG 0-main-3:     start_time=289456014683
[  278.648820] EtherCAT DEBUG 0-main-3:          cycle=1000000
[  278.648824] EtherCAT DEBUG 0-main-3:     shift_time=500000
[  278.648827] EtherCAT DEBUG 0-main-3:      remainder=14683
[  278.648830] EtherCAT DEBUG 0-main-3:          start=289457500000
[  278.648850] EtherCAT DEBUG 0-main-3: Setting DC cyclic operation start time to 289457500000.
[  278.650379] EtherCAT DEBUG 0-main-3: Setting DC AssignActivate to 0x0300.

[  278.695998] EtherCAT DEBUG 0-main-3: Now in SAFEOP.
[  283.688588] EtherCAT ERROR 0-main-3: Timeout while setting state OP.
[  283.726198] EtherCAT 0: Slave states on main device: PREOP, SAFEOP, OP + ERROR.
[  283.766130] EtherCAT DEBUG 0-main-3: SAFEOP -> SAFEOP + ERROR.

Comparing that to with the patch (extracts from the attached "DC With Patch.txt"):
[   47.873112] EtherCAT DEBUG 0-main-3: Ready for requests.
[   47.873140] EtherCAT DEBUG 0-main-3: Scanning slave 3 on main link.
[   47.881120] EtherCAT WARNING 0-main-3: Slave did not respond to System Time register request, retrying...
[   47.956443] EtherCAT 0-main-3: Slave has the System Time register.
[   48.021036] EtherCAT DEBUG 0-main-3: SII firmware file not found; reading SII data from slave.
[   48.400841] EtherCAT DEBUG 0-main-3: Unknown category type 0x003C.
[   48.400848] EtherCAT DEBUG 0-main-3: Slave is not in the state to do mailbox com (INIT), setting to PREOP.
[   48.400852] EtherCAT DEBUG 0-main-3: Configuring...
[   48.404827] EtherCAT DEBUG 0-main-3: Now in INIT.
[   48.404831] EtherCAT DEBUG 0-main-3: Clearing FMMU configurations...
[   48.404835] EtherCAT DEBUG 0-main-3: Clearing mailbox check flag...
[   48.406831] EtherCAT DEBUG 0-main-3: Clearing sync manager configurations...
[   48.408828] EtherCAT DEBUG 0-main-3: Clearing DC assignment...
[   48.410832] EtherCAT DEBUG 0-main-3: Configuring mailbox sync managers...
[   48.410839] EtherCAT DEBUG 0-main-3: SM0: Addr 0x1000, Size 128, Ctrl 0x26, En 1
[   48.410843] EtherCAT DEBUG 0-main-3: SM1: Addr 0x1080, Size 128, Ctrl 0x22, En 1
[   48.416831] EtherCAT DEBUG 0-main-3: Now in PREOP.
[   48.416836] EtherCAT DEBUG 0-main-3: Finished configuration.

[   49.991966] EtherCAT DEBUG 0-main-3: Checking system time offset.
[   49.993418] EtherCAT DEBUG 0-main-3: DC 64 bit system time offset calculation: system_time=2217603868, app_time=59887013582, diff=57669409714
[   49.993423] EtherCAT DEBUG 0-main-3: Setting time offset to 57669409714 (was 0)

[   51.152316] EtherCAT DEBUG 0-main-3: Setting DC cycle times to 1000000 / 0.
[   51.154320] EtherCAT DEBUG 0-main-3: Checking for synchrony.
[   51.256259] EtherCAT DEBUG 0-main-3: Sync after  100 ms:      14466 ns
[   51.354208] EtherCAT DEBUG 0-main-3: Sync after  200 ms:      14695 ns
...
[   51.955878] EtherCAT DEBUG 0-main-3: Sync after  800 ms:      12253 ns
[   52.053823] EtherCAT DEBUG 0-main-3: Sync after  900 ms:      11113 ns
[   52.133801] EtherCAT DEBUG 0-main-3: 10000 ns difference after 970 ms.
[   52.133806] EtherCAT DEBUG 0-main-3: app_start_time=0
[   52.133810] EtherCAT DEBUG 0-main-3:       app_time=62029021245
[   52.133813] EtherCAT DEBUG 0-main-3:     start_time=62129021245
[   52.133816] EtherCAT DEBUG 0-main-3:          cycle=1000000
[   52.133820] EtherCAT DEBUG 0-main-3:     shift_time=500000
[   52.133823] EtherCAT DEBUG 0-main-3:      remainder=21245
[   52.133826] EtherCAT DEBUG 0-main-3:          start=62130500000
[   52.133829] EtherCAT DEBUG 0-main-3: Setting DC cyclic operation start time to 62130500000.
[   52.135789] EtherCAT DEBUG 0-main-3: Setting DC AssignActivate to 0x0300.

[   52.184538] EtherCAT DEBUG 0-main-3: Now in SAFEOP.
[   52.749441] EtherCAT DEBUG 0-main-3: Now in OP. Finished configuration.


Without the patch there is no AL Error code, just a failure to DC Sync.  Without the slave->has_dc_system_time flag being set the DC time offset is not applied so the slave has no chance to sync.

Without the patch it took the slave approx. 108ms for the SII loading to be complete.  With the patch it took approx. 83ms for the 0x0910 register to respond.  They are generally a similar timeframe so I suspect the slave does not reliably respond while it is still initialising.

Also, restarting the application or manually changing the modules state to INIT or PREOP and back to OP doesn't sort out the flag.  If there has been enough time for the modules DC clock to eventually sync it will reach OP, but if not it will go back to SAFEOP + ERROR.  It takes a "rescan" command to the master to sort out the flag so that the offset calculation can be performed.

I've also found that when the slave is in SAFEOP + ERROR reg_read requests (and pretty much all communications to the slave) fail due to "Aborting register request, slave has error flag set.".  So it's a little hard to manually check if there's any error status set.

Graeme.


From: Gavin Lambert <gavin.lambert at tomra.com<mailto:gavin.lambert at tomra.com>>
Sent: Wednesday, 25 September 2019 5:06 PM
To: Graeme Foot <Graeme.Foot at touchcut.com<mailto:Graeme.Foot at touchcut.com>>; etherlab-dev at etherlab.org<mailto:etherlab-dev at etherlab.org>
Subject: RE: Hot plugged modules failing to read DC register

When the slave goes to safeop+error it should also output an AL error code which might give a hint as to why.  This should be logged to the syslog when the master acknowledges the error.

AL error 0x001B, for example, indicates that the slave stopped receiving SM frames (typical of a comms interruption) - and features/quick-op in the patchset tries to do a quicker recovery for this case by trying to go straight back to OP instead of going through a full PREOP reconfiguration.  It's possible that some slaves may need the full reconfigure, so you can disable this behaviour at configure time.

Other AL error codes mean other things, such as your DC cycle being poorly synced and frames not occurring in a strict SYNC0-SM-SYNC0-SM ordering.


But I wouldn't normally expect any standard registers to fail WC when this occurs, unless perhaps the slave was performing a full power reset (or otherwise holding the slave's ESC in reset).  Though this would interrupt comms to any downstream slaves as well, so it's not something that slaves are supposed to do of their own accord.  (And it shouldn't stay in safeop+error in that case, it should revert to Init, although that's up to the slave implementation.)

90ms seems a bit slow for just an ESC power-on SII read, although it's possible that it's doing something more complicated.

I'm not really familiar with those modules, however; you're probably best off asking Beckhoff directly.


Gavin Lambert
Senior Software Developer


[cid:image001.png at 01D57481.4CF3D8B0]
[cid:image002.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.compacsort.com&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301288430&sdata=5%2FxUt58hD78hqtYlV%2Bb9Rh9XypEBWYnZAvJF7MZGLx8%3D&reserved=0>[cid:image003.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.facebook.com%2FCompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301298424&sdata=t7uXQ6zYPjXxKLOgeCCIY4jnRGPlpU4Qhe%2Fo1RwexOI%3D&reserved=0>[cid:image004.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Fcompac-sorting-equipment%2F&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301298424&sdata=zpArhflxMAKy5x1Un6MfzKNCNVn9exe0imnSnvFNR7I%3D&reserved=0>[cid:image005.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fvimeo.com%2Fcompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301308422&sdata=zbE8KHI2j4QP%2FwGWxNY7D3m3oUlA80QgWsxx4%2F%2BK3Ro%3D&reserved=0>[cid:image006.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fcompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301308422&sdata=4LhrEQODdj9mEJG%2FXMc%2FoEn9vYBSdCvRi3gf9a3s1EU%3D&reserved=0>[cid:image007.png at 01D57481.4CF3D8B0]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.instagram.com%2Fcompacsort%2F&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301308422&sdata=XoVAbqv%2FpgzBxyhq7x8wYCqkrgMACIvX7xSIk0BVA5I%3D&reserved=0>

COMPAC SORTING EQUIPMENT LTD | 4 Henderson Pl | Onehunga | Auckland 1061 | New Zealand
Switchboard: +64 96 34 00 88 | tomra.com<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tomra.com&data=02%7C01%7Cgavin.lambert%40tomra.com%7C830afe0972d2435bef9008d7423309c3%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637050659301318416&sdata=pVkqutKL%2F5P1a5%2BnCbJ5hhuxvUEGw7rxhLEhj17z6iU%3D&reserved=0>

The information contained in this communication and any attachment is confidential and may be legally privileged. It should only be read by the person(s) to whom it is addressed. If you have received this communication in error, please notify the sender and delete the communication.

From: Graeme Foot
Sent: Tuesday, 24 September 2019 17:20
To: etherlab-dev at etherlab.org<mailto:etherlab-dev at etherlab.org>
Subject: [etherlab-dev] Hot plugged modules failing to read DC register

Hi,

I've had occasional issues with EL7332 and EL7342 modules where they will go to SafeOp + Error if you try and use them in DC mode.  I've finally had some time to look into it a little further.

When the modules go to SafeOp + Error the master outputs the message "Slave has no System Time register; delay measurement only." (with debug level 1).  This occurs due to the datagram reading register 0x0910 returning a working counter of zero.

I created a quick hack to retry reading the register up to 100 times before failing.  After approx. 90ms the EL7342 module I'm testing with successfully returned the datagram and the slave entered Op state successfully.

In my test setup I also have an EL5101 module that was doing the exact same thing (and taking around the same time), but I've never really had issues with them before.  I suspect the difference is that if you have incorrect settings on the EL7342 module and try to run a motor it can error out and reset itself, causing a situation equivalent to a hot plug.

Without my hack both modules need to wait for the SII read to complete for a similar length of time, so it looks like the slaves do not respond to the 0x0910 register request until the EEPROM read is complete.  Does anyone know if this is expected behaviour, or know of a better solution than to retry reading the register (up to 200ms ???)?


Regards,
Graeme.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 11438 bytes
Desc: image001.png
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0014.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 1629 bytes
Desc: image002.png
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0015.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.png
Type: image/png
Size: 1750 bytes
Desc: image003.png
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0016.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image004.png
Type: image/png
Size: 1855 bytes
Desc: image004.png
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0017.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image005.png
Type: image/png
Size: 1970 bytes
Desc: image005.png
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0018.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image006.png
Type: image/png
Size: 20278 bytes
Desc: image006.png
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0019.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image007.png
Type: image/png
Size: 1506 bytes
Desc: image007.png
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0020.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logo_compac_5dcf97ef-52f5-498c-8b9b-728410ddffaf.png
Type: image/png
Size: 11438 bytes
Desc: logo_compac_5dcf97ef-52f5-498c-8b9b-728410ddffaf.png
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0021.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: compacicon_82e8a8c7-154a-4a32-9720-a5badb6258e0.png
Type: image/png
Size: 1629 bytes
Desc: compacicon_82e8a8c7-154a-4a32-9720-a5badb6258e0.png
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0022.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: facebook_fa85b924-53b9-45cc-8162-0564f64ec3a3.png
Type: image/png
Size: 1750 bytes
Desc: facebook_fa85b924-53b9-45cc-8162-0564f64ec3a3.png
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0023.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: linkedin_4ec016ad-84fa-443c-85a3-b9615a4ccef8.png
Type: image/png
Size: 1855 bytes
Desc: linkedin_4ec016ad-84fa-443c-85a3-b9615a4ccef8.png
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0024.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: youtube_32142163-fc27-4aed-b14d-e8a377f98a6d.png
Type: image/png
Size: 1970 bytes
Desc: youtube_32142163-fc27-4aed-b14d-e8a377f98a6d.png
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0025.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: twitter_d89338d8-98c8-4b65-9a9e-7b1333160b0d.png
Type: image/png
Size: 20278 bytes
Desc: twitter_d89338d8-98c8-4b65-9a9e-7b1333160b0d.png
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0026.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: insta2_1cd85de9-b3a2-4971-9904-52b2481a7c82.png
Type: image/png
Size: 1506 bytes
Desc: insta2_1cd85de9-b3a2-4971-9904-52b2481a7c82.png
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20190926/a124ccfe/attachment-0027.png>


More information about the etherlab-dev mailing list