[etherlab-users] Diagnostics, crc and phy errors

Mon Nov 4 22:45:20 CET 2019

Hi,

It looks like it is just a dodgy link between device 5 and device 6.

To resolve, try the following (depending on what you have handy and assuming your drives are linked by patch cables):
- Unplug and replug the patch cable at both ends a few times (to try and get better contact)
- Take out the patch cable and use contact cleaner on the ends and the ports
- Toss the patch cable and replace (and use contact cleaner if you have it)

The incoming port shows phy errors if the link to the prior component is lost for a long enough time that the phy can detect the drop.  If it’s slightly less dodgy than that, a bad cable, noise or other then it can just show up as crc errors.  I don’t know for sure, but if the transmit wires have a dodgy connection but the receive wires are OK, for example, it could only show up as a phy / crc error in one direction.

It is unlikely to be an error on device 5 itself (unless the eth port is damaged).  If the device itself has a problem I often see the two devices either side of it with errors, with no error on the problem device in the middle.  e.g. (modifying your example):

2019-10-31 16:24:08.476

P0

P1

crc

phy

fwd

crc

phy

fwd

0:D2 CoE Driv

0

0

0

0

0

25

1:D2 CoE Driv

0

0

25

0

0

25

2:D2 CoE Driv

0

0

25

0

0

25

3:D2 CoE Driv

0

0

25

0

0

25

4:D2 CoE Driv

0

0

25

0

93

25

5:D2 CoE Driv

0

0

0

0

0

0

6:D2 CoE Driv

0

98

25

0

0

25

7:D2 CoE Driv

0

0

25

0

0

25

8:D2 CoE Driv

0

0

25

0

0

25

9:D2 CoE Driv

0

0

25

0

0

25

This usually indicates that the device in the middle repowered / reset, so the devices either side report the error, but the device causing the problem loses its count when it repowered / reset.  This is the case I have often found with bad Beckhoff ELxxxx modules.

Regards,
Graeme

From: etherlab-users <etherlab-users-bounces at etherlab.org> On Behalf Of Ignacio Rosales Gonzalez
Sent: Monday, 4 November 2019 11:55 PM
To: etherlab-users at etherlab.org
Subject: [etherlab-users] Diagnostics, crc and phy errors

Hello everybody,

I've suffering some errors in two machines.
In both of them the dmesg shows the same

 [ 2093.602142] EtherCAT 0: Domain 0: Working counter changed to 18/111.
 [ 2093.608661] EtherCAT 0: 6 slave(s) responding on main device.
 [ 2093.640855] EtherCAT 0: Scanning bus.
 [ 2094.939242] EtherCAT 0: Bus scanning completed in 1332 ms.
 [ 2094.939248] EtherCAT 0: Using slave main-0 as DC reference clock.
 [ 2097.856277] EtherCAT 0: Domain 0: Working counter changed to 0/111.
 [ 2097.876311] EtherCAT 0: 37 slave(s) responding on main device.
 [ 2097.876312] EtherCAT 0: Slave states on main device: SAFEOP, OP + ERROR.
 [ 2098.032585] EtherCAT 0: Scanning bus.

But logging the output of the new command ethercat crc I obtain different behaviours,

In one of them the crc counter and phy counter of one of servos practically go up at same time.

In the other only the phy counter in one servo is increased:

2019-10-31 16:24:08.476

P0

P1

crc

phy

fwd

crc

phy

fwd

0:D2 CoE Driv

0

0

0

0

0

25

1:D2 CoE Driv

0

0

25

0

0

25

2:D2 CoE Driv

0

0

25

0

0

25

3:D2 CoE Driv

0

0

25

0

0

25

4:D2 CoE Driv

0

0

25

0

0

25

5:D2 CoE Driv

0

0

25

0

93

25

6:D2 CoE Driv

0

98

25

0

0

25

7:D2 CoE Driv

0

0

25

0

0

25

8:D2 CoE Driv

0

0

25

0

0

25

9:D2 CoE Driv

0

0

25

0

0

25

The easy solution obviously is try to reconnect all wires and change servos, but could anyone explain whats the difference between the two cases and give me some ligth in order to solve this kind of problems?
For example in the the second case the phy counter is incremented in P0 of servodrive 6 and in P1 of servodrive 5. Is this a symphtom of an error in connection between 5 and 6 or an error of the servodrive 5?

Kind regards

Ignacio Rosales
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.etherlab.org/pipermail/etherlab-users/attachments/20191104/d9ebd601/attachment-0003.htm>