[etherlab-users] Intermittent Large number of datagrams UNMATCHED

Henry Bausley hbausley at deltatau.com
Wed Jul 6 22:43:37 CEST 2016


  Thanks for all the great suggestions everyone.  Now that I'm back from
holiday I will monitor registers 0x300 and 0x310 per the suggestions.

Ralf,

   The drives we are using are Lenze i700 drives. 

  I will also try changing the network interface.  Currently the network interface giving a problem is reported as a 
Intel 82579LM which is just a PHY, so I assume the controller is in the chipset.
  
  I will try using the additional port on my PC which has an Intel 82574L and see if I have better luck.

  I will keep you posted.


On Mon, 2016-07-04 at 08:29 +0200, Ralf Roesch wrote:
> We also are fighting with this type of problem on a customer laser
> cutting machine.
> Occasionally we see errors like this:
> [122501.934306] EtherCAT 0: Domain 0: Working counter changed to 0/9.
> [122501.934346] EtherCAT 0: Domain 1: Working counter changed to 0/9.
> [122502.320449] EtherCAT WARNING 0: 5 datagrams TIMED OUT!
> [122502.935224] EtherCAT 0: Domain 0: Working counter changed to 9/9.
> [122502.935265] EtherCAT 0: Domain 1: Working counter changed to 9/9.
> 
> This was the reason I modified the ethercat command line tool for
> extended diagnostics regarding several ESC error registers.
> 
> Attached you will find a patch which might help you.
> After applying and building the ethercat command line tool it will
> provide a new command "diag".
>       * Shortly after your ethercat master has been started
>         successfully call:
>         ethercat diag -r
>         This will reset all slaves ESC error registers including Lost
>         Link Counter Register and RX Error Counter Register.
>       * If you detect a an error UNMATCHED and TIMEOUT (sometimes
>         after hours or days) call:
>         ethercat diag
>         If you are lucky you will find one ore more ESC errors
>         displayed on your console.
>         For better understanding the displayed errors you should to
>         picture picture
>         http://www.automation.com/images/article/ethercat/Figure14.jpg
>         (part of
>         http://www.automation.com/automation-news/article/diagnostics-with-ethercat-part-4).
> 
> Would be happy about any kind of feedback.
> 
> 
> @Henry: which type of drives do you use?
> 
> 
> Regards,
> Ralf
> 
> 
> 
> On Mon Jul 04 2016 05:19:58 GMT+0200 (CEST), Graeme Foot
> <Graeme.Foot at touchcut.com> wrote:
> 
> > The only time we've had issues like that has been due to either a dodgy network cable or an RJ45 plug getting a bit grubby.  First thing I usually do is unplug/replug all the plugs a few time to clean up the connections.  If it persists then I start looking for bad cables.
> > 
> > Another option is that there is an occasional noisy process causing noise on one of the links.
> > 
> > Once or twice (only on non-ethercat machines so far) we've had cables that were in drag chains wearing out, where it showed a problem when at a specific position of the drag chain.
> > 
> > You could track down if it's a problem with a link between two particular slaves by checking each slaves Link Lost Counter and CRC Bad Counter values.
> > - Lost Link Counter Register (0x0310:0x0313)
> > - RX Error Counter Register (0x0300:0x0307)
> > 
> > This link describes some of the diagnostics:
> > http://www.automation.com/automation-news/article/diagnostics-with-ethercat-part-4
> > 
> > I think you can set the above registers to zero after the fieldbus is up and running, then you can check them if a problem occurs.
> > 
> > 
> > Haven't actually done it yet myself, so would be interested to see if it helps you.
> > 
> > 
> > Regards,
> > Graeme.
> > 
> > 
> > 
> > 
> > -----Original Message-----
> > From: etherlab-users [mailto:etherlab-users-bounces at etherlab.org] On Behalf Of Henry Bausley
> > Sent: Saturday, 2 July 2016 5:56 a.m.
> > To: etherlab-users at etherlab.org
> > Subject: [etherlab-users] Intermittent Large number of datagrams UNMATCHED
> > 
> > 
> > 
> > We have a etherlab 1.5.2 kernel mode application running in xenomai
> > 2.4.6 on Ubuntu 14.04.1 Desktop that will get on rare  occasions a large number of datagrams UNMATCHED.  It occurs at random times and relatively rarely but when it occurs it can result in disaster as we are running a large number of servos in torque mode.
> > 
> > For example we can run continuously for 5 days 24hours continuously then get a message like something below.
> > 
> > [591785.735172] EtherCAT WARNING 0: 616 datagrams UNMATCHED!
> > I am struggling as to where to look.  Is this something in our app or a known bug in the stack?
> > 
> > 
> > 
> > 
> > 
> > Outbound scan for Spam or Virus by Barracuda at Delta Tau
> > 
> > _______________________________________________
> > etherlab-users mailing list
> > etherlab-users at etherlab.org
> > http://lists.etherlab.org/mailman/listinfo/etherlab-users
> > _______________________________________________
> > etherlab-users mailing list
> > etherlab-users at etherlab.org
> > http://lists.etherlab.org/mailman/listinfo/etherlab-users
> 







More information about the Etherlab-users mailing list