[etherlab-dev] CCAT startup failures
    Tillman, Scott 
    Scott.Tillman at bhemail.com
       
    Fri Oct 23 05:52:15 CEST 2015
    
    
  
I am having problems with the startup of the CCAT driver (both v12 to the new v13).  When I boot up the driver is loading (ccat shows via lsmod).  When I start the master it swaps over to ec_ccat, but sometimes the master never begins communicating on the bus.  In these situations I can often remove and reinsert the ec_ccat module and the master will start up.  Occasionally I get a hard lockup of the system (Alt-SysRq-B won't reboot) when trying to unload or reload the module, but that's possibly unrelated.  I'm looking for solutions to both problems (startup stall and hard lockups).
System details:
Beckhoff CX5130
                Ubuntu 14.04 (Server) with ubuntu-desktop  installed
                Kernel 3.16.7 with RTAI 4.1 ipipe patch
                Stable 1.5.2 Etherlab with recent CCAT v0.13 bundle
Some changes that are included in the bundle I've already posted
I am accessing the master via the user space library and using the LXRT library for accessing RTAI.
At this point I'm looking for suggestions on how to try to establish the root cause of these problems.  I don't have much experience debugging kernel crashes, and what experience I do have is 15 years out of date.  I have no prior exposure to the linux network subsystem.
Any help would be appreciated.  Below are snippets of the syslog around various interesting conditions.  Let me know if I can provide any other possibly useful data.
-Scott Tillman
Advisory Software Engineer
Bell + Howell
919-767-6745
What I know so far:
I occasionally see this constantly repeating in my syslog during the stalled state:
Oct 21 10:58:03 ubuntu kernel: [   17.184386] (unregistered net_device): BUG! Tx Ring full when queue awake!
Oct 21 10:58:03 ubuntu kernel: [   17.237043] (unregistered net_device): BUG! Tx Ring full when queue awake!
Oct 21 10:58:03 ubuntu kernel: [   17.289702] (unregistered net_device): BUG! Tx Ring full when queue awake!
Oct 21 10:58:03 ubuntu kernel: [   17.342599] (unregistered net_device): BUG! Tx Ring full when queue awake!
etc.  This continues until I rmmod ec_ccat.
Relevant logs on initial startup try in stalled state:
Oct 22 23:25:58 ubuntu dhclient: DHCPDISCOVER on CX5130_x001 to 255.255.255.255 port 67 interval 3 (xid=0x730db6c3)
Oct 22 23:25:58 ubuntu kernel: [   12.476197] EtherCAT: Master driver 1.5.2 71593adce088+
Oct 22 23:25:58 ubuntu kernel: [   12.481558] EtherCAT 0: Registering RTDM device EtherCAT.
Oct 22 23:25:58 ubuntu kernel: [   12.481586] EtherCAT: 1 master waiting for devices.
Oct 22 23:25:58 ubuntu avahi-daemon[623]: Interface CX5130_ebus.IPv6 no longer relevant for mDNS.
Oct 22 23:25:58 ubuntu avahi-daemon[623]: Leaving mDNS multicast group on interface CX5130_ebus.IPv6 with address fe80::201:5ff:fe1a:d56d.
Oct 22 23:25:58 ubuntu avahi-daemon[623]: Withdrawing address record for fe80::201:5ff:fe1a:d56d on CX5130_ebus.
Oct 22 23:25:58 ubuntu avahi-daemon[623]: Withdrawing workstation service for CX5130_ebus.
Oct 22 23:25:58 ubuntu NetworkManager[964]:    SCPlugin-Ifupdown: devices removed (path: /sys/devices/virtual/net/CX5130_ebus, iface: CX5130_ebus)
Oct 22 23:25:58 ubuntu kernel: [   12.603567] ec_ccat: registered ec_ccat as gpio chip with #12 GPIOs.
Oct 22 23:25:58 ubuntu kernel: [   12.603770] EtherCAT: Accepting 00:01:05:1A:D5:6D as main device for master 0.
Oct 22 23:25:58 ubuntu kernel: [   12.603794] EtherCAT 0: Starting EtherCAT-IDLE thread.
Oct 22 23:25:58 ubuntu kernel: [   12.603846] (unregistered net_device): NIC Link is Up
Oct 22 23:25:58 ubuntu kernel: [   12.603870] EtherCAT 0: Link state of ecm0 changed to UP.
Oct 22 23:25:58 ubuntu kernel: [   12.608008] ec_ccat: registered ccat_update0.
Oct 22 23:25:58 ubuntu kernel: [   12.608031] ec_ccat: ccat_sram_probe: 0x0016 rev: 0x0000
Oct 22 23:25:58 ubuntu kernel: [   12.655994] EtherCAT WARNING 0: 1 datagram TIMED OUT!
Oct 22 23:25:59 ubuntu NetworkManager[964]: <warn> error monitoring device for netlink events: error processing netlink message: Object busy
Oct 22 23:25:59 ubuntu avahi-daemon[623]: Joining mDNS multicast group on interface CX5130_x001.IPv6 with address fe80::201:5ff:fe1d:7dac.
Oct 22 23:25:59 ubuntu avahi-daemon[623]: New relevant interface CX5130_x001.IPv6 for mDNS.
Oct 22 23:25:59 ubuntu avahi-daemon[623]: Registering new address record for fe80::201:5ff:fe1d:7dac on CX5130_x001.*.
Oct 22 23:25:59 ubuntu kernel: [   13.706520] EtherCAT WARNING 0: 20 datagrams TIMED OUT!
Oct 22 23:26:00 ubuntu kernel: [   14.748575] EtherCAT WARNING 0: 20 datagrams TIMED OUT!
Oct 22 23:26:01 ubuntu dbus[443]: [system] Activating service name='org.freedesktop.UPower' (using servicehelper)
Oct 22 23:26:01 ubuntu dbus[443]: [system] Successfully activated service 'org.freedesktop.UPower'
Oct 22 23:26:01 ubuntu dhclient: DHCPDISCOVER on CX5130_x001 to 255.255.255.255 port 67 interval 5 (xid=0x730db6c3)
Oct 22 23:26:02 ubuntu kernel: [   15.790895] EtherCAT WARNING 0: 20 datagrams TIMED OUT!
Oct 22 23:26:02 ubuntu kernel: [   15.893849] (unregistered net_device): BUG! Tx Ring full when queue awake!
Oct 22 23:26:02 ubuntu kernel: [   15.946642] (unregistered net_device): BUG! Tx Ring full when queue awake!
Oct 22 23:26:02 ubuntu kernel: [   16.000730] (unregistered net_device): BUG! Tx Ring full when queue awake!
At this time the 'ethercat master' command returns something like this:
Master0
  Phase: Idle
  Active: no
  Slaves: 0
  Ethernet devices:
    Main: 00:01:05:1a:d5:6d (attached)
      Link: UP
      Tx frames:   63
      Tx bytes:    3780
      Rx frames:   0
      Rx bytes:    0
      Tx errors:   1294
      Tx frame rate [1/s]:      0      0      0
      Tx rate [KByte/s]:      0.0    0.0    0.1
      Rx frame rate [1/s]:      0      0      0
      Rx rate [KByte/s]:      0.0    0.0    0.0
    Common:
      Tx frames:   63
      Tx bytes:    3780
      Rx frames:   0
      Rx bytes:    0
      Lost frames: 63
      Tx frame rate [1/s]:      0      0      0
      Tx rate [KByte/s]:      0.0    0.0    0.1
      Rx frame rate [1/s]:      0      4   2405
      Rx rate [KByte/s]:      0.0    4.7 1455.5
      Loss rate [1/s]:          0      0      0
      Frame loss [%]:         0.0  100.0  100.0
  Distributed clocks:
    Reference clock: None
    Application time: 0
                      2000-01-01 00:00:00.000000000
On an unsuccessful unload/reload of ec_ccat:
Oct 22 23:30:33 ubuntu kernel: [  287.196371] (unregistered net_device): BUG! Tx Ring full when queue awake!
Oct 22 23:30:33 ubuntu kernel: [  287.249216] (unregistered net_device): BUG! Tx Ring full when queue awake!
Oct 22 23:30:33 ubuntu kernel: [  287.301650] EtherCAT 0: Master thread exited.
Oct 22 23:30:33 ubuntu kernel: [  287.301675] EtherCAT 0: Releasing main device 00:01:05:1A:D5:6D.
Oct 22 23:30:40 ubuntu kernel: [  293.382320] ec_ccat: registered ec_ccat as gpio chip with #12 GPIOs.
Oct 22 23:30:40 ubuntu kernel: [  293.382606] EtherCAT: Accepting 00:01:05:1A:D5:6D as main device for master 0.
Oct 22 23:30:40 ubuntu kernel: [  293.382624] EtherCAT 0: Starting EtherCAT-IDLE thread.
Oct 22 23:30:40 ubuntu kernel: [  293.382691] (unregistered net_device): NIC Link is Up
Oct 22 23:30:40 ubuntu kernel: [  293.382715] EtherCAT 0: Link state of ecm0 changed to UP.
Oct 22 23:30:40 ubuntu kernel: [  293.384380] EtherCAT WARNING 0: 8 datagrams TIMED OUT!
Oct 22 23:30:40 ubuntu kernel: [  293.386028] ec_ccat: registered ccat_update0.
Oct 22 23:30:40 ubuntu kernel: [  293.386073] ec_ccat: ccat_sram_probe: 0x0016 rev: 0x0000
Oct 22 23:30:41 ubuntu kernel: [  294.431521] EtherCAT WARNING 0: 20 datagrams TIMED OUT!
Oct 22 23:30:42 ubuntu kernel: [  295.472201] EtherCAT WARNING 0: 20 datagrams TIMED OUT!
Oct 22 23:30:43 ubuntu kernel: [  296.511475] EtherCAT WARNING 0: 20 datagrams TIMED OUT!
Oct 22 23:30:43 ubuntu kernel: [  296.669281] (unregistered net_device): BUG! Tx Ring full when queue awake!
Oct 22 23:30:43 ubuntu kernel: [  296.721893] (unregistered net_device): BUG! Tx Ring full when queue awake!
Oct 22 23:30:43 ubuntu kernel: [  296.774468] (unregistered net_device): BUG! Tx Ring full when queue awake!
On a successful unload and reload of ec_ccat:
Oct 22 23:32:34 ubuntu kernel: [  408.038227] (unregistered net_device): BUG! Tx Ring full when queue awake!
Oct 22 23:32:34 ubuntu kernel: [  408.090745] (unregistered net_device): BUG! Tx Ring full when queue awake!
Oct 22 23:32:34 ubuntu kernel: [  408.143279] (unregistered net_device): BUG! Tx Ring full when queue awake!
Oct 22 23:32:34 ubuntu kernel: [  408.188016] EtherCAT 0: Master thread exited.
Oct 22 23:32:34 ubuntu kernel: [  408.188049] EtherCAT 0: Releasing main device 00:01:05:1A:D5:6D.
Oct 22 23:32:37 ubuntu kernel: [  411.085382] ec_ccat: registered ec_ccat as gpio chip with #12 GPIOs.
Oct 22 23:32:37 ubuntu kernel: [  411.085704] EtherCAT: Accepting 00:01:05:1A:D5:6D as main device for master 0.
Oct 22 23:32:37 ubuntu kernel: [  411.085726] EtherCAT 0: Starting EtherCAT-IDLE thread.
Oct 22 23:32:37 ubuntu kernel: [  411.085788] (unregistered net_device): NIC Link is Up
Oct 22 23:32:37 ubuntu kernel: [  411.085809] EtherCAT 0: Link state of ecm0 changed to UP.
Oct 22 23:32:37 ubuntu kernel: [  411.086907] ec_ccat: registered ccat_update0.
Oct 22 23:32:37 ubuntu kernel: [  411.086995] ec_ccat: ccat_sram_probe: 0x0016 rev: 0x0000
Oct 22 23:32:37 ubuntu kernel: [  411.090003] EtherCAT WARNING 0: 2 datagrams TIMED OUT!
Oct 22 23:32:37 ubuntu kernel: [  411.090014] EtherCAT WARNING 0: 1 datagram UNMATCHED!
Oct 22 23:32:37 ubuntu kernel: [  411.094116] EtherCAT 0: 14 slave(s) responding on main device.
Oct 22 23:32:37 ubuntu kernel: [  411.094128] EtherCAT 0: Slave states on main device: PREOP.
Oct 22 23:32:37 ubuntu kernel: [  411.094226] EtherCAT 0: Scanning bus.
Oct 22 23:32:39 ubuntu kernel: [  412.698595] EtherCAT 0: Bus scanning completed in 1604 ms.
Oct 22 23:32:39 ubuntu kernel: [  412.698606] EtherCAT 0: Using slave 0 as DC reference clock.
Oct 22 23:32:39 ubuntu kernel: [  412.700210] EtherCAT: Requesting master 0...
Oct 22 23:32:39 ubuntu kernel: [  412.700226] EtherCAT: Successfully requested master 0.
Oct 22 23:32:39 ubuntu kernel: [  412.709429] EtherCAT 0: Domain0: Logical address 0x00000000, 114 byte, expected working counter 32.
Oct 22 23:32:39 ubuntu kernel: [  412.709440] EtherCAT 0:   Datagram domain0-0-main: Logical offset 0x00000000, 114 byte, type LRW.
Oct 22 23:32:39 ubuntu kernel: [  412.709833] EtherCAT 0: Master thread exited.
Oct 22 23:32:39 ubuntu kernel: [  412.709858] EtherCAT 0: Starting EtherCAT-OP thread.
Oct 22 23:32:39 ubuntu kernel: [  412.710010] EtherCAT WARNING 0: 1 datagram TIMED OUT!
Oct 22 23:32:39 ubuntu kernel: [  412.742278] EtherCAT WARNING 0: No app_time received up to now, but master already active.
Oct 22 23:32:39 ubuntu kernel: [  412.784239] EtherCAT 0: Domain 0: Working counter changed to 1/32.
Oct 22 23:32:39 ubuntu kernel: [  412.895474] EtherCAT WARNING 0-2: Slave does not support changing the PDO mapping!
Oct 22 23:32:39 ubuntu kernel: [  412.895484] EtherCAT WARNING 0-2: Currently mapped PDO entries: 0x7010:01/1 0x7010:02/1 0x7010:03/1 0x7010:04/1 0x0000:00/4 0x0000:00/8 0x7010:11/32. Entries to map: 0x7010:01/1 0x7010:02/1 0x7010:03/1 0x7010:04/1 0x0000:00/12 0x7010:11/32
Oct 22 23:32:39 ubuntu kernel: [  413.102650] EtherCAT WARNING 0-3: Slave does not support changing the PDO mapping!
Oct 22 23:32:39 ubuntu kernel: [  413.102660] EtherCAT WARNING 0-3: Currently mapped PDO entries: 0x6000:01/1 0x6000:02/1 0x6000:03/2 0x6000:05/2 0x6000:07/1 0x0000:00/1 0x0000:00/6 0x1800:07/1 0x1800:09/1 0x6000:11/16. Entries to map: 0x6000:01/1 0x6000:02/1 0x6000:03/2 0x6000:05/2 0x6000:07/1 0x0000:01/1 0x0000:02/6 0x1800:07/1 0x1800:09/1 0x6000:11/16
Oct 22 23:32:40 ubuntu kernel: [  413.714330] EtherCAT WARNING 0: 6 datagrams UNMATCHED!
Oct 22 23:32:40 ubuntu kernel: [  413.715135] EtherCAT WARNING: Datagram ffff880096230d98 (domain0-0-main) was SKIPPED 6 times.
Oct 22 23:32:40 ubuntu kernel: [  413.787102] EtherCAT 0: Domain 0: 7 working counter changes - now 13/32.
Oct 22 23:32:40 ubuntu kernel: [  414.508889] EtherCAT 0: Slave states on main device: OP.
Oct 22 23:32:41 ubuntu kernel: [  414.792978] EtherCAT 0: Domain 0: 5 working counter changes - now 21/32.
I am aware of the PDO mapping warnings displayed (this system happened to have a new type of slice), but the stalls were occurring well before I added that device's support, plus the stalls are before any comms with the slaves anyway.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.etherlab.org/pipermail/etherlab-dev/attachments/20151023/e34748f6/attachment-0001.html>
    
    
More information about the etherlab-dev
mailing list