[etherlab-users] Unmatched/Skipped Datagrams

Gavin Lambert gavin.lambert at tomra.com
Mon Dec 16 23:26:06 CET 2019


Another possible cause is if your cyclic loop timing is off somehow, and you're calling ecrt_master_receive too soon after ecrt_master_send so that the packet hasn't had time to make it back to the master yet.

The timing code you have below looks correct, but bear in mind that the way wakeup_time is incremented is designed to smooth latency jitter but can result in a zero-time sleep if the task is already running behind for some reason.  So adding some logging of your wakeup_time vs. current time and other performance metrics could be useful as well.

Also, double check the code you're using to change the process priority; you didn't include that part below.  Using SCHED_FIFO requires root permissions but even then there can be restrictions.  Verify it's not failing.


Gavin Lambert
Senior Software Developer

 


COMPAC SORTING EQUIPMENT LTD | 4 Henderson Pl | Onehunga | Auckland 1061 | New Zealand
Switchboard: +49 2630 96520 | https://www.tomra.com

The information contained in this communication and any attachment is confidential and may be legally privileged. It should only be read by the person(s) to whom it is addressed. If you have received this communication in error, please notify the sender and delete the communication.
-----Original Message-----
From: Gavin Lambert
Sent: Tuesday, 17 December 2019 11:09
To: Gary Grobe <ggrobe at houstonmechatronics.com>; etherlab-users at etherlab.org
Subject: Re: [etherlab-users] Unmatched/Skipped Datagrams

There's a 4.19 version of all drivers (except r8169, because reasons), including igb, in the unofficial patchset (https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsourceforge.net%2Fu%2Fuecasm%2Fetherlab-patches%2Fci%2Fdefault%2Ftree%2F&data=02%7C01%7Cgavin.lambert%40tomra.com%7Cd8107ac890f84bfecff508d782748ed6%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637121309444435878&sdata=zYsu8AzSjFAjmgkqI8MS9xAa0r4Rsowadhwiu0yuel4%3D&reserved=0).

It can be normal to see a small number of unmatched datagrams when you're starting your app up (and it's transitioning from background mode to realtime mode), and when you're replugging the network, but otherwise no, they should never appear.  It means the EtherCAT packets aren't getting all the way around the network.

It's possible to watch the flow of packets, either by using one of the built-in debug interfaces or (usually easier) by connecting a small dumb switch between the master and the first slave, and then connecting a third laptop/PC running Wireshark to the switch as well (preferably in such a way that the laptop can't send packets, but it's not critical).  This can sometimes give you more hints about what's going on.

Using "ethercat debug 1" and observing the syslog can also help.


Gavin Lambert
Senior Software Developer

 -----Original Message-----
From: Gary Grobe
Sent: Tuesday, 17 December 2019 10:30
To: etherlab-users at etherlab.org
Subject: Re: [etherlab-users] Unmatched/Skipped Datagrams

The ec_igb doesn't seem to be available for 4.19. If nothing else works, I may try to roll back the kernel as a last resort, but according to recent mailings, it would appear that others are running 4.19 LTS w/o issue.

I did just give things a try on new hardware:
- Linux 4.19.82-rt30 #1 SMP PREEMPT RT x86_64 GNU/Linux
- IgH EtherCAT master 1.5.2 6258f7d85afb+
- Intel(R) Core(TM) i7-7700T CPU @ 2.90GHz
- ec_e1000e using the following NIC:

00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection I219-LM [8086:156f] (rev 31)
         Subsystem: Intel Corporation Ethernet Connection I219-LM [8086:0000]
         Kernel driver in use: ec_e1000e
         Kernel modules: e1000e

Also replaced cables, but even at 50Hz, I'm still seeing the following with the e1000e.

[11576.944102] EtherCAT WARNING 0: 7 datagrams UNMATCHED!
[11577.940744] EtherCAT WARNING: Datagram 000000004e9cc6e1
(domain0-0-main) was SKIPPED 13 times.
[11577.944111] EtherCAT WARNING 0: 13 datagrams UNMATCHED!
[11578.944115] EtherCAT WARNING 0: 13 datagrams UNMATCHED!
[11578.944153] EtherCAT WARNING: Datagram 000000004e9cc6e1
(domain0-0-main) was SKIPPED 20 times.
[11579.944090] EtherCAT WARNING 0: 17 datagrams UNMATCHED!
[11580.940728] EtherCAT WARNING: Datagram 000000004e9cc6e1
(domain0-0-main) was SKIPPED 25 times.
[11580.944071] EtherCAT WARNING 0: 9 datagrams UNMATCHED!
[11581.944077] EtherCAT WARNING 0: 31 datagrams UNMATCHED!
[11581.944091] EtherCAT WARNING: Datagram 000000004e9cc6e1
(domain0-0-main) was SKIPPED 32 times.
[11582.944104] EtherCAT WARNING 0: 4 datagrams UNMATCHED!
[11583.940692] EtherCAT WARNING: Datagram 000000004e9cc6e1
(domain0-0-main) was SKIPPED 18 times.
[11583.944069] EtherCAT WARNING 0: 19 datagrams UNMATCHED!
[11584.944048] EtherCAT WARNING 0: 3 datagrams UNMATCHED!
[11584.944179] EtherCAT WARNING: Datagram 000000004e9cc6e1
(domain0-0-main) was SKIPPED 8 times.
[11585.944076] EtherCAT WARNING 0: 30 datagrams UNMATCHED!

I've also replaced the controllers w/ a few in-house as well as Elmo motion controllers, but to no avail. Is it common/ok to have a small percentage of these messages to appear?

Any other thoughts?

On 12/15/19 3:55 PM, Gavin Lambert wrote:
> 1kHz cycle is typically too fast for the generic driver.  Try using the ec_igb driver instead, or reduce your cycle to 250 Hz or less.
>
> Otherwise, there may be some other issue with your network wiring, or your hardware.  Have you tried it on other hardware?
>
>
> Gavin Lambert
> Senior Software Developer
>
>   
>
>
> COMPAC SORTING EQUIPMENT LTD | 4 Henderson Pl | Onehunga | Auckland
> 1061 | New Zealand
> Switchboard: +49 2630 96520 | https://www.tomra.com
>
> The information contained in this communication and any attachment is confidential and may be legally privileged. It should only be read by the person(s) to whom it is addressed. If you have received this communication in error, please notify the sender and delete the communication.
> -----Original Message-----
> From: Gary Grobe
> Sent: Saturday, 14 December 2019 11:19
> To: etherlab-users at etherlab.org
> Subject: [etherlab-users] Unmatched/Skipped Datagrams
>
> My configuration is:
> - Linux 4.19.82-rt30 #1 SMP PREEMPT RT x86_64 GNU/Linux
> - IgH EtherCAT master 1.5.2 6258f7d85afb+
> - Intel(R) Celeron(R) CPU  J1900  @ 1.99GHz
> - ec_generic based on the following NIC:
>
> 03:00.0 Ethernet controller [0200]: Intel Corporation I211 Gigabit Network Connection [8086:1539] (rev 03)
>           Subsystem: Intel Corporation I211 Gigabit Network Connection [8086:0000]
>           Kernel driver in use: igb
>
> The app is a user-space client. The following is the cyclic task cut down to a minimum, and yet still shows the issue.
>
> {
>       // ...
>
>       struct sched_param param = { };
>
>       param.sched_priority = sched_get_priority_max(SCHED_FIFO);
>
>       if (mlockall(MCL_CURRENT | MCL_FUTURE) == -1) {
>           std::cout << "Warning: Failed to lock memory: " <<
> strerror(errno) <<
>               std::endl;
>       }
>
>       stack_prefault();
>
>       clock_gettime(CLOCK_MONOTONIC, &wakeup_time);
>       wakeup_time.tv_sec += 1;
>       wakeup_time.tv_nsec = 0;
>
>       while (running) {
>           ret = clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &wakeup_time,
>               NULL);
>
>           if (ret) {
>               std::cerr << "clock_nanosleep(): " << strerror(ret) << 
> std::endl;
>
>               break;
>           }
>
>           task_cycle();
>
>           wakeup_time.tv_nsec += PERIOD_NS;
>
>           while (wakeup_time.tv_nsec >= NSEC_PER_SEC) {
>               wakeup_time.tv_nsec -= NSEC_PER_SEC;
>               wakeup_time.tv_sec++;
>           }
>       }
> }
>
> inline void TestChain::task_cycle()
> {
>       ecrt_master_receive(master);
>       ecrt_domain_process(domain);
>
>       ecrt_master_application_time(master, TIMESPEC2NS(wakeup_time));
>       ecrt_master_sync_reference_clock(master);
>       ecrt_master_sync_slave_clocks(master);
>
>       if (counter) {
>           counter--;
>       }
>       else {
>           counter = FREQUENCY;
>       }
>
>       ecrt_domain_queue(domain);
>       ecrt_master_send(master);
> }
>
> The following messages from dmesg were taken at 1kHz. Doesn't matter if it's doing 10Hz or 1kHz, the UNMATCHED and SKIPPED numbers are quite large. I've only seen 1Hz work w/o issue.
>
> [ 4132.457210] EtherCAT 0: Domain 0: 268 working counter changes - now 1/2.
> [ 4133.373743] EtherCAT WARNING 0: 657 datagrams UNMATCHED!
> [ 4133.461587] EtherCAT WARNING: Datagram 000000002f1447da
> (domain0-0-main) was SKIPPED 736 times.
> [ 4133.461593] EtherCAT 0: Domain 0: 395 working counter changes - now 0/2.
> [ 4134.374131] EtherCAT WARNING 0: 643 datagrams UNMATCHED!
> [ 4134.465961] EtherCAT WARNING: Datagram 000000002f1447da
> (domain0-0-main) was SKIPPED 776 times.
> [ 4134.465968] EtherCAT 0: Domain 0: 232 working counter changes - now 0/2.
> [ 4135.374332] EtherCAT WARNING 0: 850 datagrams UNMATCHED!
> [ 4135.470468] EtherCAT WARNING: Datagram 000000002f1447da
> (domain0-0-main) was SKIPPED 722 times.
> [ 4135.470476] EtherCAT 0: Domain 0: 331 working counter changes - now 1/2.
> [ 4136.374860] EtherCAT WARNING 0: 649 datagrams UNMATCHED!
> [ 4136.475180] EtherCAT WARNING: Datagram 000000002f1447da
> (domain0-0-main) was SKIPPED 858 times.
> [ 4136.475187] EtherCAT 0: Domain 0: 282 working counter changes - now 1/2.
> [ 4137.375178] EtherCAT WARNING 0: 757 datagrams UNMATCHED!
> [ 4138.319869] EtherCAT WARNING: Datagram 000000002f1447da
> (domain0-0-main) was SKIPPED 745 times.
> [ 4138.319877] EtherCAT 0: Domain 0: 208 working counter changes - now 1/2.
> [ 4138.375593] EtherCAT WARNING 0: 972 datagrams UNMATCHED!
> [ 4139.324269] EtherCAT WARNING: Datagram 000000002f1447da
> (domain0-0-main) was SKIPPED 911 times.
>

--
_The contents of this email message and any attachments are intended solely for the addressee(s).  This message (including any attachments) may contain confidential, proprietary, privileged and/or private information. The information is intended to be for the use of the individual or entity designated above. If you are not the intended recipient of this message, please notify the sender immediately, and delete the message and any attachments.  _Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited. __ _______________________________________________
etherlab-users mailing list
etherlab-users at etherlab.org
https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.etherlab.org%2Fmailman%2Flistinfo%2Fetherlab-users&data=02%7C01%7Cgavin.lambert%40tomra.com%7Cd8107ac890f84bfecff508d782748ed6%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637121309444445873&sdata=R2bB6zC6lYaR7mcWdONQRGmrG%2FvZICVl3b0ef6aBXHQ%3D&reserved=0
_______________________________________________
etherlab-users mailing list
etherlab-users at etherlab.org
https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.etherlab.org%2Fmailman%2Flistinfo%2Fetherlab-users&data=02%7C01%7Cgavin.lambert%40tomra.com%7Cd8107ac890f84bfecff508d782748ed6%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C637121309444445873&sdata=R2bB6zC6lYaR7mcWdONQRGmrG%2FvZICVl3b0ef6aBXHQ%3D&reserved=0


More information about the etherlab-users mailing list