[etherlab-users] R: System randomly freezes in multi-thread Qt application with a RT process

Dr.-Ing. Wilhelm Hagemeister hm at igh.de
Fri May 31 11:00:29 CEST 2019


Hi Simone,

in addition to Robertos post...

We strictly separate GUI from real time program and communicate between
a realtime process as the data producer and clients like GUI as 
consumers by a network protocol. This protocol separates parameters from
channels and is bidirectional and multi client capable. It also allows 
to run realtime process and GUI on separate machines; which mostly the 
case. Also it makes debugging a lot easier due to reduced complexity.

"pdcom" and "pdserv" are the well tested communication libraries and
"qtpdwidgets" is the library with process data aware widgets for QT(4 
and 5).
With this "qtpdwidgets" comes an example.

Read the README with comes with the libraries.

By using "pdserv" as the communication interface to the realtime process 
you also are able to use tools like "Testmanager" or "dlsd".

There is no good example for "pdserv" right now. But triggered by your 
post I asked my colleague to add an example to the "pdserv" library to 
make the start a bit easier.

look here for latest versions and precompiled rpm's (sorry, no debian
packages yet, we are working on it...):

https://build.opensuse.org/project/show/science:EtherLab

Don't reinvent the wheel... if you can avoid it.

Regards Wilhelm.

P.S. EoE has some issues. Try without it.

Am 31.05.19 um 09:30 schrieb Viola Roberto:
> Hi Simone,
> 
>    first of all the guc firmware aren’t a issue, so go on J
> 
> In order to understand the issue i think you should try to split your 
> application in small pieces: in this way we can try to delimit the 
> perimeter of the issue.
> 
> You could try to run a simple application (using the ethercat example 
> that you can find on the repository) that only read or write some 
> objects from your slaves and step after step you can add pieces of code 
> in order to identify when and where the issue happens. I don’t know how 
> much is big your app, but from my point of view it’s the only way to 
> achieve some results.
> 
> Other note: in your first log i saw that the crash happened in EoE 
> context, did you use it? Could you try to disable it and test it again?
> 
> Have a nice weekend
> 
> Roberto
> 
> *Da:*Simone Comari [mailto:simone.comari2 at unibo.it]
> *Inviato:* mercoledì 29 maggio 2019 15:50
> *A:* Viola Roberto <roberto.viola at systemceramics.com>; 
> etherlab-users at etherlab.org
> *Cc:* Edoardo Ida <edoardo.ida2 at unibo.it>
> *Oggetto:* R: System randomly freezes in multi-thread Qt application 
> with a RT process
> 
> Hi Roberto,
> 
> Thanks for following up, much appreciated.
> 
> We tried the same setup on a laptop (Dell Inspiron-5567, Intel® Core™ 
> i7-7500U CPU @ 2.70GHz × 4), dual booted Windows 10/Ubuntu 16.04.6  but 
> the behavior remains the same.
> 
> I noticed nevertheless that both workstations have the same driver for 
> the video card (i.e. /i915/). I also noticed that at the end of RT 
> kernel build (I think when making /modules_install/) there was a warning 
> about a couple of missing firmwares for this device:
> 
> possible missing firmware /lib/firmware/i915/kbl_guc_ver9_14.bin for 
> module i915
> 
> possible missing firmware /lib/firmware/i915/bxt_guc_ver8_7.bin for 
> module i915
> 
> 
> 
> So I copied the missing files (taken from here 
> <https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/i915>) 
> into //lib/firmware/i915/ and updated with following command:
> 
> |sudo update-initramfs -u|
> 
> I then recompiled both the RT kernel and etherlab libs, but except for 
> the disappeared warnings, nothing changed.
> 
> The video card I'm mounting is the following:
> 
> $ lshw -C video
> 
>    *-display
> 
>         description: VGA compatible controller
> 
>         product: HD Graphics 620
> 
>         vendor: Intel Corporation
> 
>         physical id: 2
> 
>         bus info: pci at 0000:00:02.0 <mailto:pci at 0000:00:02.0>
> 
>         version: 02
> 
>         width: 64 bits
> 
>         clock: 33MHz
> 
>         capabilities: pciexpress msi pm vga_controller bus_master 
> cap_list rom
> 
>         configuration: driver=i915 latency=0
> 
>         resources: irq:280 memory:de000000-deffffff 
> memory:b0000000-bfffffff ioport:f000(size=64) memory:c0000-dffff
> 
> I've tried to stress both CPU
> 
> $ stress --cpu `nproc` --vm `nproc` --vm-bytes 1GB --io `nproc` --hdd 
> `nproc` --hdd-bytes 1GB --timeout 60s
> 
> stress: info: [6624] dispatching hogs: 4 cpu, 4 io, 4 vm, 4 hdd
> 
> stress: info: [6624] successful run completed in 60s
> 
> and video card
> 
> $ glmark2
> 
> =======================================================
> 
>      glmark2 2014.03+git20150611.fa71af2d
> 
> =======================================================
> 
>      OpenGL Information
> 
>      GL_VENDOR:     Intel Open Source Technology Center
> 
>      GL_RENDERER:   Mesa DRI Intel(R) HD Graphics 620 (Kaby Lake GT2)
> 
>      GL_VERSION:    3.0 Mesa 18.0.5
> 
> =======================================================
> 
> [build] use-vbo=false: FPS: 1392 FrameTime: 0.718 ms
> 
> [build] use-vbo=true: FPS: 1494 FrameTime: 0.669 ms
> 
> [texture] texture-filter=nearest: FPS: 1220 FrameTime: 0.820 ms
> 
> [texture] texture-filter=linear: FPS: 1370 FrameTime: 0.730 ms
> 
> [texture] texture-filter=mipmap: FPS: 1379 FrameTime: 0.725 ms
> 
> [shading] shading=gouraud: FPS: 1352 FrameTime: 0.740 ms
> 
> [shading] shading=blinn-phong-inf: FPS: 1356 FrameTime: 0.737 ms
> 
> [shading] shading=phong: FPS: 1334 FrameTime: 0.750 ms
> 
> [shading] shading=cel: FPS: 1365 FrameTime: 0.733 ms
> 
> [bump] bump-render=high-poly: FPS: 1004 FrameTime: 0.996 ms
> 
> [bump] bump-render=normals: FPS: 1474 FrameTime: 0.678 ms
> 
> [bump] bump-render=height: FPS: 1496 FrameTime: 0.668 ms
> 
> [effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 1141 FrameTime: 0.876 ms
> 
> [effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 847 FrameTime: 
> 1.181 ms
> 
> [pulsar] light=false:quads=5:texture=false: FPS: 1543 FrameTime: 0.648 ms
> 
> [desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: 
> FPS: 670 FrameTime: 1.493 ms
> 
> [desktop] effect=shadow:windows=4: FPS: 891 FrameTime: 1.122 ms
> 
> [buffer] 
> columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: 
> FPS: 638 FrameTime: 1.567 ms
> 
> [buffer] 
> columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: 
> FPS: 493 FrameTime: 2.028 ms
> 
> [buffer] 
> columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: 
> FPS: 690 FrameTime: 1.449 ms
> 
> [ideas] speed=duration: FPS: 1265 FrameTime: 0.791 ms
> 
> [jellyfish] <default>: FPS: 1258 FrameTime: 0.795 ms
> 
> [terrain] <default>: FPS: 189 FrameTime: 5.291 ms
> 
> [shadow] <default>: FPS: 982 FrameTime: 1.018 ms
> 
> [refract] <default>: FPS: 360 FrameTime: 2.778 ms
> 
> [conditionals] fragment-steps=0:vertex-steps=0: FPS: 1339 FrameTime: 
> 0.747 ms
> 
> [conditionals] fragment-steps=5:vertex-steps=0: FPS: 1337 FrameTime: 
> 0.748 ms
> 
> [conditionals] fragment-steps=0:vertex-steps=5: FPS: 1329 FrameTime: 
> 0.752 ms
> 
> [function] fragment-complexity=low:fragment-steps=5: FPS: 1343 
> FrameTime: 0.745 ms
> 
> [function] fragment-complexity=medium:fragment-steps=5: FPS: 1343 
> FrameTime: 0.745 ms
> 
> [loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 1315 
> FrameTime: 0.760 ms
> 
> [loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 1221 
> FrameTime: 0.819 ms
> 
> [loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 1275 
> FrameTime: 0.784 ms
> 
> =======================================================
> 
>                                    glmark2 Score: 1142
> 
> =======================================================
> 
> and no apparent issues arised.
> 
> I'm attaching two log sessions from my latest laptop trials. Inside you 
> can find a brief description of the operations carried out for each session.
> 
> I noticed that in one of them it looks like one source of the freezing 
> is related to /rt_mutex/, which I'm not really confident with but I have 
> the feeling is the actual source of our problem (probably in the way we 
> implemented the ethercat communication in our code).
> 
> Please let me know if you have any suggestions or need any other 
> information.
> 
> Thanks again,
> 
> Simone
> 
> Roberto Viola
> Technical Dept
> +39 0536836680
> 
> *SYSTEM CERAMICS S.p.A.*
> Via Ghiarola Vecchia, 73
> 41042 Fiorano (Mo) ITALY
> +39 0536 836111
> info at system-electronics.it <mailto:info at system-electronics.it>
> www.system-electronics.it <http://www.system-electronics.it>
> 
> ------------------------------------------------------------------------
> 
> Le informazioni contenute in questa email, inclusi i suoi allegati, sono 
> riservate e ad uso esclusivo del destinatario. Qualora le fosse 
> pervenuta per errore, lei non è autorizzato a copiare, inoltrare e/o 
> rendere nota questa email e i suoi allegati, totalmente o parzialmente, 
> e pertanto la preghiamo di cancellarla immediatamente senza visionarne 
> il contenuto e gli allegati.
> 
> Avvertenza: la presente casella e-mail ed i messaggi da essa derivanti, 
> sono di esclusivo utilizzo aziendale /lavorativo e mai personale.
> 
> Risposte al presente messaggio: si avvisa il destinatario che eventuali 
> sue risposte, potranno essere lette dall’intera azienda /ufficio 
> /reparto di appartenenza del mittente.
> 
> The information contained in this e-mail, including attachments, is 
> confidential and exclusively for the use of the intended recipient. If 
> you received this communication by mistake you are not authorized to 
> copy, send and/or publish this message and its attachments, in whole or 
> in part and therefore please delete this message.
> 
> ____________________________________________________
> 
> *SIMONE COMARI*
> 
> /Research Fellow//
> /DIN – Dept. of Industrial Engineering
> Alma Mater Studiorum –  University of Bologna
> Via Umberto Terracini, 24, 40131 Bologna (BO), Italy
> 
> E-mail: simone.comari2 at unibo.it <mailto:simone.comari2 at unibo.it>
> Websites:
> https://www.unibo.it/sitoweb/simone.comari2
> http://grab.diem.unibo.it <http://grab.diem.unibo.it/>
> 
> ------------------------------------------------------------------------
> 
> *Da:*Viola Roberto <roberto.viola at systemceramics.com 
> <mailto:roberto.viola at systemceramics.com>>
> *Inviato:* lunedì 27 maggio 2019 10:57
> *A:* Simone Comari; etherlab-users at etherlab.org 
> <mailto:etherlab-users at etherlab.org>
> *Oggetto:* R: System randomly freezes in multi-thread Qt application 
> with a RT process
> 
> Hi Simone,
> 
>    from the logs it seems a issue releated to your i915 (video card, i 
> guess inside your CPU).
> 
> You should try to understand the cause of issue: i suggest to try 
> without the ethercat and stressing the cpu and the video card in some 
> other way. I guess it’s not related to ethercat.
> 
> BTW what videocard do you have?
> 
> Did you try to catch the logs on other systems (laptop for example)?
> 
> R.
> 
> *Da:*Simone Comari [mailto:simone.comari2 at unibo.it]
> *Inviato:* giovedì 23 maggio 2019 12:51
> *A:* Viola Roberto <roberto.viola at systemceramics.com 
> <mailto:roberto.viola at systemceramics.com>>; etherlab-users at etherlab.org 
> <mailto:etherlab-users at etherlab.org>
> *Oggetto:* R: System randomly freezes in multi-thread Qt application 
> with a RT process
> 
> Hi Roberto,
> 
> First of all, thank you for your quick response.
> 
> Attached you can find the /kernel.log/ and /system.log/ of a single 
> session, that is:
> 
>  1. Boot up
>  2. Application launch (successful ethercat network setup) through Qt IDE
>  3. Successful enabling of a single motor (i.e. one of the ethercat
>     slaves) through our GUI
>  4. Simple operation (e.g. manual velocity control) until problem occurs
>     (it took a couple of minutes this time)
>  5. Hard shut-down of the "frozen" system
> 
> I hope these are the logs you were talking about, please let me know 
> otherwise.
> 
> Maybe it's worth mentioning we followed these 
> <https://github.com/UNIBO-GRABLab/cable_robot/wiki/Installation> 
> instructions to install both the RT kernel and ethercat libs, just in 
> case we misused patches or configurations.
> 
> Thanks again.
> 
> Best regards,
> 
> Simone
> 
> Roberto Viola
> 
> Technical Dept
> 
> +39 0536836680
> 
> *SYSTEM CERAMICS S.p.A.*
> 
> Via Ghiarola Vecchia, 73
> 
> 41042 Fiorano (Mo) ITALY
> 
> +39 0536 836111
> 
> info at system-electronics.it <mailto:info at system-electronics.it>
> 
> www.system-electronics.it <http://www.system-electronics.it>
> 
> ------------------------------------------------------------------------
> 
> Le informazioni contenute in questa email, inclusi i suoi allegati, sono 
> riservate e ad uso esclusivo del destinatario. Qualora le fosse 
> pervenuta per errore, lei non è autorizzato a copiare, inoltrare e/o 
> rendere nota questa email e i suoi allegati, totalmente o parzialmente, 
> e pertanto la preghiamo di cancellarla immediatamente senza visionarne 
> il contenuto e gli allegati.
> 
> Avvertenza: la presente casella e-mail ed i messaggi da essa derivanti, 
> sono di esclusivo utilizzo aziendale /lavorativo e mai personale.
> 
> Risposte al presente messaggio: si avvisa il destinatario che eventuali 
> sue risposte, potranno essere lette dall’intera azienda /ufficio 
> /reparto di appartenenza del mittente.
> 
> The information contained in this e-mail, including attachments, is 
> confidential and exclusively for the use of the intended recipient. If 
> you received this communication by mistake you are not authorized to 
> copy, send and/or publish this message and its attachments, in whole or 
> in part and therefore please delete this message.
> 
> ____________________________________________________
> 
> *SIMONE COMARI*
> 
> /Research Fellow//
> /DIN – Dept. of Industrial Engineering
> Alma Mater Studiorum –  University of Bologna
> Via Umberto Terracini, 24, 40131 Bologna (BO), Italy
> 
> E-mail: simone.comari2 at unibo.it <mailto:simone.comari2 at unibo.it>
> Websites:
> https://www.unibo.it/sitoweb/simone.comari2
> http://grab.diem.unibo.it <http://grab.diem.unibo.it/>
> 
> ------------------------------------------------------------------------
> 
> *Da:*Viola Roberto <roberto.viola at systemceramics.com 
> <mailto:roberto.viola at systemceramics.com>>
> *Inviato:* giovedì 23 maggio 2019 08:00
> *A:* Simone Comari; etherlab-users at etherlab.org 
> <mailto:etherlab-users at etherlab.org>
> *Oggetto:* R: System randomly freezes in multi-thread Qt application 
> with a RT process
> 
> Hi Simone, just a quick hint in order to understand the freeze: try to 
> run the setup inside a VM (kvm or virtualbox) in order to catch the 
> serial log from the kernel or, if you have a UART avaiable on your 
> system, directly from it.
> 
> In this way we should try to understand the issue better.
> 
> R.
> 
> *Da:*etherlab-users [mailto:etherlab-users-bounces at etherlab.org] *Per 
> conto di *Simone Comari
> *Inviato:* mercoledì 22 maggio 2019 18:52
> *A:* etherlab-users at etherlab.org <mailto:etherlab-users at etherlab.org>
> *Oggetto:* [etherlab-users] System randomly freezes in multi-thread Qt 
> application with a RT process
> 
> Hi all,
> 
> I am a young research fellow at the university of Bologna and I just 
> started working with EtherCAT technology and RT systems yet, so please 
> forgive me if I misuse words or I'm not precise enough.
> 
> 
> First, I'll try to describe my setup:
> 
>   * Ubuntu 16.04.6 with patched fully preemptible RT kernel 4.13.13-rt5
>   * Qt 5.12.2
>   * PCI driver e1000e
>   * Ethercat master running on this Linux RT
>   * Elmo GOLD SOLO WHISTLE Drives (ethercat slaves) 
> 
> Secondly, a brief outline of my software architecture:
> 
>   * POSIX threads
>   * Qt-based GUI running on a non-RT thread
>   * Ethercat network setup (ethercat master and slaves init) done in the
>     same non-RT thread
>   * If initialization is successful, start a new RT-thread in charge of
>     handling all ethercat-related functionalities (read/write/status-check).
>   * Shared resources between RT and non-RT ones handled with
>     pthread_mutex (even if I'm not 100% sure I'm using it correctly)
>   * Implementation of our generic ethercat master can be found here
>     <https://github.com/UNIBO-GRABLab/grab_common/blob/e5278b6fe611654bfa84c951d8b77e56ebbc8fa9/libgrabec/src/ethercatmaster.cpp>
> 
> 
> Problem description:
> 
>   * Once the ethercat network is setup and the RT thread is started,
>     quite randomly the system freezes without errors of any sorts.
>     Sometimes it happens when motors are enabled and operational,
>     sometimes when they are enabled and idle, sometimes even if they are
>     disabled. It is not reproducible and I couldn't link it to any
>     particular step in my application. Sometimes it happens even if I
>     simply start it, but always after successful initialization.
>   * Even when I manage to close the application, next time I try to run
>     it it tells me that master is busy and ec_e1000e is in use. Only
>     solution is to manually hard-shut-down the PC.
>   * Other thing I noticed is that even if the main thread (the GUI one,
>     so non-RT) is closed, the child RT-thread stays running with
>     status D  (uninterruptible sleep) blocking a great deal of CPU (that
>     is why probably the whole system freezes).
>   * We tried with different computers (both laptop and desktop) to
>     exclude a platform's dependency, but the issue remains.
> 
> Please let me know if there is any missing important information that 
> can help understanding the problem.
> 
> Thank you a lot for the support.
> 
> Best regards,
> 
> Simone
> 
> Roberto Viola
> 
> Technical Dept
> 
> +39 0536836680
> 
> *SYSTEM CERAMICS S.p.A.*
> 
> Via Ghiarola Vecchia, 73
> 
> 41042 Fiorano (Mo) ITALY
> 
> +39 0536 836111
> 
> info at system-electronics.it <mailto:info at system-electronics.it>
> 
> www.system-electronics.it <http://www.system-electronics.it>
> 
> ------------------------------------------------------------------------
> 
> Le informazioni contenute in questa email, inclusi i suoi allegati, sono 
> riservate e ad uso esclusivo del destinatario. Qualora le fosse 
> pervenuta per errore, lei non è autorizzato a copiare, inoltrare e/o 
> rendere nota questa email e i suoi allegati, totalmente o parzialmente, 
> e pertanto la preghiamo di cancellarla immediatamente senza visionarne 
> il contenuto e gli allegati.
> 
> Avvertenza: la presente casella e-mail ed i messaggi da essa derivanti, 
> sono di esclusivo utilizzo aziendale /lavorativo e mai personale.
> 
> Risposte al presente messaggio: si avvisa il destinatario che eventuali 
> sue risposte, potranno essere lette dall’intera azienda /ufficio 
> /reparto di appartenenza del mittente.
> 
> The information contained in this e-mail, including attachments, is 
> confidential and exclusively for the use of the intended recipient. If 
> you received this communication by mistake you are not authorized to 
> copy, send and/or publish this message and its attachments, in whole or 
> in part and therefore please delete this message.
> 
> ____________________________________________________
> 
> *SIMONE COMARI*
> 
> /Research Fellow//
> /DIN – Dept. of Industrial Engineering
> Alma Mater Studiorum –  University of Bologna
> Via Umberto Terracini, 24, 40131 Bologna (BO), Italy
> 
> E-mail: simone.comari2 at unibo.it <mailto:simone.comari2 at unibo.it>
> Websites:
> https://www.unibo.it/sitoweb/simone.comari2
> http://grab.diem.unibo.it <http://grab.diem.unibo.it/>
> 
> 
> _______________________________________________
> etherlab-users mailing list
> etherlab-users at etherlab.org
> http://lists.etherlab.org/mailman/listinfo/etherlab-users
> 



More information about the Etherlab-users mailing list