<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"Segoe UI Emoji";
panose-1:2 11 5 2 4 2 4 2 2 3;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle18
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle19
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle20
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle21
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle22
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle23
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle24
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle25
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle26
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle30
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-NZ" link="#0563C1" vlink="#954F72">
<div class="WordSection1">
<p class="MsoNormal">Hi,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">"/features/parallel-slave/0008-fsm_sii-loading-check.patch" is still required.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">This patch fixes a problem where the SII data may still be loading from EEPROM while the slave fsm starts to read it, resulting in bad SII data. This situation can occur when rescanning the slaves after a hotplug and one of the newly connected
slaves is a little slow reading from the EEPROM (in my case EL1008 modules in particular).<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">It's probably worth being a base patch also. I don't think there's anything relying on the parallel-slave patches. It's just that it's more of an issue after the parallel-slave patches as you initialise more modules sooner so it's more
likely to happen.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Cheers,<o:p></o:p></p>
<p class="MsoNormal">Graeme.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="mso-fareast-language:EN-NZ">From:</span></b><span lang="EN-US" style="mso-fareast-language:EN-NZ"> Gavin Lambert <gavin.lambert@tomra.com>
<br>
<b>Sent:</b> Tuesday, 11 June 2019 12:41 PM<br>
<b>To:</b> Graeme Foot <Graeme.Foot@touchcut.com>; etherlab-dev@etherlab.org<br>
<b>Subject:</b> RE: Missing Vendor ID / Product Code<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Ah, I see. I think the original version relied on simply not sending those datagrams later on. Usually there’s only one or two cycles where a slave FSM is “blocked” like that, so it didn’t eat up too many datagrams and didn’t “wrap around”
to cause issues with other slaves due to the limit on parallel FSMs.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">But yes, you’re correct that it’s better to not consume the datagram from the ring in the first place, especially if there’s going to be prolonged inactivity.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I’ll probably end up putting this in the base patches, perhaps even folding it into one of the existing ones. It’s probably getting about time to make another patchset release soon anyway.
<span style="font-family:"Segoe UI Emoji",sans-serif">😊</span><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Does this supersede your “fsm_sii-loading-check.patch” or do you think that’s still useful?<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p><strong><span style="font-family:"Calibri",sans-serif;color:#595959">Gavin Lambert</span></strong><b><span style="color:#595959"><br>
</span></b><span style="color:#595959">Senior Software Developer<o:p></o:p></span></p>
<table class="MsoNormalTable" border="0" cellpadding="0">
<tbody>
<tr>
<td style="padding:.75pt .75pt .75pt .75pt"></td>
</tr>
</tbody>
</table>
<p><span style="font-size:10.0pt;font-family:"Arial",sans-serif"><img width="360" height="102" style="width:3.75in;height:1.0625in" id="_x0000_i1025" src="cid:image001.png@01D52054.31E09080"><br>
<a href="http://www.compacsort.com"><span style="text-decoration:none"><img border="0" width="33" height="37" style="width:.3437in;height:.3854in" id="_x0000_i1026" src="cid:image002.png@01D52054.31E09080" alt="TOMRA"></span></a><a href="https://www.facebook.com/Compacsort"><span style="text-decoration:none"><img border="0" width="35" height="37" style="width:.3645in;height:.3854in" id="_x0000_i1027" src="cid:image003.png@01D52054.31E09080" alt="Facebook"></span></a><a href="https://www.linkedin.com/company/compac-sorting-equipment/"><span style="text-decoration:none"><img border="0" width="35" height="37" style="width:.3645in;height:.3854in" id="_x0000_i1028" src="cid:image004.png@01D52054.31E09080" alt="Linkedin"></span></a><a href="https://vimeo.com/compacsort"><span style="text-decoration:none"><img border="0" width="37" height="37" style="width:.3854in;height:.3854in" id="_x0000_i1029" src="cid:image005.png@01D52054.31E09080" alt="Youtube"></span></a><a href="https://twitter.com/compacsort"><span style="text-decoration:none"><img border="0" width="33" height="37" style="width:.3437in;height:.3854in" id="_x0000_i1030" src="cid:image006.png@01D52054.31E09080" alt="twitter"></span></a><o:p></o:p></span></p>
<p><b><span style="font-size:8.5pt;color:#595959">COMPAC SORTING EQUIPMENT LTD</span></b><span style="font-size:8.5pt;color:#595959"> | 4 Henderson Pl | Onehunga | Auckland 1061 | New Zealand<br>
Switchboard: +64 96 34 00 88 | <a href="http://www.tomra.com">tomra.com</a> <o:p>
</o:p></span></p>
<table class="MsoNormalTable" border="0" cellspacing="0" cellpadding="0" style="border-collapse:collapse">
<tbody>
<tr>
<td valign="top" style="border-top:solid #595959 1.0pt;border-left:none;border-bottom:solid #595959 1.0pt;border-right:none;padding:0cm 0cm 0cm 0cm">
<p><span style="font-size:8.5pt;color:#595959">The information contained in this communication and any attachment is confidential and may be legally privileged. It should only be read by the person(s) to whom it is addressed. If you have received this communication
in error, please notify the sender and delete the communication. <o:p></o:p></span></p>
</td>
</tr>
</tbody>
</table>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="mso-fareast-language:EN-NZ">From:</span></b><span lang="EN-US" style="mso-fareast-language:EN-NZ"> Graeme Foot <<a href="mailto:Graeme.Foot@touchcut.com">Graeme.Foot@touchcut.com</a>>
<br>
<b>Sent:</b> Tuesday, 11 June 2019 12:22<br>
<b>To:</b> Gavin Lambert <<a href="mailto:gavin.lambert@tomra.com">gavin.lambert@tomra.com</a>>;
<a href="mailto:etherlab-dev@etherlab.org">etherlab-dev@etherlab.org</a><br>
<b>Subject:</b> RE: Missing Vendor ID / Product Code<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hi,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Yes I saw "base/0019..." adding EC_DATAGRAM_INVALID. I didn't explicitly see "base/0026..." adding the code to prevent abandoning the mailbox fsms but I was working with it's merged code. "base/0026..." added another case where the external
datagram queue has the issue due to not abandoning the fsm, but also not using the datagram.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">My patch will plug that hole now by not incrementing the queue index if the datagram is not used (if flagged with EC_DATAGRAM_INVALID).<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Regards,<o:p></o:p></p>
<p class="MsoNormal">Graeme.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="mso-fareast-language:EN-NZ">From:</span></b><span lang="EN-US" style="mso-fareast-language:EN-NZ"> Gavin Lambert <<a href="mailto:gavin.lambert@tomra.com">gavin.lambert@tomra.com</a>>
<br>
<b>Sent:</b> Tuesday, 11 June 2019 12:02 PM<br>
<b>To:</b> Graeme Foot <<a href="mailto:Graeme.Foot@touchcut.com">Graeme.Foot@touchcut.com</a>>;
<a href="mailto:etherlab-dev@etherlab.org">etherlab-dev@etherlab.org</a><br>
<b>Subject:</b> RE: Missing Vendor ID / Product Code<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Did you have a look at base/0026-Prevent-abandoning-the-mailbox-state-machines-early-.patch? Because that does something similar.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">(It was base/0019-Support-for-multiple-mailbox-protocols.patch which added the handling of the INVALID datagram state for the mailbox state machines. The one above was a bugfix for this patch, essentially.)<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p><strong><span style="font-family:"Calibri",sans-serif;color:#595959">Gavin Lambert</span></strong><b><span style="color:#595959"><br>
</span></b><span style="color:#595959">Senior Software Developer<o:p></o:p></span></p>
<table class="MsoNormalTable" border="0" cellpadding="0">
<tbody>
<tr>
<td style="padding:.75pt .75pt .75pt .75pt"></td>
</tr>
</tbody>
</table>
<p><span style="font-size:10.0pt;font-family:"Arial",sans-serif"><img border="0" width="360" height="102" style="width:3.75in;height:1.0625in" id="_x0000_i1031" src="cid:image001.png@01D52054.31E09080"><br>
<a href="https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.compacsort.com&data=02%7C01%7Cgavin.lambert%40tomra.com%7Cdd0195eb3ce54dbefac408d6ee02e76c%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C636958093607009911&sdata=pR5aviztMJhNMWW34T%2FbDK1RZtQToowI6GNjZJts93w%3D&reserved=0"><span style="text-decoration:none"><img border="0" width="33" height="37" style="width:.3437in;height:.3854in" id="_x0000_i1032" src="cid:image002.png@01D52054.31E09080" alt="TOMRA"></span></a><a href="https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.facebook.com%2FCompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7Cdd0195eb3ce54dbefac408d6ee02e76c%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C636958093607019906&sdata=%2B7YFQ8sYT9OUbBED8vwo4eVy8t0bEfZGyn7Qfbm1pRI%3D&reserved=0"><span style="text-decoration:none"><img border="0" width="35" height="37" style="width:.3645in;height:.3854in" id="_x0000_i1033" src="cid:image003.png@01D52054.31E09080" alt="Facebook"></span></a><a href="https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Fcompac-sorting-equipment%2F&data=02%7C01%7Cgavin.lambert%40tomra.com%7Cdd0195eb3ce54dbefac408d6ee02e76c%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C636958093607019906&sdata=V0GyMdX8DgYSfzZH81g%2Bg2vYXCCwuj%2FCwQMK02Lqj0k%3D&reserved=0"><span style="text-decoration:none"><img border="0" width="35" height="37" style="width:.3645in;height:.3854in" id="_x0000_i1034" src="cid:image004.png@01D52054.31E09080" alt="Linkedin"></span></a><a href="https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fvimeo.com%2Fcompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7Cdd0195eb3ce54dbefac408d6ee02e76c%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C636958093607029901&sdata=UoA4SQcWDw9TAFLP9A2MVKP4Bl%2B1tY19Kn6q16DdBDg%3D&reserved=0"><span style="text-decoration:none"><img border="0" width="37" height="37" style="width:.3854in;height:.3854in" id="_x0000_i1035" src="cid:image005.png@01D52054.31E09080" alt="Youtube"></span></a><a href="https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fcompacsort&data=02%7C01%7Cgavin.lambert%40tomra.com%7Cdd0195eb3ce54dbefac408d6ee02e76c%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C636958093607029901&sdata=W%2BOfvs51VEO%2Bqi9m%2B78eVLoY%2B2irh3MyoX3IWpqLXQM%3D&reserved=0"><span style="text-decoration:none"><img border="0" width="33" height="37" style="width:.3437in;height:.3854in" id="_x0000_i1036" src="cid:image006.png@01D52054.31E09080" alt="twitter"></span></a><o:p></o:p></span></p>
<p><b><span style="font-size:8.5pt;color:#595959">COMPAC SORTING EQUIPMENT LTD</span></b><span style="font-size:8.5pt;color:#595959"> | 4 Henderson Pl | Onehunga | Auckland 1061 | New Zealand<br>
Switchboard: +64 96 34 00 88 | <a href="https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tomra.com&data=02%7C01%7Cgavin.lambert%40tomra.com%7Cdd0195eb3ce54dbefac408d6ee02e76c%7C4308d118edd143008a37cfeba8ad5898%7C0%7C0%7C636958093607029901&sdata=UDw5M9AhWocxlt0x1HZ2ddDs5NL5yAWaYNliVXQPtHY%3D&reserved=0">
tomra.com</a> <o:p></o:p></span></p>
<table class="MsoNormalTable" border="0" cellspacing="0" cellpadding="0" style="border-collapse:collapse">
<tbody>
<tr>
<td valign="top" style="border-top:solid #595959 1.0pt;border-left:none;border-bottom:solid #595959 1.0pt;border-right:none;padding:0cm 0cm 0cm 0cm">
<p><span style="font-size:8.5pt;color:#595959">The information contained in this communication and any attachment is confidential and may be legally privileged. It should only be read by the person(s) to whom it is addressed. If you have received this communication
in error, please notify the sender and delete the communication. <o:p></o:p></span></p>
</td>
</tr>
</tbody>
</table>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="mso-fareast-language:EN-NZ">From:</span></b><span lang="EN-US" style="mso-fareast-language:EN-NZ"> Graeme Foot <<a href="mailto:Graeme.Foot@touchcut.com">Graeme.Foot@touchcut.com</a>>
<br>
<b>Sent:</b> Tuesday, 11 June 2019 11:52<br>
<b>To:</b> <a href="mailto:etherlab-dev@etherlab.org">etherlab-dev@etherlab.org</a><br>
<b>Cc:</b> Gavin Lambert <<a href="mailto:gavin.lambert@tomra.com">gavin.lambert@tomra.com</a>><br>
<b>Subject:</b> RE: Missing Vendor ID / Product Code<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hi,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Unfortunately "0008-fsm_sii-loading-check.patch" (below) didn't fix my main problem. It turns out it is an inherent problem with how the masters external datagram ring works. I have attached a patch that plugs the hole causing the problem
I was having but there may be other cases where issues could occur.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Patch: /features/parallel-slave/0009-ec_master_exec_slave_fsms-external-datagram-fix.patch<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The guts of the problem:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><b>ec_master_exec_slave_fsms()</b> calls <b>ec_master_get_external_datagram()</b> to get a datagram from the external datagram ring. The datagram is then passed to
<b>ec_fsm_slave_exec() </b>of the slaves with some work to do. This call will then return either 1 for fsm still in progress or 0 for fsm is complete. The master assumes that if the fsm is still in progress then the datagram has been consumed and is in use,
but there are various cases where this is not true. If any of these cases occur then in the first loop of
<b>ec_master_exec_slave_fsms()</b> these slaves fsm's may be executed multiple times while another slaves fsm is waiting on its datagram to return.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">If too many slaves, or cycles, occur during this time then the waiting slaves datagram either gets its state set to
<b>EC_DATAGRAM_INVALID</b> or gets reused by another slave. This can lead to "cancelled" datagram replies or the two slaves getting the results from the second slaves datagram (as the first datagram index will be replaced and its reply is lost).<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">In my case this was occurring due to using the "0001-load-sii-from-file.patch" patch. During the SII config stage of a slave this patch will create a kthread to attempt to read the SII file from disk. In the meantime the
<b>ec_fsm_slave_exec() </b>command will continue returning a value of 1 (fsm in progress) but will not be using the presented datagrams (setting the datagram state to
<b>EC_DATAGRAM_INVALID</b>).<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">During initial startup and configuration of the master the <b>
ec_master_exec_slave_fsms()</b> call is made from <b>ec_master_idle_thread()</b> in a loop with (in my configuration) a call to schedule() before resuming the loop. This means that multiple loops may occur before a reply to a slaves datagram returns, leaving
plenty of time for the in-use datagrams to be recycled resulting in its state or data being overwritten.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The patch I have attached now also tests the datagrams state for
<b>EC_DATAGRAM_INVALID</b> before incrementing the external datagram ring index. This solves my problem where the datagrams state is being set to EC_DATAGRAM_INVALID while waiting for the kthread to complete.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I suspect there may be other instances where this problem could occur. One case I have thought of, but haven't been able to confirm, is when multiple protocols try to access a slaves mailbox at the same time (e.g. COE, EOE, FOE etc).
Only one protocol is allowed to communicate at a time. The other protocols will be offered a datagram from the ring, but they aren't able to use it until their turn comes up. In these cases if
<b>ec_read_mbox_locked()</b> fails the datagram state is also set to <b>EC_DATAGRAM_INVALID</b> so the patch should also cover this case.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Regards,<o:p></o:p></p>
<p class="MsoNormal">Graeme.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="mso-fareast-language:EN-NZ">From:</span></b><span lang="EN-US" style="mso-fareast-language:EN-NZ"> etherlab-dev <<a href="mailto:etherlab-dev-bounces@etherlab.org">etherlab-dev-bounces@etherlab.org</a>>
<b>On Behalf Of </b>Graeme Foot<br>
<b>Sent:</b> Monday, 4 March 2019 2:36 PM<br>
<b>To:</b> <a href="mailto:etherlab-dev@etherlab.org">etherlab-dev@etherlab.org</a><br>
<b>Subject:</b> Re: [etherlab-dev] Missing Vendor ID / Product Code<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hi,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I think I've finally solved the problem. The slaves with the issue are returning with the "EEPROM not loaded" bit set when reading the SII information (bit 12 if the EEPROM status word). If this bit is set then the slave has not yet finished
reading the SII information from the EEPROM and the data returned may not be valid. The master code was not checking for this bit. I have attached a patch to do so:<o:p></o:p></p>
<p class="MsoNormal">/features/parallel-slave/0008-fsm_sii-loading-check.patch<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The patch checks if the bit is set and keeps re-reading the EEPROM data until it is not. At this point the data returned is still incorrect so a complete read is requested (where a write is first sent asking for the slave to load the data
that needs to be read). There is a 500ms timeout waiting for the bit to be clear. If the bit does not clear then the EEPROM load may have failed (e.g. incorrect CRC value).<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">My previous patch (features/sii-read-failure/0001-sii-read-retry.patch) should no longer be required, but it may help to make reading of the SII data more robust. I've attached the latest version of this one also. It is now:<o:p></o:p></p>
<p class="MsoNormal">features/sii-read-failure/0001-slave-scan-retry.patch<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Regards,<o:p></o:p></p>
<p class="MsoNormal">Graeme Foot.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="mso-fareast-language:EN-NZ">From:</span></b><span lang="EN-US" style="mso-fareast-language:EN-NZ"> etherlab-dev <<a href="mailto:etherlab-dev-bounces@etherlab.org">etherlab-dev-bounces@etherlab.org</a>>
<b>On Behalf Of </b>Graeme Foot<br>
<b>Sent:</b> Friday, 12 October 2018 11:38 AM<br>
<b>To:</b> Gavin Lambert <<a href="mailto:gavin.lambert@tomra.com">gavin.lambert@tomra.com</a>>;
<a href="mailto:etherlab-dev@etherlab.org">etherlab-dev@etherlab.org</a><br>
<b>Subject:</b> Re: [etherlab-dev] Missing Vendor ID / Product Code<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hi,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I've had a chance to play with my testrig and have managed to consistently reproduce the problem when hot-plugging a module (I haven't had the problem again on a production machine from a normal startup that I can test on yet).<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">My system:<o:p></o:p></p>
<p class="MsoNormal">- CX2020<o:p></o:p></p>
<p class="MsoNormal">- EK1110 (alias 10001)<o:p></o:p></p>
<p class="MsoNormal">- EK1100 (alias 20000)<o:p></o:p></p>
<p class="MsoNormal">- EL1008 (alias 1)<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I start the system without the EL1008 plugged in and get it running. I then plug in the EL1008 and the SII information fails to read, resulting in a zero alias, vendorID and product code etc. I have attached a patch which resolves the
issue on my testrig (but I don't know if it will resolve my production issue).<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">With --enable-sii-override set, the patch detects a zero vendorID or product code (hopefully no one has a device with a zero product code) and then retries scanning the slave after a 100ms timeout. If --enable-sii-override is not set then
it will do the retry if reading the SII size fails.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">One strange thing I found during testing was that:<o:p></o:p></p>
<p class="MsoNormal">- When --enable-sii-override is not set ec_fsm_sii_success() would fail during the ec_fsm_slave_scan_state_sii_size() state; but<o:p></o:p></p>
<p class="MsoNormal">- When --enable-sii-override is set ec_fsm_sii_success() does not fail during the ec_fsm_slave_scan_state_sii_device() state, so I instead check for a zero vendorID or product code.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Gavin, the patch is against your previous patchset. I put the patch under:<o:p></o:p></p>
<p class="MsoNormal">features/sii-read-failure/0001-sii-read-retry.patch<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Let me know if you think there's anything dodgy with it.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Graeme.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="mso-fareast-language:EN-NZ">From:</span></b><span lang="EN-US" style="mso-fareast-language:EN-NZ"> Gavin Lambert <</span><a href="mailto:gavin.lambert@tomra.com"><span lang="EN-US" style="mso-fareast-language:EN-NZ">gavin.lambert@tomra.com</span></a><span lang="EN-US" style="mso-fareast-language:EN-NZ">>
<br>
<b>Sent:</b> Wednesday, 8 August 2018 3:13 PM<br>
<b>To:</b> Graeme Foot <</span><a href="mailto:Graeme.Foot@touchcut.com"><span lang="EN-US" style="mso-fareast-language:EN-NZ">Graeme.Foot@touchcut.com</span></a><span lang="EN-US" style="mso-fareast-language:EN-NZ">>;
</span><a href="mailto:etherlab-users@etherlab.org"><span lang="EN-US" style="mso-fareast-language:EN-NZ">etherlab-users@etherlab.org</span></a><span lang="EN-US" style="mso-fareast-language:EN-NZ"><br>
<b>Subject:</b> RE: Missing Vendor ID / Product Code<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">There’s lots of things that can cause that. Most often, I’ve seen this when packets get lost or corrupted, so the initial discovery datagrams get lost or fail. Usually bad wiring or shielding is the culprit.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I <i>think</i> it might be possible to get something similar due to an unfortunate timing coincidence – if the devices are being connected “live” then a dodgy plug-in could make the device visible in the initial device count scan, but then
disconnected before it finishes the identity discovery, but then reconnected again before it does the next device count scan (so it doesn’t try again). Replugging the devices (with less unfortunate timing) or restarting the etherlab service should both recover
from that case, however.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Or, of course, you might have found a bug. <span style="font-family:"Segoe UI Emoji",sans-serif">
😊</span> <o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">It's hard to say for sure what actually happened without seeing syslogs and/or reproducing it.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="mso-fareast-language:EN-NZ">From:</span></b><span lang="EN-US" style="mso-fareast-language:EN-NZ"> Graeme Foot<br>
<b>Sent:</b> Wednesday, 8 August 2018 14:12<br>
<b>To:</b> </span><a href="mailto:etherlab-users@etherlab.org"><span lang="EN-US" style="mso-fareast-language:EN-NZ">etherlab-users@etherlab.org</span></a><span lang="EN-US" style="mso-fareast-language:EN-NZ"><br>
<b>Subject:</b> [etherlab-users] Missing Vendor ID / Product Code<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hi,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I updated my EtherCAT system to use Gavin's patch set (revision 10, 20171108). It has been running fine on a few machines, but have just had a machine being commissioned where one of the slave modules had a zero Vendor ID and Product Code
(and I suspect it failed to read any information from the slave). Unfortunately it occurred while I was not available so our engineers reverted to the previous version (which detected the module correctly) and shipped the machine, so I have very minimal information
and no logs.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The module with the problem was the 17<sup>th</sup> module, the first EL2612 of 5. It is directly after an EL9410 power module. It has an explicit alias set. The engineers had tried repowering the whole system and replacing the module.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Until I get a machine to test on with the same behaviour I was wondering if anyone else has had problems with slaves not initialising correctly.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Thanks,<o:p></o:p></p>
<p class="MsoNormal">Graeme Foot.<o:p></o:p></p>
</div>
</div>
</body>
</html>