[etherlab-users] r8169 patch - packet timeout boot failures

Raz raziebe at gmail.com
Tue Dec 3 12:27:03 CET 2013


All i am doing is more of a trial and error. I do not know the realtek
driver at all.
The spinlock are needed because they are protected in the original driver
code flow . i had a boot lockup in one of my trials without them.  This
patch does not eliminate the problem entirely, but from 10 trials with 6
drives with a 100% failures to 1 out of 10 I believe it important enough to
mail to the community. as for e1000e i do not know what the problem is, i
need to check it and email you.



On Tue, Dec 3, 2013 at 1:16 PM, Jeroen Van den Keybus <
jeroen.vandenkeybus at gmail.com> wrote:

> Why the spinlock ? This driver instance shouldn't ever be reentering.
>
> I'm a bit worried that it would complicate the use of e.g. RTAI and
> Xenomai.
>
> How comes the e1000 has the same issue ?
>
> J.
>
>
>
> 2013/12/3 Raz <raziebe at gmail.com>
>
>> The bellow patch seemed to eliminate the problem. I believe the problem
>> relates to resetting some registers when link up is detected.
>>
>> diff --git a/local_src/r8169-3.2/r8169.c b/local_src/r8169-3.2/r8169.c
>> index 6df1793..a483fb5 100644
>> --- a/local_src/r8169-3.2/r8169.c
>> +++ b/local_src/r8169-3.2/r8169.c
>> @@ -1290,6 +1290,9 @@ static void __rtl8169_check_link_status(struct
>> net_device *dev,
>>
>>         if (tp->ecdev) {
>>                 ecdev_set_link(tp->ecdev, tp->link_ok(ioaddr) ? 1 : 0);
>> +               spin_lock_irqsave(&tp->lock, flags);
>> +               rtl_link_chg_patch(tp);
>> +               spin_unlock_irqrestore(&tp->lock, flags);
>>                 return;
>>         }
>>
>>
>>
>> On Tue, Dec 3, 2013 at 11:56 AM, Jeroen Van den Keybus <
>> jeroen.vandenkeybus at gmail.com> wrote:
>>
>>> Perhaps try hooking up a normal eth interface to the drive and see what
>>> the autoneg comes up with using ethtool. In the past, I have had trouble
>>> interfacing an FPGA IP core to a PC Ethernet card when the core was hard
>>> wired to 100M FD instead of advertising this using autoneg. The PC card
>>> tried to autoneg and then fell back to 100M HD.
>>>
>>> You could try testing with an EK1100 in between the PC and the drive.
>>>
>>> J.
>>>
>>>
>>> 2013/12/3 Raz <raziebe at gmail.com>
>>>
>>>> I do not have ethtool over the ethercat device as it is removed. How
>>>> can I tell ? eth0 is 100Mbps but it is my public interface. eth1 is my
>>>> ethercat interface.
>>>>
>>>> There is always a link.  the first slave is a drive, not an io device .
>>>> This drive is running xilinix with port stack and ip core of beckhof.
>>>> I am trying to debug now the realtek driver, let see...
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Dec 3, 2013 at 11:36 AM, Jeroen Van den Keybus <
>>>> jeroen.vandenkeybus at gmail.com> wrote:
>>>>
>>>>> It would be very useful to know whether e.g. the interfaces ended up
>>>>> in 100M half duplex or so. Is there a link in those cases ? What's the
>>>>> first EtherCAT station ? Maybe it doesn't handle autoneg properly during
>>>>> its reset phase ?
>>>>>
>>>>> J.
>>>>>
>>>>>
>>>>>
>>>>> 2013/12/3 Raz <raziebe at gmail.com>
>>>>>
>>>>>> hey
>>>>>> Problem happens with intel e1000e as well as realtek.  One way to
>>>>>> bypass it is to boot the master while the ethernet-ethercat cable is
>>>>>> disconnected, and once master claims the interface , connect this cable.
>>>>>> This appears to work.
>>>>>> So , There some sort of of initialisation error.
>>>>>>
>>>>>>
>>>>>> On Mon, Dec 2, 2013 at 11:32 AM, Raz <raziebe at gmail.com> wrote:
>>>>>>
>>>>>>> I still do not have a scenario. it "sometimes" happens. The
>>>>>>> -DRTL8169_DEBUG is something i did not know, so i will check and see. thx
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Dec 2, 2013 at 11:27 AM, Jeroen Van den Keybus <
>>>>>>> jeroen.vandenkeybus at gmail.com> wrote:
>>>>>>>
>>>>>>>> Is there a difference between cold and warm boot ? Does unloading
>>>>>>>> the ec driver, loading/unloading the stock r8169 driver and then reloading
>>>>>>>> the ec driver work better ? Same scenario but with Realtek drivers (r8168)
>>>>>>>> ? Also perhaps compile with -DRTL8169_DEBUG ?
>>>>>>>>
>>>>>>>> Just some thoughts.
>>>>>>>>
>>>>>>>> J.
>>>>>>>>
>>>>>>>>
>>>>>>>> 2013/12/2 Raz <raziebe at gmail.com>
>>>>>>>>
>>>>>>>>> The timeouts happens after the system boots and not while slaves
>>>>>>>>> are in in OP mode. So my transmit is irrelevant here, even though a
>>>>>>>>> transmit happens only from a single thread of through an ioctl ( SDO reads
>>>>>>>>> and so on..)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Dec 2, 2013 at 11:01 AM, Jeroen Van den Keybus <
>>>>>>>>> jeroen.vandenkeybus at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> 1. why do you disable the rtl8169_phy_timer  timer ?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The rtl8169_phy_timer is regularly polled in ec_poll instead.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> 2.  In rtl_hw_start_8168 : why do disable RTL_W16(IntrMask,
>>>>>>>>>>> tp->intr_event); ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> The drivers are all non-blocking and interrupt-free. All work
>>>>>>>>>> that interrupt handlers normally do is done in ec_poll instead.
>>>>>>>>>>
>>>>>>>>>> If you cannot send packets anymore, I suspect that you may have
>>>>>>>>>> overrun the tx queue, i.e. sent a packet before the previous one has been
>>>>>>>>>> completed. You're also not calling the ethercat transmission functions from
>>>>>>>>>> different threads, right ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> thank you
>>>>>>>>>>> raz
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> https://sites.google.com/site/ironspeedlinux/
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> etherlab-users mailing list
>>>>>>>>>>> etherlab-users at etherlab.org
>>>>>>>>>>> http://lists.etherlab.org/mailman/listinfo/etherlab-users
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> https://sites.google.com/site/ironspeedlinux/
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> https://sites.google.com/site/ironspeedlinux/
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> https://sites.google.com/site/ironspeedlinux/
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> https://sites.google.com/site/ironspeedlinux/
>>>>
>>>
>>>
>>
>>
>> --
>> https://sites.google.com/site/ironspeedlinux/
>>
>
>


-- 
https://sites.google.com/site/ironspeedlinux/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.etherlab.org/pipermail/etherlab-users/attachments/20131203/34b53b80/attachment-0003.htm>


More information about the Etherlab-users mailing list