[etherlab-users] r8169 patch - packet timeout boot failures

Tue Dec 3 11:43:30 CET 2013

The bellow patch seemed to eliminate the problem. I believe the problem
relates to resetting some registers when link up is detected.

diff --git a/local_src/r8169-3.2/r8169.c b/local_src/r8169-3.2/r8169.c
index 6df1793..a483fb5 100644
--- a/local_src/r8169-3.2/r8169.c
+++ b/local_src/r8169-3.2/r8169.c
@@ -1290,6 +1290,9 @@ static void __rtl8169_check_link_status(struct
net_device *dev,

        if (tp->ecdev) {
                ecdev_set_link(tp->ecdev, tp->link_ok(ioaddr) ? 1 : 0);
+               spin_lock_irqsave(&tp->lock, flags);
+               rtl_link_chg_patch(tp);
+               spin_unlock_irqrestore(&tp->lock, flags);
                return;
        }



On Tue, Dec 3, 2013 at 11:56 AM, Jeroen Van den Keybus <
jeroen.vandenkeybus at gmail.com> wrote:

> Perhaps try hooking up a normal eth interface to the drive and see what
> the autoneg comes up with using ethtool. In the past, I have had trouble
> interfacing an FPGA IP core to a PC Ethernet card when the core was hard
> wired to 100M FD instead of advertising this using autoneg. The PC card
> tried to autoneg and then fell back to 100M HD.
>
> You could try testing with an EK1100 in between the PC and the drive.
>
> J.
>
>
> 2013/12/3 Raz <raziebe at gmail.com>
>
>> I do not have ethtool over the ethercat device as it is removed. How can
>> I tell ? eth0 is 100Mbps but it is my public interface. eth1 is my ethercat
>> interface.
>>
>> There is always a link.  the first slave is a drive, not an io device .
>> This drive is running xilinix with port stack and ip core of beckhof.
>> I am trying to debug now the realtek driver, let see...
>>
>>
>>
>>
>> On Tue, Dec 3, 2013 at 11:36 AM, Jeroen Van den Keybus <
>> jeroen.vandenkeybus at gmail.com> wrote:
>>
>>> It would be very useful to know whether e.g. the interfaces ended up in
>>> 100M half duplex or so. Is there a link in those cases ? What's the first
>>> EtherCAT station ? Maybe it doesn't handle autoneg properly during its
>>> reset phase ?
>>>
>>> J.
>>>
>>>
>>>
>>> 2013/12/3 Raz <raziebe at gmail.com>
>>>
>>>> hey
>>>> Problem happens with intel e1000e as well as realtek.  One way to
>>>> bypass it is to boot the master while the ethernet-ethercat cable is
>>>> disconnected, and once master claims the interface , connect this cable.
>>>> This appears to work.
>>>> So , There some sort of of initialisation error.
>>>>
>>>>
>>>> On Mon, Dec 2, 2013 at 11:32 AM, Raz <raziebe at gmail.com> wrote:
>>>>
>>>>> I still do not have a scenario. it "sometimes" happens. The
>>>>> -DRTL8169_DEBUG is something i did not know, so i will check and see. thx
>>>>>
>>>>>
>>>>> On Mon, Dec 2, 2013 at 11:27 AM, Jeroen Van den Keybus <
>>>>> jeroen.vandenkeybus at gmail.com> wrote:
>>>>>
>>>>>> Is there a difference between cold and warm boot ? Does unloading the
>>>>>> ec driver, loading/unloading the stock r8169 driver and then reloading the
>>>>>> ec driver work better ? Same scenario but with Realtek drivers (r8168) ?
>>>>>> Also perhaps compile with -DRTL8169_DEBUG ?
>>>>>>
>>>>>> Just some thoughts.
>>>>>>
>>>>>> J.
>>>>>>
>>>>>>
>>>>>> 2013/12/2 Raz <raziebe at gmail.com>
>>>>>>
>>>>>>> The timeouts happens after the system boots and not while slaves are
>>>>>>> in in OP mode. So my transmit is irrelevant here, even though a transmit
>>>>>>> happens only from a single thread of through an ioctl ( SDO reads and so
>>>>>>> on..)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Dec 2, 2013 at 11:01 AM, Jeroen Van den Keybus <
>>>>>>> jeroen.vandenkeybus at gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>> 1. why do you disable the rtl8169_phy_timer  timer ?
>>>>>>>>>
>>>>>>>>
>>>>>>>> The rtl8169_phy_timer is regularly polled in ec_poll instead.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> 2.  In rtl_hw_start_8168 : why do disable RTL_W16(IntrMask,
>>>>>>>>> tp->intr_event); ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>> The drivers are all non-blocking and interrupt-free. All work that
>>>>>>>> interrupt handlers normally do is done in ec_poll instead.
>>>>>>>>
>>>>>>>> If you cannot send packets anymore, I suspect that you may have
>>>>>>>> overrun the tx queue, i.e. sent a packet before the previous one has been
>>>>>>>> completed. You're also not calling the ethercat transmission functions from
>>>>>>>> different threads, right ?
>>>>>>>>
>>>>>>>>
>>>>>>>> thank you
>>>>>>>>> raz
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> https://sites.google.com/site/ironspeedlinux/
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> etherlab-users mailing list
>>>>>>>>> etherlab-users at etherlab.org
>>>>>>>>> http://lists.etherlab.org/mailman/listinfo/etherlab-users
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> https://sites.google.com/site/ironspeedlinux/
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> https://sites.google.com/site/ironspeedlinux/
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> https://sites.google.com/site/ironspeedlinux/
>>>>
>>>
>>>
>>
>>
>> --
>> https://sites.google.com/site/ironspeedlinux/
>>
>
>


-- 
https://sites.google.com/site/ironspeedlinux/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.etherlab.org/pipermail/etherlab-users/attachments/20131203/1142dcdf/attachment-0003.htm>