Fix hard reset of alix/soekris hardware under heavy NIC load

The following patch corrects a hard reset that occurs on alix and
soekris net5501 hardware. Specifically, this is an issue with the via
rhine NIC driver. Under periods of extreme load, the via rhine driver
can cause a reset of the entire system. When this happens no output is
seen on the console, the device simply reboots. This was reported in the
following ticket:

Consistent crash on Soekris NET5501
https://dev.openwrt.org/ticket/11882

I was able to reproduce the problem locally with numerous alix 2d13
platforms. Under heavy 64 byte frame size load from a Smartbits traffic
generator, the alix hardware will hard reset in under 5 minutes. The
same hardware, under the same load, on FreeBSD 9.0 does not reset.

The patch below backports most of the via rhine changes from kernel 3.5.
With OpenWrt trunk being on kernel 3.3.8, it seemed prudent to bring
these changes in to avoid this serious issue.

I've tested this heavily in my test environment and was able to keep the
device stable under load for extended periods of time without any resets.

Also, a nice side effect of this change is that it significantly raises
the performance the platform. Without this patch the alix 2d13 can move
approximately 29,000 packets per second at 64 byte frame sizes. After
this patch the alix can move 52,000 packets per second at 64 byte frame
sizes.

I put this patch under the x86 patches as those are the only platforms
with via rhine hardware that I'm aware of. If it needs to go some place
else please let me know.

Thanks,
Adam

Signed-off-by: Adam Gensler <openwrt@kristenandadam.net>

SVN-Revision: 33072
This commit is contained in:
Jo-Philipp Wich 2012-08-09 09:41:22 +00:00
parent 47380a4388
commit c8a0166212
1 changed files with 60 additions and 0 deletions

View File

@ -0,0 +1,60 @@
--- a/drivers/net/ethernet/via/via-rhine.c
+++ b/drivers/net/ethernet/via/via-rhine.c
@@ -689,9 +689,12 @@ static void __devinit rhine_reload_eepro
#ifdef CONFIG_NET_POLL_CONTROLLER
static void rhine_poll(struct net_device *dev)
{
- disable_irq(dev->irq);
- rhine_interrupt(dev->irq, (void *)dev);
- enable_irq(dev->irq);
+ struct rhine_private *rp = netdev_priv(dev);
+ const int irq = rp->pdev->irq;
+
+ disable_irq(irq);
+ rhine_interrupt(irq, dev);
+ enable_irq(irq);
}
#endif
@@ -929,7 +932,6 @@ static int __devinit rhine_init_one(stru
dev = alloc_etherdev(sizeof(struct rhine_private));
if (!dev) {
rc = -ENOMEM;
- dev_err(&pdev->dev, "alloc_etherdev failed\n");
goto err_out;
}
SET_NETDEV_DEV(dev, &pdev->dev);
@@ -973,7 +975,6 @@ static int __devinit rhine_init_one(stru
}
#endif /* USE_MMIO */
- dev->base_addr = (unsigned long)ioaddr;
rp->base = ioaddr;
/* Get chip registers into a sane state */
@@ -996,8 +997,6 @@ static int __devinit rhine_init_one(stru
if (!phy_id)
phy_id = ioread8(ioaddr + 0x6C);
- dev->irq = pdev->irq;
-
spin_lock_init(&rp->lock);
mutex_init(&rp->task_lock);
INIT_WORK(&rp->reset_task, rhine_reset_task);
@@ -1158,7 +1157,6 @@ static void alloc_rbufs(struct net_devic
rp->rx_skbuff[i] = skb;
if (skb == NULL)
break;
- skb->dev = dev; /* Mark as being used by this device. */
rp->rx_skbuff_dma[i] =
pci_map_single(rp->pdev, skb->data, rp->rx_buf_sz,
@@ -1943,7 +1941,6 @@ static int rhine_rx(struct net_device *d
rp->rx_skbuff[entry] = skb;
if (skb == NULL)
break; /* Better luck next round. */
- skb->dev = dev; /* Mark as being used by this device. */
rp->rx_skbuff_dma[entry] =
pci_map_single(rp->pdev, skb->data,
rp->rx_buf_sz,