As most readers of this blog are probably aware, pre-releases of the Linux Kernel 2.6.27 are able to trash the NVRAM/EEPROM of certain Intel Network cards. As usual, lwn.net has a nice writeup of the issue including some background information.
I had a similar problem in the past and as a favour for some friends of mine, I've written down a small description to restoring the ethernet firmware in these cards. This guide should serve as a good primer on reflashing your broken nic but probably needs to be adapted for your own use case.
NB: Instead of just giving a command by command description of what I did, I'll try explaining a bit more about the background and the process of fixing the problem at hand. Maybe this gives other people some insight into valuable problem solving skills.
Some years back we bought quite some Tyan S5112 machines for bawue.net.
The idea was to fit these machines with the Tyan m3289 server management card, an IPMI card allowing remotely powercycling of the machines and offering a serial console via the network.
In order to have the whole setup work, the IPMI management module needs support from the network interface in order to receive IP packets while the machine is powered off. After contacting the Tyan support, we were offered a firmware file to flash into the network adapter activating the needed "management mode". This firmware file came in the form of a .bin file and an accompanying eeupdate.exe file for flashing the firmware image.
The mainboard has two ethernet controllers, with the 82547EI one being the controller utilized by the management card. The lspci output on this board looks as follows:
[root@selene ~]# lspci|grep Ethernet
01:01.0 Ethernet controller: Intel Corporation 82547EI Gigabit Ethernet Controller
03:02.0 Ethernet controller: Intel Corporation 82541GI Gigabit Ethernet Controller
[root@selene ~]#
The instructions for flashing the firmware were relatively simple: Boot with a DOS bootdisk and execute eeupdate -nic=1 -d 82547EI.eep
After I pressed enter, nothing much happened. The system wasn't reading anything from the disk, there was no progress message on the screen, just the general start-message of the firmware update tool.
When nothing happened after 5 minutes of waiting, I foolishly reset the system. Big mistake!.
On the next boot of the system, there were no PXE messages from the network card and during bootup the e1000 linux driver only threw out the ominous message The EEPROM Checksum Is Not Valid without loading the network interface.
It turns out, I just trashed the firmware of my on-board network interface.
Calling the eeupdate program again did not work as it bailed out because the eeprom was corrupted
At such times, I firmly believe in the use of swearwords accompanied by heavy googling for some advice. Alas, googling told me only that I broke my hardware and needed new one.
As returning hardware because of a problem is akin to giving up, which is generally unacceptable, I decided to look into the issue a bit more and find a workable solution to unbrick the network interface.
To recap, I had the following:
- One on-board network interface with a corrupted nvram
- One eeprom image for said network interface (For reference: 82547EI.EEP)
- No working flash program
- No working driver
- One working Linux system without network
- One working Linux notebook with network
Luckily, the working Linux system and network access is all I needed:
The first step was getting the sources of the e1000 driver from the project page. As this was a few years ago, I chose the version 7.3.15 which was current at this time.
After untaring the sources, a quick grep -R 'The EEPROM Checksum Is Not Valid' e1000-7.3.15 turned up one hit in e1000-7.3.15/src/e1000_main.c:
/* make sure the EEPROM is good */
if (e1000_validate_eeprom_checksum(&adapter->hw) < 0) {
DPRINTK(PROBE, ERR, "The EEPROM Checksum Is Not Valid\n");
err = -EIO;
goto err_eeprom;
}
So there is a function called
e1000_validate_eeprom_checksum responsible for checking the validity of the eeprom. During the main initialization of the card this function is called and in case the checksum is not valid, the error handler err_eeprom is executed which aborts the module load.
On a hunch, I removed the whole check logic containted in this function located in e1000-7.3.15/src/e1000_hw.c. After I was done, the whole function body consisted only of a "return 0" statement meaning that the checksum check will always succeed.
Building the modified module by calling make in the src dir resulted in a e1000.ko file which could be loaded into the running kernel by executing "insmod ./e1000". (Note, this will probably not work with current kernels as the buildscripts have changed. Use a current version of the e1000 driver instead.)
To my great relief the driver returned the following message:
Intel(R) PRO/1000 Network Driver - version 7.3.15
Copyright (c) 1999-2006 Intel Corporation.
e1000: 0000:01:01.0: e1000_probe: (PCI:33MHz:32-bit) ff:ff:ff:ff:ff:ff
e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
e1000: 0000:03:02.0: e1000_probe: (PCI:33MHz:32-bit) 00:e0:81:55:f2:01
e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection
So even though the mac address of the card is broken, at least the card is somewhat detected and I can work on restoring the eeprom.
For modifying low-level settings of network interfaces under Linux one can usually use the fabulous ethtool utility.
Using the -e parameter dumps the eeprom values of the specified network interface onto the screen. This is great for getting a backup of the eeprom:
[root@selene ~]# ethtool -e eth1 | head -n 5
Offset Values
------ ------
0x0000 00 e0 81 55 f2 01 10 02 ff ff 06 20 ff ff ff ff
0x0010 ff ff ff ff 0b 64 76 10 86 80 76 10 86 80 84 b2
0x0020 dd 20 22 22 00 00 90 2f 80 23 12 00 20 1e 12 00
[root@selene ~]#
Even better is the -E parameter as it allows changing a single byte at a specified address in the eeprom:
[root@selene ~]# ethtool -E eth0 magic 0x10198086 offset 0x0 value 0x00
[root@selene ~]#
This command would change the byte at the address 0x0 (the first byte) into the value 0x00. The 0x10198086 value is the "magic" value needed to "unlock" this write operation. Depending on the driver and the card this value is different for each system. In the case of the intel e1000 driver, the magic value is the Device ID and Vendor ID of the selected network card. This value can be gathered by examining the lspci -n output.
As I was in a hurry back then to get the machine working again, I didn't try to find out what exactly the magic value was but just commented out this check in the e1000_ethtool.c file.
For reference, the patch of my modifications to the e1000 driver are
e1000-repair.patchdownloadable as a unified diff.
Now, that I could change single values in the eeprom, it was time to take a look at the Tyan provided eeprom file:
[root@localhost root]# head -n 5 82547.eep
E000 2A81 0855 0A10 FFFF FFFF FFFF FFFF
FFFF FFFF 640B 1019 8086 1019 8086 B200
1F35 002A 0E00 0012 0E00 20DD 7777 1F95
0001 1F73 0098 1F72 3FB0 0009 1200 3649
00CF 8FA7 290E 0305 0CCA FFFF FFFF FFFF
Comparing this eeprom file with the dump taken earlier from the second network interface in the machine showed that the .eep file from intel was in "
mixed-endian" format, meaning I had to shuffle the values around a bit before being able to rewrite the image. The file contains the eeprom values as groups of two bytes each in reversed order. The first four byte-values in the file are
0xe0 0x00 0x2a 0x81 while in the eeprom they would be
0x00 0xe0 0x81 0x2a.
After I found the correct byte ordering, I could simply call ethtool -E manually with the correct addresses and just write each byte into the eeprom or automate this and reduce the possibility of mistakes. Naturally, automation it is. Back then I chose to do this script in PHP as a small exercise in command-line-interface programming.
The PHP script can be downloaded as eepromer.php and executed by calling php eepromer.php on the shell.
In order to explain it's working, the code is printed below:
At the start, the variable file contains the filename to read in. This file is then opened and read into memory as it is only 6K large. The
PHP String Tokenizer function is used to extract the values from the script and the bytes in each extracted group are then swapped around to put them into big endian byte-order. When the eeprom file has been completely parsed the ethtool commands to write the gathered data into the eeprom are printed to STDOUT:
[root@localhost root]# php eepromer.php | head -n 5
ethtool -E eth0 magic 0x00 offset 0x0 value 0x00
ethtool -E eth0 magic 0x00 offset 0x1 value 0xE0
ethtool -E eth0 magic 0x00 offset 0x2 value 0x81
ethtool -E eth0 magic 0x00 offset 0x3 value 0x2A
ethtool -E eth0 magic 0x00 offset 0x4 value 0x55
[root@localhost root]#
By piping the output of this quick-and-dirty script into a shell (php eepromer.php | sh), the content of the .eep file is written for real into the eeprom. The last step is changing the first 6 bytes of the eeprom (offset 0x0 to 0x5) to the original mac address of the network interface.
After this has been done, the network card is considered repaird or unbricked.
Now, this explanation should give anyone some hints on fixing his network card's eeprom should this be needed because of problems with the kernel releases mentioned in the beginning. It is unlikely that following the procedure above to the letter is going to have any usable results as every system and situation is different.
The major problem would be, that not everyone has a working .eep file from Intel available after his own card is trashed. My suggestion would be to look for a friend who has exactly the same card. This can be achieved by using lspci:
[root@selene ~]# lspci | grep Ethernet
01:01.0 Ethernet controller: Intel Corporation 82547EI Gigabit Ethernet Controller
03:02.0 Ethernet controller: Intel Corporation 82541GI Gigabit Ethernet Controller
[root@selene ~]# lspci -n | grep '01:01\.0'
01:01.0 0200: 8086:1019
[root@selene ~]#
If a friend has the same network card as indicated by the Vendor and Device ID (8086 == Intel, 1019 == 82547EI Gigabit Ethernet Controller in my example) he should be able to take eeprom dump by calling ethtool -e [device] > /tmp/eeprom-[device].dump.:
[root@selene ~]# ethtool -e eth0 | head -n 8
Offset Values
------ ------
0x0000 00 e0 81 55 f2 00 10 0a ff ff ff ff ff ff ff ff
0x0010 ff ff ff ff 0b 64 19 10 86 80 19 10 86 80 00 b2
0x0020 35 1f 2a 00 00 0e 12 00 00 0e dd 20 77 77 95 1f
0x0030 01 00 73 1f 98 00 72 1f b0 3f 09 00 00 12 49 36
0x0040 cf 00 a7 8f 0e 29 05 03 c8 0c ff ff ff ff ff ff
0x0050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 02 06
[root@selene ~]#
This dump file can then be written into your own nvram by an easier procedure then described above. After all, no endianness swapping is necessary as ethtool already returned the data correctly. A bit of reformatting of the import is necessary however but can be accomplished in a simple bash script:
span style="color: #ff0000;">"\n"
This script, which can be written as one single line, will remove the header and other superfluous data from the dumpfile, leaving only the values itself which are then echoed to STDOUT in the form of an ethtool command.
[root@selene ~]# magic=0x0; j=0; for i in `sed -e '1,2d' /tmp/eeprom-[device].dump | cut -c 9- |
tr -d "\n"`; do echo ethtool -E magic $magic offset 0x$(printf %x ${j}) value 0x${i}; j=$(($j + 1)); done | head -n 5
ethtool -E magic 0x0 offset 0x0 value 0x00
ethtool -E magic 0x0 offset 0x1 value 0xe0
ethtool -E magic 0x0 offset 0x2 value 0x81
ethtool -E magic 0x0 offset 0x3 value 0x55
ethtool -E magic 0x0 offset 0x4 value 0xf2
[root@selene ~]#
Piping this into a shell will restore your eeprom meaning only the mac address has to be reverted to the old one.
The correct working of the above bash line should be tested however, as the output of ethtool differs depending on the card and the driver.
Should the network interface not even be visible on the PCI bus anymore (possible due to the usage of the ibautil.exe tool mentioned on some webpages) reflashing the main bios might work for some systems. The flashrom utility from the coreboot project might come in handy for this.
Otherwise it might be necessary to rewrite the necessary part of the eeprom first through the SPI, a serial programming interface, in order to have it enumerate again on the pci bus.
Should this be the case, it might be easier to just return your mainboard for repair though.
Should you be willing however, to try your luck and certainly void any little bit of warranty you had left, Intel offers a nice manual explaining the firmware format of it's ICH8 system.