AMD PCNET: Difference between revisions

Rewrote article to include proper initialization and transmit/receive packet examples
[unchecked revision][unchecked revision]
(Add to Network Hardware category)
(Rewrote article to include proper initialization and transmit/receive packet examples)
Line 1:
{{In Progress}}
The '''AMD PCNET''' family of network interface chips are supported by most popular virtual machines and emulators, including QEMU, VMware and VirtualBox. While not as simple as the [[RTL8139]] it is easier to test with an emulator, as the RTL8139 is only supported in QEMU, and getting QEMU's full network support running is sometimes difficult. This article will focus on the Am79C970A a.k.a. the AMD PCnet-PCI II in VirtualBox.
 
== Overview ==
The PCnet-PCI II is a PCI network adapter. It has built-in support for CRC checks and can automatically pad short packets to the minimum Ethernet length.
This article will cover using the card with IO port accesses as opposed to memory mapped IO.
 
It supports PCI bus mastering and can operate in both 32-bit mode and a legacy 16-bit compatibility mode (this mode is from now on referred to as software style, or SWSTYLE). Access to the card's registers are through an index/data register system in either IO port space or memory mapped IO. Given that the MMIO access is sometimes absent on emulators or certain systems, this article will focus on the IO port access. A final distinction is made between the actual accesses to the index/data registers, which can either be 16-bit or 32-bit. The 32-bit mode is referred to as DWIO in the specifications (as it implies the DWIO bit is set in a particular register). Note that any combination of DWIO and SWSTYLE can be selected.
The card uses a system of index registers. What this means is that one first writes to an IO port to select the chip register you want to access, and then read/write to/from a certain IO port. It can also support reading and writing in both 16 and 32bit modes.
 
== Initialization and Register Access ==
== In Real Mode ==
In using the card in real mode your OS will need to have the ability to read and write to PCI registers and the ability to use an Interrupt controller (PIC8259, have not tried it using APIC) this article assumes the reader has knowledge of both. Since the card will be sending and reciving data on its own you will want to use the PIC to switch execution back and forth from processing from the card and your CPU this article installs the card at IRQ10 (0x01C8).
 
=== PCI Configuration ===
There are two types of registers used by the card BCR's and CSR's. Below are some helpful functions for reading and writing to the registers.
 
In the PCI configuration space, the card has vendor ID 0x1022 and device ID 0x2000. A separate similar device (PCnet-PCI III and clones) has device ID 0x2001 and is programmed similarly.
<source lang="asm">
iobase dw 0xD020 ; the default I/O address in Virtual Box when enabling no extra hardware your PCI address most likely will not be the same
 
The first task of the driver should be to enable the IO ports and bus mastering ability of the device in PCI configuration space. This is done by setting bits 0 and 2 of the control register, e.g. <source lang="c">uint32_t conf = pciConfigReadDWord(bus, slot, func, offset);
RDP equ 0x10
conf &= 0xffff0000; // preserve status register, clear config register
RAP equ 0x12
conf |= 0x5; // set bits 0 and 2
BDP equ 0x16
pciConfigWriteDWord(bus, slot, func, offset, conf);</source>
 
You will then want to read the IO base address of the first BAR from configuration space. We will assume this has the value io_base.
; IN: ax,register number on the device
; OUT: cx,word read from register
Rcsr: ; read a value of a CSR
nop ; useful marker in reading through crash dumps
pusha
mov dx,[iobase]
add dx,RAP
out dx,ax
mov dx,[iobase]
add dx,RDP
in ax,dx
mov [csrW],ax
popa
mov cx,[csrW]
ret
csrW dw 0xFFFF
 
=== Card register access ===
; IN: ax,register number on the device; cx,word to be sent to register;
; OUT: nothing
Wcsr:
nop
pusha
mov dx,[iobase] ; base port address
add dx,RAP
out dx,ax
mov dx,[iobase]
add dx,RDP
mov ax,cx ; mov word to be sent out by the port
out dx,ax
popa
ret
 
As stated above, the card supports a index/data system of accessing its internal registers. This means that the index of the register you wish to access is first written to an index port, followed by either writing a new value to or reading the old value from a data register. To make things slightly more complex, however, the card splits its internal registers into two groups - Control and Status Registers (CSR) and Bus Control Registers (BCR). Both share a common index port (called Register Address Port - RAP), but use separate data ports: the Register Data Port (RDP) for CSRs and the BCR Data Port (BDP) for BCRs. During normal initialization and use of the cards, the CSRs are used exclusively. A further important register exists in the IO space called the reset register.
; IN: ax,register number on the device
; OUT: cx,word read from register
Rbcr:
nop
pusha
mov dx,[iobase]
add dx,RAP
out dx,ax
mov dx,[iobase]
add dx,BDP
in ax,dx
mov [bcrW],ax
popa
mov cx,[bcrW]
ret
bcrW dw 0xFFFF
 
A further complication exists in that the offsets of the RAP, Reset register and BDP (but not RDP) relative to io_base vary depending on the current value of DWIO:
; IN: ax,register number on the device; cx,word to be sent to register;
{| {{wikitable}}
; OUT: nothing
! colspan="2" | DWIO = 0 (16-bit access)
Wbcr:
! colspan="2" | DWIO = 1 (32-bit access)
nop
! Register
pusha
|-
mov dx,[iobase] ; base port address
! Offset
add dx,RAP
! Length
out dx,ax
! Offset
mov dx,[iobase]
! Length
add dx,BDP
!
mov ax,cx ; mov word to be sent out by the port
|-
out dx,ax
| 0 || 16 || 0 || 16 || First 16 bytes of EPROM (the first 6 bytes are MAC address)
popa
|-
ret
| 0x10 || 2 || 0x10 || 4 || RDP - data register for CSRs
</source>
|-
| 0x12 || 2 || 0x14 || 4 || RAP - index register for both CSR and BCR access
|-
| 0x14 || 2 || 0x18 || 4 || Reset register
|-
| 0x16 || 2 || 0x1c || 4 || BDP - data register for BCRs
|}
 
In addition, the card requires different length data accesses to the registers depending on the setting of DWIO: if DWIO=0, then offsets 0x0 through 0xf are read as single bytes, all others read/written as 16-bit words; if DWIO=1, then all registers (including 0x0 through 0xf) are read/written as 32-bit double words (and with accesses aligned on 32-bit boundaries). We can write functions to access the registers:<source lang="c">void writeRAP32(uint32_t val)
The initialization process of the card consists of writing an address to a structure which defines the way the card will act.
{
outd(io_base + 0x14, val);
}
 
void writeRAP16(uint16_t val)
<source lang="asm">
{
; see page 154 table 34 (SSIZE32=0)
outw(io_base + 0x12, val);
MODE dw 1000000010000000b ; set 10Base-T and prom mode (see CSR15)
}
PADR: ; physical address or
dw 0xFFFF
dw 0xFFFF
dw 0xFFFF
LOGICAL:
dw 0x0000
dw 0x0000
dw 0x0000
dw 0x0000
RDRA dw ADRRDRA ; address for where your recieve descriptors are
db 00000000b ;bits 15-13 are for length of descriptors (see page 155)
db 0x0000 ; HI address in real mode not useful
TDTA dw TDTA ; address for where your transmit descriptors are
db 00000000b ; see page 155 table 37
db 0x0000 ; HI address in real mode not useful
</source>
 
uint32_t readCSR32(uint32_t csr_no)
Descriptor Ring format for Reciever:
{
writeRAP32(csr_no);
return ind(io_base + 0x10);
}
 
uint16_t readCSR16(uint16_t csr_no)
Reciever Descriptor ring:
{
writeRAP32(csr_no);
return inw(io_base + 0x10);
}
 
void writeCSR32(uint32_t csr_no, uint32_t val)
Descriptor Ring format for Transmitter:
{
writeRAP32(csr_no);
outd(io_base + 0x10, val);
}
 
void writeCSR16(uint16_t csr_no, uint16_t val)
Transmitter Descriptor ring:
{
writeRAP16(csr_no);
outw(io_base + 0x10, val);
}</source>
and similar functions for BCRs.
 
Unfortunately it is difficult to determine the current state of DWIO (and therefore know which state the card is in when the driver initializes) as the only way of reporting it is to read BCR18 bit 7, which in turn requires knowledge of the BDP, which requires knowledge of DWIO etc. Fortunately, following a reset (either hard or soft), the card is in a known state with DWIO=0 (16-bit access). Normally, therefore, when your driver takes control of the card, it can assume it is in 16-bit mode. However, it may be the case that firmware or a bootloader has already initialized the card into 32-bit mode, which you didn't know about. You should, therefore, reset the card when your driver takes control. This is accomplished by a read of the reset register:<source lang="c">ind(io_base + 0x18);
inw(io_base + 0x14);</source>Note this snippet reads first from the 32-bit reset register: if the card is in 32-bit mode this will trigger a reset, if in 16-bit mode it will simply read garbage without affecting the card. It then reads from the 16-bit reset register: if the card was initially in 32-bit mode, it has since been reset and will now be reset again, otherwise it will reset for the first time.
 
You should now wait 1 microsecond for the reset to complete (using your OSs timing functions).
 
Then, if desired, you can program the card into 32-bit mode (the rest of this article assumes this, but you can easily substitute read/writeCSR32 with read/writeCSR16 if you like). To do this, we simply need to perform a 32 bit write of 0 to the RDP. After reset, RAP points to CSR0, so we are effectively writing 0 to CSR0. This will not cause any harm as we completely reprogram CSR0 later anyway.<source lang="c">outd(io_base + 0x10, 0);</source>
 
=== Interrupt handling ===
 
The next section will enable some interrupts on the card. We will flesh out the interrupt handler later, but you should install the interrupt handler here as otherwise you will get crashes due to unhandled interrupts. You need to parse [[ACPI]] tables etc. to determine the proper interrupt routing for your device.
 
=== SWSTYLE ===
 
We now need to set the value of SWSTYLE to 2. After reset, it defaults to 0 representing 16-bit legacy compatibility mode. We want the card to be able to access all of the first 4 GiB of (physical) memory for its buffers, so need to set it to 32-bit mode.<source lang="c">uint32_t csr58 = readCSR32(58);
csr58 &= 0xfff0;
csr58 |= 2;
writeCSR32(58, csr58);</source>
 
=== ASEL ===
 
The card has both 10/100baseT and coaxial outputs. It has functionality to automatically select whichever is attached which is normally enabled by default. This snippet simply ensures this functionality is enabled by setting the ASEL bit in BCR2 just in case firmware has altered this for some reason.<source lang="c">uint32_t bcr2 = readBCR32(2);
bcr2 |= 0x2;
writeBCR32(2, bcr2);</source>
 
=== Ring buffers ===
 
The card uses two ring buffers to store packets: one for packets received and one for packets to be transmitted. The actual ring buffers themselves are regions of physical memory containing a set number of descriptor entries (DEs) which are fixed 16 bytes in length (for SWSTYLE=2). Each of these then contains a pointer to the actual physical address of the memory used for the packet.
 
For example, if you wise to define 32 receive buffers and 8 transmit buffers (similar to what the Linux driver does), then you would need to allocate 32 * 16 bytes for the receive DEs, 8 * 16 bytes for the transmit DEs, 32 * packet length (1544 is used in Linux, but we will use 1548 as it is a multiple of 16) for the actual receive buffers and 8 * packet length for the actual transmit buffers.
 
The DEs contain a number of important bits for sending/receiving packets, e.g. destination MAC address, error bits etc. but they also contain an important bit called the ownership bit (bit 7 of byte 7). If this is cleared, it means the driver 'owns' that particular ring buffer entry. If it is set, it means the card owns it (and the driver should not touch the entire entry). The way this works is that the only party (driver or card) that can read/write the entry is the one that owns it, and particularly only the owning party can flip ownership back to the other party. At initialization, you would want the card to 'own' all the receive buffers (so it can write new packets into them that it receives, then flip ownership to the driver), and the driver to 'own' all the transmit buffers (so it can write packets to be transmitted, then flip ownership to the driver).
 
You should also have a variable that stores the current 'pointer' into each buffer (i.e. what is the next one the driver expects to read/write). The card maintains separate pointers internally. You also need a simple way of incrementing the pointer (and wrapping back to the start if necessary).
 
Thus to initialize the ring buffers you'd want something like:<source lang="c">int rx_buffer_ptr = 0;
int tx_buffer_ptr = 0; // pointers to transmit/receive buffers
 
int rx_buffer_count = 32; // total number of receive buffers
int tx_buffer_count = 8; // total number of transmit buffers
 
const int buffer_size = 1548; // length of each packet buffer
 
const int de_size = 16; // length of descriptor entry
 
uint8_t *rdes; // pointer to ring buffer of receive DEs
uint8_t *tdes; // pointer to ring buffer of transmit DEs
 
uint32_t rx_buffers; // physical address of actual receive buffers (< 4 GiB)
uint32_t tx_buffers; // physical address of actual transmit buffers (< 4 GiB)
 
// does the driver own the particular buffer?
int driverOwns(uint8_t *des, int idx)
{
return (des[de_size * idx + 7] & 0x80) == 0;
}
 
// get the next transmit buffer index
int nextTxIdx(int cur_tx_idx)
{
int ret = cur_tx_idx + 1;
if(cur_tx_idx == tx_buffer_count)
ret = 0;
return ret;
}
 
// get the next receive buffer index
int nextRxIdx(int cur_rx_idx)
{
int ret = cur_rx_idx + 1;
if(cur_rx_idx == rx_buffer_count)
ret = 0;
return ret;
}
 
// initialize a DE
void initDE(uint8_t *des, int idx, int is_tx)
{
memset(&des[idx * de_size], de_size, 0);
// first 4 bytes are the physical address of the actual buffer
uint32_t buf_addr = rx_buffers;
if(is_tx)
buf_addr = tx_buffers;
*(uint32_t *)&des[idx * de_size] = buf_addr + idx * buffer_size;
 
// next 2 bytes are 0xf000 OR'd with the first 12 bits of the 2s complement of the length
uint16_t bcnt = (uint16_t)(-buffer_size);
bcnt &= 0x0fff;
bcnt |= 0xf000;
*(uint16_t *)&des[idx * de_size + 4] = bcnt;
 
// finally, set ownership bit - transmit buffers are owned by us, receive buffers by the card
if(!is_tx)
des[idx * de_size + 7] = 0x80;
}</source>
 
=== Card registers setup ===
 
Finally, once all our ring buffers are set up, we need to give their addresses to the card. There are two ways of setting up the card registers: we can either program them all directly, or set up a special initialization structure and then pass that to the card. In this article we will use the latter.
 
You will need to allocate a 28 byte region of physical memory, aligned on a 32-bit boundary. The members are:
 
== IO Ports ==
In 32bit mode the card has only 4+6 IO ports necessary to use the card:
{| {{wikitable}}
! Offset (from IO basebytes)
! NameByte 3
! Byte 2
! Byte 1
! Byte 0
|-
| 0 || TLEN << 4 || RLEN << 4 || colspan="2" | MODE
| 0x00-0x05 || MAC0-MAC5
|-
| 4 || MAC[3] || MAC[2] || MAC [1] || MAC[0]
| 0x10 || RDP
|-
| 8 || colspan="2" | Reserved (0) || MAC[5] || MAC[4]
| 0x14 || RAP
|-
| 12 || LADR[3] || LADR[2] || LADR[1] || LADR[0]
| 0x18 || RST
|-
| 16 || LADR[7] || LADR[6] || LADR[5] || LADR[4]
| 0x1C || BDP
|-
| 20 || colspan="4" | Physical address of first receive descriptor entry
|-
| 24 || colspan="4" | Physical address of first transmit descriptor entry
|}
 
Note that TLEN and RLEN are the log2 of the number of transmit and receive descriptor entries respectively. For example, if you have 8 transmit descriptor entries, TLEN would be 3 (as 2^3 = 8), which you then need to shift left by 4 bits, so the actual value to write to byte 3 would be 0x30. The maximum value of TLEN and RLEN is 9 (i.e. 512 buffers).
 
MODE provides various functions to control how the card works with regards to sending and receiving packets, and running loopback tests. You probably want to set it to zero (enable transmit and receive functionality, receive broadcast packets and those sent this physical address, disable promiscuous mode). See the spec description of CSR15 for further details.
 
You also need to specify the physical address (MAC address) you want the card to use. If you want to keep the current one, you will need to first read it from the EPROM of the card (it is exposed as the first 6 bytes of the IO space that the registers are in).
 
LADR is the logical address filter you want the card to use when deciding to accept Ethernet packets with logical addressing. If you do not wish to use logical addressing (the default), then set these bytes to zero.
 
To actually set up the card registers, we provide it with the address of our initialization structure by writing the low 16-bits of its address to CSR1 and the high 16-bits to CSR2.
 
You can also set up other registers at this point, e.g. CSR3 (only interesting bits shown):
{| {{wikitable}}
! Bit number
! Functionality
|-
| 10 || Receive interrupt mask - if set then incoming packets won't trigger an interrupt
|-
| 9 || Transmit interrupt mask - if set then an interrupt won't be triggered when a packet has completed sending. Depending on your design this may be preferable.
|-
| 8 || Interrupt done mask - if set then you won't get an interrupt when the card has finished initializing. You probably want this as it is far easier to poll for this situation (which only occurs once anyway).
|-
| 2 || Big endian enable - you will want to ensure this is cleared to zero
|}
 
*'''MAC0-MAC5''' are the ports used to access the MAC address stored in the card's ROM. These should be accessed byte-by-byte in order to ensure compatibility with all three emulators.
And you may want to set bit 11 of CSR4 which automatically pads Ethernet packets which are too short to be at least 64 bytes.
*'''RAP''' is the Register Access Pointer, and selects which card register one wants to access.
 
*'''RDP''' is the Register Data Pointer and is the port used to access the first set of registers
Once all the control registers are set up, you set bit 0 of CSR0, and then wait for initialization to be done. You can do this by either waiting for an interrupt (if you didn't disable the initialization done interrupt in CSR3) or by polling until CSR0 bit 8 is set. Note that if you want to wait for an interrupt you will also need to set bit 6 of CSR0 or interrupts won't be generated (you will need to enable this anyway to get notification of received packets, so it makes sense to set it at the same time as the initialization bit).
*'''BDP''' is the Bus Configuration Data Port and is used to access the second set of registers
 
*'''RST''' is used to reset the card.
Once initialization has completed, you can finally start the card. This is accomplished by clearing both the INIT bit (bit 0) and STOP bit (bit 2) in CSR0 and setting the STRT bit (bit 1) at the same time.
== DMA Descriptors ==
 
The PCNET cards use a series of descriptors to handle DMA.
== Sending packets ==
<source lang="c">
 
struct pcnet_descriptor {
Sending packets involves simply writing the packet details to the next available transmit buffer, then flipping the ownership for the particular ring buffer entry to the card. The card regularly scans all the transmit buffers looking for one it hasn't sent, and then will transmit those it finds.
uint32_t address; //physical address of buffer
 
uint16_t length; //length of buffer (rx) or length of packet (tx)
For example:<source lang="c">int sendPacket(void *packet, size_t len, uint8_t *dest)
uint16_t status;
{
uint32_t flags;
// the next available descriptor entry index is in tx_buffer_ptr
uint32_t user;//can be used to store whatever value you want, such as the virtual address of your buffer
if(!driverOwns(tdes, tx_buffer_ptr))
};
{
</source>
// we don't own the next buffer, this implies all the transmit
// buffers are full and the card hasn't sent them yet.
// A fully functional driver would therefore add the packet to
// a queue somewhere, and wait for the transmit done interrupt
// then try again. We simply fail and return. You can set
// bit 3 of CSR0 here to encourage the card to send all buffers.
return 0;
}
 
// copy the packet data to the transmit buffer. An alternative would
// be to update the appropriate transmit DE to point to 'packet', but
// then you would need to ensure that packet is not invalidated before
// the card has a chance to send the data.
memcpy((void *)(tx_buffers + tx_buffer_ptr * buffer_size), packet, len);
 
// set the STP bit in the descriptor entry (signals this is the first
// frame in a split packet - we only support single frames)
tdes[tx_buffer_ptr * de_size + 7] |= 0x2;
 
// similarly, set the ENP bit to state this is also the end of a packet
tdes[tx_buffer_ptr * de_size + 7] |= 0x1;
 
// set the BCNT member to be 0xf000 OR'd with the first 12 bits of the
// two's complement of the length of the packet
uint16_t bcnt = (uint16_t)(-len);
bcnt &= 0xfff;
bcnt |= 0xf000;
*(uint16_t *)&tdes[tx_buffer_ptr * de_size + 4] = bcnt;
 
// finally, flip the ownership bit back to the card
tdes[tx_buffer_ptr * de_size + 7] |= 0x80;
 
// update the next transmit pointer
tx_buffer_ptr = nextTxIdx(tx_buffer_ptr);
}</source>
 
== Handling interrupts and receiving packets ==
 
Receiving packets is normally done in your interrupt handler - the card will signal an interrupt whenever it receives a packet and has written it to the receive buffer.
 
Note that interrupts can come from many sources (other than new packets). If a new packet has been signalled then CSR0 bit 10 will be set. There are other bits in CSR0 than can be set (depending on how you set up interrupt masks in CSR3) and additionally other bits in CSR4 that can signal interrupts (although these are usually masked out on reset). After you have properly handled an interrupt, you will need to write a 1 back to the appropriate bit in CSR0 or CSR4 before sending EOI to you interrupt controller (or the interrupt will continue to be signalled). ORing CSR0 with 0x7f00 and ORing CSR4 with 0x026a will reset all interrupts.
 
Once a receive packet interrupt has been received, you need to loop through the receive descriptor entries (starting at rx_buffer_ptr) handling each packet until you find an entry which the driver doesn't own, then stop. e.g. <source lang="c">void handleReceiveInterrupt()
{
while(driverOwns(rdes, rx_buffer_ptr))
{
// packet length is given by bytes 8 and 9 of the descriptor
// (no need to negate it unlike BCNT above)
uint16_t plen = *(uint16_t *)&rdes[rx_buffer_ptr * de_size + 8];
 
// the packet itself is written somewhere in the receive buffer
void *pbuf = (void *)(rx_buffers + rx_buffer_ptr * buffer_size);
 
// do something with the packet (i.e. hand to the next layer in the
// network stack). You probably don't want to do any extensive
// processing here (as this is within an interrupt handler) - just
// copy the data somewhere to a queue and continue so that the
// system is interrupted for as little time as possible
handlePacket(pbuf, plen);
 
// hand the buffer back to the card
rdes[rx_buffer_ptr * de_size + 7] = 0x80;
 
// increment rx_buffer_ptr;
rx_buffer_ptr = nextRxIdx(rx_buffer_ptr);
}
 
// set interrupt as handled
writeCSR32(readCSR32(0) | 0x0400);
 
// don't forget to send EOI
}</source>
 
==See Also==
Anonymous user