AMD PCNET

From OSDev.wiki
Revision as of 17:14, 25 October 2015 by osdev>Jnc100 (Rewrote article to include proper initialization and transmit/receive packet examples)
Jump to navigation Jump to search

The AMD PCNET family of network interface chips are supported by most popular virtual machines and emulators, including QEMU, VMware and VirtualBox. While not as simple as the RTL8139 it is easier to test with an emulator, as the RTL8139 is only supported in QEMU, and getting QEMU's full network support running is sometimes difficult. This article will focus on the Am79C970A a.k.a. the AMD PCnet-PCI II in VirtualBox.

Overview

The PCnet-PCI II is a PCI network adapter. It has built-in support for CRC checks and can automatically pad short packets to the minimum Ethernet length.

It supports PCI bus mastering and can operate in both 32-bit mode and a legacy 16-bit compatibility mode (this mode is from now on referred to as software style, or SWSTYLE). Access to the card's registers are through an index/data register system in either IO port space or memory mapped IO. Given that the MMIO access is sometimes absent on emulators or certain systems, this article will focus on the IO port access. A final distinction is made between the actual accesses to the index/data registers, which can either be 16-bit or 32-bit. The 32-bit mode is referred to as DWIO in the specifications (as it implies the DWIO bit is set in a particular register). Note that any combination of DWIO and SWSTYLE can be selected.

Initialization and Register Access

PCI Configuration

In the PCI configuration space, the card has vendor ID 0x1022 and device ID 0x2000. A separate similar device (PCnet-PCI III and clones) has device ID 0x2001 and is programmed similarly.

The first task of the driver should be to enable the IO ports and bus mastering ability of the device in PCI configuration space. This is done by setting bits 0 and 2 of the control register, e.g.

uint32_t conf = pciConfigReadDWord(bus, slot, func, offset);
conf &= 0xffff0000; // preserve status register, clear config register
conf |= 0x5;        // set bits 0 and 2
pciConfigWriteDWord(bus, slot, func, offset, conf);

You will then want to read the IO base address of the first BAR from configuration space. We will assume this has the value io_base.

Card register access

As stated above, the card supports a index/data system of accessing its internal registers. This means that the index of the register you wish to access is first written to an index port, followed by either writing a new value to or reading the old value from a data register. To make things slightly more complex, however, the card splits its internal registers into two groups - Control and Status Registers (CSR) and Bus Control Registers (BCR). Both share a common index port (called Register Address Port - RAP), but use separate data ports: the Register Data Port (RDP) for CSRs and the BCR Data Port (BDP) for BCRs. During normal initialization and use of the cards, the CSRs are used exclusively. A further important register exists in the IO space called the reset register.

A further complication exists in that the offsets of the RAP, Reset register and BDP (but not RDP) relative to io_base vary depending on the current value of DWIO:

DWIO = 0 (16-bit access) DWIO = 1 (32-bit access) Register
Offset Length Offset Length
0 16 0 16 First 16 bytes of EPROM (the first 6 bytes are MAC address)
0x10 2 0x10 4 RDP - data register for CSRs
0x12 2 0x14 4 RAP - index register for both CSR and BCR access
0x14 2 0x18 4 Reset register
0x16 2 0x1c 4 BDP - data register for BCRs

In addition, the card requires different length data accesses to the registers depending on the setting of DWIO: if DWIO=0, then offsets 0x0 through 0xf are read as single bytes, all others read/written as 16-bit words; if DWIO=1, then all registers (including 0x0 through 0xf) are read/written as 32-bit double words (and with accesses aligned on 32-bit boundaries). We can write functions to access the registers:

void writeRAP32(uint32_t val)
{
    outd(io_base + 0x14, val);
}

void writeRAP16(uint16_t val)
{
    outw(io_base + 0x12, val);
}

uint32_t readCSR32(uint32_t csr_no)
{
    writeRAP32(csr_no);
    return ind(io_base + 0x10);
}

uint16_t readCSR16(uint16_t csr_no)
{
    writeRAP32(csr_no);
    return inw(io_base + 0x10);
}

void writeCSR32(uint32_t csr_no, uint32_t val)
{
    writeRAP32(csr_no);
    outd(io_base + 0x10, val);
}

void writeCSR16(uint16_t csr_no, uint16_t val)
{
    writeRAP16(csr_no);
    outw(io_base + 0x10, val);
}

and similar functions for BCRs.

Unfortunately it is difficult to determine the current state of DWIO (and therefore know which state the card is in when the driver initializes) as the only way of reporting it is to read BCR18 bit 7, which in turn requires knowledge of the BDP, which requires knowledge of DWIO etc. Fortunately, following a reset (either hard or soft), the card is in a known state with DWIO=0 (16-bit access). Normally, therefore, when your driver takes control of the card, it can assume it is in 16-bit mode. However, it may be the case that firmware or a bootloader has already initialized the card into 32-bit mode, which you didn't know about. You should, therefore, reset the card when your driver takes control. This is accomplished by a read of the reset register:

ind(io_base + 0x18);
inw(io_base + 0x14);

Note this snippet reads first from the 32-bit reset register: if the card is in 32-bit mode this will trigger a reset, if in 16-bit mode it will simply read garbage without affecting the card. It then reads from the 16-bit reset register: if the card was initially in 32-bit mode, it has since been reset and will now be reset again, otherwise it will reset for the first time.

You should now wait 1 microsecond for the reset to complete (using your OSs timing functions).

Then, if desired, you can program the card into 32-bit mode (the rest of this article assumes this, but you can easily substitute read/writeCSR32 with read/writeCSR16 if you like). To do this, we simply need to perform a 32 bit write of 0 to the RDP. After reset, RAP points to CSR0, so we are effectively writing 0 to CSR0. This will not cause any harm as we completely reprogram CSR0 later anyway.

outd(io_base + 0x10, 0);

Interrupt handling

The next section will enable some interrupts on the card. We will flesh out the interrupt handler later, but you should install the interrupt handler here as otherwise you will get crashes due to unhandled interrupts. You need to parse ACPI tables etc. to determine the proper interrupt routing for your device.

SWSTYLE

We now need to set the value of SWSTYLE to 2. After reset, it defaults to 0 representing 16-bit legacy compatibility mode. We want the card to be able to access all of the first 4 GiB of (physical) memory for its buffers, so need to set it to 32-bit mode.

uint32_t csr58 = readCSR32(58);
csr58 &= 0xfff0;
csr58 |= 2;
writeCSR32(58, csr58);

ASEL

The card has both 10/100baseT and coaxial outputs. It has functionality to automatically select whichever is attached which is normally enabled by default. This snippet simply ensures this functionality is enabled by setting the ASEL bit in BCR2 just in case firmware has altered this for some reason.

uint32_t bcr2 = readBCR32(2);
bcr2 |= 0x2;
writeBCR32(2, bcr2);

Ring buffers

The card uses two ring buffers to store packets: one for packets received and one for packets to be transmitted. The actual ring buffers themselves are regions of physical memory containing a set number of descriptor entries (DEs) which are fixed 16 bytes in length (for SWSTYLE=2). Each of these then contains a pointer to the actual physical address of the memory used for the packet.

For example, if you wise to define 32 receive buffers and 8 transmit buffers (similar to what the Linux driver does), then you would need to allocate 32 * 16 bytes for the receive DEs, 8 * 16 bytes for the transmit DEs, 32 * packet length (1544 is used in Linux, but we will use 1548 as it is a multiple of 16) for the actual receive buffers and 8 * packet length for the actual transmit buffers.

The DEs contain a number of important bits for sending/receiving packets, e.g. destination MAC address, error bits etc. but they also contain an important bit called the ownership bit (bit 7 of byte 7). If this is cleared, it means the driver 'owns' that particular ring buffer entry. If it is set, it means the card owns it (and the driver should not touch the entire entry). The way this works is that the only party (driver or card) that can read/write the entry is the one that owns it, and particularly only the owning party can flip ownership back to the other party. At initialization, you would want the card to 'own' all the receive buffers (so it can write new packets into them that it receives, then flip ownership to the driver), and the driver to 'own' all the transmit buffers (so it can write packets to be transmitted, then flip ownership to the driver).

You should also have a variable that stores the current 'pointer' into each buffer (i.e. what is the next one the driver expects to read/write). The card maintains separate pointers internally. You also need a simple way of incrementing the pointer (and wrapping back to the start if necessary).

Thus to initialize the ring buffers you'd want something like:

int rx_buffer_ptr = 0;
int tx_buffer_ptr = 0;                 // pointers to transmit/receive buffers

int rx_buffer_count = 32;              // total number of receive buffers
int tx_buffer_count = 8;               // total number of transmit buffers

const int buffer_size = 1548;          // length of each packet buffer

const int de_size = 16;                // length of descriptor entry

uint8_t *rdes;                         // pointer to ring buffer of receive DEs
uint8_t *tdes;                         // pointer to ring buffer of transmit DEs

uint32_t rx_buffers;                   // physical address of actual receive buffers (< 4 GiB)
uint32_t tx_buffers;                   // physical address of actual transmit buffers (< 4 GiB)

// does the driver own the particular buffer?
int driverOwns(uint8_t *des, int idx)
{
    return (des[de_size * idx + 7] & 0x80) == 0;
}

// get the next transmit buffer index
int nextTxIdx(int cur_tx_idx)
{
    int ret = cur_tx_idx + 1;
    if(cur_tx_idx == tx_buffer_count)
        ret = 0;
    return ret;
}

// get the next receive buffer index
int nextRxIdx(int cur_rx_idx)
{
    int ret = cur_rx_idx + 1;
    if(cur_rx_idx == rx_buffer_count)
        ret = 0;
    return ret;
}

// initialize a DE
void initDE(uint8_t *des, int idx, int is_tx)
{
    memset(&des[idx * de_size], de_size, 0);
    
    // first 4 bytes are the physical address of the actual buffer
    uint32_t buf_addr = rx_buffers;
    if(is_tx)
        buf_addr = tx_buffers;
    *(uint32_t *)&des[idx * de_size] = buf_addr + idx * buffer_size;

    // next 2 bytes are 0xf000 OR'd with the first 12 bits of the 2s complement of the length
    uint16_t bcnt = (uint16_t)(-buffer_size);
    bcnt &= 0x0fff;
    bcnt |= 0xf000;
    *(uint16_t *)&des[idx * de_size + 4] = bcnt;

    // finally, set ownership bit - transmit buffers are owned by us, receive buffers by the card
    if(!is_tx)
        des[idx * de_size + 7] = 0x80;
}

Card registers setup

Finally, once all our ring buffers are set up, we need to give their addresses to the card. There are two ways of setting up the card registers: we can either program them all directly, or set up a special initialization structure and then pass that to the card. In this article we will use the latter.

You will need to allocate a 28 byte region of physical memory, aligned on a 32-bit boundary. The members are:

Offset (bytes) Byte 3 Byte 2 Byte 1 Byte 0
0 TLEN << 4 RLEN << 4 MODE
4 MAC[3] MAC[2] MAC [1] MAC[0]
8 Reserved (0) MAC[5] MAC[4]
12 LADR[3] LADR[2] LADR[1] LADR[0]
16 LADR[7] LADR[6] LADR[5] LADR[4]
20 Physical address of first receive descriptor entry
24 Physical address of first transmit descriptor entry

Note that TLEN and RLEN are the log2 of the number of transmit and receive descriptor entries respectively. For example, if you have 8 transmit descriptor entries, TLEN would be 3 (as 2^3 = 8), which you then need to shift left by 4 bits, so the actual value to write to byte 3 would be 0x30. The maximum value of TLEN and RLEN is 9 (i.e. 512 buffers).

MODE provides various functions to control how the card works with regards to sending and receiving packets, and running loopback tests. You probably want to set it to zero (enable transmit and receive functionality, receive broadcast packets and those sent this physical address, disable promiscuous mode). See the spec description of CSR15 for further details.

You also need to specify the physical address (MAC address) you want the card to use. If you want to keep the current one, you will need to first read it from the EPROM of the card (it is exposed as the first 6 bytes of the IO space that the registers are in).

LADR is the logical address filter you want the card to use when deciding to accept Ethernet packets with logical addressing. If you do not wish to use logical addressing (the default), then set these bytes to zero.

To actually set up the card registers, we provide it with the address of our initialization structure by writing the low 16-bits of its address to CSR1 and the high 16-bits to CSR2.

You can also set up other registers at this point, e.g. CSR3 (only interesting bits shown):

Bit number Functionality
10 Receive interrupt mask - if set then incoming packets won't trigger an interrupt
9 Transmit interrupt mask - if set then an interrupt won't be triggered when a packet has completed sending. Depending on your design this may be preferable.
8 Interrupt done mask - if set then you won't get an interrupt when the card has finished initializing. You probably want this as it is far easier to poll for this situation (which only occurs once anyway).
2 Big endian enable - you will want to ensure this is cleared to zero

And you may want to set bit 11 of CSR4 which automatically pads Ethernet packets which are too short to be at least 64 bytes.

Once all the control registers are set up, you set bit 0 of CSR0, and then wait for initialization to be done. You can do this by either waiting for an interrupt (if you didn't disable the initialization done interrupt in CSR3) or by polling until CSR0 bit 8 is set. Note that if you want to wait for an interrupt you will also need to set bit 6 of CSR0 or interrupts won't be generated (you will need to enable this anyway to get notification of received packets, so it makes sense to set it at the same time as the initialization bit).

Once initialization has completed, you can finally start the card. This is accomplished by clearing both the INIT bit (bit 0) and STOP bit (bit 2) in CSR0 and setting the STRT bit (bit 1) at the same time.

Sending packets

Sending packets involves simply writing the packet details to the next available transmit buffer, then flipping the ownership for the particular ring buffer entry to the card. The card regularly scans all the transmit buffers looking for one it hasn't sent, and then will transmit those it finds.

For example:

int sendPacket(void *packet, size_t len, uint8_t *dest)
{
    // the next available descriptor entry index is in tx_buffer_ptr
    if(!driverOwns(tdes, tx_buffer_ptr))
    {
        // we don't own the next buffer, this implies all the transmit
        //  buffers are full and the card hasn't sent them yet.
        // A fully functional driver would therefore add the packet to
        //  a queue somewhere, and wait for the transmit done interrupt
        //  then try again.  We simply fail and return.  You can set
        //  bit 3 of CSR0 here to encourage the card to send all buffers.
        return 0;
    }

    // copy the packet data to the transmit buffer.  An alternative would
    //  be to update the appropriate transmit DE to point to 'packet', but
    //  then you would need to ensure that packet is not invalidated before
    //  the card has a chance to send the data.
    memcpy((void *)(tx_buffers + tx_buffer_ptr * buffer_size), packet, len);

    // set the STP bit in the descriptor entry (signals this is the first
    //  frame in a split packet - we only support single frames)
    tdes[tx_buffer_ptr * de_size + 7] |= 0x2;

    // similarly, set the ENP bit to state this is also the end of a packet
    tdes[tx_buffer_ptr * de_size + 7] |= 0x1;

    // set the BCNT member to be 0xf000 OR'd with the first 12 bits of the
    //  two's complement of the length of the packet
    uint16_t bcnt = (uint16_t)(-len);
    bcnt &= 0xfff;
    bcnt |= 0xf000;
    *(uint16_t *)&tdes[tx_buffer_ptr * de_size + 4] = bcnt;

    // finally, flip the ownership bit back to the card
    tdes[tx_buffer_ptr * de_size + 7] |= 0x80;

    // update the next transmit pointer
    tx_buffer_ptr = nextTxIdx(tx_buffer_ptr);
}

Handling interrupts and receiving packets

Receiving packets is normally done in your interrupt handler - the card will signal an interrupt whenever it receives a packet and has written it to the receive buffer.

Note that interrupts can come from many sources (other than new packets). If a new packet has been signalled then CSR0 bit 10 will be set. There are other bits in CSR0 than can be set (depending on how you set up interrupt masks in CSR3) and additionally other bits in CSR4 that can signal interrupts (although these are usually masked out on reset). After you have properly handled an interrupt, you will need to write a 1 back to the appropriate bit in CSR0 or CSR4 before sending EOI to you interrupt controller (or the interrupt will continue to be signalled). ORing CSR0 with 0x7f00 and ORing CSR4 with 0x026a will reset all interrupts.

Once a receive packet interrupt has been received, you need to loop through the receive descriptor entries (starting at rx_buffer_ptr) handling each packet until you find an entry which the driver doesn't own, then stop. e.g.

void handleReceiveInterrupt()
{
    while(driverOwns(rdes, rx_buffer_ptr))
    {
        // packet length is given by bytes 8 and 9 of the descriptor
        //  (no need to negate it unlike BCNT above)
        uint16_t plen = *(uint16_t *)&rdes[rx_buffer_ptr * de_size + 8];

        // the packet itself is written somewhere in the receive buffer
        void *pbuf = (void *)(rx_buffers + rx_buffer_ptr * buffer_size);

        // do something with the packet (i.e. hand to the next layer in the
        //  network stack).  You probably don't want to do any extensive
        //  processing here (as this is within an interrupt handler) - just
        //  copy the data somewhere to a queue and continue so that the
        //  system is interrupted for as little time as possible
        handlePacket(pbuf, plen);

        // hand the buffer back to the card
        rdes[rx_buffer_ptr * de_size + 7] = 0x80;

        // increment rx_buffer_ptr;
        rx_buffer_ptr = nextRxIdx(rx_buffer_ptr);
    }

    // set interrupt as handled
    writeCSR32(readCSR32(0) | 0x0400);

    // don't forget to send EOI
}

See Also

External Links