Title:
Gigabit Ethernet adapter supporting the iSCSI and IPSEC protocols
Document Type and Number:
Kind Code:
A1

Abstract:
The invention is embodied in a gigabit Ethernet adapter. A system according to the invention provides a compact hardware solution to handling high network communication speeds. In addition, the invention adapts to multiple communication protocols via a modular construction and design.
Inventors:
Minami, John Shigeto (Honolulu, HI, US)
Uyeshiro, Robin Yasu (Kailua, HI, US)
Johnson, Michael Ward (Livermore, CA, US)
Su, Steve (Honolulu, HI, US)
Smith, Michael John Sebastian (Palo Alto, CA, US)
Chen, Addison Kwuanming (Honolulu, HI, US)
Mihir Shaileshbhai, Doctor. (Honolulu, HI, US)
Greenfield, Daniel Leo (Honolulu, HI, US)
Application Number:
10/456871
Publication Date:
04/01/2004
Filing Date:
06/05/2003
View Patent Images:
Images are available in PDF form when logged in. To view PDFs, Login  or  Create Account (Free!)
Primary Class:
International Classes:
(IPC1-7): H04L012/66
Attorney, Agent or Firm:
GLENN PATENT GROUP (3475 EDISON WAY, SUITE L, MENLO PARK, CA, 94025, US)
Claims:
1. An integrated network adapter for decoding and encoding network protocols and processing data, comprising: a hardwired data path for processing streaming data; a hardwired data path for receiving and transmitting packets and for encoding and decoding packets; a plurality of parallel, hardwired protocol state machines; wherein each protocol state machine is optimized for a specific network protocol; and wherein said protocol state machines execute in parallel; and means for scheduling shared resources based on traffic.

2. An integrated network adapter embodied in a single integrated circuit, said network adapter comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; a physical layer module (PHY); a media-access layer module (MAC); an IPsec processing engine integrated with said TOE; and an upper-level protocol (ULP) for offload processing, said ULP integrated with said TOE.

3. The network adapter of claim 2, wherein said ULP implements an iSCSI protocol.

4. An integrated network adapter, comprising: a hardwired data path for receiving and transmitting packets and for encoding and decoding packets; at least one hardwired protocol state machine; and at least one communication channel between said network adapter and a host computer.

5. The network adapter of claim 4, wherein said at least one communication channel employs instruction blocks (IBs) and status messages (SMs) to transfer data and control information.

6. The network adapter of claim 4, further comprising: at least one threshold timer for controlling communication via said at least one communication channel; wherein data are transferred at selected threshold interval.

7. The network adapter of claim 4, further comprising: a module for establishing at least one data threshold for controlling communication via said at least one communication channel; wherein data are transferred when data levels reach a selected threshold.

8. The network adapter of claim 6, wherein said timer threshold comprises an interrupt aggregation mechanism for reducing a number of interrupts between said network adapter and said host computer and for increasing data throughput.

9. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and an interrupt aggregation mechanism for optimizing data throughput of said processor and TOE.

10. The network adapter of claim 4, further comprising: a module that provides optimized hardware support for TCP Selective Acknowledgement (SACK); wherein TCP acknowledges missing data packets and retransmits said missing data packets, but only said missing data packets.

11. The network adapter of claim 4, further comprising: a module that provides optimized hardware support for TCP slow start.

12. The network adapter of claim 11, wherein slow start slowly ramps up a number of data segments in flight at one time by: initially only allowing two data segments that correspond to a current window, cwnd, of twice a maximum segment size (MSS) fly before expecting an acknowledgement (ACK); and increasing cwnd by one MSS for each successful ACK received, to allow one more segment in flight, until cwnd is equivalent to a receiver's advertised window.

13. The network adapter of claim 11, wherein slow start is always started on a new data connection; and wherein slow start may be activated in the middle of a connection when a data traffic congestion event occurs.

14. The network adapter of claim 4, further comprising: a module that provides optimized hardware support for TCP fast retransmit.

15. The network adapter of claim 14, wherein fast retransmit immediately generates an ACK when an out of order segment is received to allow a sender to fill a hole quickly, instead of waiting for a standard time out.

16. The network adapter of claim 14: wherein fast retransmit is invoked when a receiver receives three duplicate ACKS; wherein a sender tries to fill a hole when fast retransmit is invoked; and wherein a duplicate ACK is considered duplicate when ACK and window advertisement values in a segment match one another.

17. The network adapter of claim 4, further comprising: a module that provides optimized hardware support for TCP window scaling.

18. The network adapter of claim 17, wherein a window scaling operation is based on three variables, which comprise: a least one bit for enabling window scale; at least one bit for setting a scaling factor; and a parameter for determining a scaling value.

19. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for iSCSI header and data CRC generation and checking.

20. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for iSCSI fixed-interval marker (FIM) generation.

21. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for TCP dump mode, wherein TCP dump mode supports diagnostic programs and packet monitoring programs.

22. The network adapter of claim 21, wherein when TCP dump mode is enabled all received packets are sent to said host as exceptions and all outgoing TCP/UDP packets coming from a hardware stack are looped back as exception packets.

23. The network adapter of claim 22, further comprising: a driver copying said exception packets for a network monitor, and for re-injecting rx packets and sending TX packets as raw Ethernet frames.

24. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for host ACK mode; wherein a TCP ACK is only sent when said host has received data from a TCP segment to provide data integrity where data may be corrupted as they are passed between said host computer and said network adapter.

25. The network adapter of claim 24, wherein host ACK mode waits for a DMA of an MTX buffer that contains a data segment to complete before sending an ACK.

26. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for TCP timestamps to allow TCP to calculate a Round Trip Time Measurement (RTTM) better, and to support Protect Against Wrapped Sequences (PAWS).

27. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for TCP PAWS to protect against old duplicate segments corrupting TCP connections.

28. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for TCP host retransmit mode to allow retransmission of data directly out of a host's memory buffers, instead of out of buffers located in said network adapter.

29. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for random initial sequence numbers.

30. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; a module that provides optimized hardware support for dual stack mode; and a hardware TCP/IP stack integrated into said network adapter that works in cooperation and in conjunction with a software TCP/IP stack in said host; wherein said network adapter supports co-existence of said software TCP/IP stack running in parallel using a same IP address as said network adapter.

31. The network adapter of claim 30, further comprising: a module for supporting SYN status message mode; wherein any received SYN generates a status message back to said host; wherein SYN/ACK is not generated by said network adapter until said host sends a proper instruction block back to said network adapter; and wherein if said SYN status message mode is not enabled on said network adapter, then SYN/ACKs are generated automatically by said network adapter, and SYN received status messages are not generated.

32. The network adapter of claim 30, further comprising: a module for supporting suppression of RST messages from said network adapter when a TCP packet is received that does not match a network adapter control block database; wherein, instead of automatically generating a RST, said network adapter hardware sends a packet to said host as an exception packet to allow said software TCP/IP stack in said host to handle said packet as an exception packet.

33. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for IP ID splitting to allow said host and said network adapter to share an IP address without overlapping IP ID's.

34. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for filtering of data packets to restrict, accept, or take special action on certain types of packets.

35. The network adapter of claim 34, wherein said filtering can take any of the following attributes: accept a programmed uni-cast address; accept broadcast packets; accept multicast packets; accept addresses within a range specified by a netmask; and allow a promiscuous mode that accepts all packets.

36. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for virtual local area network (VLAN).

37. The network adapter of claim 36, wherein said VLAN module comprises any of: an element for stripping incoming packets of their VLAN headers; an element for generating VLAN tagged outbound packets; an element for generating VLAN parameters from incoming SYN frames; and an element for passing VLAN tag information for exception packets and UDP packets.

38. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for jumbo frames.

39. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for Simple Network Management Protocol (SNMP).

40. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for management information base (MIB).

41. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for flexible and programmable memory error checking and correction (ECC).

42. The network adapter of claim 41: wherein said ECC module uses at least one extra bit to store an encrypted ECC code with data in a packet; wherein when said data are written to memory, said ECC code is also stored; wherein when said data are read back, said stored ECC code is compared to an ECC code which would have been generated when said data were written; wherein if said ECC codes do not match, a determination is made as to which bit in said data is in error; wherein said bit in error is flipped and a memory controller releases said corrected data; wherein errors are corrected on-the-fly, and corrected data are not placed back in said memory; and wherein if same corrupt data are read again, operation of said ECC module is repeated.

43. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for network adapter operation in legacy modes; wherein all network traffic are sent to said host regardless of traffic type; and wherein said network adapter operate as if a hardware TCP/IP stack were not present therein.

44. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support that allows IP fragmentation to be handled in either of hardware and software; wherein IP fragmented packets that are passed up as exception packets and reassembled in a software driver are re-injected via an IP injection mode back into said network adapter.

45. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for IP injection that allows IP packets to be injected into a TCP/IP stack in said network adapter.

46. The network adapter of claim 45, said IP injection module further comprising: one or more injection control registers for injecting an IP packet into said network adapter TCP/IP stack; wherein said one or more injection control registers allow said host to inject an IP packet into said network adapter TCP/IP stack.

47. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for any of Network Address Translation (NAT), IP masquerading, and port forwarding via port range registers that forward all packets of a specified type UDP or TCP that fall in a programmable range of ports to an exception path; wherein said port registers enable certain ranges of ports to be used for network control operations and port forwarding.

48. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for multiple IP addresses.

49. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for a debug mode; wherein When a test and control bit is enabled in said network adapter, all IP packets are sent as exceptions to said host.

50. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for TCP time wait state.

51. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for a variable number of connections.

52. The network adapter of claim 51, wherein when said network adapter accepts a connection that equals a network adapter maximum capacity, a next SYN is passed up to said host as an exception packet to allow said host to handle said connection.

53. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for User Datagram Protocol (UDP).

54. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for TTL (time to live) to limit an IP packet life on to a selected number of hops.

55. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for TCP keepalive to allow an idle TCP connection to stay connected and not time out by periodically sending a keep alive packet across a link.

56. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for TCP type of service (TOS) for use by routers to prioritize an IP packet.

57. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for TCP quality of service (QoS).

58. The network adapter of claim 57, wherein TCP transmit data flow starts with a Socket Query module, which goes through a transmit data available Bit table looking for entries that have Transmit Data Available bits set, and wherein when said Socket Query module finds such an entry, said Socket Query module puts that entry into one of a plurality of queues according to a socket's User Priority level.

59. The network adapter of claim 4, further comprising: a hardwired transport offload engine (TOE); a processor integrated with said TOE; and a module that provides optimized hardware support for failover.

60. The network adapter of claim 59, said failover module comprising: a NO_SYN mode that allows a socket to be created without trying to initiate a connection; wherein a socket and all its related data structures in said network adapter are created without creating a connection; and wherein NO_SYN mode supports failover from another card or connection migration from a software TCP/IP stack to said network adapter.

61. The network adapter of claim 3, wherein said ULP offloads the calculation of the iSCSI CRC for transmit and receive.

62. The network adapter of claim 3, wherein said ULP performs iSCSI framing using Fixed Interval Markers (FIM) for transmit.

63. The network adapter of claim 3, wherein said TOE accepts iSCSI header segments and iSCSI data segments from the host iSCSI driver and prepares iSCSI PDUs for transmission.

64. The network adapter of claim 3, further comprising: an iSCSI driver resident on a host computer; and wherein said host iSCSI driver communicates with said TOE.

65. The network adapter of claim 64, wherein said TOE receives iSCSI Protocol Data Units (PDU), calculates iSCSI CRCs, and passes the iSCSI CRCs to said host iSCSI driver.

66. The network adapter of claim 64, wherein said host iSCSI driver assembles a complete iSCSI Protocol Data Unit (PDU) header in host memory, creates an iSCSI Instruction Block (IB), and sends the iSCSI IB to said TOE.

67. The network adapter of claim 66, wherein an iSCSI IB contains a set of address and length pairs, known as transfer blocks, which correspond to a linked-list of buffers in host computer memory.

68. The network adapter of claim 67, wherein said host iSCSI driver adjusts buffer size of a final transfer block when receiving iSCSI data to account for CRC bytes and still allows correct separation of iSCSI header and data segments.

69. The network adapter of claim 64, wherein an iSCSI Protocol Data Unit (PDU), including its corresponding Basic Header Segment (BHS), any Additional Header Segment (AHS), and any data segment, is transferred between said host iSCSI driver and said TOE using an iSCSI Instruction Block (IB).

70. The network adapter of claim 65, wherein said host iSCSI driver seeds a calculated iSCSI CRC value using an iSCSI CRC seed field in an iSCSI Instruction Block (IB).

71. The network adapter of claim 64, wherein said host iSCSI driver splits iSCSI Protocol Data Unit (PDU) header and data segments on receive by posting receive buffers of the correct size for the iSCSI PDU header and, if there are data segments, posting receive buffers of the correct size for the iSCSI PDU data segment.

72. The network adapter of claim 71, wherein said host iSCSI driver posts correctly sized buffers for any Additional Header Segments (AHS) received by using instruction blocks.

73. The network adapter of claim 64, wherein said TOE and said host iSCSI driver interface at the iSCSI PDU level.

74. The network adapter of claim 64, wherein said TOE separates header and data segments of iSCSI Protocol Data Units (PDU) without requiring additional memory copies in the host computer's memory by DMAing PDU headers to either said integrated processor or the host computer and DMAing PDU data sections to the host computer.

75. The network adapter of claim 3, wherein said TOE performs IPsec anti-replay support on a per-SA basis.

76. The network adapter of claim 3, wherein said TOE implements IPSec null, DES, 3DES algorithms, and AES 128-bit algorithm in cipher-block chaining (CBC) mode.

77. The network adapter of claim 3, wherein said TOE implements IPsec null, SHA-1 and MD-5 authentication algorithms.

78. The network adapter of claim 3, wherein said TOE implements IPsec variable-length encryption keys.

79. The network adapter of claim 3, wherein said TOE implements IPsec variable-length authentication keys.

80. The network adapter of claim 3, wherein said TOE implements IPsec jumbo frame support.

81. The network adapter of claim 3, wherein said TOE implements IPsec automatic handling of Security Association (SA) expiration on the basis of both time and total data transferred.

82. The network adapter of claim 3, wherein said TOE implements IPsec Policy Enforcement.

83. The network adapter of claim 3, wherein said TOE implements IPsec exception handling, including exception-packet generation and status reporting.

Description:

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This Application is a Continuation-in-Part of U.S. application Ser. No. 10/093,340 filed on Mar. 6, 2002 and claims benefit of U.S. application Ser. No. 10/131,118 filed on Apr. 23, 2002, and U.S. Provisional Patent Application Serial No. 60/386,924, filed on Jun. 6, 2002.

BACKGROUND OF THE INVENTION

[0002] 1. Technical Field

[0003] The invention relates to telecommunications. More particularly, the invention relates to a method and apparatus for processing data in connection with communication protocols that are used to send and receive data.

[0004] 2. Description of the Prior Art

[0005] Computer networks necessitate the provision of various communication protocols to transmit and receive data. Typically, a computer network comprises a system of devices such as computers, printers and other computer peripherals, communicatively connected together. Data are transferred between each of these devices through data packets which are communicated through the network using a communication protocol standard. Many different protocol standards are in current use today. Examples of popular protocols are Internet Protocol (IP), Internetwork Packet Exchange (IPX), Sequenced Packet Exchange (SPX), Transmission Control Protocol (TCP), and Point to Point Protocol (PPP). Each network device contains a combination of hardware and software that translates protocols and processes data.

[0006] An example is a computer attached to a Local Area Network (LAN) system, wherein a network device uses hardware to handle the Link Layer protocol, and software to handle the Network, Transport, and Communication Protocols and information data handling. The network device normally implements the one Link Layer protocol in hardware, limiting the attached computer to only that particular LAN protocol. The higher protocols, e.g. Network, Transport, and Communication protocols, along with the Data handlers, are implemented as software programs which process the data once they are passed through the network device hardware into system memory. The advantage to this implementation is that it allows a general purpose device such as the computer to be used in many different network setups and support any arbitrary network application that may be needed. The result of this implementation, however, is that the system requires a high processor overhead, a large amount of system memory, complicated configuration setup on the part of the computer user to coordinate the different software protocol and data handlers communicating to the computer's Operating System (O.S.) and computer and network hardware.

[0007] This high overhead required in processing time is demonstrated in U.S. Pat. No. 5,485,460 issued to Schrier et al on Jan. 16, 1996, which teaches a method of operating multiple software protocol stacks implementing the same protocol on a device. This type of implementation is used in Disk Operating System (DOS) based machines running Microsoft Windows. During normal operation, once the hardware verifies the transport or link layer protocol, the resulting data packet is sent to a software layer which determines the packets frame format and strips any specific frame headers. The packet is then sent to different protocol stacks where it is evaluated for the specific protocol. However, the packet may be sent to several protocols stacks before it is accepted or rejected. The time lag created by software protocol stacks prevent audio and video transmissions to be processed in real-time; the data must be buffered before playback. It is evident that the amount of processing overhead required to process a protocol is very high and extremely cumbersome and lends itself to applications with a powerful Central Processing Unit (CPU) and a large amount of memory.

[0008] Consumer products that do not fit in the traditional models of a network device are entering the market. A few examples of these products are pagers, cellular phones, game machines, smart telephones, and televisions. Most of these products have small footprints, eight-bit controllers, limited memory or require a very limited form factor. Consumer products such as these are simplistic and require low cost and low power consumption. The previously mentioned protocol implementations require too much hardware and processor power to meet these requirements. The complexity of such implementations are difficult to incorporate into consumer products in a cost effective way. If network access can be simplified such that it may be easily manufactured on a low-cost, low-power, and small form-factor device, these products can access network services, such as the Internet.

[0009] Communications networks use protocols to transmit and receive data. Typically, a communications network comprises a collection of network devices, also called nodes, such as computers, printers, storage devices, and other computer peripherals, communicatively connected together. Data is transferred between each of these network devices using data packets that are transmitted through the communications network using a protocol. Many different protocols are in current use today. Examples of popular protocols include the Internet Protocol (IP), Internetwork Packet Exchange (IPX) protocol, Sequenced Packet Exchange (SPX) protocol, Transmission Control Protocol (TCP), Point-to-Point Protocol (PPP) and other similar new protocols that are under development. A network device contains a combination of hardware and software that processes protocols and data packets.

[0010] In 1978, the International Standards Organization (ISO), a standards setting body, created a network reference model known as the Open System Interconnection (OSI) model. The OSI model includes seven conceptual layers: 1) The Physical (PHY) layer that defines the physical components connecting the network device to the network; 2) The Data Link layer that controls the movement of data in discrete forms known as frames that contain data packets; 3) The Network layer that builds data packets following a specific protocol; 4) The Transport layer that ensures reliable delivery of data packets; 5) The Session layer that allows for two way communications between network devices; 6) The Presentation layer that controls the manner of representing the data and ensures that the data is in correct form; and 7) The Application layer that provides file sharing, message handling, printing and so on. Sometimes the Session and Presentation layers are omitted from this model. For an explanation of how modern communications networks and the Internet relate to the ISO seven-layer model see, for example, chapter 11 of the text “Internetworking with TCP/IP” by Douglas E. Comer (volume 1, fourth edition, ISBN 0201633469) and Chapter 1 of the text “TCP/IP Illustrated” by W. Richard Stevens (volume 1, ISBN 0130183806).

[0011] An example of a network device is a computer attached to a Local Area Network (LAN), wherein the network device uses hardware in a host computer to handle the Physical and Data Link layers, and uses software running on the host computer to handle the Network, Transport, Session, Presentation and Application layers. The Network, Transport, Session, and Presentation layers, are implemented using protocol-processing software, also called protocol stacks. The Application layer is implemented using application software that process the data once the data is passed through the network-device hardware and protocol-processing software. The advantage to this software-based protocol processing implementation is that it allows a general-purpose computer to be used in many different types of communications networks and supports any applications that may be needed. The result of this software-based protocol processing implementation, however, is that the overhead of the protocol-processing software, running on the Central Processing Unit (CPU) of the host computer, to process the Network, Transport, Session and Presentation layers is very high. A software-based protocol processing implementation also requires a large amount of memory on the host computer, because data must be copied and moved as the software processes it. The high overhead required by protocol-processing software is demonstrated in U.S. Pat. No. 5,485,460 issued to Schrier et al. on Jan. 16, 1996, which teaches a method of operating multiple software protocol stacks. This type of software-based protocol processing implementation is used, for example, in computers running Microsoft Windows.

[0012] During normal operation of a network device, the network-device hardware extracts the data packets that are then sent to the protocol-processing software in the host computer. The protocol-processing software runs on the host computer, and this host computer is not optimized for the tasks to be performed by the protocol-processing software. The combination of protocol-processing software and a general-purpose host computer is not optimized for protocol processing and this leads to performance limitations. Performance limitations in protocol processing, such as the time lag created by the execution of protocol-processing software, is deleterious and may prevent, for example, audio and video transmissions from being processed in real-time or prevent the full speed and capacity of the communications network from being used. It is evident that the amount of host-computer CPU overhead required to process a protocol is very high and extremely cumbersome and requires the use of the CPU and a large amount of memory in the host computer.

[0013] New consumer and industrial products that do not fit in the traditional models of a network device are entering the market and, at the same time, network speed continues to increase. Examples of these consumer products include Internet-enabled cell phones, Internet-enabled TVs, and Internet appliances. Examples of industrial products include network interface cards (NICs), Internet routers, Internet switches, and Internet storage servers. Software-based protocol processing implementations are too inefficient to meet the requirements of these new consumer and industrial products. Software-based protocol processing implementations are difficult to incorporate into consumer products in a cost effective way because of their complexity. Software-based protocol processing implementations are difficult to implement in high-speed industrial products because of the processing power required. If protocol processing can be simplified and optimized such that it may be easily manufactured on a low-cost, low-power, high-performance, integrated, and small form-factor device, these consumer and industrial products can read and write data on any communications network, such as the Internet.

[0014] A hardware-based, as opposed to software-based, protocol processing implementation, an Internet tuner, is described in J. Minami; R. Koyama; M. Johnson; M. Shinohara; T. Poff; D. Burkes; Multiple network protocol encoder/decoder and data processor , U.S. Pat. No. 6,034,963 (Mar. 7, 2000) (the '963 patent). This Internet tuner provides a core technology for processing protocols.

[0015] It would be advantageous to provide a gigabit Ethernet adapter that provides a hardware solution to high network communication speeds. It would further be advantageous to provide a gigabit Ethernet adapter that adapts to multiple communication protocols.

SUMMARY OF THE INVENTION

[0016] The invention is embodied in a gigabit Ethernet adapter. A system according to the invention provides a compact hardware solution to handling high network communication speeds. In addition, the invention adapts to multiple communication protocols via a modular construction and design. A presently preferred embodiment of the invention provides an integrated network adapter for decoding and encoding network protocols and processing data. The network adapter comprises a hardwired data path for processing streaming data; a hardwired data path for receiving and transmitting packets and for encoding and decoding packets; a plurality of parallel, hardwired protocol state machines; wherein each protocol state machine is optimized for a specific network protocol; and wherein said protocol state machines execute in parallel; and means for scheduling shared resources based on traffic.

BRIEF DESCRIPTION OF THE DRAWINGS

[0017] FIG. 1 is a block schematic diagram of a NIC Card Implementation according to the invention;

[0018] FIG. 2 is a block schematic diagram of an Interface for Network Attached Device according to the invention;

[0019] FIG. 3 is a block level diagram of a system according to the invention;

[0020] FIG. 4 is a high level block diagram for a gigabit Ethernet adapter according to the invention;

[0021] FIG. 5 is a block schematic diagram that depicts the I/Os used in a MAC Interface module according to the invention;

[0022] FIG. 6 is a block schematic diagram of an Ethernet Interface according to the invention;

[0023] FIG. 7 is a block schematic diagram of an Address Filter and Packet Type Parser module according to the invention;

[0024] FIG. 8 is a timing diagram that shows Address Filter and Packet Type Parser module operation according to the invention;

[0025] FIG. 9 is a block schematic diagram of a Data Aligner Module according to the invention;

[0026] FIG. 10 is a block schematic diagram of an ARP Module according to the invention;

[0027] FIG. 11 is a block schematic diagram of an ARP Cache according to the invention;

[0028] FIG. 12 shows a Transmission Queue Entry Format according to the invention;

[0029] FIG. 13 shows a Lookup Table Entry Format according to the invention;

[0030] FIG. 14 shows an ARP Cache Entry Format according to the invention;

[0031] FIG. 15 is a flow diagram that shows the ARP Lookup Process according to the invention;

[0032] FIG. 16 is a block schematic diagram of an IP Module according to the invention;

[0033] FIG. 17 is a block diagram of an ID generator according to the invention;

[0034] FIG. 18 is a block diagram depicting the data flow with an Injector according to the invention;

[0035] FIG. 19 is a top-level block diagram for the TCP module according to the invention;

[0036] FIG. 20 depicts a TCP Receive data flow according to the invention;

[0037] FIG. 21 shows a VSOCK/Rcv State Handler Control Block Search Resolution Flow according to the invention;

[0038] FIG. 22 shows a basic data flow according to the invention;

[0039] FIG. 23 shows a Socket Receive Data Flow according to the invention;

[0040] FIG. 24 shows a Socket Transmit Flow according to the invention;

[0041] FIG. 25 shows a data flow according to the invention;

[0042] FIG. 26 shows a block diagram of a module according to the invention;

[0043] FIG. 27 shows an algorithm according to the invention;

[0044] FIG. 28 is a block diagram for the entire algorithm shown in FIG. 27 ;

[0045] FIG. 29 shows logic according to the invention;

[0046] FIG. 30 shows a format of an option according to the invention;

[0047] FIG. 31 shows a format of another option according to the invention;

[0048] FIG. 32 shows a format of another option according to the invention;

[0049] FIGS. 33 and 34 show formats of further options according to the invention;

[0050] FIG. 35 is a block schematic diagram of an IP Router according to the invention;

[0051] FIG. 36 shows a format of each IP route entry according to the invention;

[0052] FIG. 37 shows signaling used to request and receive a route according to the invention;

[0053] FIG. 38 is a block schematic diagram of an Exception Handler according to the invention;

[0054] FIG. 39 is an M1 memory map according to the invention;

[0055] FIG. 40 depicts a sample memory map according to the invention;

[0056] FIG. 41 is a block diagram flow for data according to the invention;

[0057] FIG. 42 is a block diagram of the mtxarb sub unit according to the invention;

[0058] FIG. 43 is a block diagram flow for data according to the invention;

[0059] FIG. 44 is a block diagram of the mcbarb sub unit according to the invention;

[0060] FIG. 45 depicts a default memory map for the network stack according to the invention;

[0061] FIG. 46 default settings according to the invention;

[0062] FIG. 47 shows a matching IB and SB queue which together form a Channel according to the invention;

[0063] FIG. 48 shows processing flow for an Instruction Block queue according to the invention;

[0064] FIG. 49 is a block diagram depicting data flow fo ar Status block passing between a network stack, an on-chip processor, and a Host according to the invention;

[0065] FIG. 50 is a block diagram of a iSCSI transmit data path according to the invention;

[0066] FIG. 51 shows an iSCSI Transmit Flow Chart according to the invention;

[0067] FIG. 52 shows use of a four-byte buffer according to the invention;

[0068] FIG. 53 is a block diagram of a iSCSI receive data path according to the invention;

[0069] FIG. 54 shows a transfer split into two requests according to the invention;

[0070] FIG. 55 shows a DMA transfer to a host split into separate requests according to the invention;

[0071] FIG. 56 shows SA Block Flow according to the invention;

[0072] FIG. 57 shows TX AH Transport SA Block Format according to the invention;

[0073] FIG. 58 shows TX ESP-1 Transport SA Block Format according to the invention;

[0074] FIG. 59 shows TX ESP-2 Transport SA Block Format according to the invention;

[0075] FIG. 60 shows TX AH Tunnel SA Block Format according to the invention;

[0076] FIG. 61 shows TX AH Tunnel SA Block Format according to the invention;

[0077] FIG. 62 shows TX ESP-2 Tunnel SA Block Format according to the invention;

[0078] FIG. 63 shows RX AH SA Block Format according to the invention;

[0079] FIG. 64 shows RX ESP-1 SA Block Format according to the invention;

[0080] FIG. 65 shows RX ESP-2 SA Block Format according to the invention;

[0081] FIG. 66 is a block diagram that depicts the overall flow for the IPSEC logic according to the invention;

[0082] FIG. 67 is a block diagram outlining the data flow according to the invention;

[0083] FIG. 68 is a block diagram showing data path flow for received IPSEC packets according to the invention; and

[0084] FIG. 69 is a flow diagram that shows the IPSEC Anti-Replay Algorithm according to the invention.

DETAILED DESCRIPTION OF THE INVENTION

[0085] The invention is embodied in a gigabit Ethernet adapter. A system according to the invention provides a compact hardware solution to handling high network communication speeds. In addition, the invention adapts to multiple communication protocols via a modular construction and design.

Introduction

[0086] General Description

[0087] The invention comprises an architecture to be used in a high-speed hardware network stack (hereafter referred to as the IT10G). The description herein defines the data paths and flows, registers, theory of applications, and timings. Combined with other system blocks, the IT10G provides the core for line speed TCP/IP processing.

[0088] Definitions

[0089] As used herein, the following terms shall have the corresponding meaning:10 Gbps 10 Gigabit (10,000,000,000 bits per second)

[0090] ACK Acknowledgment

[0091] AH Authentication Header

[0092] AHS Additional Header Segment

[0093] ARP Address Resolution Protocol

[0094] BHS Basic Header Segment

[0095] CB Control Block

[0096] CPU Central Processing Unit

[0097] CRC Cyclic Redundancy Check

[0098] DAV Data Available

[0099] DDR Double Data Rate

[0100] DIX Digital Intel Xerox

[0101] DMA Direct Memory Access

[0102] DOS Denial of Service

[0103] DRAMDynamic RAM

[0104] EEPROM Electrically Erasable PROM

[0105] ESP Encapsulating Security Payload

[0106] FCIP Fiber Channel over IP

[0107] FIFO First-In First-Out

[0108] FIM Fixed Interval Marker

[0109] FIN Finish

[0110] Gb Gigabit (1,000,000,000 bits per second)

[0111] HDMAHost DMA

[0112] HO Half Open

[0113] HR Host Retransmit

[0114] HSU Header Storage Unit

[0115] IB Instruction Block

[0116] ICMP Internet Control Message Protocol

[0117] ID Identification

[0118] IGMP Internet Group Management Protocol

[0119] IP Internet Protocol

[0120] IPsec IP Security

[0121] IPX Internet Packet Exchange

[0122] IQ Instruction Block Queue

[0123] iSCSI Internet Small Computer System Interface

[0124] ISN Initial Sequence Number

[0125] LAN Local Area Network

[0126] LDMA Local DMA

[0127] LIP Local IP Address

[0128] LL Linked List

[0129] LP Local Port

[0130] LSB Least-Significant Byte

[0131] LUT Look-Up Table

[0132] MAC Media Access Controller

[0133] MCB CB Memory

[0134] MDL Memory Descriptor List

[0135] MIB Management Information Base

[0136] MII Media Independent Interface

[0137] MPLS Multiprotocol Label Switching

[0138] MRX Receive Memory

[0139] MSB Most-Significant Bit

[0140] MSS Maximum Segment Size

[0141] MTU Maximum Transmission Unit

[0142] MTX TX Memory

[0143] NAT Network Address Translation

[0144] NIC Network Interface Card

[0145] NS Network Stack

[0146] OR OR Logic Function

[0147] PDU Protocol Data Unit

[0148] PIP Peer IP Address

[0149] PP Peer Port

[0150] PROM Programmable ROM

[0151] PSH Push

[0152] PV Pointer Valid

[0153] QoS Quality of Service

[0154] RAM Random Access Memory

[0155] RARP Reverse Address Resolution Protocol

[0156] Rcv Receive

[0157] RDMARemote DMA

[0158] ROM Read-Only Memory

[0159] RST Reset

[0160] RT Round Trip

[0161] RTO Retransmission Timeout

[0162] RTT Round-Trip Time

[0163] RX Receive

[0164] SA Security Association

[0165] SB Status Blocks

[0166] SEQ Sequence

[0167] SM Status Message

[0168] SNMP Simple Network Management Protocol

[0169] SPI Security Parameter Index

[0170] Stagen Status Generator

[0171] SYN Synchronization

[0172] TCP Transport Control Protocol

[0173] TOE Transport Offload Engine

[0174] TOS Type of Service

[0175] TTL Time to Live

[0176] TW Time Wait

[0177] TX Transmit

[0178] UDP User Datagram Protocol

[0179] URG Urgent

[0180] VLAN Virtual LAN

[0181] VSOCK Virtual Socket

[0182] WS Window Scaling

[0183] XMTCTL Transmit Control

[0184] XOR Exclusive-OR

Application Overview

[0185] Overview

[0186] As bandwidth continues to increase, the ability to process TCP/IP communications becomes more of an overhead for system processors. Many sources state that as Ethernet rates reach the gigabit per second (Gbps) rate, that TCP/IP protocol processing will consume close to 100% of the host computer's CPU bandwidth, and when the rates increase further to 10 Gbps, that the entire TCP/IP protocol processing must be off-loaded to dedicated sub-systems. The herein described IT10G implements TCP and IP, along with related protocols including, for example, ARP, RARP, and IP host routing, as a series of state machines. The IT 10G core forms an accelerator or engine, also known as a Transport Offload Engine (TOE). The IT10G core uses no processor or software, although hooks are provided so that a connected on-chip processor can handle be used to extend the features of the network stack.

[0187] Sample Applications

[0188] An example usage of the IT10G core is an Intelligent Network Interface Card (NIC). In a typical application, the NIC is plugged into a computer server and natively processes TCP/UDP/IP packets.

[0189] FIG. 1 is a block schematic diagram of a NIC Implementation of the invention. In FIG. 1 , the IT 10G core 10 is combined with a processor 11 , system peripherals 12 , and a system bus interface 13 into a single-chip NIC controller. The single-chip NIC controller is integrated with an Ethernet PHY 14 , combined with a configuration EEPROM 15 , and optional external memory for the network stack to form a low chip count NIC. The processor memory 16 (both ROM and RAM) may be internal to the integrated chip or reside externally. Another usage for the IT 10G is to function as the interface for network attached devices, such as storage units, printers, cameras, and so forth. In these cases, a custom application socket (or interface) 17 can be designed into the IT 10G to process layer 6 and 7 protocols and to facilitate data movement specific for that application. Examples include custom data paths for streaming media, bulk data movements, and protocols such as iSCSI and FCIP.

[0190] FIG. 2 is a block schematic diagram of an Interface for Network Attached Device according to the invention. Although the IT 10G is designed to support line speed processing at 10 Gbps rates, the same architecture and logic may also be used at lower speeds. In these cases, the only difference is in the Ethernet MAC 21 and PHY 14 . Advantages of using this architecture at slower line speeds include lower power consumption, for example.

[0191] The Challenge

[0192] The challenge for high-speed bandwidths is in processing TCP/IP packets at wire line speeds. This is shown in the following table. 1

TABLE 1
Processing Power Requirements
Rate Bytes/sec Packets/sec 1 Instr/sec 2
10 Mbps 1,000,000 2,000 2 MIPs
100 Mbps 10,000,000 20,000 20 MIPs
1 Gbps 100,000,000 200,000 200 MIPs
10 Gbps 1,000,000,000 2,000,000 2 GIPs
Notes:
1 This assumes an average packet size of 500 bytes
2 This assumes 500 instruction overhead per packet and 1 instruction per byte

[0193] The figures in the above table are very conservative, and do not take into account, for example, the full duplex nature of networking. If full-duplex operation is factored in, then the processing power requirements can easily double. In any case, it is apparent that starting at the gigabit level, the processing overhead of TCP/IP becomes a major drain on host computer processing power and that another solution is needed.

[0194] Bandwidth Limitation

[0195] The IT10G addresses the limitation of host computer processing power by various architecture implementations. These include the following features:

[0196] On the fly (streaming) processing of incoming and outgoing data

[0197] Ultra wide datapaths (64 bits in the current implementation)

[0198] Parallel execution of protocol state machines

[0199] Intelligent scheduling of shared resources

[0200] Minimized memory copying

System Overview

[0201] Overview

[0202] This section describes the top level of the preferred embodiment. It provides a block level description of the system as well as a theory of operation for different data paths and transfer types.

[0203] This embodiment of the invention incorporates the IT10G network stack and combines it with a processor core, and system components to provide a complete networking sub-system for different applications. A block level diagram for the system is shown in FIG. 3 .

[0204] Clock Requirements

[0205] The presently preferred embodiment of the invention is a chip that is designed to operate with different clock domains. The following table lists all clock domains for both 1 Gbps and 10 Gbps operations. 2

TABLE 2
Clock Domains
1 Gb 10 Gb
Domain Symbol (Mhz) (MHz) Notes
MAC CLK MAC 125 125
System CLK CORE  20 200 This clock serves the
Clock network stack and the on-
chip processor core
System CLK SYS 66/133 133 PCI 64/66 or PCI-X 133 is
Interface used for 1 Gbps. PCI-
Express is used for 10
Gbps.

Protocol Processor

[0206] Overview

[0207] This section provides an overview of the internal Protocol processor.

[0208] Processor Core

[0209] The herein described chip uses an internal (or on-chip) processor for programmability and flexibility. This processor is also furnished with all the peripherals needed to complete a working system. Under normal operating conditions, the on-chip processor controls the network stack.

[0210] Memory Architecture

[0211] The on-chip processor has the capability to address up to 4 GBytes of memory. Within this address space are located all of its peripherals, its RAM, ROM, and the network stack.

[0212] Network Stack Architecture

[0213] Overview

[0214] This section overviews the IT 10G architecture. Subsequent sections herein go into detail on individual modules. The IT10G takes the hardware protocol processing function of a network stack, and adds enhancements that enable it to scale up to 10 Gbps rates. The major additions to previous versions are widening of the data paths, parallel execution of state machines, and intelligent scheduling of shared resources. In addition, other protocols previously not supported are added with support for protocols such as RARP, ICMP, and IGMP. FIG. 4 is a high level block diagram for the IT 10G.

[0215] Theory of Operation

[0216] TCP/UDP Socket Initialization

[0217] Prior to transferring any data using the IT 10G, a socket connection must be initialized. This can be done either by using commands blocks or by programming up the TCP socket registers directly. Properties that must be programmed for every socket include the Destination IP address, Destination Port number, and type of connection (TCP or UDP, Server or Client, for example). Optional parameters include such settings as a QoS level, Source Port, TTL, and TOS setting. Once these parameters have been entered, the socket may be activated. In the case of UDP sockets, data can start to be transmitted or received immediately. For TCP clients, a socket connection must first be established, and for TCP servers a SYN packet must be received from a client, and then a socket connection established. All these operations may be performed completely by the IT 10G hardware.

[0218] Transmission of Packets

[0219] When TCP packets need to be transmitted, the application running on the host computer first writes the data to a socket (either a fixed socket or virtual socket—virtual sockets are supported by the IT 10G architecture). If the current send buffer is empty, then a partial running checksum is kept as the data is being written to memory. The partial checksum is used as the starting seed for checksum calculations, and alleviates the need for the TCP layers in the IT 10G network stack to read through the data again prior to sending data out. Data can be written to the socket buffer in either 32-bit or 64-bit chunks. Up to four valid_byte signals are used to indicate which bytes are valid. Data should be packed when writing to the socket buffers, with only the last word having possible invalid bytes. This stage also applies to UDP packets for which there is an option of not calculating the data checksum.

[0220] Once all the data has been written, the SEND command can be issued by the application running on the host computer. At this point, the TCP/UDP engine calculates the packet length, checksums and builds the TCP/IP header. This TCP/IP header is pre-pended to the socket data section. The buffer pointer for the packet, along with the sockets QoS level is then put on the transmission queue.

[0221] The transmission scheduler looks at all sockets that have pending packets and selects the packet with the highest QoS level. This transmission scheduler looks at all types of packets that need transmission. These packets may include TCP, UDP, ICMP, ARP, RARP, and raw packets, for example. A minimum-bandwidth algorithm is used to make sure that no socket is completely starved. When a socket packet is selected for transmission, the socket buffer pointer is passed to the MAC TX Interface. The MAC TX Interface is responsible for reading the data from the socket buffer and sending the data to the MAC. A buffer is used to store the outgoing packet in case it needs to be retransmitted due to Ethernet collisions or for other reasons. Once the packet data is sent from the original socket buffer, then that data buffer is freed. When a valid transmit status is received back from the MAC, the data buffer is flushed, and the next packet can then be sent. If an invalid transmission status is received from the MAC, then the last packet stored in the data buffer is retransmitted.

[0222] Reception of Packets

[0223] When a packet is received from the MAC, the Ethernet header is parsed to determine if the packet is destined for this network stack. The MAC address filter may be programmed to accept a unicast addresses, unicast addresses that fall within a programmed mask, broadcast addresses, or multicast addresses. In addition, the encapsulating protocol is also determined. If the 16-bit TYPE field in the Ethernet header indicates an ARP (0x0806) or RARP (0x0835) packet, then the ARP/RARP module is enabled to further process the packet. If the TYPE field decodes to IPv4 (0x0800), then the IP module is enabled to process the packet further. A complete list of example supported TYPE fields is shown in the following table. If the TYPE field decodes to any other value, the packet may optionally be routed to a buffer and the host computer notified that an unknown Ethernet packet has been received. In this last case, the application may read the packet, and determine the proper course of action. With this construction of the datapath any protocol not directly supported in hardware, such as IPX for example, may be indirectly supported by the IT10G. 3

TABLE 3
Supported Ethernet TYPE Field Values
TYPE Field Description
0x0800 IPv4 Packet
0x0806 ARP Packet
0x8035 RARP Packet
0x8100 VLAN Tagged Packets
0x8847 MPLS Unicast Packets
0x8848 MPLS Multicast Packets
Note:
IPv6 packets are handled as exceptions at the Ethernet layer.

[0224] ARP/RARP Packets

[0225] If the received packet is an ARP or RARP packet, then the ARP/RARP module is enabled. It examines the OP field in the packet and determines if it is a request or a reply. If it is a request, then an outside entity is polling for information. If the address that is being polled is for the IT 10G, then a reply_req is sent to the ARP/RARP reply module. If the packet received is an ARP or RARP reply, then the results, i.e. the MAC and IP addresses, are sent to the ARP/RARP request module.

[0226] In an alternative embodiment the ARP and/or RARP functions are handled in the host computer using dedicated and optimized hardware in the IT10G to route ARP/RARP packets to the host via the exception path.

[0227] IP Packets

[0228] If the received packet is an IP packet, then the IP module is enabled. The IP module first examines the version field in the IP header to determine if the received packet is an IPv4 packet.

[0229] The IP module parses the embedded protocol of the received packet. Depending on what protocol is decoded, the received packet is sent to the appropriate module. Protocols supported directly by hardware in the current embodiment include TCP and UDP, for example. Other protocols, such as RDMA, may be supported by other optimized processing modules. All unknown protocols are processed using the exception handler.

[0230] TCP Packets

[0231] If a TCP packet is received by the IT 10G, then the socket information is parsed, and the corresponding socket enabled. The state information of the socket is retrieved, and based on the type of packet received, the socket state is updated accordingly. The data payload of the packet (if applicable) is stored in the socket data buffer. If an ACK packet needs to be generated, the TCP state module generates the ACK packet and schedules the ACK packet for transmission. If a TCP packet is received that does not correlate to an open socket, then the TCP state module generates a RST packet and the RST packet is scheduled for transmission.

[0232] UDP Packets

[0233] If a UDP packet is received, then the socket information is parsed, and the data stored in the socket receive data buffer. If no open socket exists, then the UDP packet is silently discarded.

[0234] In an alternative embodiment UDP packets may be handled by the host computer using the exception handler.

[0235] Network Stack Registers

[0236] The hardware network stack of the IT 10G is configured to appear as a peripheral to the on-chip processor. The base address for the network stack is programmed via the on-chip processor's NS_Base_Add register. This architecture allows the on-chip processor to put the network stack at various locations in its memory or I/O space.

[0237] Ethernet MAC Interface

[0238] Overview

[0239] The following discussion describes the Ethernet MAC interface module. The function of the Ethernet MAC interface module is to abstract the Ethernet MAC from the core of the IT10G. This allows the IT 10G network stack core to be coupled to different speed MACs and/or MACs from various sources without changing the IT10G core architecture, for example. This section describes the interface requirements for communication with the IT10G core.

[0240] Module I/Os

[0241] FIG. 5 is a block schematic diagram that depicts the I/Os used in MAC Interface module.

[0242] Ethernet Interface

[0243] Overview

[0244] This section describes the Ethernet Interface module. The Ethernet interface module communicates with the Ethernet MAC interface at the lower end, and to blocks such as the ARP, and IP modules on the upper end. The Ethernet interface module handles data for both the receive and transmit paths. On the transmit side, the Ethernet interface module is responsible for scheduling packets for transmission, setting up DMA channels for transmission, and communicating with the Ethernet MAC interface transmit signals. On the receive side, the Ethernet interface module is responsible for parsing the Ethernet header, determining if the packet should be received based upon address filter settings, enabling the next encapsulated protocol based upon the TYPE field in the packet header, and aligning the data so that it starts on a 64-bit boundary for the upper layer protocols. FIG. 6 is a block schematic diagram of the Ethernet Interface 40 .

[0245] Sub Module Block Descriptions

[0246] Transmission Scheduler

[0247] The Transmission Scheduler block 60 is responsible for taking transmission requests from the ARP, IP, TCP, and Raw transmission modules, and determining which packet should be sent next. The Transmission Scheduler determines transmission order by comparing QoS levels for each transmission request. Along with the QoS level, each transmission request contains a pointer to the starting memory block for a packet, along with a packet length. The transmission scheduler has the capability to be programmed to weigh the transmission priority of certain packet types more heavily than others. For example, a QoS level of five from the TCP module can be made to count for more than a level five request from the IP module. The Transmission Scheduler allows multiple modules to operate in parallel and shared fashion that depends on transmit data traffic. The following is the algorithm currently used to determine packet scheduling.

[0248] Check to see that no packet channel has reached the starved state. This is a programmable level, per channel type, i.e. TCP, IP, ARP, and Raw buffers, that states how many times a channel is passed over before the scheduler over-rides the QoS level and the packet is sent out. If two or more packets have reached the starved state at the same time, then the channel with the higher weighting is given priority. The other packet is then scheduled to be sent next. If the packets have the same priority weighting they are sent out one after the other according to the following order; TCP/UDP then ARP then IP then Raw Ethernet.

[0249] If no channel has a packet in the starved state, then the channel with the highest combined QoS level and channel weighting is sent.

[0250] If only one channel has a packet to be sent, it is sent immediately.

[0251] Once a packet channel has been selected for transmission, the channel memory pointer, packet length, and type are passed to the DMA engine. The DMA engine in turn signals back to the transmission scheduler when the transfer has been completed. At this point the scheduler sends the packet parameters to the DMA engine.

[0252] DMA Engine

[0253] The DMA Engine 61 receives packet parameters from the transmission scheduler. Packet parameters include packet type, packet length, and starting memory pointer. The DMA engine uses the packet length to determine how many data bytes to transfer from memory. The packet type indicates to the DMA engine from which memory buffer to retrieve the data, and the starting memory pointer indicates from where to start reading data. The DMA engine needs to understand how big each of the memory blocks used in the channel packet is because an outgoing packet may span multiple memory blocks. The DMA engine receives data 64 bits at a time from the memory controllers and passes data 64 bits at a time to the transmitter interface.

[0254] Transmitter Interface

[0255] The Transmitter Interface 62 takes the output from the DMA engine and generates the macout_lock, macout_rdy, macout_eof, and macout_val_byte signals for the Ethernet MAC interface. The 64 bit macout_data bus connects directly from the DMA Engine to the Ethernet MAC Interface.

[0256] Receiver Interface

[0257] The Receiver Interface 63 is responsible for interfacing with the Ethernet MAC interface. The Receiver Interface takes data in and presents the data along with state count information to the Address Filter and Packet Type Parser block.

[0258] Address Filter and Packet Type Parser

[0259] The Address Filter and Packet Type Parser 64 parses the Ethernet header and performs two major functions:

[0260] Determine if the packet is for the local network stack

[0261] Parse the encapsulated packet type to determine where to send the rest of the packet.

[0262] Address Filtering

[0263] The network stack can be programmed with the following filter options:

[0264] Accept a programmed unicast address

[0265] Accept broadcast packets

[0266] Accept multicast packets

[0267] Accept addresses within a range specified by a netmask

[0268] Promiscuous mode (accepts all packets)

[0269] These parameters are all settable by the host computer via registers.

[0270] Packet Types Supported

[0271] The following packet types are known by the IT10G hardware and are natively supported:

[0272] IPv4 packets with type=0x8000

[0273] ARP packets with type=0x0806

[0274] RARP packets with type=0x8035

[0275] The packet type parser also handles the case where an 802.3 length parameter is included in the TYPE field. This case is detected when the value is equal to or less then 1500 (decimal). When this condition is detected, the type parser sends the encapsulated packet to both the ARP and IP receive modules, along with asserting an 802_frame signal so that each subsequent module realizes that it must decode the packet with the knowledge that it may not be really meant for that module.

[0276] Note: IPv6 packets are treated as exception packets by the Ethernet layer.

[0277] FIG. 7 is a block schematic diagram of an Address Filter and Packet Type Parser module, and FIG. 8 is a timing diagram that shows Address Filter and Packet Type Parser module operation. For I/O timing, the signals that indicate the packet types remain asserted until the macin_lock signal that corresponds to that packet has been de-asserted. All_packet signals also only trigger if the destination MAC address is acceptable.

[0278] If the Address Filter and Packet Type Parser module parses a packet that it does not understand, and if the unsupported type feature is enabled, then the packet is routed to the Exception Handler for storage and further processing.

[0279] Data Aligner

[0280] The Data Aligner 65 is responsible for aligning data bytes for the following layers of packet processing. The Data Aligner is needed because the Ethernet header is not an even multiple of 64 bits. Depending on whether VLAN tags are present or not, the data aligner re-orients the 64-bit data so that to the upper processing layers, the data is MSB justified. This way the payload section of the Ethernet frame is always aligned on an even 64-bit boundary. The Data Aligner is also responsible for generating the ready signal to the next layers. The ready signal goes active two or three ready cycles after macin_rdy gets asserted. FIG. 9 is a block schematic diagram of an implementation Data Aligner Module.

[0281] Ethernet Packet Formats

[0282] The IT10G accepts both 802.3(SNAP) and DIX format packets from the network, but only transmits packets in DIX format. Furthermore, when 802.3 packets are received, they are first translated into DIX format, and then processed by the Ethernet filter. Therefore, all Ethernet exception packets are stored in DIX format.

[0283] ARP Protocol and ARP Cache Modules

[0284] Overview

[0285] The following discussion details the ARP Protocol and ARP Cache modules. In one embodiment of the IT10G architecture, the ARP protocol module also supports the RARP protocol, but does not include the ARP cache itself. Because each module capable of transmitting a packet queries the ARP cache ahead of time, this common resource is separated from this ARP module. The ARP Protocol and ARP Cache module may send updates to the ARP cache based upon packet types received.

[0286] ARP Feature List:

[0287] Able to respond to ARP requests by generating ARP replies

[0288] Able to generate ARP requests in response to the ARP cache

[0289] Able to provide ARP replies for multiple IP addresses (multi-homed host/ARP proxy)

[0290] Able to generate targeted (unicast) ARP requests

[0291] Filters out illegal addresses

[0292] Passes aligned ARP data up to the processor

[0293] Capable of performing a gratuitous ARP

[0294] CPU may bypass automatic ARP reply generation, dumping ARP data into the exception handler

[0295] CPU may generate custom ARP replies (when in bypass mode)

[0296] Variable priority of ARP packets, depending on network conditions

[0297] RARP Feature List:

[0298] Request an IP address

[0299] Request a specific IP address

[0300] RARP requests are handed off to the exception handler

[0301] Handles irregular RARP replies

[0302] Passes aligned RARP data up to the processor

[0303] CPU may generate custom RARP requests and replies

[0304] ARP Cache Features:

[0305] Dynamic ARP table size

[0306] Automatically updated ARP entry information

[0307] Interrupt when sender's hardware address changes

[0308] Capable of promiscuous collection of ARP data

[0309] Duplicate IP address detection and interrupt generation

[0310] ARP request capability via the ARP module

[0311] Support for static ARP entries

[0312] Option for enabling static ARP entries to be replaced by dynamic ARP data

[0313] Support for ARP proxying

[0314] Configurable expiration time for ARP entries

[0315] (The CPU may be either the host computer CPU or the on-chip processor in this context.)

[0316] ARP Module Block Diagram

[0317] FIG. 10 is a block schematic diagram of one implementation of an ARP Module Block.

[0318] ARP Cache Module Block Diagram

[0319] FIG. 11 is a block schematic diagram of one implementation of an ARP Cache Block.

[0320] ARP Module Theory of Operations

[0321] Parsing Packets

[0322] The ARP module 100 only processes ARP and RARP packets. The module waits for a ready signal received from Ethernet receive module. When that signal is received, the frametype of the incoming Ethernet frame is checked. If the frametype is not ARP/RARP, the packet is ignored. Otherwise, the module begins parsing.

[0323] Data is read from the Ethernet interface in 64-bit words. An ARP packet takes up 3.5 words. The first word of an ARP-type packet contains mostly static information. The first 48 bits of the first word of an ARP-type packet contain the Hardware Type, Protocol Type, Hardware Address Length, and Protocol Address Length. These received values are compared with the values expected for ARP requests for IPv4 over Ethernet. If the received values do not match, the data is passed to the exception handler for further processing. Otherwise, the ARP module continues with parsing. The last 16 bits of the first word of an ARP-type packet contain the opcode. The ARP module stores the opcode and checks if it is valid, i.e. 1, 2 or 4. If the opcode is invalid, the data is passed to the exception handler for further processing. Otherwise, the ARP module continues with parsing.

[0324] The second word of an ARP-type packet contains the Source Ethernet Address and half of the Source IP Address. The ARP module stores the first 48 bits into the Source Ethernet Address register. Then the ARP module checks if this field is a valid Source Ethernet Address. The address should not be same as the address of the IT 10G network stack. If the source address is invalid, the packet is discarded. The last 16 bits of the packet are then stored in the upper half of the Source IP Address register.

[0325] The third word of an ARP-type packet contains the second half of the Source IP Address and the Target Ethernet Address. The ARP module stores the first 16 bits in the lower half of the Source IP Address register, and checks if this stored value is a valid Source IP Address. The address should not be same as that of the IT10G hardware, or the broadcast address. Also, the source address should be in the same subnet. The ARP module discards the packet if the source address is invalid. If the packet is an ARP/RARP reply, compare the Target Hardware Address with my Ethernet address. If the address does not match, the ARP module discards the packet. Otherwise the ARP module continues with parsing.

[0326] Only the first 32 bits of the last word of an ARP-type packet contain data (the Target IP Address). The ARP module stores the Target IP Address in a register. If the packet is an ARP packet (as opposed to ARP request or RARP packet), compare the Target IP Address with my IP address. If the addresses do not match, discard this packet. Otherwise, if this packet is an ARP request, generate an ARP reply. If this is a RARP reply, pass the assigned IP address to the RARP handler.

[0327] Once all the address data have been validated, the source addresses are passed to the ARP Cache.

[0328] Transmitting Packets

[0329] The ARP module may receive requests for transmitting packets from three sources: the ARP Cache 110 (ARP requests), internally from the parser/FIFO buffer (for ARP replies), and from the system controller or host computer (for custom ARP/RARP packets). Because of this situation, a type of priority-queue is necessary for scheduling the transmission or ARP/RARP packets.

[0330] Transmission requests are placed in the queue in a first-come first-served order, except when two or more entities want to transmit. In that case, the next request placed in the queue depends on its priority. RARP requests normally have the highest priority, followed by ARP requests. ARP replies usually have the lowest priority. Using priority allows resources to be shared depending on data traffic.

[0331] There is one condition where ARP replies have the highest priority. This occurs when ARP reply FIFO buffer is filled. When the FIFO buffer is filled, incoming ARP requests begin to be discarded, therefore ARP replies should have the highest priority at that point to avoid forcing retransmissions of ARP requests.

[0332] When the transmission queue is full, no more requests may be made until one or more transmission requests have been fulfilled (and removed from the queue). When the ARP module detects a full queue, it requests an increase in priority from the transmission arbiter. Because there should be only two conditions for the queue, full or not full, this request signal may be a single bit.

[0333] When the transmission arbiter allows the ARP module to transmit, ARP/RARP packets are generated dynamically depending on the type of packet to be sent. The type of packet is determined by the opcode, which is stored with each entry in the queue. FIG. 12 shows a Transmission Queue Entry Format.

[0334] Bypass Mode

[0335] The ARP module has the option of bypassing the automatic processing of incoming packet data. When a bypass flag is set, incoming ARP/RARP data are transferred to the exception handler buffer. The CPU then accesses the buffer, and processes the data. When in bypass mode, the CPU may generate ARP replies on its own, passing data to the transmission scheduler. The fields that can be customized in outgoing ARP/RARP packets are: the source IP address, the source Ethernet address, the target IP address, and the opcode. All other fields match the standard values used in ARP/RARP packets for IPv4 over Ethernet, and the source Ethernet address is set to that of the Ethernet interface. (The CPU may be either the host computer or the on-chip processor in this context.)

[0336] Note: If it is necessary to modify these other ARP/RARP fields, the CPU must generate a raw Ethernet frame itself.

[0337] ARP Cache Theory of Operation

[0338] Adding Entries to the ARP Cache

[0339] ARP entries are created when receiving targeted ARP requests and replies (dynamic), or when requested by the CPU (static). (The CPU may be either the host computer or the on-chip processor in this context.) Dynamic entries are ARP entries that are created when an ARP request or reply is received for one of the interface IP addresses. Dynamic entries exist for a limited time as specified by the user or application program running on the host computer; typically five to 15 minutes. Static entries are ARP entries that are created by the user and do not normally expire.

[0340] New ARP data come from two sources: the CPU via the ARP registers and the ARP packet parser. When both sources request to add an ARP entry at the same time the dynamic ARP entries have priority, because it is necessary to process incoming ARP data as quickly as possible.

[0341] Once an ARP data source has been selected, we need to determine where in IT 10G hardware memory the ARP entry is to be stored. To do this we use a lookup table (LUT) to map a given IP address to a location in memory. The lookup table contains 256 entries. Each entry is 16 bits wide and contains a memory pointer and a pointer valid (PV) bit. The PV bit is used determine if the pointer is pointing to a valid address, i.e. the starting address of a memory block allocated by the ARP cache. FIG. 13 shows a Lookup Table Entry Format.

[0342] To determine from where in the LUT we need to retrieve the pointer, we use an 8-bit index. The index is taken from the last octet of a 32-bit IP address. The reason for using the last octet is that in a local area network (LAN) this is the portion of the IP address that varies the most between hosts.

[0343] Once we determine which slot in the LUT to use, we check to see if there is a valid pointer contained in that slot (PV=“1”). If there is a valid pointer, that means there is a block of memory allocated for this index, and the target IP address may be found in that block. At this point, the block of memory being pointed to is retrieved and the target IP address is searched for. If the LUT does not contain a valid pointer in this slot, then memory must be allocated from an internal memory, malloc1. Once the memory has been allocated the address of the first word of the allocated memory is stored in the pointer field of the LUT entry.

[0344] After allocating memory and storing the pointer in the LUT, we need to store the necessary ARP data. This ARP data includes the IP address, necessary for determining if this is the correct entry during cache lookups. Also used is a set of control fields. The retry counter is used to keep track of the number of ARP request attempts performed for a given IP address. The type field indicates the type of cache entry (000=dynamic entry; 001=static entry; 010=proxy entry; 011=ARP check entry). The resolved flag indicates that this IP address has been successfully resolved to an Ethernet address. The valid flag indicates that this ARP entry contains valid data. Note: an entry may be valid and unresolved while the initial ARP request is being performed. The src field indicates the source of the ARP entry (00=dynamically added, 01=system interface, 10=IP router, and 11=both system interface and IP router). The interface field allows the use of multiple Ethernet interfaces, but defaults to a single interface (0). Following the controls fields is the link address that points to the following ARP entry. The most significant bit (MSB) of the link address is actually a flag, link_valid. The link_valid bit indicates that there is another ARP entry following this one. The last two fields are the Ethernet address to which the IP address has been resolved, and the timestamp. The timestamp indicates when the ARP entry was created, and is used to determine if the entry has expired. FIG. 14 shows an example of the ARP Cache Entry Format

[0345] In LANs with more than 256 hosts or with multiple subnets, collisions between different IP addresses may occur in the LUT. In other words, more than one IP address may map to the same LUT index. This would be due to more than one host having a given value in the last octet of its IP address. To deal with collisions, the ARP cache uses chaining., which we describe next.

[0346] When performing a lookup in the LUT, and an entry is found to already exist in that slot, we retrieve the ARP entry that is being pointed to from memory. We examine IP address in the ARP entry and compare it to the target IP address. If the IP addresses match then we can simply update the entry. However, if the addresses do not match, then we look at the Link_Valid flag and the last 16 bits of ARP entry. The last 16 bits contain a link address pointing to another ARP entry that maps to the same LUT index. If the Link_Valid bit is asserted, then we retrieve the ARP entry pointed to in the Link Address field. Again the IP address in the entry is compared with the target IP address. If there is a match then the entry is updated, otherwise the lookup process continues (following the links in the chain) until a match is found or the Link_Valid bit is not asserted.

[0347] When the end of a chain is reached and a match has not been found, a new ARP entry is created. Creating a new ARP entry may require the allocation of memory by the malloc1 memory controller. Each block of memory is 128 bytes in size. Thus, each block can accommodate 8 ARP entries. If the end of a block has been reached, then a new memory block must be requested from malloc1.

[0348] As previously mentioned, the user (or application running on the host computer) has the option of creating static or permanent ARP entries. The user may have the option of allowing dynamic ARP data to replace static entries. In other words, when ARP data are received for an IP address that already has a static ARP entry created for it, that static entry may be replaced with the received data. The benefit of this arranegment is that static entries may become outdated and allowing dynamic data to overwrite static data may result in a more current ARP table. This update capability may be disabled if the user is confident that IP-to-Ethernet address mappings will remain constant, e.g. storing the IP and Ethernet addresses of a router interface. The user may also choose to preserve static entries to minimize the number of ARP broadcasts on a LAN. Note: ARP proxy entries can never be overwritten by dynamic ARP data.

[0349] Looking Up Entries in the Cache

[0350] Looking up entries in the ARP cache follows a process similar to that for creating ARP entries. Lookups begin by using the LUT to determine if memory has been allocated for a given index. If memory has been allocated, the memory is searched until either the entry is found (a cache hit occurs), or an entry with the link_valid flag set to zero (a cache miss) is encountered.

[0351] If a cache miss occurs, an ARP request is generated. This involves creating a new ARP entry in the cache, and a new LUT entry if necessary. In the new ARP entry, the target IP address is stored, the resolved bit is set to zero and the valid bit is set to one. The request counter is set to zero as well. The entry is then time stamped and an ARP request is passed to the ARP module. If a reply is not received after one second, then the request counter is incremented and another request is sent. After sending three requests and receiving no replies, attempts to resolve the target IP are abandoned. Note: the retry interval and number of request retries are user-configurable.

[0352] When a cache miss occurs, the requesting module is notified of the miss. This allows the CPU or IP router the opportunity to decide to wait for an ARP reply for the current target IP address, or to begin a new lookup for another IP address and place the current IP address at the back of the queue. This helps to minimize the impact of a cache miss on establishing multiple connections. FIG. 15 is a flow diagram that shows the ARP Lookup Process.

[0353] If a matching entry is found (cache hit) then the resolved Ethernet address is returned to the module requesting the ARP lookup. Otherwise if, the target IP address was not found in the cache, and all ARP request attempts have timed out, the requesting module is notified that the target IP address could not be resolved.

[0354] Note: if an ARP lookup request from the IP router fails, the router must wait a minimum of 20 seconds before initiating another lookup for that address.

[0355] Cache Initialization

[0356] When the ARP cache is initialized several components are reset. The lookup table (LUT) is cleared, by setting all the PV bits to zero. All memory currently in use is de-allocated and released back to the mallocl memory controller. The ARP expiration timer is also set to zero.

[0357] During the initialization period, no ARP requests are generated. Also, any attempts to create ARP entries from the CPU (static entries), or from received ARP data (dynamic entries) are ignored or discarded.

[0358] Expiring ARP Entries

[0359] Dynamic ARP entries may only exist in the ARP cache for a limited amount of time. This is to prevent any IP-to-Ethernet address mappings from becoming stale. Outdated address mappings could occur if a LAN uses DHCP to assign IP addresses or if the Ethernet interface on a device is changed during a communications session.

[0360] To keep track of the time, a 16-bit counter is used. Operating with a clock frequency of 1 Hz the counter is used to track the number of seconds that have passed. Each ARP entry contains a 16-bit timestamp taken from this counter. This timestamp is taken when an IP address is successfully resolved.

[0361] ARP entry expiration occurs when the ARP cache is idle, i.e. no requests or lookups are currently being processed. At this time, an 8-bit counter is used to cycle through and search the LUT. Each slot in the LUT is checked to see if it contains a valid pointer. If a pointer is valid, the memory block pointed to is retrieved. Then, each entry within that block is checked to see if the difference between its timestamp and the current time is greater than or equal to the maximum lifetime of an ARP entry. If other memory blocks are chained off the first memory block, the entries contained in those blocks are also checked. Once all the entries associated with a given LUT index have been checked, then the next LUT slot is checked.

[0362] If an entry is found to have expired, the valid bit in the entry is set to zero. If there are no other entries within the same memory block, then the block is de-allocated and returned to malloc1. If the block being de-allocated is the only block associated with a given LUT slot, the PV bit in that slot is also set to zero.

[0363] Performing ARP Proxying

[0364] The ARP cache supports proxy ARP entries. ARP proxying is used when this device acts as a router for LAN traffic, or there are devices on the LAN that are unable to respond to ARP queries.

[0365] With ARP proxying enabled, the ARP module passes requests for IP addresses that do not belong to the host up to the ARP cache. The ARP cache then does a lookup to search for the target IP address. If it finds a match, it checks the type field of the ARP entry to determine if it is a proxy entry. If it is a proxy entry, the ARP cache passes the corresponding Ethernet address back to the ARP module. The ARP module then generates an ARP reply using the Ethernet address found in the proxy entry as the source Ethernet address. Note: ARP proxy lookups occur only for incoming ARP requests.

[0366] Detection of Duplicate IP Addresses (ARP Check)

[0367] When the system (host computer plus IT 10G hardware) initially connects to a network, the user or application running on the host computer should perform a gratuitous ARP request to test if any other device on the network is using one of the IP addresses assigned to its interface. If two devices on the same LAN use the same IP address, this could result in problems with routing packets for the two hosts. A gratuitous ARP request is a request for the host's own IP address. If no replies are received for the queries, then it can be assumed that no other host on the LAN is using our IP address.

[0368] An ARP check is initiated in a manner similar to that of performing an ARP lookup. The only difference is that the cache is discarded once the gratuitous ARP request has been completed. If no replies are received, the entry is removed. If a reply is received, an interrupt is generated to notify the host computer that the IP address is in use by another device on the LAN, and the entry is removed from the cache.

[0369] Cache Access Priorities

[0370] Different tasks have different priorities in terms of access to the ARP cache memory. Proxy entry lookups have the highest priority due to the need for rapid responses to ARP requests. Second in priority is adding dynamic entries to the cache; incoming ARP packets may be received at a very high rate and must be processed as quickly as possible to avoid retransmissions. ARP lookups from the IP router have the next highest priority, followed by lookups by the host computer. The manual creation of ARP entries has the second lowest priority. Expiring cache entries has the lowest priority and is performed whenever the cache is not processing an ARP lookup or creating a new entry.

IP Module

[0371] Overview

[0372] The IT 10G natively supports IPv4 packets with automatic parsing for all types of received packets.

[0373] IP Module Block Diagram

[0374] FIG. 16 is a block schematic diagram of one implementation of an IP Module Block.

[0375] IP Sub Module Descriptions

[0376] IP Parser

[0377] The IP Parser module 161 is responsible for parsing received IP packets and determining where to send the packet. Each received IP packet can be sent to either the TCP/UDP module or the exception handler.

[0378] IP Header Field Parsing

[0379] IP Version

[0380] Only IPv4 are accepted and parsed by the IP module, therefore this field must be 0x4 to be processed. If an IPv6 packet is detected, it is handled as an exception and processed by the Exception Handler. Any packet having a version that is less then 0x4 is considered malformed (illegal) and the packet is dropped.

[0381] IP Header Length

[0382] The IP Header Length field is used to determine if any IP options are present. This field must be greater then or equal to five. If it is less, the packet is considered malformed and dropped.

[0383] IP TOS

[0384] This field is not parsed or kept for received packets.

[0385] Packet Len

[0386] This field is used to determine the total number of bytes in the received packet, and is used to indicate to the next level protocol where the end of its data section is. All data bytes received after this count expires and before the ip_packet signal de-asserts are assumed to be padding bytes and are silently discarded.

[0387] Packet ID, Flags, and Fragmentation Offset

[0388] These fields are used for defragmenting packets. Fragmented IP packets may be handled by dedicated hardware or may be treated as exceptions and processed by the Exception Handler.

[0389] TTL

[0390] This field is not parsed or kept for received packets.

[0391] PROT

[0392] This field is used to determine the next encapsulated protocol. The following protocols are fully supported (or partially supported in alternative embodiments) in hardware: 4

TABLE 4
Supported Protocol Field Decodes
Hex
value Protocol
0x06 TCP
0x11 UDP

[0393] If any other protocol is received, and if the unsupport_prot feature is enabled, then the packet may be sent to the host computer. A protocol filter may be enabled to selectively receive certain protocols. Otherwise, the packet is silently discarded.

[0394] Checksum

[0395] This field is not parsed or kept. It is used just to make sure the checksum is correct. If the checksum turns out bad, then the bad_checksum signal, which goes to all the next layers is asserted. It stays asserted until it is acknowledged.

[0396] Source IP Address

[0397] This field is parsed and sent to the TCP/UDP layers.

[0398] Destination IP Address

[0399] This field is parsed and checked against valid IP addresses that the local stack should be responding to. This may take more then one clock cycle, in which case the parsing should continue. If the packet turns out to be misdirected, then the bad_ip_add signal is asserted. It stays asserted until it is acknowledged.

[0400] IP ID Generation Algorithm

[0401] The on-chip processor can set the IP ID seed value by writing any 16-bit value to the IP_ID_Start register. The ID generator takes this value and does a mapping of the 16 bits to generate the IP ID used by different requestors. The on-chip processor, TCP module, and ICMP echo reply generator can all request an IP ID. A block diagram of one implementation of the ID generator is shown in the FIG. 17 .

[0402] The IP ID Seed register is incremented every time a new IP ID is requested. The Bit Mapper block rearranges the IP_ID_Reg value such that the IP_ID_Out bus is not a simple incrementing value.

[0403] IP Injector Module

[0404] The IP injector module is used to inject packets from the on-chip processor into the IP and TCP modules. The IP injector control registers are located in the IP module register space, and these registers are programmed by the on-chip processor. A block diagram depicting the data flow of the IP Injector is shown in FIG. 18 .

[0405] As can be seen, the IP Injector is capable of inserting data below the IP module. To use IP Injection, the on-chip processor programs the IP Injectior module with the starting address in its memory of where the packet resides, the length of the packet, and the source MAC address. The injector module generates an interrupt when it has completed transmitting the packet from the on-chip proce