Plaque It!
Sponsored by: Flash of Genius |
[0001] This application claims the benefit of the filing date of co-pending U.S. Provisional Application, Serial No. 60/190,358, filed Mar. 16, 2000, entitled “An IPv4 Compatibility Aggregatable Global Unicast Address Format,” co-pending U.S. Provisional Application, Serial No. 60/232,047, filed Sep. 12, 2000, entitled “Techniques for Improved Topology Based on Reverse-Path Forwarding,” co-pending U.S. Provisional Application, Serial No. 60/232,046, filed Sep. 12, 2000, entitled “Reduced Overhead Hello Protocol,” and co-pending U.S. Provisional Application, Serial No.______, filed Nov. 14, 2000, entitled “Efficient Routing Protocols for Packet-Radio Networks Based on Tree Sharing”, the entirety of which provisional applications are incorporated by reference herein.
[0003] A network is a collection of communications entities (e.g., hosts, routers, and gateways) that are in communication with each other over communication links. Organizing communications entities into networks increases the capabilities of the communication entities beyond that which each communications entity alone is capable by enabling such entities to share resources. A network that interconnects communications entities within a common geographical area (for example, the personal computers in an office) is called a local area network (LAN). Some LANs employ one or more network servers that direct the flow of data within the network and control access to certain network functions such as storing data in a central file repository, printing, accessing other networks. In other LANs, computers communicate with each other without the use of servers.
[0004] A wide area network (WAN), of which the Internet is an example, is a collection of geographically distributed LANs joined by long-range communication links. The Internet is a publicly accessible, worldwide network of networks based upon a transmission protocol known as TCP/IP (Transmission Control Protocol/Internet Protocol). Communications on the Internet is packet-switched; that is, the information that is to pass from one communications entity to another is broken into packets that are individually passed from router to router until the packets arrive at their destination. The TCP divides the data into segments and provides reliable delivery of bytes in the segments to the destination, which reconstructs the data. The IP further subdivides the TCP segments into packets and routes the packets to their final destination. The route taken by packets may pass through one or more networks, depending upon the Internet Protocol (IP) address of the destination.
[0005] A rapidly growing part of the Internet is the World Wide Web (“Web”), which operates according to a client-server model. Client software, commonly referred to as a Web browser, runs on a computer system. After establishing an Internet connection, the client user launches the Web browser to communicate with a Web server on the Internet. Using TCP/IP, the Web browser sends HTTP (Hypertext Transport Protocol) requests to the Web server. The request traverses the Internet's TCP/IP infrastructure to Web host server as HTTP packets.
[0006] A private network based on Internet technology and consisting of a collection of LAN and WAN components is called an Intranet. Accordingly, communications entities that are part of an intranet can use a Web browser to access Web servers that are within the intranet or on the Internet.
[0007] Today, most of the communication links between the various communications entities in a networks are wire-line; that is, client systems are typically connected to a server and to other client systems by wires, such as twisted-pair wires, coaxial cables, fiber optic cables, and the like. Wireless communication links, such as microwave links, radio frequency (RF) links, infrared (IR) links, and satellite links, are becoming more prevalent in networks.
[0008] A characteristic of wireless networks is that the communication entities in the network are mobile. Such mobility creates frequent, dynamic changes to the network topology and state of the communication links between the communication entities. Mobility is less of a concern for those communication entities connected to the Internet by wire-line, however, the topology of the Internet is perpetually changing, with communication entities joining and leaving the Internet often. Also, the state of communication links between communication entities on the Internet may change for various reasons, such as increased packet traffic.
[0009] To effectively route messages through such dynamically changing networks, routers need to remain informed of topology and link-state changes. Existing methods based on flooding are inefficient and consume too much network bandwidth. The inefficiency of flooding is the result, in part, of the following redundancies: (1) link-state and topology updates are sent over multiple paths to each router; and (2) every router forwards every update to all neighboring routers, even if only a small subset of the neighboring routers need to receive it.
[0010] The routing of update information and of data packets is further complicated by the heterogeneous infrastructure of the Internet. Currently, most communications entities on the Internet exchange messages using the Internet Protocol Version 4 (or IPv4), but an increasing number of communications entities that communicate using the Internet Protocol Version 6 (or IPv6) are being deployed. IPv6 is a second generation Internet Protocol designed to supplant IPv4, but is expected to coexist with IPv4 until the transition to IPv6 is complete. In general, the IP versions are incompatible: IPv4 routers cannot route IPv6 messages, nor can IPv6 routers route IPv4 messages. Instead, special routers that implement both the IPv4 and IPv6 protocols in a “dual-stack” configuration are required to support the coexistence and transition phase.
[0011] Another difficulty presented by the mobility of the communications entities is that the movement of one communication entity can interrupt on-going communications with another entity. For example, a portable laptop computer with a wireless link by which it is communicating with a Web server on the Internet may be moved so that the link to network, and thus to the Web server, is broken. In general, the loss of the link irretrievably causes the loss of any information being transmitted to the computer, although the laptop computer may later regain the link or establish a new link to the network. After reconnecting to the network, the laptop computer must reestablish communications with the Web server. The on-going communications are lost.
[0012] Thus, there remains a need for a mobile wireless network that can perform reliably and efficiently despite the aforementioned difficulties associated with the mobility of the communication entities in the network.
[0013] An objective of the invention is to enable seamless movement by mobile nodes from network to network. Another objective is to facilitate the addition of devices to networks. Yet another objective is to improve robustness of communications and connections in wireless networks comprised of mobile nodes.
[0014] In one aspect, the invention features a wireless mobile network comprising a mobile client in communication with a server over a first wireless route through the network. Routing nodes communicate with each other according to a protocol by which each routing node disseminates link-related information to zero or more neighbor nodes based on a tree developed and maintained by that routing node. The routing nodes determine that a link-state change in the first wireless route has interrupted communications between the mobile client and the server, and that the mobile client has selected an alternative wireless route through the network. A queue stores communications affected by the interruption and transmits such communications to the client and the server so that communications can resume between the client and the server over the alternative wireless route from the point of interruption.
[0015] In one embodiment, the wireless mobile network further comprises a processing system that measures at least one network parameter during network operation for use in adapting communications between the client and the server to current network conditions. The measured network parameter can be, for example, packet loss and round-trip time. The processing system can adjust a length of packets transmitted between the client and the server in response to the measured packet loss or adjust a time-out period for which a sender of a transmitted packet awaits a corresponding acknowledgment.
[0016] The processing unit can also make other determinations that serve to adapt communications between the client and the server in wireless mobile network, such as determining the number of retransmissions of an unacknowledged packet before an attempt is made to reestablish a connection between the client and the server in response to the measured network parameter, and determining the number of attempts to reestablish a connection between the client and the server if the transmitted packet remains unacknowledged packet after the number of retransmissions.
[0017] In one embodiment, the wireless mobile network also includes a node that generates an IPv6 packet for transmission to a destination node. The IPv6 packet includes an address having a globally aggregatable IPv6 address prefix and an IPv6-compatible interface identifier that contains an embedded IPv4 address associated with the destination node. The format of the address achieves compatibility between IPv6 and IPv4, enabling IPv6 packets to be routed through IPv6 routing infrastructure or to be tunneled through IPv4 routing infrastructure.
[0018] In another aspect, the invention features a network comprising a node that generates an IPv6 packet for transmission to a destination node. The IPv6 packet includes an address of the destination node. This address has a globally aggregatable IPv6 address prefix and an IPv6-compatible interface identifier that contains an embedded IPv4 address associated with the destination node. The network also includes routing nodes that communicate with each other according to a protocol by which each routing node disseminates routing information to zero or more neighbor nodes based on a broadcast tree maintained in part by that routing node. The routing nodes determine a route to the destination node based on the routing information, the IPv6 address prefix, and the IPv4 address embedded within the IPv6-compatible interface identifier.
[0019] In one embodiment, the wireless mobile network includes a processing system measuring at least one network parameter during network operation for use in adapting communications of the first node to current network conditions. The measured network parameter can be, for example, packet loss and round-trip time. The processing system can adjust a length of packets transmitted by the first node in response to the measured packet loss or adjust a time-out period for which the first node awaits an acknowledgment for a transmitted packet.
[0020] Other determinations that may be made by the processing unit for adapting communications include determining the number of retransmissions of an unacknowledged packet before an attempt is made to reestablish a connection to the first node, and determining the number of attempts to reestablish a connection to the first node if the transmitted packet remains unacknowledged packet after the number of retransmissions.
[0021] The invention is pointed out with particularity in the appended claims. The objectives advantages of the invention described above, as well as further objectives and advantages of the invention, may be better understood by reference to the following description taken in conjunction with the accompanying drawings, in which:
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035] FIGS.
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
[0042]
[0043]
[0044]
[0045] A server
[0046] Each subnet
[0047] The subnet
[0048] Although it is conceivable that all nodes
[0049] In the subnet
[0050] Each broadcast link connecting multiple nodes
[0051] In one embodiment, the subnet
[0052] The following example illustrates the operation of the subnet
[0053] Each router
[0054] In brief, the TBRPF protocol performed by each of the routers
[0055] To transmit update messages, the TBRPF protocol supports unicast transmissions (e.g., point-to-point or receiver directed), in which a packet reaches only a single neighbor, and broadcast transmissions, in which a single packet is transmitted simultaneously to all neighbor nodes. In particular, the TBRPF protocol allows an update to be sent either on a common broadcast channel or on one or more unicast channels, depending on the number of neighbors that need to receive the update.
[0056] Upon recovering the same link to node B
[0057] In one embodiment, such communications resume at their point of interruption. In brief, node A
[0058] The route taken by the object updates may traverse a heterogeneous IPv6/IPv4 infrastructure. Normally, IPv6 nodes are unable to route packets to other IPv6 nodes
[0059] Accordingly, the internetworking system
[0060]
[0061] The data link layer
[0062] The network layer
[0063] Topology Broadcast based on Reverse-Path Forwarding (TBRPF) Protocol
[0064] In brief, the TBRPF protocol uses the concept of reverse-path forwarding to broadcast each link-state update in the reverse direction along a tree formed by the minimum-hop paths from all routing nodes
[0065] Based on the information received along the minimum-hop-path trees, each node
[0066] To communicate according to the TBRPF protocol, each routing node
[0067] 1. A topology table, denoted TT_i, consisting of all link-states stored at node i. The entry for link (u, v) in this table is denoted TT_i(u, v) and includes the most recent update (u, v, c, sn) received for link (u, v). The component c represents the cost associated with the link, and the component sn is a serial number for identifying the most recent update affecting link (u, v) received by the node i. The components c and sn of the entry for the link (u, v) is denoted TT_i(u, v).c and TT_i(u, v).sn. Optionally, the dissemination of multiple link metrics is attainable by replacing the single cost c with a vector of multiple metrics.
[0068] 2. A list of neighbor nodes, denoted N_i.
[0069] 3. For each node u other than node i, the following is maintained:
[0070] a. The parent, denoted p_i(u), which is the neighbor node (“nbr”) of node i that is the next node on a minimum-hop path from node i to node u, as obtained from the topology table TT_i.
[0071] b. A list of children nodes of node i, denoted children_i(u).
[0072] C. The sequence number of the most recent link-state update originating from node u received by node i, denoted sn_i(u). The sequence number is included in the link-state update message. The use of sequence numbers helps achieve reliability despite topology changes, because node i avoids sending another node information that the other node already has. Each node i maintains a counter (i.e., the sequence number) for each link that the node i monitors. That counter is incremented each time the status of the link changes.
[0073] d. The routing table entry for node u, consisting of the next node on a preferred path to node u. The routing table entry for node u can be equal to the parent p_i(u) if minimum-hop routing is used for data packets. However, in general, the routing table entry for node u is not p_i(u), because the selection of routes for data traffic can be based on any objective.
[0074] One embodiment of the TBRPF protocol uses the following message types:
[0075] LINK-STATE UPDATE: A message containing one or more link-state updates (u, v, c, sn).
[0076] NEW PARENT: A message informing a neighbor node that the node has selected that neighbor node to be a parent with respect to one or more sources of updates.
[0077] CANCEL PARENT: A message informing a neighbor that it is no longer a parent with respect to one or more sources of updates.
[0078] HELLO: A message sent periodically by each node i for neighbor discovery.
[0079] NEIGHBOR: A message sent in response to a HELLO message.
[0080] NEIGHBOR ACK: A message sent in response to a NEIGHBOR message.
[0081] ACK: A link-level acknowledgment to a unicast transmission.
[0082] NACK: A link-level negative acknowledgment reporting that one or more update messages sent on the broadcast channel were not received.
[0083] RETRANSMISSION OF BROADCAST: A retransmission, on a unicast channel, of link-state updates belonging to an update message for which a NACK message was received.
[0084] HEARTBEAT: A message sent periodically on the broadcast channel when there are no updates to be sent on this channel, used to achieve reliable link-level broadcast of update messages based on NACKs. END OF BROADCAST: A message sent to a neighbor over a unicast channel, to report that updates originating from one or more sources are now being sent on the unicast channel instead of the broadcast channel.
[0085] The formats for the various types of TBRPF protocol messages are described below.
[0086]
[0087] Node i receives (step
[0088] If node i receives a message representing a link-state update, the discovery of a new neighbor node, the loss of a neighbor node, or a change in the cost of a link to a neighbor node, node i enters (step
[0089] If node i receives (step
[0090] Consider, for example, the case in which node i initially has no topology information. Accordingly, node i has no links to neighbor nodes, and its topology table TT_i is empty. Also the parent node is p_i(src)=NULL (i.e., not defined), the children_i(src) is the empty set, and sn_i(src)=0 for each source node src. Upon receiving (step
[0091] In response to the NEW PARENT message, then each neighbor node nbr informs (step
[0092] Node i cancels an existing parent p_i(src) by sending a CANCEL PARENT(src) message containing the identity of the source node src. Consequently, the set of children, children_i(src), at node i with respect to source node src is the set of neighbor nodes from which node i has received a NEW PARENT message containing the identity of source node src without receiving a subsequent CANCEL PARENT message for that source node src. Node i can also simultaneously select a neighbor node as the parent for multiple sources, so that the node i sends a NEW PARENT(src_list, sn_list) message to the new parent, where src_list is the list of source nodes and sn_list is the corresponding list of sequence numbers. Similarly, a CANCEL PARENT message can contain a list of sources.
[0093] In one embodiment, the TBRPF does not use NEW PARENT and CANCEL PARENT messages in the generation the minimum-hop-path tree. Instead, each node i computes the minimum-hop paths from each neighbor node nbr to all destinations (e.g., by using breadth-first search or Dijkstra's shortest-path algorithm). Consequently, each node i computes the parents p_nbr(src) for each neighbor node nbr and source node src, from which node i determines which neighbor nodes nbr are its children for the given source node src. Although this process eliminates NEW PARENT and CANCEL PARENT messages, the process also requires that each node i (1) sends all updates originating from the source node src to any child node in children_i(src), or (2) periodically sends updates along the minimum-hop-path tree, because node i does not know the sequence number sn_nbr(src) from the neighbor node nbr and thus does not know what updates the neighbor node nbr already has. Either of these actions ensures that each neighbor node nbr receives the most recent information for each link.
[0094]
[0095] Conversely, the children
[0096] In brief, the TBRPF protocol disseminates link-state updates generated by a source node src along the minimum-hop-path tree rooted at node src and dynamically updates the minimum-hop-path tree based on the topology and link-state information received along the minimum-hop-path tree. More specifically, whenever the topology table TT_i of node i changes, the node i computes its parent p_i(src) with respect to every source node src (see the procedure Update_Parents in Appendix A). The node i computes parents by (1) computing minimum-hop paths to all other nodes using, for example, Dijkstra's algorithm, and (2) selecting the next node on the minimum-hop path to each source node src to be the parent for that source node src (see the procedure Compute_New_Parents in Appendix A). The computation of parents occurs when the node i receives a topology update, establishes a link to a new neighbor node, or detects a failure or change in cost of a link to an existing neighbor node.
[0097] In one embodiment, node i computes a new parent p_i(src) for a given source node src even though the path to the source node src through the new parent has the same number of hops as the path to the source node src through the old parent. In another embodiment, the node keeps the old parent node in this event, thus reducing the overhead of the TBRPF protocol. This embodiment can be implemented, for example, by using the procedure Compute_New_Parents
[0098] If the parent p_i(src) changes, node i sends the message CANCEL PARENT(src) to the current (i.e., old) parent, if the old parent exists. Upon receiving the CANCEL PARENT(src) message, the old parent (“k”) removes node i from the list children_k(src).
[0099] Node i also sends the message NEW PARENT(src, sn) to the newly computed parent if the new parent exists, where sn=sn_i(src) is the sequence number of the most recent link-state update originating from source node src received by node i. This sequence number indicates the “position” up to which node i has received updates from the old parent, and indicates to the new parent that it should send only those updates that occurred subsequently (i.e., after that sequence number).
[0100] Upon receiving the NEW PARENT(src, sn) message, the new parent “j” for p_i(src) adds node i to the list children_j(src) and sends to node i a link-state update message consisting of all the link states originating from source node src in its topology table that have a sequence number greater than sn (see the procedure Process_New_Parent in Appendix A). Thus, only updates not yet known to node i are sent to node i.
[0101] Generally, the range of sequence numbers is large enough so that wraparound does not occur. However, if a small sequence number range is used, wraparound can be handled by employing infrequent periodic updates with a period that is less than half the minimum wraparound period, and by using a cyclic comparison of sequence numbers. That is, sn is considered less than sn′if either sn is less than sn′ and the difference between sn and sn′ (sn′−sn′) is less than half the largest possible sequence number, or sn′ is less than sn and the difference, sn−sn′, is greater than half the largest possible sequence number.
[0102] When a node i detects the existence of a new neighbor nbr, it executes Link_Up(i,nbr) to process this newly established link. The link cost and sequence number fields for this link in the topology table at node i are updated. Then, the corresponding link-state message is sent to all neighbors in children_i(i). As noted above, node i also recomputes its parent node p_i(src) for every node src, in response to this topological change. In a similar manner, when node i detects the loss of connectivity to an existing neighbor node nbr, node i executes Link_Down(i, nbr). Link_Change(i, nbr) is likewise executed at node i in response to a change in the cost to an existing neighbor node nbr. However, this procedure does not recompute parents.
[0103] In one embodiment, if a path between the node i and a given source node src ceases to exist, the node i computes a new parent p_i(src) that is set to NULL (i.e., parent does not exist). In another embodiment, although the path between the node i and the given source node src ceases to exist, the node i keeps the current parent, if the current parent is still a neighbor node of the node i. Thus, the overhead of the TBRPF protocol is reduced because it is unnecessary to send a CANCEL PARENT and a subsequent NEW PARENT messages if the old path to the source becomes operational later because of a link recovery. This embodiment can be implemented by replacing the fifth line of the procedure Update-Parents in Appendix A, “If (new_p_i(src)!=p_i(src)){”, with the line “If(new_p_i(src)!=p_i(src) and new_p_i(src)!=NULL){”.
[0104] The TBRPF protocol does not use an age field in link-state update messages. However, failed links (represented by an infinite cost) and links that are unreachable (i.e., links (u, v) such that p_i(u)=NULL) are deleted from the topology table TT_i after MAX_AGE seconds (e.g., 1 hour) in order to conserve memory. Failed links (u, v) are maintained for some time in the topology table TT_i, rather than deleted immediately, to ensure that the node i that changes its parent p_i(u) near the time of failure (or had no parent p_i(u) during the failure) is informed of the failure by the new parent.
[0105] Unreachable links, (i.e., links (u, v) such that node i and node u are on different sides of a network partition), are maintained for a period of time to avoid having to rebroadcast the old link state for (u, v) throughout node i's side of the partition, if the network partition soon recovers, which can often happen if the network partition is caused by a marginal link that oscillates between the up and down states. If a link recovers resulting in the reconnection of two network components that were disconnected (i.e., partitioned) prior to the link recovery, the routing nodes
[0106] To correct this situation, in one embodiment, a header field is added to each link-state update message, which indicates whether the update message is sent in response to a NEW PARENT message. The header field also identifies the corresponding NEW PARENT message using a sequence number. For example, if a given node i sends a NEW PARENT message (for multiple sources) to node j following the recovery of the link (i, j), the node i waits for a response from node j to the NEW PARENT message before sending to node i's neighbor nodes an update message corresponding to the link recovery. The response from node j includes the link-state information of the other nodes
[0107] A node i that is turned off (or goes to sleep) operates as if the links to all neighbors have gone down. Thus, the node i remembers the link-state information that it had when turned off. Since all such links are either down or unreachable, these link states are deleted from the topology table TT_i if the node i awakens after being in sleep mode for more than MAX_AGE seconds.
[0108] Infrequent periodic updates occur to correct errors that may appear in table entries or it update messages. (See Send_Periodic_Updates in Appendix A.) As discussed above, periodic updates are also useful if the sequence number range is not large enough to avoid wraparound.
[0109] When a given routing node
[0110] A link-state update message reports the state of the link (src, nbr) as a tuple (src, nbr, c, sn), where c and sn are the cost and the sequence number associated with the update. A cost of infinity represents a failed link. The source node src is the head node of link (src, nbr), and is the only node that can report changes to parameters of link (src, nbr). Therefore, any node
[0111] The source node src maintains a counter sn_src, which is incremented by at least one each time the cost of one or more outgoing links (src, nbr) changes value. For example, the counter sn_src can be a time stamp that represents the number of seconds (or other units of time) elapsed from some fixed time. When the source node src generates a link-state update (src, nbr, c, sn), the sequence number sn is set to the current value of sn_src.
[0112] In brief, each routing node
[0113] In most link-state routing protocols, e.g., OSPF (Opens Shortest Path First), each routing node
[0114] The TBRPF protocol may utilize bandwidth more efficiently by using unicast transmissions if those routing nodes
[0115] In general, each routing node
[0116] To avoid this possible drawback, one option is to use broadcast transmission if k>(n+1)/2 and unicast transmission in all other cases. In general, a rule of the form k>g(n) can be used. For update messages, the number of children k may be different for different update sources. Therefore, it is possible to use unicast transmissions for some sources and broadcast transmissions for other sources, and the transmission mode for a given source u, denoted mode_i(u), can change dynamically between unicast and broadcast as the number of children changes.
[0117] While LINK-STATE-UPDATE messages can be transmitted in either unicast or broadcast mode, HELLO messages and HEARTBEAT messages (discussed below) are always transmitted on the broadcast channel, and the following messages are always transmitted on the unicast channel (to a single neighbor): NEIGHBOR, NEIGHBOR ACK, ACK, NACK, NEW PARENT, CANCEL PARENT, RETRANSMISSION OF BROADCAST, END OF BROADCAST, and LINK-STATE-UPDATE messages sent in response to a NEW PARENT message.
[0118] Exemplary pseudo-code for a procedure for sending a LINK-STATE UPDATE message (that is not a response to a NEW PARENT message) on the broadcast or unicast channel is as follows:
[0119] If (mode_i(src)==BROADCAST)
[0120] Append the message update_msg to the message queue associated with the broadcast channel.
[0121] If (mode_i(src)==UNICAST)
[0122] For (each node k in children_i(src))
[0123] Append the message update_msg to the message queue associated with the unicast channel to node k.
[0124] Reliable unicast transmission of control packets can be achieved by a variety of reliable link-layer unicast transmission protocols that use sequence numbers and ACKs, and that retransmit a packet if an ACK is not received for that packet within a specified amount of time.
[0125] For reliable transmission of Link-State Update messages in broadcast mode, each broadcast update message includes one or more link-state updates, denoted lsu(src), originating from sources src for which the transmission mode is BROADCAST. Each broadcast control packet is identified by a sequence number that is incremented each time a new broadcast control packet is transmitted. Reliable transmission of broadcast control packets in TBRPF can be accomplished using either ACKs or NACKs. If ACKs are used, then the packet is retransmitted after a specified amount of time if an ACK has not been received from each neighbor node that must receive the message.
[0126] In one embodiment of TBRPF, NACKs are used instead of ACKs for reliable transmission of broadcast control packets, so that the amount of ACKINACK traffic is minimized if most transmissions are successful. Suppose node i receives a NACK from a neighbor node nbr for a broadcast update message. In one embodiment, all updates lsu(src) in the original message, for each source node src such that neighbor node nbr belongs to children_i(src), are retransmitted (reliably) on the UNICAST channel to the neighbor node nbr, in a RETRANSMISSION OF BROADCAST message. This message includes the original broadcast sequence number to allow neighbor node nbr to process the updates in the correct order. In another embodiment, such update messages are retransmitted on the broadcast channel. This embodiment may improve the efficiency of the TBRPF protocol in subnets that do not support receiver-directed transmission, because in such subnets unicast transmission provides no efficiency advantage over broadcast transmissions.
[0127] The procedure for the reliable transmission of broadcast update packets uses the following message types (in addition to LINK-STATE UPDATE messages): HEARTBEAT(sn), NACK(sn, bit_map), and RETRANSMISSION OF BROADCAST(sn, update_msg). A NACK(sn, bit_map) message contains the sequence number (sn) of the last received broadcast control packet, and a 16-bit vector (bit-map) specifying which of the 16 broadcast control packets from sn-15 to sn have been successfully received.
[0128] A description of the procedure for the reliable transmission of broadcast update packets at node i uses the following exemplary notation:
[0129] Pkt(sn) represents a control packet with sequence number sn transmitted on the broadcast channel by node i.
[0130] MsgQ represents a message queue for new control messages to be sent on the broadcast channel from node i.
[0131] brdcst_sn_i represents the sequence number of the last packet transmitted on the broadcast channel by node i.
[0132] Heartbeat_Timer represents a timer used in the transmission of the HEARTBEAT message.
[0133] Following the transmission of the broadcast control packet Pkt(brdcst_sn_i) on the broadcast channel, node i increments brdcst_sn_i and reinitializes Heartbeat_Timer. When Heartbeat_Timer expires at node i, the node i appends the control message HEARTBEAT(brdcst_sn_i) to the message queue associated with the broadcast channel, and reinitializes Heartbeat_Timer. When the node i receives NACK(sn, bit_map) from neighbor node nbr, node i performs the functions as illustrated by following exemplary pseudo-code:
For each (sn' not received as indicated by bit_map){ Let update_msg = {(src*, v*, sn*, c*) in Pkt(sn') such that the neighbor node nbr is in children_i(src*)}. Append the message RETRANSMISSION OF BROADCAST(sn', update_msg) to the message queue associated with the unicast channel to neighbor node nbr. (Message must be sent even if update_msg is empty.)}
[0134] Upon receipt at neighbor node nbr of control packet Pkt(sn) transmitted on the broadcast channel by node i, the neighbor node nbr performs the following operations as illustrated by the following pseudo-code:
If the control packet Pkt(sn) is received in error{ Append the control message NACK(sn, bit_map) to the message queue associated with the unicast channel to node i.} If the control packet Pkt(sn) is received out of order (i.e., at least one previous sequence number is skipped){ Withhold the processing of the control packet Pkt(sn). Append the control message NACK(sn, bit_map′) to the message queue associated with the unicast channel to node i.} Else (control packet Pkt(sn) is received correctly and in order){ For each Link-State Update message update_msg in Pkt(sn), call Process_Update(i, nbr, update_msg).}
[0135] When a communication link is established from node i to a new neighbor nbr, in one embodiment the node i obtains the current value of brdcst_sn_nbr from the NEIGHBOR message or NEIGHBOR ACK that was received from neighbor node nbr.
[0136] Each node i can dynamically select the transmission mode for link-state updates originating from each source node src. As described above, this decision uses a rule of the form k>g(n), where k is the number of children (for src) and n is the number of neighbors of node i. However, to ensure that updates are received in the correct order, or that the receiver has enough information to reorder the updates, node i sends an END OF BROADCAST(last_seq_no, src) message on the unicast channel to each child when the mode changes to UNICAST, and waits for all update packets sent on unicast channels to be ACKed on before changing to BROADCAST mode.
[0137] To facilitate this process, each node i maintains a binary variable unacked_i(nbr, src) for each neighbor node nbr and source node src, indicating whether there are any unACKed control packets sent to neighbor node nbr containing link-state updates originating at source node src. The following exemplary pseudo-code illustrates an embodiment of a procedure that is executed periodically at each node i.
For each (node src){ If (mode_i(src) = BROADCAST and |children_i(src)| <= g(n)){ For each (node nbr in children_i(src)){ Append the message END OF BROADCAST(brdcst_sn_i, src) to the message queue associated with the unicast channel to node nbr.} Set mode_i(src) = UNICAST.} If (mode_i(src) = UNICAST and |children_i(src)| > g(n)){ Set switch_flag = YES. For each (node nbr in children_i(src)){ If (unacked_i(nbr, src) = YES) Set switch_flag = NO.} If (switch_flag = YES) Set mode_i(src) = BROADCAST.}}
[0138] In one embodiment, a result of the running the TBRPF protocol is that each router
[0139] For the full-topology link-state protocol embodiment (1) alternate paths and disjoint paths are immediately available, allowing faster recovery from failures and topology changes; and (2) paths can be computed subject to any combination of quality-of-service (QoS) constraints and objectives. Partial-topology link-state protocols provide each node
[0140] In one partial-topology embodiment, each routing node
[0141] Various rules can be used to define the set of special links in the list L_i. For example, one rule defines a link (i, j) to be in L_i only if node j is the parent of node i for some source node other than node j, or if node j belongs to the set children_i(src) for some source node src other than node i. This definition of special links includes enough links to provide minimum-hop paths between any pair of nodes. As a result, this partial-topology embodiment reduces the amount of control traffic without reducing the quality of the routes. In this embodiment, an update (u, v, c, sn, sp) is augmented to include a, “sp” field (e.g., a single-bit field), which indicates whether the link (u, v) is a special link. Pseudo-code representing an exemplary implementation of the partial-topology embodiment appears in the Appendix A, after the “Partial-Topology 1” header. The procedure Mark_Special_Links(i) is called upon a change to the parent p_i(src) or to the set of children nodes children_i(src).
[0142] In another partial-topology embodiment, each routing node
[0143] Upon receiving an update message, consisting of one or more updates (u, v, c), node i executes the procedure Update( ), which calls the procedure Update_Topology_Table( ), then executes the procedure Lex_Dijkstra( ) to compute the new source tree Ti and the procedure Generate_Updates( ) to generate updates and modify the set of reported links Ri based on changes in link costs and changes to the source tree Ti. Each generated update is then sent to the appropriate children, that is, updates for links with head u are sent to children_i(u). The procedure Update_Parents( ) is called, which determines any changes in the parent assignment and sends NEW PARENT and CANCEL PARENT messages.
[0144] The sending of updates can be accomplished in different ways, depending on whether the subnet
[0145] The procedure Update_Topology_Table( ) does the following for each update (u, v, c) in the input message (in-message) such that the parent p_ i(u) is the neighbor node who sent the message. (Updates received from a node other than the parent are ignored.) If either TT_i does not contain an entry for (u, v) or contains an entry with a different cost than c, then TT_i(u, v) is updated with the new value c and link (u, v) is marked as changed. If the input message is a PARENT RESPONSE, then in addition to updates, the message contains the same list of sources as the NEW PARENT message to which it is responding. For each such source node u such that pending_i(u)=1 and for each link (u, v) in TT_i that is outgoing from source node u but for which the input message does not contain an update, the cost of (u, v) is set to infinity, to indicate that the link should be deleted. In other words, any link that was reported by the old parent but is not reported by the new parent is deleted. Only information from the current parent is considered valid.
[0146] The procedure Lex_Dijkstrao( ) (not included in Appendix A) is an implementation of Dijkstra's algorithm that computes the lexicographically smallest shortest path LSP(i, u) from node i to each node u, using as path name the sequence of nodes in the path in the reverse direction. For example, the next-to-last node of LSP(i, u) has the smallest node ID among all possible choices for the next-to-last node. Such paths are computed using a modification of Dijkstra's algorithm in which, if there are multiple choices for the next node to label, the one with the smallest ID is chosen.
[0147] The procedure Generate_Updates( ) decides what updates to include in the message to be sent to neighbor nodes. A non-delete update is included for any link (u, v) that is in the new source tree Ti and either is marked as changed or was not in the previous source tree (denoted old source tree Ti). In this case, Ti(u, v).c′ is set to Ti(u, v).c, and (u,v) is added to the reported link set Ri if not already in the reported link set Ri. A delete update is included for any link (u, v) that is in the reported link set Ri but is not in the source tree Ti, such that TT_i(u, v).c>TT_i(u,v).c′. In this case, (u, v) is removed from the reported link set Ri. Any links with infinite cost are erased from the topology table TT_i.
[0148] The procedure Update_Parents( ) sets the new parent p_i(u) for each source node u to be the second node on the shortest path to node u. If there is no path to node u, p_i(u) is null. If the new parent is different from the old parent, then a NEW PARENT message is sent to the new parent (if it is not null) and a CANCEL PARENT message is sent to the old parent (if it is not null and the link to the old parent is still up). The NEW PARENT messages for all source nodes u having the same new parent are combined into a single message, and CANCEL PARENT messages are similarly combined.
[0149] The procedure Process_New_Parent( ) is executed when a NEW PARENT message is received from some neighbor node. For each source node u in the NEW PARENT message, the procedure adds the neighbor node to children_i(u) and includes in the PARENT RESPONSE message an update for each link (u, v) in the source tree Ti whose head is source node u, if such a link exists. (Such a link will not exist if node u is a leaf of source tree Ti.) As described above, the PARENT RESPONSE also includes the same list of sources as the NEW PARENT message to which it is responding. (This list is not necessary if the node sending the NEW PARENT message remembers the list and can match the PARENT RESPONSE to the NEW PARENT message.)
[0150] When the cost of a link to a neighbor node j changes, node i sets TT_i(i, j).c to the new cost and calls the procedure Update( ) with k=i and an empty input message. A threshold rule can be used so that TT_i(i, j).c is updated only if the percent difference between the new cost and the old cost is at least some given threshold. If a link to a neighbor node j fails, the same procedure is followed (with the cost changing to infinity), and node j is removed from set of neighbor nodes Ni.
[0151] When a link to a neighbor node j comes up, either initially or upon recovering from a failure, node i executes the procedure Link_Up(I, j), which adds neighbor node j to the set of neighbor nodes Ni, sets TT_i(i, j).c to the link cost, and calls the procedure Update( ) with k=i and an empty input message. This may result in a NEW PARENT message being sent to neighbor node j.
[0152] To correct errors that may appear in TT_i due to noisy transmissions or memory errors, each node i can periodically generate updates for its outgoing links. Since a received update is ignored unless it has a cost that differs from the entry in the topology table TT_i, the cost of the periodic update should be chosen to be slightly different from the previous update. Alternatively, each update can contain an additional bit b, which toggles with each periodic update.
[0153]
[0154] If link (B, D) fails, as shown in
[0155] To disseminate link-state updates to the appropriates nodes in the subnet
[0156] In one embodiment, the TBRPF protocol messages are sent via the User Datagram Protocol (UDP), which requires an official UDP-service port-number registration. The use of UDP/IPv4 provides several advantages over a data link level approach, including (1) IPv4 segmentation/reassembly facilities, (2) UDP checksum facilities, (3) simplified application level access for routing daemons, (4) IPv4 multicast addressing for link state messages.
[0157] TBRPF protocol messages are sent to the IPv4 unicast address of a current neighbor or to the “All_TBRPF_Neighbors” IPv4 multicast address, presuming that an official IPv4 multicast address is assigned to “All_TBRPF_Neighbors.” In general, a message is sent to the IPv4 unicast address of a current neighbor node if all components of the message pertain only to that neighbor. Similarly, a message is sent to the All_TBRPF_Neighbors IPv4 multicast address if the message contains components which pertain to more than one neighbor neighbors. Nodes
[0158] Actual addressing strategies depend on the underlying data link layer. for example, for data links such as IEEE 802.11, a single, multiple access channel is available for all unicast and broadcast/multicast messages. In such cases, since channel occupancy for unicast and multicast messages is identical, it is advantageous to send a single message to the All_TBRPF_Neighbors multicast address rather than multiple unicast messages, even if the message contains components that pertain to only a subset of the current neighbor nodes. In other cases, in which point-to-point receiver directed channels are available, sending multiple unicast messages may reduce contention on the multiple access broadcast channel.
[0159]
[0160] The message header
[0161] The type filed
[0162] ACK
[0163] NACK
[0164] NEW_PARENT
[0165] CANCEL_PARENT
[0166] HEARTBEAT
[0167] END_OF_BROADCAST
[0168] LINK_STATE_UPDATE_A
[0169]
[0170] RETRANSMISSION_OF_BROADCAST
[0171] The version field
[0172] The mode field
[0173] UNICAST
[0174] BROADCAST
[0175] Messages of type ACK, NACK, NEW_PARENT, CANCEL_PARENT, RETRANSMISSION_OF_BROADCAST, and END_OF_BROADCAST are sent as UNICAST. Messages of type LINK_STATE_UPDATE_A and LINK_STATE_UPDATE_B may be sent as either UNICAST or BROADCAST.
[0176] The number of sources field
[0177] The offset field
[0178] The sequence number field
[0179] The receiver identification field