Title:
Mobile ad hoc extensions for the internet
Document Type and Number:
Kind Code:
A1

Abstract:
Described is an internetworking system having various mobile ad hoc extensions to the Internet that are particularly suited to the dynamic environment of mobile ad hoc networks. The internetworking system includes any combination of a link-state routing protocol for disseminating topology and link-state information over a multi-hop network comprised of nodes, a neighbor discovery protocol that can detect the appearance and disappearance of new neighbor nodes, an address format that facilitates deployment of IPv6 nodes in a predominantly IPv4 network infrastructure, a queuing mechanism that can update information upon resuming interrupted communications between nodes, and dynamic network measurement techniques for adaptively using wireless bandwidth when establishing and maintaining connections between nodes and a server.

Inventors:
Ogier, Richard G. (Half Moon Bay, CA, US)
Woodworth, Carla Peccolo (San Mateo, CA, US)
Templin, Fred Lambert (Portola Valley, CA, US)
Bellur, Bhargav R. (Fremont, CA, US)
Arnold, James A. (Helena, MT, US)
Seaton, Scott D. (Fremont, CA, US)
Frandsen, Michael (Helena, MT, US)
Williams, Nathan W. (Helena, MT, US)
Gellrich, Christian A. (Redwood City, CA, US)
      Plaque It!

Sponsored by:
Flash of Genius
Application Number:
09/728211
Publication Date:
01/31/2002
Filing Date:
12/01/2000
View Patent Images:
Images are available in PDF form when logged in. To view PDFs, Login  or  Create Account (Free!)
Primary Class:
Other Classes:
370/466, 714/749, 709/239, 370/252
International Classes:
(IPC1-7): H04J003/22; H04J003/14
Attorney, Agent or Firm:
TESTA, HURWITZ & THIBEAULT, LLP (HIGH STREET TOWER, BOSTON, MA, 02110, US)
Claims:

What is claimed is:



1. A wireless mobile network comprising: a mobile client in communication with a server over a first wireless route through the network; routing nodes communicating with each other according to a protocol by which each routing node disseminates link-related information to zero or more neighbor nodes based on a tree developed and maintained by that routing node, the routing nodes determining that a link-state change in the first wireless route has interrupted communications between the mobile client and the server, and that the mobile client has selected an alternative wireless route through the network; and a queue storing communications affected by the interruption and transmitting such communications to the client and the server to resume communications between the client and the server over the alternative wireless route from the point of interruption.

2. The wireless mobile network of claim 1 further comprising a processing system measuring at least one network parameter during network operation for use in adapting communications between the client and the server to current network conditions.

3. The wireless mobile network of claim 2 wherein the measured network parameter is packet loss, and the processing system adjusts a length of packets transmitted between the client and the server in response to the measured packet loss.

4. The wireless mobile network of claim 2 wherein the measured network parameter is a round-trip time for a transmitted packet, and in response to the measured round-trip time the processing unit adjusts a time-out period for which a sender of a transmitted packet awaits a corresponding acknowledgment.

5. The wireless mobile network of claim 2 wherein the processing unit determines a number of retransmissions of an unacknowledged packet before an attempt is made to reestablish a connection between the client and the server in response to the measured network parameter.

6. The wireless mobile network of claim 5 wherein the processing unit determines a number of attempts to reestablish a connection between the client and the server if the transmitted packet remains unacknowledged packet after the number of retransmissions.

7. The wireless mobile network of claim 1 further comprising a node generating an IPv6 packet for transmission to a destination node, the IPv6 packet including an address having a globally aggregatable IPv6 address prefix and an IPv6-compatible interface identifier that contains an embedded IPv4 address associated with the destination node.

8. A wireless mobile network comprising: a first node generating an IPv6 packet for transmission to a destination node, the IPv6 packet including an address of the destination node, the address having a globally aggregatable IPv6 address prefix and an IPv6-compatible interface identifier that contains an embedded IPv4 address associated with the destination node; and routing nodes communicating with each other according to a protocol by which each routing node disseminates routing information to zero or more neighbor nodes based on a broadcast tree maintained in part by that routing node, the routing nodes determining a path to the destination node based on the routing information, the IPv6 address prefix, and the IPv4 address embedded within the IPv6-compatible interface identifier.

9. The wireless mobile network of claim 8 further comprising a processing system measuring at least one network parameter during network operation for use in adapting communications of the first node to current network conditions.

10. The wireless mobile network of claim 9 wherein the measured network parameter is packet loss, and the processing system adjusts a length of packets transmitted by the first node in response to the measured packet loss.

11. The wireless mobile network of claim 9 wherein the measured network parameter is a round-trip time for a transmitted packet, and in response to the measured round-trip time the processing unit adjusts a time-out period for which the first node awaits an acknowledgment for a transmitted packet.

12. The wireless mobile network of claim 9 wherein the processing unit in response to the measured network parameter determines a number of retransmissions of an unacknowledged packet before an attempt is made to reestablish a connection to the first node.

13. The wireless mobile network of claim 12 wherein the processing unit determines a number of attempts to establish a connection with the first node if the transmitted packet remains unacknowledged after the number of retransmissions.

Description:

RELATED APPLICATION

[0001] This application claims the benefit of the filing date of co-pending U.S. Provisional Application, Serial No. 60/190,358, filed Mar. 16, 2000, entitled “An IPv4 Compatibility Aggregatable Global Unicast Address Format,” co-pending U.S. Provisional Application, Serial No. 60/232,047, filed Sep. 12, 2000, entitled “Techniques for Improved Topology Based on Reverse-Path Forwarding,” co-pending U.S. Provisional Application, Serial No. 60/232,046, filed Sep. 12, 2000, entitled “Reduced Overhead Hello Protocol,” and co-pending U.S. Provisional Application, Serial No.______, filed Nov. 14, 2000, entitled “Efficient Routing Protocols for Packet-Radio Networks Based on Tree Sharing”, the entirety of which provisional applications are incorporated by reference herein.

GOVERNMENT SUPPORT

[0002] This invention was funded with government support under Contract No. DAAB07-96-D-H002, awarded by the U.S. Army Communications and Electronics Command. The United States government has certain rights to this invention.

BACKGROUND

[0003] A network is a collection of communications entities (e.g., hosts, routers, and gateways) that are in communication with each other over communication links. Organizing communications entities into networks increases the capabilities of the communication entities beyond that which each communications entity alone is capable by enabling such entities to share resources. A network that interconnects communications entities within a common geographical area (for example, the personal computers in an office) is called a local area network (LAN). Some LANs employ one or more network servers that direct the flow of data within the network and control access to certain network functions such as storing data in a central file repository, printing, accessing other networks. In other LANs, computers communicate with each other without the use of servers.

[0004] A wide area network (WAN), of which the Internet is an example, is a collection of geographically distributed LANs joined by long-range communication links. The Internet is a publicly accessible, worldwide network of networks based upon a transmission protocol known as TCP/IP (Transmission Control Protocol/Internet Protocol). Communications on the Internet is packet-switched; that is, the information that is to pass from one communications entity to another is broken into packets that are individually passed from router to router until the packets arrive at their destination. The TCP divides the data into segments and provides reliable delivery of bytes in the segments to the destination, which reconstructs the data. The IP further subdivides the TCP segments into packets and routes the packets to their final destination. The route taken by packets may pass through one or more networks, depending upon the Internet Protocol (IP) address of the destination.

[0005] A rapidly growing part of the Internet is the World Wide Web (“Web”), which operates according to a client-server model. Client software, commonly referred to as a Web browser, runs on a computer system. After establishing an Internet connection, the client user launches the Web browser to communicate with a Web server on the Internet. Using TCP/IP, the Web browser sends HTTP (Hypertext Transport Protocol) requests to the Web server. The request traverses the Internet's TCP/IP infrastructure to Web host server as HTTP packets.

[0006] A private network based on Internet technology and consisting of a collection of LAN and WAN components is called an Intranet. Accordingly, communications entities that are part of an intranet can use a Web browser to access Web servers that are within the intranet or on the Internet.

[0007] Today, most of the communication links between the various communications entities in a networks are wire-line; that is, client systems are typically connected to a server and to other client systems by wires, such as twisted-pair wires, coaxial cables, fiber optic cables, and the like. Wireless communication links, such as microwave links, radio frequency (RF) links, infrared (IR) links, and satellite links, are becoming more prevalent in networks.

[0008] A characteristic of wireless networks is that the communication entities in the network are mobile. Such mobility creates frequent, dynamic changes to the network topology and state of the communication links between the communication entities. Mobility is less of a concern for those communication entities connected to the Internet by wire-line, however, the topology of the Internet is perpetually changing, with communication entities joining and leaving the Internet often. Also, the state of communication links between communication entities on the Internet may change for various reasons, such as increased packet traffic.

[0009] To effectively route messages through such dynamically changing networks, routers need to remain informed of topology and link-state changes. Existing methods based on flooding are inefficient and consume too much network bandwidth. The inefficiency of flooding is the result, in part, of the following redundancies: (1) link-state and topology updates are sent over multiple paths to each router; and (2) every router forwards every update to all neighboring routers, even if only a small subset of the neighboring routers need to receive it.

[0010] The routing of update information and of data packets is further complicated by the heterogeneous infrastructure of the Internet. Currently, most communications entities on the Internet exchange messages using the Internet Protocol Version 4 (or IPv4), but an increasing number of communications entities that communicate using the Internet Protocol Version 6 (or IPv6) are being deployed. IPv6 is a second generation Internet Protocol designed to supplant IPv4, but is expected to coexist with IPv4 until the transition to IPv6 is complete. In general, the IP versions are incompatible: IPv4 routers cannot route IPv6 messages, nor can IPv6 routers route IPv4 messages. Instead, special routers that implement both the IPv4 and IPv6 protocols in a “dual-stack” configuration are required to support the coexistence and transition phase.

[0011] Another difficulty presented by the mobility of the communications entities is that the movement of one communication entity can interrupt on-going communications with another entity. For example, a portable laptop computer with a wireless link by which it is communicating with a Web server on the Internet may be moved so that the link to network, and thus to the Web server, is broken. In general, the loss of the link irretrievably causes the loss of any information being transmitted to the computer, although the laptop computer may later regain the link or establish a new link to the network. After reconnecting to the network, the laptop computer must reestablish communications with the Web server. The on-going communications are lost.

[0012] Thus, there remains a need for a mobile wireless network that can perform reliably and efficiently despite the aforementioned difficulties associated with the mobility of the communication entities in the network.

SUMMARY OF THE INVENTION

[0013] An objective of the invention is to enable seamless movement by mobile nodes from network to network. Another objective is to facilitate the addition of devices to networks. Yet another objective is to improve robustness of communications and connections in wireless networks comprised of mobile nodes.

[0014] In one aspect, the invention features a wireless mobile network comprising a mobile client in communication with a server over a first wireless route through the network. Routing nodes communicate with each other according to a protocol by which each routing node disseminates link-related information to zero or more neighbor nodes based on a tree developed and maintained by that routing node. The routing nodes determine that a link-state change in the first wireless route has interrupted communications between the mobile client and the server, and that the mobile client has selected an alternative wireless route through the network. A queue stores communications affected by the interruption and transmits such communications to the client and the server so that communications can resume between the client and the server over the alternative wireless route from the point of interruption.

[0015] In one embodiment, the wireless mobile network further comprises a processing system that measures at least one network parameter during network operation for use in adapting communications between the client and the server to current network conditions. The measured network parameter can be, for example, packet loss and round-trip time. The processing system can adjust a length of packets transmitted between the client and the server in response to the measured packet loss or adjust a time-out period for which a sender of a transmitted packet awaits a corresponding acknowledgment.

[0016] The processing unit can also make other determinations that serve to adapt communications between the client and the server in wireless mobile network, such as determining the number of retransmissions of an unacknowledged packet before an attempt is made to reestablish a connection between the client and the server in response to the measured network parameter, and determining the number of attempts to reestablish a connection between the client and the server if the transmitted packet remains unacknowledged packet after the number of retransmissions.

[0017] In one embodiment, the wireless mobile network also includes a node that generates an IPv6 packet for transmission to a destination node. The IPv6 packet includes an address having a globally aggregatable IPv6 address prefix and an IPv6-compatible interface identifier that contains an embedded IPv4 address associated with the destination node. The format of the address achieves compatibility between IPv6 and IPv4, enabling IPv6 packets to be routed through IPv6 routing infrastructure or to be tunneled through IPv4 routing infrastructure.

[0018] In another aspect, the invention features a network comprising a node that generates an IPv6 packet for transmission to a destination node. The IPv6 packet includes an address of the destination node. This address has a globally aggregatable IPv6 address prefix and an IPv6-compatible interface identifier that contains an embedded IPv4 address associated with the destination node. The network also includes routing nodes that communicate with each other according to a protocol by which each routing node disseminates routing information to zero or more neighbor nodes based on a broadcast tree maintained in part by that routing node. The routing nodes determine a route to the destination node based on the routing information, the IPv6 address prefix, and the IPv4 address embedded within the IPv6-compatible interface identifier.

[0019] In one embodiment, the wireless mobile network includes a processing system measuring at least one network parameter during network operation for use in adapting communications of the first node to current network conditions. The measured network parameter can be, for example, packet loss and round-trip time. The processing system can adjust a length of packets transmitted by the first node in response to the measured packet loss or adjust a time-out period for which the first node awaits an acknowledgment for a transmitted packet.

[0020] Other determinations that may be made by the processing unit for adapting communications include determining the number of retransmissions of an unacknowledged packet before an attempt is made to reestablish a connection to the first node, and determining the number of attempts to reestablish a connection to the first node if the transmitted packet remains unacknowledged packet after the number of retransmissions.

BRIEF DESCRIPTION OF THE DRAWINGS

[0021] The invention is pointed out with particularity in the appended claims. The objectives advantages of the invention described above, as well as further objectives and advantages of the invention, may be better understood by reference to the following description taken in conjunction with the accompanying drawings, in which:

[0022] FIG. 1 is a block diagram of an embodiment of a mobile internetworking system including a plurality of subnets in communication with the Internet;

[0023] FIG. 2 is a block diagram of a portion of an embodiment of protocol stack that can be implemented by each of the routing nodes in each subnet to communicate in accordance with the principles of the invention;

[0024] FIG. 3 is a flow diagram illustrating an embodiment of a process by which each routing node selects a parent neighbor node and children neighbor node(s) for each potential source node in the subnet to define a minimum-hop-path tree for each potential source node along which routing nodes receive and forward link-state updates originating from that source node;

[0025] FIG. 4 is a diagram illustrating an embodiment of an exemplary minimum-hop-path tree for the nodes in the subnet of FIG. 1 ;

[0026] FIG. 5 is a block diagram illustrating the operation of a partial topology version of the TBRPF protocol;

[0027] FIG. 6 is a diagram of an embodiment of a format of a message header for an atomic TBRPF protocol message;

[0028] FIG. 7 is a diagram of an embodiment of a format of a compound TBRPF message;

[0029] FIGS. 8A and 8B are diagrams of embodiments of a format of a NEW PARENT protocol message;

[0030] FIG. 9 is a diagram of an embodiment of a format for a CANCEL PARENT message;

[0031] FIGS. 10A and 10B are diagrams of embodiments of exemplary formats for link-state update messages;

[0032] FIG. 11 is a diagram of an embodiment of an exemplary format of a RETRANSMISSION_OF_BROADCAST message;

[0033] FIG. 12 is a flow diagram of an embodiment of a process performed by the nodes of the subnet to achieve neighbor discovery;

[0034] FIG. 13 is a diagram of a packet format for the protocol messages used for neighbor discovery;

[0035] FIGS. 14 are a flow diagram of another embodiment of a process for performing neighbor discovery;

[0036] FIG. 15A is a diagram of a format for an IPv6 address including a prefix and an interface identifier;

[0037] FIG. 15B is a diagram of an embodiment of the interface identifier including a 24-bit company identifier concatenated with a 40-bit extension identifier;

[0038] FIG. 15C is a diagram of an embodiment of the interface identifier including a 24-bit company identifier concatenated with the 40-bit extension identifier;

[0039] FIG. 15D is a diagram of an IP v6-IPv4 compatibility address;

[0040] FIG. 15E is a diagram of an embodiment of a message format for tunneling an IPv6-IPv4 compatibility address through IPv4 infrastructure;

[0041] FIG. 16 is a flow diagram of an embodiment of a process by which a router tests an IPv6-IPv4 compatibility address;

[0042] FIG. 17 is a flow diagram of an embodiment of a process by which a mobile node and a server exchange messages;

[0043] FIGS. 18A and 18B are diagrams illustrating an example of the operation of a message queue.

DESCRIPTION OF THE INVENTION

[0044] FIG. 1 shows an embodiment of an internetworking system 2 including communication sub-networks (“subnets”) 10 , 20 that are components of a worldwide network of networks 30 (i.e., the “Internet”). The Internet 30 includes communications entities, (e.g., hosts and routers), that exchange messages according to an Internet Protocol (IP) such as IPv4 (version 4) and IPv6 (version 6). On the Internet 30 , entities implementing IPv6 may coexist with IPv4 entities. In general, the IPv4 and IPv6 versions are incompatible. The incompatibility is due, in part, to the difference in addressing format: the IPv4 specifies a 32-bit address format, whereas the IPv6 specifies a 128-bit address format.

[0045] A server 40 is connected to the Internet 30 by a wire-line or wireless connection. The server 40 can be internal or external to the subnet 10 . For purposes such as hosting application programs, delivering information or Web pages, hosting databases, handling electronic mail (“e-mail”), or controlling access to other portions of the Internet 30 , the server 40 is a computer system that typically handles multiple connections to other entities (e.g., client systems) simultaneously. Although represented as a single server 40 , other embodiments can have a group of interconnected servers. The data on the server 40 are replicated on one or more of these interconnected servers to provide redundancy in the event that a connection to the server 40 cannot be established.

[0046] Each subnet 10 , 20 includes one or more networks that can include both local area network (LAN) and wide area network (WAN) components. Each subnet 10 , 20 may be a freely accessible component of the public Internet 30 , or a private Intranet. The subnet 10 includes IP hosts 12 , routers 14 , and a gateway 16 (collectively referred to as nodes 18 ). As used hereafter, a router 14 is any node 18 that forwards IP packets not explicitly addressed to itself, and an IP host 12 is any node 18 that is not a router 14 . Examples of devices that can participate as a node 18 in the subnet 10 include laptop computers, desktop computers, wireless telephones, and personal digital assistants (PDAs), network computers, television sets with a service such as Web TV, client computer systems, server computer systems. The gateway 16 is a particular type of routing node 14 that connects the subnet 10 to the Internet 30 . The subnet 20 is similarly configured with nodes 18′ (i.e., hosts 12 ′, routers 14 ′,and gateways 16 ′).

[0047] The subnet 10 can be associated with one organization or administrative domain, such as an Internet service provider (ISP), which associates each node 18 with an assigned IPv6 or IPv4 network address. Each IPv6 address is globally unique, whereas each IPv4 address is locally unique at least within the subnet 10 , and may be globally unique. Presumably, the assigned IP address has some topological relevance to the “home” subnet 10 of the node 18 so that the nodes 18 of the subnet 10 can be collectively identified by a common address prefix for routing purposes (called address aggregation). In one embodiment, the gateway 16 is a dual-stack node; that is, the gateway 16 has two IP addresses, an IPv6 address and an IPv4 address, and can route packets to IPv4 and IPv6 nodes.

[0048] Although it is conceivable that all nodes 18 in subnet 10 are initially assigned network addresses that follow a common address convention and have a common network prefix, dynamic topology changes may result in nodes 18 leaving their home subnet 10 to join a “foreign” subnet (e.g., subnet 20 ) and new nodes joining the home subnet 10 . Because the nodes 18 maintain the same IP address irrespective of whether the node 18 changes its location within the subnet 10 or moves to the foreign subnet 20 , mobility may result in a heterogeneous conglomerate of IPv6 and IPv4 addresses, having various network prefixes, within the single subnet 10 unless some form of dynamic address assignment or other address-renumbering scheme is used. Further, the gradual transition from the use of IPv4 network addresses to IPv6 network addresses within the subnet 10 increases the likelihood of such a heterogeneous conglomeration. Thus, like the Internet 30 , the infrastructure of the subnet 10 can become heterogeneous; some nodes 18 can be IPv4 nodes, while others are IPv6 nodes.

[0049] In the subnet 10 , each node 18 can establish connectivity with one or more other nodes 18 through broadcast or point-to-point links. In general, each link is a communication facility or medium over which nodes 18 can communicate at the link layer (i.e., the protocol layer immediately below the Internet Protocol layer.) Such communication links can be wire-line (e.g., telephone lines) or wireless; thus, nodes 18 are referred to as wireless or wire-line depending upon the type of communication link that the node 18 has to the subnet 10 . Examples 5 of wireless communication links are microwave links, radio frequency (RF) links, infrared (IR) links, and satellite links. Protocols for establishing link layer links include Ethernet, PPP (Point-to-Point Protocol) links, X.25, Frame Relay, or ATM (asynchronous transfer mode). Each wireless node 18 , e.g., IP host A 12 , has a range 22 of communication within which that node 18 can establish a connection to the subnet 10 . When beyond the range 22 of communication, the IP host A 12 cannot communicate with the server 40 on the Internet 30 or with other nodes 18 in the subnet 10 .

[0050] Each broadcast link connecting multiple nodes 18 is mapped into multiple point-to-point bi-directional links. For example, a pair of nodes 18 is considered to have established a bi-directional link 18 , if each node 18 can reliably receive messages from the other. For example, IP host A 12 and node B 14 have established a bi-directional link 24 if and only if IP host A 12 can receive messages sent from node B 14 and node B 14 can receive messages sent from IP host A 12 at a given instant in time. Nodes 18 that have established a bi-directional link are considered to be adjacent (i.e., neighboring nodes). Such a bi-directional link 24 between the two nodes A and B is represented by a pair of unidirectional links (A, B) and (B, A). Each link has at least one positive cost (or metric) that can vary in time, and for any given cost, such cost for the link (A, B) may be different from that for the link (B, A). Any technique for assigning costs to links can be used to practice the invention. For example, the cost of a link can be one, for minimum-hop routing, or the link delay plus a constant bias.

[0051] In one embodiment, the subnet 10 is a mobile “ad hoc” network (“MANET”) in that the topology of the subnet 10 and the state of the links (i.e., link state) between the nodes 18 in the subnet 10 can change frequently because several of the nodes 18 are mobile. That is, each mobile node 18 may move from one location to another location within the same subnet 10 or to another subnet 20 , dynamically breaking existing links and establishing new links with other nodes 18 , 18 ′ as a result. Such movement by one node 18 does not necessarily result in breaking a link, but may diminish the quality of the communications with another node 18 over that link. In this case, a cost of that link has increased. Movement that breaks a link may interrupt any on-going communications with other nodes 18 in the subnet 10 or in the foreign subnet 20 , or with servers (e.g., server 40 ) connected to the Internet 30 . In another embodiment, the position of every node 18 in the subnet 10 is fixed (i.e., a static network configuration in which no link state changes occur due to node mobility). As the principles of the invention apply to both static and dynamic network configurations, a reference to the subnet 10 contemplates both types of network environments.

[0052] The following example illustrates the operation of the subnet 10 . Consider, for example, that node A is communicating with the server 40 over a route through subnet 10 that includes the link (A, B) to node B 14 , when node A 12 moves from its present location. This movement breaks the communication link with node B 14 and, as a result, interrupts communications with the server 40 . The relocation of node A 12 may break a link with one or more other nodes 18 as well. As one example, the movement by node A 12 may temporarily take node A 12 out of communication range with node B 14 , and upon returning within range, node A 12 can reestablish the broken link 24 with node B 14 . In this example, the link 24 is intermittent. As another example, node A 12 may move to a different location within the subnet 10 altogether and reestablish a bi-directional link 26 with a different node, (e.g., here node H). In yet another example, node A 12 may move to the foreign subnet 20 and establish a bi-directional link 28 with a node 14 ′ in the subnet 20 (e.g., node M 14 ′).

[0053] Each router 14 in the subnet 10 is responsible for detecting, updating, and reporting changes in cost and up-or-down status of each outgoing communication link to neighbor nodes. Thus, each router 14 in the subnet 10 runs a link-state-routing protocol for disseminating subnet topology and link-state information to the other routers 14 in the subnet 10 . Each router 14 also executes a neighbor discovery protocol for detecting the arrival of new neighbor nodes and the loss of existing neighbor nodes. To achieve discovery, IP hosts 12 connected to the subnet 10 also run the neighbor discovery protocol. IP hosts 12 can also operate as routers by running the link-state-routing protocol (in the description, such routing IP hosts are categorically referred to as routers 14 ). The link-state-routing protocol, referred to as a topology broadcast based on reverse-path forwarding (TBRPF) protocol, seeks to substantially minimize the amount of update and control traffic required to maintain shortest (or nearly shortest) paths to all destinations in the subnet 10 .

[0054] In brief, the TBRPF protocol performed by each of the routers 14 in the subnet 10 operates to inform a subset of the neighboring routers 14 in the subnet 10 of the current network topology and corresponding link-state information. Thus, for the examples above, each router 14 in the subnet 10 that detects a change in a link to node A 12 , (e.g., node B 14 in the cost of the link (B, A)), operates as the source (i.e., source node) of an update. Each source node sends a message to a neighbor of that source node, informing the neighbor of the update to that link. Each router 14 receiving the update may subsequently forward the update to zero or more neighbor nodes, until the change in the topology of the subnet 10 disseminates to the appropriate routers 14 in the subnet 10 .

[0055] To transmit update messages, the TBRPF protocol supports unicast transmissions (e.g., point-to-point or receiver directed), in which a packet reaches only a single neighbor, and broadcast transmissions, in which a single packet is transmitted simultaneously to all neighbor nodes. In particular, the TBRPF protocol allows an update to be sent either on a common broadcast channel or on one or more unicast channels, depending on the number of neighbors that need to receive the update.

[0056] Upon recovering the same link to node B 14 , or upon reestablishing a new link to another node 18 in the same subnet 10 or in the foreign subnet 20 , the node A 12 can resume the interrupted communications with server 40 . In effect, one of the nodes 18 , 18 ′ in the subnet 10 , 20 , respectively, using the neighbor discovery protocol, discovers node A 12 and, using the TBRPF protocol, initiates dissemination of topology and link-state information associated with the link to node A 12 . The routers 14 also use the TBRPF protocol to disseminate this information to the other routers in the respective subnet 10 so that one or more routes to the node A 12 become available for communication with the server 40 .

[0057] In one embodiment, such communications resume at their point of interruption. In brief, node A 12 maintains, in local cache, copies of objects that are located on the server 40 . When node A 12 and the server 40 are in communication, node A 12 updates the objects as necessary, thereby maintaining substantially up-to-date copies of the objects. Thus, when node A 12 moves out of the communication range 22 with the subnet 10 , the node A 12 initially has up-to-date information. Then when node A 12 reconnects to the subnet 10 , the server 40 forwards previously undelivered updates to the objects locally stored at node A 12 , along a route determined by information stored at each of the routing nodes 14 . In the event node A 12 reconnects to the foreign subnet 20 , a hand-off protocol, such as MobileIP, is used to achieve the redirection of the messages between the server 40 and the node A 12 .

[0058] The route taken by the object updates may traverse a heterogeneous IPv6/IPv4 infrastructure. Normally, IPv6 nodes are unable to route packets to other IPv6 nodes 18 over routes that pass through IPv4 infrastructure. In one embodiment, described in more detail below, the nodes 18 use an IPv6 -IPv4 compatible aggregatable global unicast address format to achieve such routing. This IPv6 -IPv6 compatibility address format also enables incremental deployment of IPv6 nodes 18 that do not share a common multiple access data-link with another IPv6 node 18 .

[0059] Accordingly, the internetworking system 2 provides various mobile ad hoc extensions to the Internet 30 that are particularly suited to the dynamic environment of mobile ad hoc networks. Such extensions, which are described further below, include techniques for (1) disseminating update information to nodes 18 in the subnet 10 using the TBRPF protocol; (2) detecting the appearance and disappearance of new neighbor nodes using a neighbor discovery protocol; (3) establishing an address format that facilitates deployment of IPv6 nodes in a predominantly IPv4 network infrastructure; (4) updating information upon resuming communications between nodes; and (5) adaptively using network bandwidth to establish and maintain connections between nodes 18 and the server 40 .

[0060] FIG. 2 shows a portion of an embodiment of protocol stack 50 that can be used by each of the routing nodes 14 , 14 ′ to communicate with other routing nodes 14 in the subnet 10 , 20 and on the Internet 30 , and thereby implement the various extensions to the Internet 30 described herein. The protocol stack 50 includes a data-link layer 54 , a network layer 62 , and a transport layer 70 .

[0061] The data link layer 54 can implemented by any conventional data link protocol (e.g., IEEE 802.11) with an addressing scheme that supports broadcast, multicast and unicast addressing with best-effort (not guaranteed) message delivery services between nodes 18 having instantaneous bi-directional links. For such implementations, each node 18 in the subnet 10 has a unique data link layer unicast address assignment.

[0062] The network layer 62 is the protocol layer responsible for assuring that packets arrive at their proper destination. Some of the mobile ad hoc extensions for the Internet 30 described herein operate at the network layer 62 , such as the TBRPF protocol 58 and the IPv6 -IPv4 compatibility address format, described in more detail below. Embodiments that redirect communications from foreign subnets to home subnets also use hand-off mechanisms such as Mobile IP, which operate at the network layer 62 . At the transport layer 70 , other mobile ad hoc extensions to the Internet 30 are implemented, such as techniques for updating communications upon restoring connections between nodes and for adaptively using the network bandwidth.

[0063] Topology Broadcast based on Reverse-Path Forwarding (TBRPF) Protocol

[0064] In brief, the TBRPF protocol uses the concept of reverse-path forwarding to broadcast each link-state update in the reverse direction along a tree formed by the minimum-hop paths from all routing nodes 14 to the source of the update. That is, each link-state update is broadcast along the minimum-hop-path tree rooted at the source (i.e., source node “src”) of the update. The minimum-hop-path trees (one tree per source) are updated dynamically using the topology and link-state information that are received along the minimum-hop-path trees themselves. In one embodiment, minimum-hop-path trees are used because they change less frequently than shortest-path trees that are determined based on a metric, such as delay. Other embodiments of the TBRPF protocol can use other types of trees, such as shortest path trees, to practice the principles of the invention.

[0065] Based on the information received along the minimum-hop-path trees, each node 18 in the subnet 10 computes a parent node and children nodes, if any, for the minimum-hop-path tree rooted at each source node src. Each routing node 14 may receive and forward updates originating from a source node src along the minimum-hop-path tree rooted at that source node src. Each routing node 14 in the subnet 10 also engages in neighbor discovery to detect new neighbor nodes and the loss of existing neighbor nodes. Consequently, the routing node 14 may become the source of an update and thus may generate an update message. When forwarding data packets to a destination node, each routing node 14 selects the next node on a route to the destination.

[0066] To communicate according to the TBRPF protocol, each routing node 14 (or node i, when referred to generally) in the subnet 10 stores the following information:

[0067] 1. A topology table, denoted TT_i, consisting of all link-states stored at node i. The entry for link (u, v) in this table is denoted TT_i(u, v) and includes the most recent update (u, v, c, sn) received for link (u, v). The component c represents the cost associated with the link, and the component sn is a serial number for identifying the most recent update affecting link (u, v) received by the node i. The components c and sn of the entry for the link (u, v) is denoted TT_i(u, v).c and TT_i(u, v).sn. Optionally, the dissemination of multiple link metrics is attainable by replacing the single cost c with a vector of multiple metrics.

[0068] 2. A list of neighbor nodes, denoted N_i.

[0069] 3. For each node u other than node i, the following is maintained:

[0070] a. The parent, denoted p_i(u), which is the neighbor node (“nbr”) of node i that is the next node on a minimum-hop path from node i to node u, as obtained from the topology table TT_i.

[0071] b. A list of children nodes of node i, denoted children_i(u).

[0072] C. The sequence number of the most recent link-state update originating from node u received by node i, denoted sn_i(u). The sequence number is included in the link-state update message. The use of sequence numbers helps achieve reliability despite topology changes, because node i avoids sending another node information that the other node already has. Each node i maintains a counter (i.e., the sequence number) for each link that the node i monitors. That counter is incremented each time the status of the link changes.

[0073] d. The routing table entry for node u, consisting of the next node on a preferred path to node u. The routing table entry for node u can be equal to the parent p_i(u) if minimum-hop routing is used for data packets. However, in general, the routing table entry for node u is not p_i(u), because the selection of routes for data traffic can be based on any objective.

[0074] One embodiment of the TBRPF protocol uses the following message types:

[0075] LINK-STATE UPDATE: A message containing one or more link-state updates (u, v, c, sn).

[0076] NEW PARENT: A message informing a neighbor node that the node has selected that neighbor node to be a parent with respect to one or more sources of updates.

[0077] CANCEL PARENT: A message informing a neighbor that it is no longer a parent with respect to one or more sources of updates.

[0078] HELLO: A message sent periodically by each node i for neighbor discovery.

[0079] NEIGHBOR: A message sent in response to a HELLO message.

[0080] NEIGHBOR ACK: A message sent in response to a NEIGHBOR message.

[0081] ACK: A link-level acknowledgment to a unicast transmission.

[0082] NACK: A link-level negative acknowledgment reporting that one or more update messages sent on the broadcast channel were not received.

[0083] RETRANSMISSION OF BROADCAST: A retransmission, on a unicast channel, of link-state updates belonging to an update message for which a NACK message was received.

[0084] HEARTBEAT: A message sent periodically on the broadcast channel when there are no updates to be sent on this channel, used to achieve reliable link-level broadcast of update messages based on NACKs. END OF BROADCAST: A message sent to a neighbor over a unicast channel, to report that updates originating from one or more sources are now being sent on the unicast channel instead of the broadcast channel.

[0085] The formats for the various types of TBRPF protocol messages are described below.

Building the Minimum-Hop-Path Tree For a Source

[0086] FIG. 3 shows an embodiment of a process by which each routing node 14 selects a parent neighbor node and children neighbor node(s) for each potential source node src in the subnet 10 . The selection of the parent and children neighbor nodes for each potential source node src define a minimum-hop-path tree for that potential source node along which the routing nodes 14 receive and forward link-state updates originating from that source node src. Pseudo-code describing the network-level procedures performed by each routing node 14 is in Appendix A.

[0087] Node i receives (step 90 ) a message over a communication link. The received message can represent a link-state update, the discovery of a new neighbor node, the loss of a neighbor node, a change in the cost of a link to a neighbor node, a selection of a new parent neighbor node, or a cancellation of a parent neighbor node. Pseudo-code for processing these types of received messages is provided in Appendix A; the corresponding procedures are called Process_Update, Link_Up, Link_Down, Link_Change, Process_New_Parent, and Process_Cancel_Parent, respectively. The general procedure followed in response to all of these events, and the specific procedure followed by a node that has just started and has no topology information are described below.

[0088] If node i receives a message representing a link-state update, the discovery of a new neighbor node, the loss of a neighbor node, or a change in the cost of a link to a neighbor node, node i enters (step 100 ) the new link-state information, if any into the topology table, TT_i, and forwards (step 102 ) the link-state information in a link-state update to the neighbor nodes in children_i(src), where src is the source node at which the update originated. Node i then computes (step 104 ) the parent nodes p_i(u) for all potential source nodes src by running a shortest-path algorithm such as Dijkstra's algorithm. If this computation results in a change to the parent node p i (u) for any source u, node i then sends a NEW PARENT(u, sn) message, where sn=sn_i(u), to the new parent node p_i(u) and a CANCEL PARENT message to the old parent node (step 106 ).

[0089] If node i receives (step 90 ) a NEW PARENT(u, sn) message from a sending node with source u and sequence number sn, node i adds(step 108 ) the sending node to node i's list of children nodes children_i(u) for that source u, and then sends (step 110 ) the sending node a LINK-STATE UPDATE message containing all updates in node i's topology table, TT_i, originating from source u and having a sequence number greater than sn. If node i receives (step 90 ) a CANCEL PARENT(u) message from a sending node with source u, node i removes (step 112 ) the sending node from node i's list of children nodes children_i(u) for that source u.

[0090] Consider, for example, the case in which node i initially has no topology information. Accordingly, node i has no links to neighbor nodes, and its topology table TT_i is empty. Also the parent node is p_i(src)=NULL (i.e., not defined), the children_i(src) is the empty set, and sn_i(src)=0 for each source node src. Upon receiving (step 90 ) messages representing the discovery of neighbor nodes, node i executes the Link_Up procedure to process each link established with each neighbor node nbr. Because each neighbor node nbr of node i is (trivially) the next node on the minimum-hop path from node i to neighbor node nbr, node i selects (step 104 ) each of its neighbor nodes nbr as the new parent node p_i(nbr) for source node nbr. Execution of the Link-Up procedure results in node i sending (step 106 ) a NEW PARENT message to each neighbor node nbr. Therefore, the NEW PARENT message sent to a new neighbor node nbr contains the neighbor node nbr (and possibly other sources) in its source list.

[0091] In response to the NEW PARENT message, then each neighbor node nbr informs (step 110 ) node i of the outgoing links of neighbor node nbr. Information about the outgoing links of neighbor node nbr allows node i to compute minimum-hop paths to the nodes at the other end of the outgoing links, and thus to compute (step 104 ) new parents p_i(src), for all source nodes src that are two hops away. Node i sends (step 106 ) a NEW PARENT message to each of these computed new parents. Then each parent p_i(src) for each such source node src informs (step 110 ) node i of the outgoing links for source node src, which allows node i to compute (step 104 ) new parents for all source nodes that are three hops away. This process continues until node i has computed parent nodes for all sources nodes src in the subnet 10 . As a result, for a given source node src, the parents p i(src) for all nodes i other than source node src define a minimum-hop-path tree rooted at source node src (after the protocol has converged).

[0092] Node i cancels an existing parent p_i(src) by sending a CANCEL PARENT(src) message containing the identity of the source node src. Consequently, the set of children, children_i(src), at node i with respect to source node src is the set of neighbor nodes from which node i has received a NEW PARENT message containing the identity of source node src without receiving a subsequent CANCEL PARENT message for that source node src. Node i can also simultaneously select a neighbor node as the parent for multiple sources, so that the node i sends a NEW PARENT(src_list, sn_list) message to the new parent, where src_list is the list of source nodes and sn_list is the corresponding list of sequence numbers. Similarly, a CANCEL PARENT message can contain a list of sources.

[0093] In one embodiment, the TBRPF does not use NEW PARENT and CANCEL PARENT messages in the generation the minimum-hop-path tree. Instead, each node i computes the minimum-hop paths from each neighbor node nbr to all destinations (e.g., by using breadth-first search or Dijkstra's shortest-path algorithm). Consequently, each node i computes the parents p_nbr(src) for each neighbor node nbr and source node src, from which node i determines which neighbor nodes nbr are its children for the given source node src. Although this process eliminates NEW PARENT and CANCEL PARENT messages, the process also requires that each node i (1) sends all updates originating from the source node src to any child node in children_i(src), or (2) periodically sends updates along the minimum-hop-path tree, because node i does not know the sequence number sn_nbr(src) from the neighbor node nbr and thus does not know what updates the neighbor node nbr already has. Either of these actions ensures that each neighbor node nbr receives the most recent information for each link.

[0094] FIG. 4 shows an embodiment of an exemplary minimum-hop-path tree 120 for the nodes 18 in the subnet 10 of FIG. 1 . For the sake of illustration, assume that node D is the source of an update. The parent 122 for nodes C, G, and H with respect to the source node D is node D; the parent 124 for node F with respect to source node D is node H; the parent 126 for nodes A and B with respect to source node D is node F; the parent 128 for node E is node B; and the parent 130 for node L is node A. (In this example, node A is a routing node 14 , and thus runs the TBRPF protocol.)

[0095] Conversely, the children 132 of node D are nodes C, G, and H; the child 134 of node H is node F; the children 136 of node F are nodes A and B; the child 138 of node B is node E, and the child 140 of node A is node L. As shown, nodes C, E, G, and L are leaf nodes, which, in accordance with the TBRPF protocol, do not have to forward updates originating from the source node D.

Updating the Minimum-Hop-Path Tree

[0096] In brief, the TBRPF protocol disseminates link-state updates generated by a source node src along the minimum-hop-path tree rooted at node src and dynamically updates the minimum-hop-path tree based on the topology and link-state information received along the minimum-hop-path tree. More specifically, whenever the topology table TT_i of node i changes, the node i computes its parent p_i(src) with respect to every source node src (see the procedure Update_Parents in Appendix A). The node i computes parents by (1) computing minimum-hop paths to all other nodes using, for example, Dijkstra's algorithm, and (2) selecting the next node on the minimum-hop path to each source node src to be the parent for that source node src (see the procedure Compute_New_Parents in Appendix A). The computation of parents occurs when the node i receives a topology update, establishes a link to a new neighbor node, or detects a failure or change in cost of a link to an existing neighbor node.

[0097] In one embodiment, node i computes a new parent p_i(src) for a given source node src even though the path to the source node src through the new parent has the same number of hops as the path to the source node src through the old parent. In another embodiment, the node keeps the old parent node in this event, thus reducing the overhead of the TBRPF protocol. This embodiment can be implemented, for example, by using the procedure Compute_New_Parents 2 (given in Appendix A) instead of the procedure Compute_New_Parents.

[0098] If the parent p_i(src) changes, node i sends the message CANCEL PARENT(src) to the current (i.e., old) parent, if the old parent exists. Upon receiving the CANCEL PARENT(src) message, the old parent (“k”) removes node i from the list children_k(src).

[0099] Node i also sends the message NEW PARENT(src, sn) to the newly computed parent if the new parent exists, where sn=sn_i(src) is the sequence number of the most recent link-state update originating from source node src received by node i. This sequence number indicates the “position” up to which node i has received updates from the old parent, and indicates to the new parent that it should send only those updates that occurred subsequently (i.e., after that sequence number).

[0100] Upon receiving the NEW PARENT(src, sn) message, the new parent “j” for p_i(src) adds node i to the list children_j(src) and sends to node i a link-state update message consisting of all the link states originating from source node src in its topology table that have a sequence number greater than sn (see the procedure Process_New_Parent in Appendix A). Thus, only updates not yet known to node i are sent to node i.

[0101] Generally, the range of sequence numbers is large enough so that wraparound does not occur. However, if a small sequence number range is used, wraparound can be handled by employing infrequent periodic updates with a period that is less than half the minimum wraparound period, and by using a cyclic comparison of sequence numbers. That is, sn is considered less than sn′if either sn is less than sn′ and the difference between sn and sn′ (sn′−sn′) is less than half the largest possible sequence number, or sn′ is less than sn and the difference, sn−sn′, is greater than half the largest possible sequence number.

[0102] When a node i detects the existence of a new neighbor nbr, it executes Link_Up(i,nbr) to process this newly established link. The link cost and sequence number fields for this link in the topology table at node i are updated. Then, the corresponding link-state message is sent to all neighbors in children_i(i). As noted above, node i also recomputes its parent node p_i(src) for every node src, in response to this topological change. In a similar manner, when node i detects the loss of connectivity to an existing neighbor node nbr, node i executes Link_Down(i, nbr). Link_Change(i, nbr) is likewise executed at node i in response to a change in the cost to an existing neighbor node nbr. However, this procedure does not recompute parents.

[0103] In one embodiment, if a path between the node i and a given source node src ceases to exist, the node i computes a new parent p_i(src) that is set to NULL (i.e., parent does not exist). In another embodiment, although the path between the node i and the given source node src ceases to exist, the node i keeps the current parent, if the current parent is still a neighbor node of the node i. Thus, the overhead of the TBRPF protocol is reduced because it is unnecessary to send a CANCEL PARENT and a subsequent NEW PARENT messages if the old path to the source becomes operational later because of a link recovery. This embodiment can be implemented by replacing the fifth line of the procedure Update-Parents in Appendix A, “If (new_p_i(src)!=p_i(src)){”, with the line “If(new_p_i(src)!=p_i(src) and new_p_i(src)!=NULL){”.

[0104] The TBRPF protocol does not use an age field in link-state update messages. However, failed links (represented by an infinite cost) and links that are unreachable (i.e., links (u, v) such that p_i(u)=NULL) are deleted from the topology table TT_i after MAX_AGE seconds (e.g., 1 hour) in order to conserve memory. Failed links (u, v) are maintained for some time in the topology table TT_i, rather than deleted immediately, to ensure that the node i that changes its parent p_i(u) near the time of failure (or had no parent p_i(u) during the failure) is informed of the failure by the new parent.

[0105] Unreachable links, (i.e., links (u, v) such that node i and node u are on different sides of a network partition), are maintained for a period of time to avoid having to rebroadcast the old link state for (u, v) throughout node i's side of the partition, if the network partition soon recovers, which can often happen if the network partition is caused by a marginal link that oscillates between the up and down states. If a link recovers resulting in the reconnection of two network components that were disconnected (i.e., partitioned) prior to the link recovery, the routing nodes 14 in one partition my temporarily have invalid routes to nodes 18 in the other partition. This occurs because the routing nodes 14 may receive an update message for the link recovery before receiving update messages for links in the other partition. Consequently, the link_information for those links in the other partition may be outdated temporarily.

[0106] To correct this situation, in one embodiment, a header field is added to each link-state update message, which indicates whether the update message is sent in response to a NEW PARENT message. The header field also identifies the corresponding NEW PARENT message using a sequence number. For example, if a given node i sends a NEW PARENT message (for multiple sources) to node j following the recovery of the link (i, j), the node i waits for a response from node j to the NEW PARENT message before sending to node i's neighbor nodes an update message corresponding to the link recovery. The response from node j includes the link-state information of the other nodes 18 in the previously disconnected partition. Then node i forwards this link-state information to node i's neighbor nodes. Consequently, the nodes 18 in the same partition as node i receives updates for the links in the other partition at the same time that the nodes 18 receive the update for the link recovery. Thus, the link-state information for those links in the other partition is not outdated temporarily.

[0107] A node i that is turned off (or goes to sleep) operates as if the links to all neighbors have gone down. Thus, the node i remembers the link-state information that it had when turned off. Since all such links are either down or unreachable, these link states are deleted from the topology table TT_i if the node i awakens after being in sleep mode for more than MAX_AGE seconds.

[0108] Infrequent periodic updates occur to correct errors that may appear in table entries or it update messages. (See Send_Periodic_Updates in Appendix A.) As discussed above, periodic updates are also useful if the sequence number range is not large enough to avoid wraparound.

Initiating an Update Message

[0109] When a given routing node 14 detects a change in the state of a neighbor node, that routing node 14 becomes the source (i.e., source node src) of a link-state update message with respect to corresponding link to that neighbor node. As described above, the source node src then broadcasts each link-state update along the minimum-hop-path tree rooted at the source of the update.

[0110] A link-state update message reports the state of the link (src, nbr) as a tuple (src, nbr, c, sn), where c and sn are the cost and the sequence number associated with the update. A cost of infinity represents a failed link. The source node src is the head node of link (src, nbr), and is the only node that can report changes to parameters of link (src, nbr). Therefore, any node 18 receiving the link-state update (src, nbr, c, sn) can determine that the update originated from the source node src.

[0111] The source node src maintains a counter sn_src, which is incremented by at least one each time the cost of one or more outgoing links (src, nbr) changes value. For example, the counter sn_src can be a time stamp that represents the number of seconds (or other units of time) elapsed from some fixed time. When the source node src generates a link-state update (src, nbr, c, sn), the sequence number sn is set to the current value of sn_src.

Receiving an Update Message

[0112] In brief, each routing node 14 that receives a link-state update message receives that update message along a single path. That is, any link-state update originating from source node src is accepted by node i if (1) the link-state update is received from the parent node p i(src), and (2) the link-state update has a larger sequence number than the corresponding link-state entry in the topology table TT_i at node i. If the link-state update is accepted, node i enters the link-state update into the topology table TT_i. Node i may then forward the link-state update to zero or more children nodes in children_i(src). In one embodiment, the link-state update passes to every child node in children_i(src). (See the procedures Update_Topology_Table and Process_Update in the Appendix A.)

Forwarding Update Messages

[0113] In most link-state routing protocols, e.g., OSPF (Opens Shortest Path First), each routing node 18 forwards the same link-state information to all neighbor nodes. In contrast, in one embodiment of the TBRPF protocol, each routing node 14 sends each link-state update only to neighbor nodes that are children on the minimum-hop-path tree rooted at the source of the update. Each routing node 14 having no children for the source node src of the link-state update is a leaf in the minimum-hop-path tree and therefore does not forward updates originating from the source node src. In typical networks, most nodes 18 are leaves, thus the TBRPF protocol makes efficient use of the bandwidth of the subnet 10 . In addition, those nodes having only one child node for the source node src can send updates generated by the source node src to that child node only, instead of broadcasting the updates to all neighbor nodes.

[0114] The TBRPF protocol may utilize bandwidth more efficiently by using unicast transmissions if those routing nodes 14 have only one child, or a few children, for the source of the update, and broadcast transmissions when several children exist for the update. Therefore, in one embodiment, the TBRPF protocol determines whether to use unicast or broadcast transmissions, depending on the number of children nodes and the total number of neighbor nodes.

[0115] In general, each routing node 14 uses unicast transmissions for updates with only one intended receiver (e.g., only one child), and broadcast transmissions for updates with several intended receivers, to avoid transmitting the update message several times. Therefore, each routing node 14 uses unicast transmission if k=1 and use broadcast if k>1, where k is the number of intended receivers. A possible drawback can occur if the number of children nodes exceeds one and there are a many more neighbors. For example, if there are two children nodes and twenty neighbor nodes, (i.e., k=2 and n=20, where k is the number of children nodes and n is the number of neighbors), then 18 neighbor nodes are listening to a message not intended for them. Such neighbor nodes could instead be sending or receiving other messages.

[0116] To avoid this possible drawback, one option is to use broadcast transmission if k>(n+1)/2 and unicast transmission in all other cases. In general, a rule of the form k>g(n) can be used. For update messages, the number of children k may be different for different update sources. Therefore, it is possible to use unicast transmissions for some sources and broadcast transmissions for other sources, and the transmission mode for a given source u, denoted mode_i(u), can change dynamically between unicast and broadcast as the number of children changes.

[0117] While LINK-STATE-UPDATE messages can be transmitted in either unicast or broadcast mode, HELLO messages and HEARTBEAT messages (discussed below) are always transmitted on the broadcast channel, and the following messages are always transmitted on the unicast channel (to a single neighbor): NEIGHBOR, NEIGHBOR ACK, ACK, NACK, NEW PARENT, CANCEL PARENT, RETRANSMISSION OF BROADCAST, END OF BROADCAST, and LINK-STATE-UPDATE messages sent in response to a NEW PARENT message.

[0118] Exemplary pseudo-code for a procedure for sending a LINK-STATE UPDATE message (that is not a response to a NEW PARENT message) on the broadcast or unicast channel is as follows:

[0119] If (mode_i(src)==BROADCAST)

[0120] Append the message update_msg to the message queue associated with the broadcast channel.

[0121] If (mode_i(src)==UNICAST)

[0122] For (each node k in children_i(src))

[0123] Append the message update_msg to the message queue associated with the unicast channel to node k.

[0124] Reliable unicast transmission of control packets can be achieved by a variety of reliable link-layer unicast transmission protocols that use sequence numbers and ACKs, and that retransmit a packet if an ACK is not received for that packet within a specified amount of time.

Reliable Transmission in Broadcast Mode

[0125] For reliable transmission of Link-State Update messages in broadcast mode, each broadcast update message includes one or more link-state updates, denoted lsu(src), originating from sources src for which the transmission mode is BROADCAST. Each broadcast control packet is identified by a sequence number that is incremented each time a new broadcast control packet is transmitted. Reliable transmission of broadcast control packets in TBRPF can be accomplished using either ACKs or NACKs. If ACKs are used, then the packet is retransmitted after a specified amount of time if an ACK has not been received from each neighbor node that must receive the message.

[0126] In one embodiment of TBRPF, NACKs are used instead of ACKs for reliable transmission of broadcast control packets, so that the amount of ACKINACK traffic is minimized if most transmissions are successful. Suppose node i receives a NACK from a neighbor node nbr for a broadcast update message. In one embodiment, all updates lsu(src) in the original message, for each source node src such that neighbor node nbr belongs to children_i(src), are retransmitted (reliably) on the UNICAST channel to the neighbor node nbr, in a RETRANSMISSION OF BROADCAST message. This message includes the original broadcast sequence number to allow neighbor node nbr to process the updates in the correct order. In another embodiment, such update messages are retransmitted on the broadcast channel. This embodiment may improve the efficiency of the TBRPF protocol in subnets that do not support receiver-directed transmission, because in such subnets unicast transmission provides no efficiency advantage over broadcast transmissions.

[0127] The procedure for the reliable transmission of broadcast update packets uses the following message types (in addition to LINK-STATE UPDATE messages): HEARTBEAT(sn), NACK(sn, bit_map), and RETRANSMISSION OF BROADCAST(sn, update_msg). A NACK(sn, bit_map) message contains the sequence number (sn) of the last received broadcast control packet, and a 16-bit vector (bit-map) specifying which of the 16 broadcast control packets from sn-15 to sn have been successfully received.

[0128] A description of the procedure for the reliable transmission of broadcast update packets at node i uses the following exemplary notation:

[0129] Pkt(sn) represents a control packet with sequence number sn transmitted on the broadcast channel by node i.

[0130] MsgQ represents a message queue for new control messages to be sent on the broadcast channel from node i.

[0131] brdcst_sn_i represents the sequence number of the last packet transmitted on the broadcast channel by node i.

[0132] Heartbeat_Timer represents a timer used in the transmission of the HEARTBEAT message.

[0133] Following the transmission of the broadcast control packet Pkt(brdcst_sn_i) on the broadcast channel, node i increments brdcst_sn_i and reinitializes Heartbeat_Timer. When Heartbeat_Timer expires at node i, the node i appends the control message HEARTBEAT(brdcst_sn_i) to the message queue associated with the broadcast channel, and reinitializes Heartbeat_Timer. When the node i receives NACK(sn, bit_map) from neighbor node nbr, node i performs the functions as illustrated by following exemplary pseudo-code: 1

For each (sn' not received as indicated by bit_map){
Let update_msg = {(src*, v*, sn*, c*) in Pkt(sn') such that the
neighbor node nbr is in children_i(src*)}.
Append the message RETRANSMISSION OF
BROADCAST(sn', update_msg) to the message queue
associated with the unicast channel to neighbor node nbr.
(Message must be sent even if update_msg is empty.)}

[0134] Upon receipt at neighbor node nbr of control packet Pkt(sn) transmitted on the broadcast channel by node i, the neighbor node nbr performs the following operations as illustrated by the following pseudo-code: 2

If the control packet Pkt(sn) is received in error{
Append the control message NACK(sn, bit_map) to the message
queue associated with the unicast channel to node i.}
If the control packet Pkt(sn) is received out of order (i.e., at least one
previous sequence number is skipped){
Withhold the processing of the control packet Pkt(sn).
Append the control message NACK(sn, bit_map′) to the
message queue associated with the unicast channel
to node i.}
Else (control packet Pkt(sn) is received correctly and in order){
For each Link-State Update message update_msg in
Pkt(sn), call Process_Update(i, nbr, update_msg).}

[0135] When a communication link is established from node i to a new neighbor nbr, in one embodiment the node i obtains the current value of brdcst_sn_nbr from the NEIGHBOR message or NEIGHBOR ACK that was received from neighbor node nbr.

[0136] Each node i can dynamically select the transmission mode for link-state updates originating from each source node src. As described above, this decision uses a rule of the form k>g(n), where k is the number of children (for src) and n is the number of neighbors of node i. However, to ensure that updates are received in the correct order, or that the receiver has enough information to reorder the updates, node i sends an END OF BROADCAST(last_seq_no, src) message on the unicast channel to each child when the mode changes to UNICAST, and waits for all update packets sent on unicast channels to be ACKed on before changing to BROADCAST mode.

[0137] To facilitate this process, each node i maintains a binary variable unacked_i(nbr, src) for each neighbor node nbr and source node src, indicating whether there are any unACKed control packets sent to neighbor node nbr containing link-state updates originating at source node src. The following exemplary pseudo-code illustrates an embodiment of a procedure that is executed periodically at each node i. 3

For each (node src){
If (mode_i(src) = BROADCAST and |children_i(src)| <= g(n)){
For each (node nbr in children_i(src)){
Append the message END OF BROADCAST(brdcst_sn_i,
src) to the message queue associated with the unicast
channel to node nbr.}
Set mode_i(src) = UNICAST.}
If (mode_i(src) = UNICAST and |children_i(src)| > g(n)){
Set switch_flag = YES.
For each (node nbr in children_i(src)){
If (unacked_i(nbr, src) = YES) Set switch_flag = NO.}
If (switch_flag = YES) Set mode_i(src) = BROADCAST.}}

Full and Partial Topology TBRPF

[0138] In one embodiment, a result of the running the TBRPF protocol is that each router 14 in the subnet 10 obtains the state of each link in the subnet 10 (or within a cluster if hierarchical routing is used). Accordingly, this embodiment of the TBRPF protocol is referred to as full-topology link-state protocol. In some embodiments, described below, the TBRPF protocol is a partial-topology link-state protocol in that each router 14 maintains a subset of the communication links in the subnet 10 . In the full-topology protocol embodiment, each routing node 14 is provided with the state of each link in the subnet 10 (or cluster, if hierarchical routing is used). In other embodiments, the TBRPF is a partial topology protocol in that each routing node 14 is provided with only a subset of the links in the subnet 10 .

[0139] For the full-topology link-state protocol embodiment (1) alternate paths and disjoint paths are immediately available, allowing faster recovery from failures and topology changes; and (2) paths can be computed subject to any combination of quality-of-service (QoS) constraints and objectives. Partial-topology link-state protocols provide each node 18 with sufficient topology information to compute at least one path to each destination. Whether implemented as a full-topology or as a partial-topology protocol, the TBRPF protocol is a proactive link-state protocol in that each node 18 dynamically reacts to link-state and topology changes and maintains a path to each possible destination in the subnet 10 at all times.

A Partial-Topology Embodiment

[0140] In one partial-topology embodiment, each routing node 14 decides which of its outgoing links (i, j), called “special links,” should be disseminated to all nodes in the subnet 10 . This subset of links is maintained in a list L_i. All other outgoing links are sent only one hop (i.e., to all neighbor nodes of node i). Node i sends an update to its neighbor nodes if that update is the addition or removal of a link from the list L_i, or reflects a change in the state of a link in the list L_i.

[0141] Various rules can be used to define the set of special links in the list L_i. For example, one rule defines a link (i, j) to be in L_i only if node j is the parent of node i for some source node other than node j, or if node j belongs to the set children_i(src) for some source node src other than node i. This definition of special links includes enough links to provide minimum-hop paths between any pair of nodes. As a result, this partial-topology embodiment reduces the amount of control traffic without reducing the quality of the routes. In this embodiment, an update (u, v, c, sn, sp) is augmented to include a, “sp” field (e.g., a single-bit field), which indicates whether the link (u, v) is a special link. Pseudo-code representing an exemplary implementation of the partial-topology embodiment appears in the Appendix A, after the “Partial-Topology 1” header. The procedure Mark_Special_Links(i) is called upon a change to the parent p_i(src) or to the set of children nodes children_i(src).

A Second Partial-Topology Embodiment

[0142] In another partial-topology embodiment, each routing node 14 , hereafter node i, maintains a topology table TT_i, a source tree Ti (i.e., computed paths to all destinations), a set of reported links Ri, and a set of neighbor nodes Ni. The entry of TT_i for a link (u, v) is denoted TT_i(u,v) and consists of the tuple (u, v, c, c'), where c is the cost associated with the link and c′ is the last cost reported to neighbor nodes for the link. The component c of the entry for link (u, v) is denoted TT_i(u, v).c. In addition, a parent p_i (u) and set of children nodes children_i (u) are maintained for each node u≠ node i. The parent p_i (u) is the next node on a shortest path to node u, based on the information in TT_i. The source tree Ti, computed by a lexicographic version of Dijkstra's algorithm, is the set of links that belong to at least one of the computed paths. The set of reported links Ri includes the source tree Ti and any link in TT_i for which an update has been sent but a delete update has not since been sent. In addition, a binary variable pending_i(u) is maintained for each node u≠node i, which indicates that the parent p_i (u) is pending, i.e., that a NEW PARENT(u) message has been sent to p_i (u) but no response has yet been received. In general, each node i reports to neighbor nodes the current states of only those links in its source tree Ti, but sends only part of its source tree Ti to each neighbor node such that no node receives the same information from more than one neighbor node. Pseudo-code representing an exemplary implementation of this partial-topology embodiment of the TBRPF protocol appears in the Appendix A, after the “Partial-Topology 2” header.

[0143] Upon receiving an update message, consisting of one or more updates (u, v, c), node i executes the procedure Update( ), which calls the procedure Update_Topology_Table( ), then executes the procedure Lex_Dijkstra( ) to compute the new source tree Ti and the procedure Generate_Updates( ) to generate updates and modify the set of reported links Ri based on changes in link costs and changes to the source tree Ti. Each generated update is then sent to the appropriate children, that is, updates for links with head u are sent to children_i(u). The procedure Update_Parents( ) is called, which determines any changes in the parent assignment and sends NEW PARENT and CANCEL PARENT messages.

[0144] The sending of updates can be accomplished in different ways, depending on whether the subnet 10 consists of point-to-point links, broadcast links, or a combination of both link types. In a network of point-to-point links, each neighbor node k would be sent a message that contains the updates for links (u, v) such that k belongs to children_i(u). If a broadcast capability also exists, links (u, v) for which children_i (u) has more than one member can be broadcast to all neighbor nodes.

[0145] The procedure Update_Topology_Table( ) does the following for each update (u, v, c) in the input message (in-message) such that the parent p_ i(u) is the neighbor node who sent the message. (Updates received from a node other than the parent are ignored.) If either TT_i does not contain an entry for (u, v) or contains an entry with a different cost than c, then TT_i(u, v) is updated with the new value c and link (u, v) is marked as changed. If the input message is a PARENT RESPONSE, then in addition to updates, the message contains the same list of sources as the NEW PARENT message to which it is responding. For each such source node u such that pending_i(u)=1 and for each link (u, v) in TT_i that is outgoing from source node u but for which the input message does not contain an update, the cost of (u, v) is set to infinity, to indicate that the link should be deleted. In other words, any link that was reported by the old parent but is not reported by the new parent is deleted. Only information from the current parent is considered valid.

[0146] The procedure Lex_Dijkstrao( ) (not included in Appendix A) is an implementation of Dijkstra's algorithm that computes the lexicographically smallest shortest path LSP(i, u) from node i to each node u, using as path name the sequence of nodes in the path in the reverse direction. For example, the next-to-last node of LSP(i, u) has the smallest node ID among all possible choices for the next-to-last node. Such paths are computed using a modification of Dijkstra's algorithm in which, if there are multiple choices for the next node to label, the one with the smallest ID is chosen.

[0147] The procedure Generate_Updates( ) decides what updates to include in the message to be sent to neighbor nodes. A non-delete update is included for any link (u, v) that is in the new source tree Ti and either is marked as changed or was not in the previous source tree (denoted old source tree Ti). In this case, Ti(u, v).c′ is set to Ti(u, v).c, and (u,v) is added to the reported link set Ri if not already in the reported link set Ri. A delete update is included for any link (u, v) that is in the reported link set Ri but is not in the source tree Ti, such that TT_i(u, v).c>TT_i(u,v).c′. In this case, (u, v) is removed from the reported link set Ri. Any links with infinite cost are erased from the topology table TT_i.

[0148] The procedure Update_Parents( ) sets the new parent p_i(u) for each source node u to be the second node on the shortest path to node u. If there is no path to node u, p_i(u) is null. If the new parent is different from the old parent, then a NEW PARENT message is sent to the new parent (if it is not null) and a CANCEL PARENT message is sent to the old parent (if it is not null and the link to the old parent is still up). The NEW PARENT messages for all source nodes u having the same new parent are combined into a single message, and CANCEL PARENT messages are similarly combined.

[0149] The procedure Process_New_Parent( ) is executed when a NEW PARENT message is received from some neighbor node. For each source node u in the NEW PARENT message, the procedure adds the neighbor node to children_i(u) and includes in the PARENT RESPONSE message an update for each link (u, v) in the source tree Ti whose head is source node u, if such a link exists. (Such a link will not exist if node u is a leaf of source tree Ti.) As described above, the PARENT RESPONSE also includes the same list of sources as the NEW PARENT message to which it is responding. (This list is not necessary if the node sending the NEW PARENT message remembers the list and can match the PARENT RESPONSE to the NEW PARENT message.)

[0150] When the cost of a link to a neighbor node j changes, node i sets TT_i(i, j).c to the new cost and calls the procedure Update( ) with k=i and an empty input message. A threshold rule can be used so that TT_i(i, j).c is updated only if the percent difference between the new cost and the old cost is at least some given threshold. If a link to a neighbor node j fails, the same procedure is followed (with the cost changing to infinity), and node j is removed from set of neighbor nodes Ni.

[0151] When a link to a neighbor node j comes up, either initially or upon recovering from a failure, node i executes the procedure Link_Up(I, j), which adds neighbor node j to the set of neighbor nodes Ni, sets TT_i(i, j).c to the link cost, and calls the procedure Update( ) with k=i and an empty input message. This may result in a NEW PARENT message being sent to neighbor node j.

[0152] To correct errors that may appear in TT_i due to noisy transmissions or memory errors, each node i can periodically generate updates for its outgoing links. Since a received update is ignored unless it has a cost that differs from the entry in the topology table TT_i, the cost of the periodic update should be chosen to be slightly different from the previous update. Alternatively, each update can contain an additional bit b, which toggles with each periodic update.

[0153] FIG. 5 illustrates the operation of the second partial-topology embodiment of the TBRPF protocol when a communication link 142 between nodes B and D in the subnet 10 fails. The minimum-hop-path tree for source node B before the link failure is shown with solid arrows; the minimum-hop-path tree for source node C is shown with dashed arrows. As shown node A selects node B as parent for source nodes B, D, and F, and selects node C as parent for source nodes C, E, and F. Therefore, node B reports link-state changes to node A only for links (B, A), (B, C), (B, D), and (D, F), and node C reports link-state changes to node A only for links (C, A), (C, B), (C, E), and (E, G). Neither nodes B or C would report a link-state change affecting link (F, G) to node A. Thus, unlike the full-topology embodiment of the TBRPF, in which each node 14 has link information for every link in the subnet 10 , the nodes 18 of this partial-topology embodiment have link-state information for less than every link in the subnet 10 .

[0154] If link (B, D) fails, as shown in FIG. 5 , node B reports to nodes A and C that link (B, D) has failed (cost=infinity). Node C reports to node A that link (E, D) 144 has been added to node C's minimum-hop-path source tree. After receiving these updates, node A selects node C as its new parent for source nodes D and F, and sends a NEW PARENT message to node C and a CANCEL PARENT message to node B. Node C responds by sending node A an update only for link (D, F), because link (D, F) is the only link in node C's minimum hop-path source tree with node D or node F as the head of a link. For example, node F is the head of the link (F, G), but the link (F, G) is not in node C's minimum-hop-path source tree and is therefore not reported to node A. Although the minimum-hop-path source tree of node A is modified during the update process, node A does not generate any updates because it has no children for any source other than itself (i.e., node A).

TBRPF Protocol Messages

[0155] To disseminate link-state updates to the appropriates nodes in the subnet 10 , neighboring router nodes 14 that have established bi-directional links and performed data link to IPv4 address resolution using TBRPF neighbor discovery (as described below) exchange TBRPF protocol messages. The IPv4 addresses are therefore available for use as node IDs in TBRPF protocol messages.

[0156] In one embodiment, the TBRPF protocol messages are sent via the User Datagram Protocol (UDP), which requires an official UDP-service port-number registration. The use of UDP/IPv4 provides several advantages over a data link level approach, including (1) IPv4 segmentation/reassembly facilities, (2) UDP checksum facilities, (3) simplified application level access for routing daemons, (4) IPv4 multicast addressing for link state messages.

[0157] TBRPF protocol messages are sent to the IPv4 unicast address of a current neighbor or to the “All_TBRPF_Neighbors” IPv4 multicast address, presuming that an official IPv4 multicast address is assigned to “All_TBRPF_Neighbors.” In general, a message is sent to the IPv4 unicast address of a current neighbor node if all components of the message pertain only to that neighbor. Similarly, a message is sent to the All_TBRPF_Neighbors IPv4 multicast address if the message contains components which pertain to more than one neighbor neighbors. Nodes 14 are prepared to receive TBRPF protocol messages sent to their own IPV4 unicast address or the All_TBRPF_Neighbors multicast address.

[0158] Actual addressing strategies depend on the underlying data link layer. for example, for data links such as IEEE 802.11, a single, multiple access channel is available for all unicast and broadcast/multicast messages. In such cases, since channel occupancy for unicast and multicast messages is identical, it is advantageous to send a single message to the All_TBRPF_Neighbors multicast address rather than multiple unicast messages, even if the message contains components that pertain to only a subset of the current neighbor nodes. In other cases, in which point-to-point receiver directed channels are available, sending multiple unicast messages may reduce contention on the multiple access broadcast channel.

Atomic TBRPF Message Format

[0159] FIG. 6 shows an exemplary embodiment of an individual (atomic) TBRPF protocol message 160 including a message header 162 followed by a message body 164 . Atomic messages may be transmitted either individually or as components of a compound TBRPF protocol message having multiple atomic messages within a single UDP/IPv4 datagram. TBRPF message headers 162 are either 32-bits or 64-bits in length depending on whether the atomic message is BROADCAST or UNICAST.

[0160] The message header 162 includes a type field 166 , a version field 168 , a mode field 170 , a number of sources field 172 , an offset field 174 , a link sequence number field 176 , and a receiver identification field 178 , which is used when the mode is defined as UNICAST.

[0161] The type filed 166 (e.g., 4 bits) represents the atomic message type. The following are examples of atomic message types:

[0162] ACK 1

[0163] NACK 2

[0164] NEW_PARENT 3

[0165] CANCEL_PARENT 4

[0166] HEARTBEAT 5

[0167] END_OF_BROADCAST 6

[0168] LINK_STATE_UPDATE_A 7

[0169] 15 SK vLINK_STATE_UPDATE_B 8

[0170] RETRANSMISSION_OF_BROADCAST 9

[0171] The version field 168 (e.g., 3 bits) represents the TBRPF protocol version and provides a transition mechanism for future versions of the TBRPF protocol. Also, the version 168 can assist the node 18 in identifying false messages purporting to be TBRPF protocol messages.

[0172] The mode field 170 (e.g., 1 bit) represents the transmission mode for the atomic TBRPF protocol message 160 ; the mode is either UNICAST or BROADCAST. UNICAST refers to an atomic message that must be processed by only a single neighbor node. BROADCAST refers to an atomic message that is to be processed by all neighbor nodes. (For IPv4 subnets, UNICAST implies a specific IPv4 unicast address, whereas BROADCAST implies the All_TBRPF_Neighbors IPv4 multicast address.) The following exemplary mode bits are defined:

[0173] UNICAST 0

[0174] BROADCAST 1

[0175] Messages of type ACK, NACK, NEW_PARENT, CANCEL_PARENT, RETRANSMISSION_OF_BROADCAST, and END_OF_BROADCAST are sent as UNICAST. Messages of type LINK_STATE_UPDATE_A and LINK_STATE_UPDATE_B may be sent as either UNICAST or BROADCAST.

[0176] The number of sources field 172 (e.g., 8 bits) represents the number of sources “Num_Sources” included in the atomic message 160 . The field 172 takes a value from 1 to 255 for messages of type: NEW_PARENT, CANCEL_PARENT, LINK_STATE_UPDATE_A, and LINK_STATE_UPDATE_B. All other message types are set Num_Sources=0.

[0177] The offset field 174 (e.g., 18 bits) represents the offset (in bytes) from the 0'th byte of the current atomic message header 162 to the 0'th byte of the next atomic message header 162 in the “compound message” (described below.) An offset of 0 indicates that no further atomic messages follow. The 18-bit offset field 174 , for example, imposes a 4-kilobyte length restriction on individual atomic messages.

[0178] The sequence number field 176 (e.g., 4 bits) represents the link sequence number (“LSEQ”) for this TBRPF protocol message 160 .

[0179] The receiver identification field 178 (e.g., 32 bits) re