The Second Internet by Lawrence Hughes - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

Chapter 4 – The Depletion of the IPv4 Address Space

Many people today are aware that the folks in charge of the Internet are starting to run low on addresses. Most of them are not aware that this is not the first time we’ve faced this, or just how low that pool of addresses is today. The majority of Internet users are either completely oblivious to what is going on and think that the Internet will go on like it has, forever. If they have heard any rumors about an address shortage they have a blind faith that the people in charge can simply work some magic and the problem will go away. Well, they did once, in the mid 1990’s (with NAT), but they are all out of tricks this time around. IPv4 is simply out of gas, and it is time to start using its successor, IPv6.

4.1 – OECD IPv6 Report, March 2008

The best study on this done to date (in my opinion) is in the OECD report presented at the OCED Ministerial Meeting on the Future of the Internet Economy, in Seoul Korea, 17-18 June, 2008. I was a speaker at the concurrent Korean IPv6 Summit. The full name of OECD is Organisation for Economic Co- Operation and Development. It was established in 1961, and currently has 30 member nations, including most members of the EU, plus Australia, Canada, Japan, Korea, Mexico, New Zealand, Turkey, the UK and the US. It had a 2009 budget of EUR 320 million. Their goals are to:

  • Support sustainable economic growth
  • Boost employment
  • Raise living standards
  • Maintain financial stability
  • Assist other countries’ economic development
  • Contribute to growth in world trade

Unlike the IETF or ISO, the OECD is not specifically concerned with technology. However, they have determined that the imminent exhaustion of the IPv4 address space will have a major impact on most of their goal areas, hence they did a major study, the results of which are presented in Ministerial Background Report DSTI/ICCP(2007)20/FINAL, “Internet Address Space: Economic Considerations in the Management of IPv4 and in the Deployment of IPv6”. The report is available free by download over the Internet (search for “OECD IPv6 Report”). You should actually read the entire report, but I will summarize the most important aspects of it in this chapter. Let me quote one paragraph from the Main Points section:

“There is now an expectation among some experts that the currently used version of the Internet Protocol, IPv4, will run out of previously unallocated address space in 2010 or 2011, as only 16% of the total IPv4 address space remains unallocated in early 2008. The situation is critical for the future of the Internet economy because all new users connecting to the Internet, and all businesses that require IP addresses for their growth, will be affected by the change from the current status of ready availability of unallocated IPv4 addresses.”

As of this writing, in early 2010, only 8% of the addresses remain unallocated. The current best estimates are that the IANA address pool will be exhausted by July 2011, and all RIRs will exhaust their supply within six months after that (some potentially even earlier).

Another key passage from this section follows:

“As the pool of unallocated IPv4 addresses dwindles and transition to IPv6 gathers momentum, all stakeholders should anticipate the impacts of the transition period and plan accordingly. With regard to the depletion of the unallocated IPv4 address space, the most important message may be that there is no complete solution and that no option will meet all expectations. While the Internet technical community discusses optional mechanisms to manage IPv4 address space exhaustion and IPv6 deployment and to manage routing table growth pre- and post- exhaustion, governments should encourage all stakeholders to support a smooth transition to IPv6.”

“IPv6 adoption is a multi-year, complex integration process that impacts all sectors of the economy. In addition, a long period of co-existence between IPv4 and IPv6 is projected during which maintaining operations and interoperability at the application level will be critical. The fact that each player is capable of addressing only part of the issue associated with the Internet-wide transition to IPv6 underscores the need for awareness raising and co-operation”.

Basically, there is no solution for those wanting to remain with IPv4. It is going to take multiple years to make the transition. There are only two years left, so March 2010 is really the last possible date to begin a smooth and affordable introduction of IPv6. Any later start will involve unnecessary expense and crisis management, towards the end of the IPv4 lifetime. Such transitions are usually not done well when rushed. And once the addresses are gone, that’s it.

The report acknowledges that in the early phases of a major technology transition such as this, there may be little or no incentive to shift to the new technology. However, once a critical mass of users adopting the new technology, there is often a tipping point after which adoption grows rapidly until it is widespread. In theory this tipping point is reached when the marginal cost, for an ISP or an organization, of implementing the next device with IPv4 becomes higher than the cost of deploying the next device with IPv4. For an ISP, there are costs associated with deploying IPv4 nodes such as the cost of obtaining the addresses themselves, the costs of designing and deploying network infrastructure that uses fewer and fewer public (globally routable) addresses (by using NAT). When these become higher than the cost of deploying IPv6, they will begin migration in earnest. Reaching this tipping point depends on a number of factors, including customer demand, opportunity costs, emerging markets, the introduction of new services, government incentives, and regulation.

One of the key requirements for migrating to IPv6 is technical expertise in the subject. This is necessary to provide economies and companies with competitive advantage in the area of technology products and services, and the benefit from ICT-enabled innovation. Countries who are early adopters, and provide training and incentives for their companies to embrace it, or even help fund the necessary infrastructure (as in China) will have significant competitive advantages in years to come over countries that are laggards in this transition.

Increasing scarcity of IPv4 addresses can raise competitive concerns in terms of barriers to new entry and strengthening incumbent positions. There has been much discussion over how to manage previously allocated IPv4 addresses once the free pool has been exhausted. Will a black (or even a legitmate) market evolve for IPv4 addresses? Will companies that have more than they need be selling them on eBay? It’s possible that some companies might even be acquired in order to obtain a large number of addresses (as happened when Compaq bought Digital Equipment Corporation, and then again when HP bought Compaq). Today, you only borrow (lease?) addresses from an ISP for so long as you have service with that ISP. If you terminate that service, the addresses are reclaimed by the ISP for allocation to other customers. You don’t really own those addresses, so you can’t sell them. Even the ISP doesn’t own them, if an ISP goes out of business their address pool probably returns to the RIR they got them from. Some of these situations are not currently well defined, but they will be as the IPv4 address space nears exhaustion. Notably, the situation on the early Class A block allocations is not quite so well defined. Those blocks may be owned by those early adopter companies.

There is also discussion of how existing and increasing use of NAT requires developers of network aware products and applications to build increasingly complex central gateways or NAT traversal mechanisms to allow clients who are in most cases, both behind NAT gateways. This is creating barriers to innovation, the development of new services, and the overall performance and stability of the Internet.

There is a risk of some parts of the world deploying IPv6, while others continue running IPv4 with multiple layers of NAT. Such decisions would impact the economic opportunities offered by the Internet with severe repercussions in terms of stifled creativity and deployment of generally accessible new services. Also, there could be serious issues of interoperation between people in the IPv6 world and those left behind in the IPv4 world. This could lead to a fragmentation of the Internet.

The five sections of the report cover the following topics:

  • Overview of the major initiatives that have taken place in Internet addressing to-date, and the parallel development of institutions that manage Internet addressing
  • Summary of proposals under consideration for management of remaining IPv4 addresses
  • Overview of the drivers and challenges for transitioning to IPv6 through a dual stack (IPv4 + IPv6) environment. It reviews factors that influence IPv6 adoption, drawing on available information.
  • Economic and public policy considerations and recommendations to governments
  • Lessons learned from several IPv6 deployments

4.2 – OECD Follow-up Report, April 2010

In April 2010, the OECD released a follow-up report to the IPv6 report mentioned above. It is called “Internet Addressing: Measuring Deployment of IPv6”. They still expect IPv4 addresses to run out in 2012. As of March 2010, only 8% of the full IPv4 address space is available for allocation. Currently, IPv6  use is growing faster than IPv4 use, albeit from a still small base. Several large-scale deployments are taking place or are in planning.  Some of the key findings, all as of March 2010 are:

  • 5.5% of the networks on the Internet (1,800 networks) can handle IPv6 traffic
  • IPv6 networks have grown faster than IPv4-only networks since mid-2007
  • Demand for IPv6 address blocks has grown faster than demand for IPv4 address blocks.
  • One out of five transit networks (i.e. networks that provide connections through themselves to other networks) handle IPv6. This means that Internet infrastructure players are actively readying for IPv6.
  • As of January 2010, over 90% of installed operating systems are IPv6 capable, and 25% of end users ran an operating system that enabled IPv6 by default (e.g. Windows Vista or Mac OS X). This percentage has probably increased since the release of Windows 7, but no measurement is available.
  • As of January 2010, over 1.45% of the top 1000 websites were available over IPv6, but as of March 2010 (when Google IPv6 enabled their websites) this jumped to 8%.
  • Over 4,000 IPv6 prefixes (address blocks) had been allocated. Of these 2,500 (60%) showed up as routed on the Internet backbone (were actually in use).
  • At least 23% of Internet eXchange Points explicitly supported IPv6
  • 7 out of 13 DNS Root Servers are accessible over IPv6
  • 65% of Top Level Domains (TLDs) had Ipv6 records in the root zone file
  • 80% of TLDs have name servers with an IPv6 address
  • 1.5 million domain names (about 1% of the total) had IPv6 DNS records

Operators in the RIPE and APNIC service areas were given a survey in 2009.

  • 7% of APNIC respondents claimed to have equal or more IPv6 traffic than IPv4 traffic
  • 2% of RIPE respondents claimed to have equal or more IPv6 traffic than IPv4 traffic
  • Of those respondents not deploying IPv6, 60% saw cost as a major barrier
  • Of those respondents deploying IPv6, 40% considered lack of vendor support the main obstacle

img20.png

Routed IPv6 Prefixes, 2004 to 2009

img21.png

IPv6 unique Autonomous Systems, 2003 to 2009

Source: ITAC/NRO Contribution to the OECD, Geoff Huston and George Michaelson, data from end of year 2009.

Since 2008, the ratio of routed IPv6 prefixes to IPv4 prefixes has climbed from 0.45% to 0.8%, which indicates that the number of routed IPv6 prefixes is increasing more rapidly than that of routed IPv4 prefixes. The ratio of IPv6 to IPv4 AS entities actively routing went from about 3.2% in 2008 to 5.5% in 2010.

The compound annual growth rate from 24 February 2009 to 5 November 2009 for dual stack ASes was 52%, for IPv6-only ASes was 13%, and for IPv4-only ASes was 8%. At year end 2009, there were 31,582 ASes using IPv4-only, there were 1806 ASes using dual stack, and there were 59 ASes using IPv6-only.

One trend is that service providers, corporations, public agencies and end-users are using IPv6 for advanced and innovative activities on private networks. IPv6 is also being used in 6LoWPAN (IPv6 over Low power Personal Area Networks, as specified in RFC 4944, “Transmission of IPv6 Packets over IEEE 802.15.4 Networks”, September 2007.

4.3 – How IPv4 Addresses Were Allocated in the Early Days

In the early days, before IANA and the RIRs were created, IPv4 addresses were actually allocated manually by a single individual, Jon Postel. He never dreamed how large the Internet would grow, or that it would be a worldwide phenomenon that had a major impact on most world economies. He is the one responsible for allocating large chunks (“Class A” blocks) to a few early adopters (e.g. HP, Apple, and M.I.T.) Unfortunately, those allocations are very difficult to undo today, so about 1/3 of all the addresses allocated in the U.S. belong to less than 50 organizations. The IANA now just considers those legacy allocations, and has tried to do the best they could with the address space remaining at the time they took over allocation.

4.3.1 – Original “Classful” Allocation Blocks

The first 50% of the full IPv4 address space (0.0.0.0 to 127.255.255.255) was divided up into 128 “Class A” blocks (now known as “/8” or “slash-8” blocks). Each of these contained 224-2, or some 16.8 million usable addresses. Here is a list of some of the lucky organizations that own these blocks today, either from the original allocation or by buying other companies that owned them.

img22.png

Another 25% of the full address space (128.0.0.0 to 191.255.255.255) was divided up into 16,384 “Class B” blocks (now known as “/16” blocks). Each of these contained 216-2, or 65,534 usable addresses.

Another 12.5% of the full address space (192.0.0.0 to 224.255.255.255) was divided up into about 2.1 million “Class C” blocks (now known as “/24” blocks). Each of these contained 28-2, or 254 usable addresses.

Another 6.25% of the full address space (224.0.0.0 to 239.255.255.255) was reserved for multicast (these are known as Class D addresses). There is no way to “recover” any of this address space.

The final 6.25% of the full address space (240.0.0.0 to 255.255.255.255) was reserved for future use, experimentation and limited broadcast. These are known as Class E addresses. These addresses cannot be “recovered” without modifications to essentially every router in the world (most routers block them by default – in many routers this is not even configurable).

The subblock of Class E from 255.0.0.0 to 255.255.255.255 is actually used for “limited broadcast” (limited because it will not cross routers). A packet sent to any of these addresses will be received by all nodes on your LAN. Of these, normally only the address 255.255.255.255 is used. There is no broadcast in IPv6 (although there is a multicast address that has much the same effect).

The U.S. Deptartment of Defense has 10 “/8” blocks, for about 168 million addresses. This is almost 4% of the total IPv4 address space. One entire “/8” block (127.x.x.x) has only one address used, which is 127.0.0.1 (the IPv4 “loopback” address, used to address your own node). A small block at 169.254.0.0/16 is reserved for IPv4 Link Local usage (similar to IPv6 link-local addresses). For details, see  FC 5735, “Special Use IPv4 Addresses”, January 2010.

One “/8” block (10.0.0.0/8), one “/12” block (172.16.0.0/12) and one “/16” block (192.168.0.0/16) were reserved for use as “private” addresses by RFC 1918, “Address Allocation for Private Internets”, February 1996. These addresses can be used by any organization for any internal network, but should never be routed onto the Internet (although in practice you can sometimes find these addresses on the backbone due to misconfigured routers). These would correspond to internal phone “extensions” such as 101, 102, etc. Every company with a PBX might use that same set of extensions.

As of 4 June 2010, only 16 of the possible 256 “/8” blocks (about 6.25% of the full address space) are still unallocated. Here is a map of the status of all 256 “/8” blocks. By September 2011, (or earlier) there won’t be any dots left. All the blocks with dots (unallocated “/8”s) in the chart today will be allocated to one of the RIRs (ARIN, RIPE, APNIC, LACNIC or AfriNIC).

img23.png

img24.png

Almost all of the “Legacy, early allocation” blocks are in the U.S., so ARIN’s real share of the total IPv4 address space is over 40% (for less than 5% of the world’s population).

You can check the official status of the remaining allocations at any time, at:

http://www.iana.org/assignments/ipv4-address-space/ipv4-address-space.xml

4.3.2 – Classless Inter-Domain Routing (CIDR)

The original allocation block sizes (Classes A, B & C) did not fit all organizations. For many organizations, even the smallest block (Class C) was too big. If we had stuck with the original allocation block sizes, we would have run out of addresses around 1997. When this was realized, the IETF introduced Classless Inter-Domain Routing as defined in RFC 1518, “An Architecture for IP Address Allocation with CIDR”, September 1993; and RFC 1519 “Classless Inter-Domain Routing (CIDR): An Address Assignment and Aggregation Strategy”, September 1993. CIDR allowed allocation blocks to be split along any bit of the 32 possible bit locations, not just at 24, 16 and 8 bits. Some useful CIDR allocation block sizes are:

img25.png

img26.png

CIDR allows a closer fit to actual organization size than the old classful “3 sizes fit all” scheme. However, each allocated block requires an entry in the core routing tables. As we allocate smaller and smaller blocks, the number of entries in the core routing tables is growing very rapidly. Many things are beginning to go wrong as we get closer and closer to an empty barrel.

In the mid 1990’s, there were steps taken (NAT and Private Addresses) to further limit the number of addresses being allocated to each organization. NAT was only ever envisioned by its creators as a “quick fix” that would buy us a few years to really solve the problem. They understood all the problems NAT would cause, and were willing to live with them for a short time, when the alternative was to run out of IPv4 addresses somewhere around 1997. For the real long-term fix, the IETF also began working on the next generation Internet Protocol with a much larger address space. That next generation Internet Protocol is complete, mature and available to deploy today. It is called IPv6.

4.4 – Problems Introduced by Customer Premise Equipment NAT (CPE NAT)

  • Since the mid 1990’s we have been living with problems created by the introduction of Network Address Translation doing conventional “hide mode” (Cone) NAT at the Customer Premise (CPE NAT). These include:
  • Difficulty for internal nodes to accept incoming connections, for VoIP (SIP), Peer-to-Peer (P2P), running your own mail (SMTP), web (HTTP/HTTPS), File Transfer (FTP/SSH) or other servers.
  • Problems with protocols that embed IPv4 addresses in packet transmissions (SIP, many games)
  • Problems with protocols that detect tampering to IP and/or TCP/UDP header fields (e.g. IP addresses, port numbers), such as IPsec Authentication Header (AH).
  • Problems due to advances in web technology (primarily Web 2.0 / AJAX) that use large numbers of connections, each over a different port, such as iTunes and Google Maps. This can be as high as 200 ports per application. Since NAPT systems share the 65,536 possible ports associated with a single “real” IPv4 address among the nodes hidden behind each address, each internal user on average can use at most 65,536 divided by the number of users behind that address. In enterprise networks, this might (until recently) have been thousands or tens of thousands of nodes behind one real address. For 1,000 nodes, on average each user could use no more than 65 ports. For 10,000 nodes, on average each user could use no more than 6 ports. To allow each user up to 200 ports, no more than 300 users should be hidden behind each IPv4 address. Currently, the average number of ports used per user is actually quite low (less than 10), but this  is expected to grow rapidly as more users begin using Web 2.0 / AJAX type applications. If possible, NAT schemes should use ports on a first come, first served basis, rather than allocating 1/n of the possible ports to each node.

Difficulty of tracking abuse to specific users behind a NAT. This requires keeping large amounts of information including source IP address, destination IP address, port number(s) and accurate time stamps for every connection. This may have to be kept for up to one year. A year’s worth of such data for a single user can be tens of gigabytes to terabytes in size. Multiplied by the number of users, this is a staggering amount of storage that ISPs are required to keep. Hackers love to “hide behind” NAT gateways.

Essentially, private IP addresses behind “hide mode” NAT are good only for outgoing connections using the simplest connectivity paradigms (e.g. client to server, using a small total number of ports per user).

It is possible to allow at most one internal node to accept incoming connections on a given port (e.g. 25 for SMTP, 80 for HTTP, etc) for a given real IPv4 address, using port forwarding. For example, your NAT gateway can be configured to forward any incoming connection to its real IPv4 address on port 25, to the private address of a single internal node where an e-mail server is running. The gateway could also forward incoming connections to its real IPv4 address on port 80 to the same or a different internal node’s private address where a web server is running. This limits the entire LAN (or that part of it behind a given real IPv4 address) to a single server for any given port number. This still translates the destination IPv4 address on the way in, and the source IPv4 address on the way out (but not port numbers), still causing many of the problems listed above.

Some firewalls (or other NAT gateways) in addition to “hide mode” (Cone) NAT for outgoing connections, and port forwarding, also support bidirectional NAT (called BINAT, symmetric NAT, and “1 to 1” NAT among other names). This type of NAT makes a two way address translation between a single real IP address and a single private internal address (hence “1 to 1”). The full 65,536 possible ports may be used on the internal node, but a distinct real IPv4 address is required for each such NAT mapping. This would allow deployment of multiple web servers within a LAN, or an easy way to provide access to many services on a single node (e.g. a Windows Server based computer). This still translates the destination IP address of packets on the way in, and the source IP address of packets on the way out (but not port numbers) still causing many of the problems listed above. In addition it uses up one real address per internal server, and requires addressing the “missing ARP” problem (caused by the fact that there is no physical node at the external address to respond to ARP queries). This can be solved by configuring a static ARP for the external IPv4 address on the NAT gateway, or various other solutions. This is one of the most difficult and least widely understood aspects of managing a NAT gateway.

There are also various NAT Traversal protocols (STUN, TURN, SOCKS, NAT-T, etc) that allow incoming connections to internal nodes that have only private addresses (without any port forwarding or BINAT support in the NAT gateway). These typically require an outside server to assist. STUN uses the outside server only to establish the connection, while TURN also routes all traffic through the outside gateway. All involve encapsulating traffic over UDP, which complicates error detection and recovery, as well as supporting the “connection oriented” nature of TCP traffic. All require extensive modifications to the source code of clients, which is quite complex and very specific to the NAT traversal algorithm used. Often the external servers used are not under control of the network, leading to security issues. One of the most popular network applications (Skype) uses standard UDP encapsulated “hole punching” traversal, which causes many security issues.

RFCs related to NAT Traversal

  • RFC 1928, “SOCKS Protocol Version 5”, March 1996 (Standards Track)
  • RFC 3489, “Simple Traversal of User Datagram Protocol (UDP) Through Network Address Translators (NATs)”, March 2003 (Standards Track, Obsoleted by RFC 5389)
  • RFC 3947, “Negotiation of NAT-Traversal in the IKE”, January 2005 (Standards Track)
  • RFC 3948, “UDP Encapsulation of IPsec ESP Packets”, January 2005 (Standards Track)
  • RFC 5389, “Session Traversal Utilities for NAT (STUN)”, October 2008 (Standards Track)
  • RFC 5766, “Traversal Using Relays Around NAT (TURN): Relay Extensions to Session Traversal Utilities for NAT (STUN)”, March 2010 (Standards Track, awaiting final approval)

4.5 – Implementing NAT at the Carrier: Carrier Grade NAT (CGN) or Large Scale NAT (LSN)

As we progress from the “end times” for IPv4, to “life after IPv4” (beyond the depletion date for IPv4), those who have not already migrated to IPv6 will face even greater problems, as ISPs deploy Carrier Grade NAT or Large Scale NAT solutions in their networks, as opposed to at the Customer Premise (CPE NAT). The reason for this is to try to make optimal use of an even smaller number of real IPv4 addresses than is possible with CPE NAT. Essentially the ISP will have a very small pool of real IPv4 address (less than the number of customers). They will share single real IPv4 addresses across customers. This will make the problems associated with CPE NAT dramatically worse. There is excellent coverage of the issues associated with deploying NAT in the carrier in draft-ford-shared-addressing-issues-02 (2010-03– 08). The schemes discussed in this Internet Draft include:

  • Dual-Stack Lite, draft-ietf-softwire-dual-stack-lite
  • Carrier Grade NAT (CGN), draft.nishitani-cgn
  • NAT64, draft-ietf-behave-v6v4-xlate-stateful
  • IVI, draft-ietf-behave-v6v4-xlate
  • Address+Port (A+P) proposals, draft-ymbk-aplusp, draft-boucadair-port-range
  • Scalable Multihoming  across IPv6 – Stateless Address Mapping, draft-despres-sam

Of these, only Dual-Stack Lite makes dual stack service available to users. It provides direct IPv6 service (no NAT, no tunneling). It provides IPv4 service tunneled over IPv6 with only one level of NAT44 (which takes place at the carrier). Customers will get only private IPv4 addresses. It is possible that some ISPs may provide a few precious “real” (globally routable) IPv4 addresses to business customers at a significant price premium (all the market will bear). All of the NAT schemes extend the address space by adding port information. They differ in the way they manage the port value.

With CPE NAT, a given real IPv4 address covered only one legal entity (a home, a company, etc). With carrier based NAT, multiple legal entities will be behind most real IPv4 addresses, which will vastly complicate the legal issues (such as tracking down a source of network abuse or being able to prove who really did something).

You will see the terms NAT444 and NAT464 in discussions of carrier based NAT. The existing NAT that is widely deployed now called is NAT44 (NAT from IPv4 to IPv4). There is also NAT46 (NAT from IPv4 to IPv6 and NAT64 (NAT from IPv6 to IPv4).

NAT444 essentially leaves the CPE NAT44 (the existing one layer NAT that is widely deployed today) intact at the Customer Premise, while the carrier deploys a second layer of NAT44 before it ever reaches the customer. It is really just two NAT44 mechanisms in series. The CPE NAT44 will map the private addresses supplied from the carrier NAT44 onto yet another set of internal private addresses. The transport from carrier to customer is also over IPv4. The difference from existing systems is that today the CPE NAT usually has one real IPv4 address which it shares among multiple internal nodes. In NAT444 systems, there won’t be even one real IPv4 address at the customer premise. It will be quite difficult (and probably very expensive) to host servers with public IPv4 addresses (e.g. web, mail, VoIP) at customer sites – most will have to be hosted at a collocation facility.

NAT464 is similar, but involves doing one layer of NAT46 (from IPv4 to IPv6) at the carrier, followed by a second layer of NAT64 (from IPv6 to IPv4) at the customer premise. This allows the transport from carrier to customer to be over IPv6, which is a good thing, but involves upgrading or replacing all Customer Premise equipment to ones that are NAT64 compliant (few are today). Also, address translation between IP families (IPv4 to IPv6 and IPv6 to IPv4) has even more problems than address translation within a single IP family (only IPv4 to IPv4 – there is no IPv6 to IPv6 NAT!).

For an analogy, imagine deploying nested telephone PBXes. There would be an outer PBX, with a real telephone number, and behind that other PBXes with internal extensions from the outer PBX. Behind each internal PBX, you would have sets of internal phones. To call an internal phone, you would dial the real phone number of the outer PBX, have to do something to select an internal PBX (dial the internal PBX’s extension number?) then once connected to the internal PBX, you would need to interact with it to select an internal phone (e.g. dial the first three characters of the phone owner’s name). This is the kind of complexity that IPv4 applications will now have to cope with. It will be much simpler to just convert them directly to IPv6.

In either case (NAT444 or NAT464), there are some protocols that will work across one layer of NAT, but fail when there is more than one layer of NAT. Both NAT444 and NAT464 will introduce these kinds of issues, since both involve at least two layers of NAT. Some home or small business users may unintentionally introduce even more layers of NAT due to lack of understanding, for example by deploying a firewall/NAT box behind a modem/NAT gateway.

The following problems are made worse by Carrier Grade NAT compared even to CPE NAT. Some affect only the end user, some affect third parties (e.g. law enforcement), and many affect both.

  • The number of ports available per node will be even less, so Web 2.0 / AJAX applications such as iTunes and Google Maps will fail in unpredictable ways, especially with schemes that divide the available ports into equally sized port ranges per customer
  • Incoming port negotiations may fail – e.g. Universal Plug and Play (UPnP)
  • Incoming connections to Well-Known Ports will not work (e.g. SMTP, HTTP, SIP, etc)
  • Reverse DNS pretty much breaks down completely
  • Inbound ICMP will fail in most cases
  • Security issues are even worse than with CPE NAT
  • Packet fragmentation requires special handling
  • There are more single points of failure and decreased network stability
  • Port randomization is affected (especially in schemes that restrict ports to ranges)
  • Penalty Boxes no longer work
  • Spam blacklisting will affect many other nodes that use the same address
  • Geo-location services may not be reliable or particularly specific
  • Load balancing algorithms are impacted
  • Authentication mechanisms are impacted
  • IPv6 transition mechanisms will be affected (Dual-Stack Lite is the exception here)
  • Frequent keep-alives will reduce battery life in mobile nodes

Applications that had to be modified to support NAT Traversal to work through NAT44 will have to be modified once again, with even more complicated schemes, to traverse multiple layers of NAT. Application Layer Gateway workarounds now have to be implemented at the Carrier, not just at the Customer Premise. ALGs that have to deal with port-range restrictions will have an even harder job.

Blocking incoming access to services based on IPv4 address will likely affect many “innocent bystanders” that happen to share the same real IPv4 address. One obvious example is spam blacklists. A less obvious example is that some secure devices restrict access by source IP address (only this node can connect to my firewall). Now, many other nodes, even in different organizations, will be sharing that same IP address legitimately, so may be able to access such nodes.

With reverse DNS, you publish the node name associated with a given IP address. With CPE NAT this affected many nodes, but this will be completely meaningless for nodes behind Carrier based NAT. There is no way to publish thousands of node names for a single IP address, nor is there any way for someone asking for the reverse lookup to interpret the response correctly.

IPv6 transition mechanisms such as 6to4 will not work at all behind Carrier based NAT, but Teredo might. Likewise IPv4 Multicast and Mobile IPv4 will have to be modified extensively for Carrier based NAT.

Chapter 5 – TCP/IPv6 Core Protocols

This chapter introduces the new concepts and technical specifics of TCP/IPv6, the foundation of the  econd Internet. Since IPv6 is based heavily on TCP/IPv4, the approach will be to describe the differences between the two. The subchapter headings are intentionally similar to those in Chapter 3, to allow you to compare the old and the new, topic by topic. Again, there is no intent to be comprehensive. There is a lot of content available on all aspects of IPv6, listed in the bibliography, and/or available online. The ultimate references are the RFCs, so this chapter includes pointers to the relevant ones, for those who want to drill deeper on specific topics.

In other chapters we will discuss topics such as advanced aspects of IPv6 (IPsec, Mobile IPv6),  the new things that TCP/IPv6 makes possible, who is involved in making it happen, and how we get from the First Internet to the Second Internet (migration). This chapter covers the core protocols of IPv6.

5.1 – Network Hardware

Essentially the same network hardware that was used to deploy the First Internet is being used to deploy the Second Internet, with some notable exceptions, primarily hardware that implements things at the Internet Layer or above, such as smart (“Layer 3”) switches, routers and firewalls. Also DNS and DHCP servers must be updated or replaced with ones that support TCP/IPv6 (more typically both TCP/IPv4 and TCP/IPv6, or “dual stack”). As TCP/IPv6 is deployed, Virtual Private Networks (VPNs) will likely move away from “SSL/VPN” to IPsec based VPNs, which is the only IETF approved technology for VPNs. Unfortunately IPsec is incompatible with NAT, which is now endemic in the First Internet. VoIP and IPTV appliances will probably be upgraded to (or replaced with) TCP/IPv6 based systems. Any device  ith TCP/IP hardware acceleration (such as in high end routers) will probably need to be redesigned or replaced. Simply upgrading the firmware will not be sufficient on such products. There are some routers that only have hardware acceleration for the IPv4 stack, which has led some people to think there are performance issues with IPv6. Already a number of hardware acceleration chips that support both IPv4 and IPv6 are available and are being used in new product designs.

The hardware of most nodes does not need to change, especially client and server computers. Replacement or upgrade of the operating system and applications is all that is needed. The good news is that almost all operating systems and many network applications that run on client computers are already fully compliant with TCP/IPv6, and those are widely deployed. Those that aren’t yet compliant can be upgraded or configured to support it with very reasonable effort and cost. Many server applications (especially open source ones) are already compliant as well. Virtually everything Microsoft makes fully supports TCP/IPv6 today. For client computers, Windows Vista and Windows 7 have very complete support. Windows XP has some support, but is missing some key features (like GUI configuration of IPv6 addresses, and DNS queries over IPv6). My company supplies free tools for Windows XP GUI IPv6 configuration, and a Windows XP DHCPv6 client. Go to www.infoweapons.com (downloads) for details. For server computers, Windows Server 2008 and Exchange Server 2007 (and most other server software since 2007) have full support for TCP/IPv6. Most Open Source operating systems (Linux, FreeBSD, OpenBSD and NetBSD) have had full support for TCP/IPv6 for many years. Most open source network applications (Apache, Nagios, Postfix, Dovecot, etc) also have full support (although in some cases, documentation may be hard to find).

NICs do not need to change unless they have IPv4 specific hardware acceleration, and even those will typically run TCP/IPv6 with no problem, but the IPv6 part won’t be accelerated (it will run at “software” performance levels, in terms of packets or bytes processed per second). There are already many chips available to build hardware accelerated NICs that fully support both TCP/IPv4 and TCP/IPv6, so soon, even NICs with hardware acceleration will be no problem. They will accelerate IPv4 and/or IPv6 traffic. For the most part, NICs work at the Link Layer, hence are IP version agnostic (except for hardware acceleration).

Existing Wi-Fi NICs are also IP version agnostic (they work at the Link Layer), and every one I’ve tried has worked with IPv6 with no upgrades or workarounds required. Wi-Fi Access Points are another matter, because usually they include higher layer functionality such as IPv4 routing, often including IPv4 NAT

and a DHCPv4 server. Even here, there is a simple workaround. Most Wi-Fi access points have a “WAN”

connector which is the input to the NAT gateway, and one or more “LAN” connectors that are on the client side of the NAT gateway. The LAN connectors are intending to plug in wired client nodes, which are peers to the wireless client nodes (both wired and wireless client nodes obtain configuration information and translated IP addresses from the DHCPv4 server and NAT gateway built into the Wi-Fi access point). Of course the existing IPv4 routing, IPv4 NAT and DHCPv4 in such devices are not compatible with TCP/IPv6. There will be dual stack Wi-Fi Access Points available soon from companies like D-Link, but the majority of products available today do not have routing, firewall or DHCP support for IPv6.

However, if you plug the cable from your ISP DSL modem (or from a larger home wired network) into one of the LAN connectors on your Wi-Fi Access Point, instead of into the WAN connector as you are supposed to, you can simply ignore the IPv4 specific parts of the Wi-Fi access point. The actual Wi-Fi transmitter part is IP agnostic, and if there is both TCP/IPv4 and TCP/IPv6 on the feed you connect, they will both be broadcast on wireless, and all existing nodes with Wi-Fi NICs will receive it (assuming each OS supports IPv6 and you have configured it). Of course, if you want your Wi-Fi nodes to obtain IPv4 addresses automatically, you must have a DHCPv4 server somewhere in your network (properly configured). Your Wi-Fi Access Point is no longer performing this function. Likewise if you want Wi-Fi clients to obtain IPv6 addresses through stateless auto configuration, there must be a Router Advertisement Daemon in your network (just as for wired IPv6). If your wireless node has a DHCPv6 client, and you have a DHCPv6 server in your network, stateful auto configuration will work over Wi-Fi as well. Of course you can manually configure IPv6 addresses for Wi-Fi nodes just as you can with wired nodes. No NAT is required for IPv6. For IPv4, no NAT will be performed in the Wi-Fi Access Point, so if you need it, it must be performed at the outside gateway (for example, a wired DSL modem from your ISP). Your wireless nodes will be peers to your wired nodes. All of them (wired and wireless) will get addresses from the same DHCPv4 pool (if you use DHCP) and all will be in the same subnet. Normally if you connected a Wi-Fi gateway with NAT inside an existing NATTED network, your wireless nodes would be behind two levels of NAT, which can cause some problems.

You will also find that some consumer devices that support Wi-Fi already have support for IPv6, such as certain Nokia phones and any phone based on Microsoft Mobile (Samsung Omnia, HTC, etc). It’s kinda cool to deploy dual stack Wi-Fi and show people the dancing turtle at www.kame.net on your phone. With most of today’s phones, however, the only thing that works over IPv6 today (if anything) is Wi-Fi Internet access, not the voice traffic or “Internet over wireless” service. In theory you could add a dual stack softphone (VoIP client) and do voice communications over IPv6, but only via the Wi-Fi connection through a Wi-Fi Access Point connected to the main Internet, not over your wireless telephone carrier’s Internet service via WAP, GPRS, EDGE , HSDPA,  or whatever else they provide. Someday even these services will be dual stack (probably primarily HSDPA).

Soon there will be dual stack Wi-Fi Access Points that fully support routing for IPv4 and IPv6, NAT for IPv4, and a Router Advertisement Daemon to enable IPv6 stateless auto configuration. D-Link in Taiwan is working on those now, and they should be on the market shortly.

Network cables are totally IP version agnostic. You will not need to rewire you network just for IPv6.

All conventional (“layer 2”) hubs and switches are IP version agnostic, although “layer 3” features of some switches (such as Web management, SNMP, and VLANs) must be upgraded to support IPv6. In most cases, this will be possible simply with new downloaded firmware. No hardware changes are needed (assuming there is sufficient RAM and ROM to handle the more complex firmware). Contact your switch vendor and demand that they add support for IPv6. There are already a few layer 3 switches on the market that support IPv6. I have an SMC 8848M 48-port Gigabit managed switch in my home network that has quite a bit of IPv6 support, including web management over IPv6, IPv6 based VLANs, SNMP over IPv6, etc. Unfortunately, traffic statistics do not breakout IPv4 and IPv6 traffic, just the total is reported. D-Link also recently announced a dual stack smart switch series. They are already IPv6

Ready Gold certified. For details, search for their DGS-3627 XSTACK Managed 24 port Gigabit Stackable L3 Switch.

Many enterprise grade routers and firewalls already support TCP/IPv6, although in some cases you must pay extra for the IPv6 functionality. Cisco routers require “Advanced IP Services” for IOS, usually at additional cost, before IPv6 works. For example, the Cisco 2851 router ($6495) includes only the base IOS (no IPv6 support). The Advanced IP Services Feature Pack for it is an additional $1700 (all prices list). When buying or considering using Cisco Routers for use in IPv6 networks, make sure they already include Advanced IP Services, or include the additional cost of the Feature Pack.

Home network gateways that support TCP/IPv6 are further behind, but coming soon, especially from Asian vendors, such as D-Link. A typical one will have all the features of existing IPv4 based gateways, plus 6in4 tunneling (to tunnel in IPv6 from a virtual ISP), a Router Advertisement Daemon (to enable stateless auto configuration), and firewall rules for IPv6 traffic. They should also be able to accept direct (as opposed to tunneled) IPv6 service, for when dual stack ISP service becomes more widely available. Their DNS relay should support DNS over both IPv4 and IPv6. More advanced gateways might include a DHCPv6 server.

Note that some DSL or cable modems also include IPv4 firewall functionality. Of course this will not allow you to control IPv6 traffic. Therefore, if you are connecting your LAN to the IPv6 Internet, there must be IPv6 firewalling somewhere, possibly in a 6in4 tunnel endpoint that is routing IPv6 traffic into your LAN. A Dual Stack gateway firewall may include routing to accept incoming “direct” IPv6 service and/or a 6in4 endpoint to accept incoming “tunneled” IPv6 service, together with both IPv4 and IPv6 filtering rules, and a Router Advertisement Daemon to support stateless auto configuration for the internal nodes that support IPv6. My company makes an easy to use dual stack firewall with all of these features, and an intuitive GUI administrative interface (SolidWall). There are several such products on the market today, in addition to open source projects you can install on Linux or FreeBSD.

You can find out more about D-Link’s IPv6 compliant products at:

http://www.dlink.com/business/ipv6/

Most IP Phones in use today do not support TCP/IPv6, but some new models (including ones from Snom in Germany and Moimstone in Korea) do support it. Cisco supports IPv6 on a number of their recent phones, including the 7906G, 7911G, 7931G, 7941G/GE, 7942G, 7945G, 7961G/GE, 7962G, 7965G, 79770G, 7971G/GE, and 7975G. Most of the older Cisco IP phones currently in use do not support IPv6, and their firmware cannot be upgraded for various reasons.

When looking for hardware products that already support TCP/IPv6, an excellent source of information is the IPv6 Ready Approved Products List. If possible, choose products that have passed the Phase 2 (Gold level) testing. This insures full compliance with all relevant RFCs and interoperability with many other products. There is also a list of products that have passed the Phase 1 (Silver level) testing. Phase 1 testing insures compliance with all items denoted MUST in the relevant RFCs. Phase 2 testing also insures compliance with all items denoted SHOULD in the relevant RFCs (a much more comprehensive set of functionality). These lists are updated and maintained by the IPv6 Ready Logo Committee of the IPv6 Forum. They can be found here:

http://www.ipv6ready.org/phase-1_approved_list

http://www.ipv6ready.org/phase-2_approved_list

5.2 – RFCs: A Whole Raft of New Standards for TCP/IPv6

There are many new RFCs that define the protocols, addressing and routing schemes, as well as migration issues for TCP/IPv6. I will cover the most important of those in this chapter.

You can trace the beginnings and evolution of TCP/IPv6 in some early RFCs. In 1990, when the IETF first realized that a successor to TCP/IPv4 was going to be needed (and soon), the fun began. One key RFC related to this is RFC 1752, “The Recommendation for the IP Next Generation Protocol”, January 1995. Prior to this, people referred to the successor protocol as IPng (IP next generation), but in this RFC the term IPv6 was used. RFC 1752 says that the IETF started its effort to select a successor in late 1990, and that several parallel efforts were started. Among these proposals were “CNAT”, “IP Encaps”, “Nimrod”, “Simple CLNP”, the “P Internet Protocol”, the “Simple Internet Protocol” and “TP/IX”. None of these ever made it past the Internet Draft stage.

By late 1993, an IPng Working Group was formed, and the various proposals still around were reviewed. These included CATNIP, TUBA, and SIPP. Relevant RFCs (now of only historical interest) are:

  • RFC 1347 “TCP and UDP with Bigger Addresses (TUBA)”, June 1992 (Informational)
  • RFC 1526 “Assignment of System Identifiers for TUBA/CLNP Hosts”, September  1993 (Informational)
  • RFC 1561, “Use of ISO CLNP in TUBA Environments”, December 1993 (Experimental)
  • RFC 1707, “CATNIP: Common Architecture for the Internet”, October  1994 (Informational)
  • RFC 1710, “Simple Internet Protocol Plus White Paper”, October 1994 (Informational)

The CLNP referred to in several of these was the “Connectionless-mode Network Layer Protocol”, defined in ISO/IEC 8473, which did not make it into the final IPv6 specification. By 1995 a consensus had emerged, with the best features of all the contenders. The consensus was summarized in RFC 1752. Before the end of the year (barely), the first real TCP/IPv6 specifications were published:

  • RFC 1883, “Internet Protocol, Version 6 (IPv6) Specification”, December 1995 (Standards Track, obsoleted by RFC 2460)
  • RFC 1884, “IP Version 6 Addressing Architecture”, December 1995 (Standards Track, obsoleted by RFC 2373)
  • RFC 1885, “Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification”, December 1995 (Standards Track, obsoleted by RFC 2463)
  • RFC 1886, “DNS extensions to support IP version 6”, December 1995 (Standards Track – obsoleted by RFC 3596)
  • RFC 1887, “An Architecture for IPv6 Unicast Address Allocation”, December 1995 (Informational)

Most of these have been updated since then and there are quite a few new ones since 1995, but this is where it really started. Yes, TCP/IPv6 is turning 15 years old in 2010, and has finally grown up.

5.3 – TCP/IPv6

The software that is making the Second Internet (and virtually all Local Area Networks) possible will be around for quite some time. Like its predecessor, TCP/IPv4, it is a suite (family) of protocols. Once again, the core protocols are TCPv6 (Transmission Control Protocol version 6) and IPv6 (Internet Protocol version 6). TCPv6 has very few changes from TCPv4, but there are a few, due to the larger addresses that require more storage, and the odd method of calculating the checksum defined in TCPv4 (this involves a “pseudo header” that includes the source and destination addresses from the IP header, which of course are different in IPv4 and IPv6).

There is no new RFC specifically about TCPv6, but there are several RFCs that include details about the new features.

UDP has only very minor changes to work over IPv6, primarily to provide more storage for IPv6 addresses. The UDP packet header checksum also includes the IP addresses, once again using the new pseudo header.

The following standards current define IPv6:

  • RFC 1809, “Using the Flow Label Field in IPv6”, June 1995 (Informational)
  • RFC 1881, “IPv6 Address Allocation Management”, December 1995 (Informational)
  • RFC 1887, “An Architecture for IPv6 Unicast Address Allocation”, December 1995 (Informational)
  • RFC 1981, “Path MTU Discovery for IP version 6”, August 1996 (Standards Track)
  • RFC 2428, “FTP Extensions for IPv6 and NATs”, September 1998 (Standards Track)
  • RFC 2460, “Internet Protocol, Version 6 (IPv6) Specification”, December 1998 (Standards Track)
  • RFC 2473, “Generic Packet Tunneling in IPv6 Specification”, December 1998 (Standards Track)
  • RFC 2474, “Definition of the Differentiated Service Field (DS Field) in the IPv4 and IPv6 Headers”, December 1998 (Standards Track)
  • RFC 2526, “Reserved IPv6 Subnet Anycast Addresses”, March 1999 (Standards Track)
  • RFC 2529, “Transmission of IPv6 over IPv4 Domains without Explicit Tunnels”, March 1999 (Standards Track)
  • RFC 2675, “IPv6 Jumbograms”, August 1999 (Standards Track)
  • RFC 2711, “IPv6 Router Alert Option”, October 1999 (Standards Track)
  • RFC 2765, “Stateless IP/ICMP Translation Algorithm (SIIT)”, February 2000 (Standards Track)
  • RFC 2767, “Dual Stack Hosts using the Bump-In-the-Stack Technique (BIS)”, February 2000 (Informational)
  • RFC 2894, “Router Renumbering for IPv6”, August 2000 (Standards Track)
  • RFC 3053, “IPv6 Tunnel Broker”, January 2001 (Informational)
  • RFC 3056, “Connection of IPv6 Domains via IPv4 Clouds”, February 2001 (Standards Track)
  • RFC 3089, “A SOCKS-based IPv6/IPv4 Gateway Mechanism”, April 2001 (Informational)
  • RFC 3111, “Service Location Protocol Modifications for IPv6”, May 2001 (Standards Track)

RFC 3122, “Extensions to IPv6 Neighbor Discovery for Inverse Discovery Specification”, June 2001 (Standards Track)

  • RFC 3142, “An IPv6-to-IPv4 Transport Relay Translator”, June 2001 (Informational)
  • RFC 3175, “Aggregation of RSVP for IPv4 and IPv6 Reservations”, September 2001 (Standards Track)
  • RFC 3177, “IAB/IESG Recommendations on IPv6 Address Allocations to Sites”, September 2001 (Informational)
  • RFC 3178, “IPv6 Multihoming Support at Site Exit Routers”, October 2001 (Informational)
  • RFC 3306, “Unicast-Prefix-based IPv6 Multicast Addresses”, August 2002 (Standards Track)
  • RFC 3314, “Recommendations for IPv6 in Third Generation Partnership Project (3GPP) Standards”, September 2002 (Informational)
  • RFC 3316, “Internet Protocol Version 6 (IPv6) for Some Second and Third Generation Cellular Hosts”, April 2003 (Informational)
  • RFC 3363, “Representing Internet Protocol version 6 (IPv6) Addresses in the Domain Name System”, August 2002 (Informational)
  • RFC 3364, “Tradeoffs in Domain Name System (DNS) Support for Internet Protocol version 6 (IPv6)”, August 2002 (Informational)
  • RFC 3484, “Default Address Selection for Internet Protocol version 6 (IPv6)”, February 2003 (Standards Track)
  • RFC 3531, “A Flexible Method for Managing the Assignment of Bits of an IPv6 Address Block”, April 2003 (Informational)
  • RFC 3574, “Transition Scenarios for 3GPP Networks”, August 2003 (Informational)
  • RFC 3582, “Goals for IPv6 Site-Multihoming Architectures”, August 2003 (Informational)
  • RFC 3587, “IPv6 Global Unicast Address Format”, August 2003 (Informational)
  • RFC 3595, “Textual Conventions for the IPv6 Flow Label”, September 2003 (Standards Track)
  • RFC 3697, “IPv6 Flow Label Specification”, March 2004 (Standards Track)
  • RFC 3710, “6bone (IPv6 Testing Address Allocation) Phaseout”, March 2004 (Standards Track)
  • RFC 3750, “Unmanaged Networks IPv6 Transition Scenarios”, April 2004 (Informational)
  • RFC 3756, “IPv6 Neighbor Discovery (ND) Trust Models and Threats”, May 2004 (Informational)
  • RFC 3769, “Requirements for IPv6 Prefix Delegation”, June 2004 (Informational)
  • RFC 3849, “IPv6 Address Prefix Reserved for Documentation”, July 2004 (Informational)
  • RFC 3879, “Deprecating Site Local Addresses”, September 2004 (Standards Track)
  • RFC 3904, “Evaluation of IPv6 Transition Mechanisms for Unmanaged Networks”, September 2004 (Informational)
  • RFC 3974, “SMTP Operational Experience in Mixed IPv4/v6 Environments”, January 2005 (Informational)
  • RFC 4007, “IPv6 Scoped Address Architecture”, March 2005 (Informational)
  • RFC 4029, “Scenarios and Analysis for Introducing IPv6 into ISP Networks”, March 2005 (Informational)
  • RFC 4038, “Application Aspects of IPv6 Transition”, March 2005 (Informational)
  • RFC 4057, “IPv6 Enterprise Network Scenarios”, June 2005 (Informational)
  • RFC 4074, “Common Misbehavior Against DNS Queries for IPv6 Addresses”, May 2005 (Informational)
  • RFC 4135, “Goals of Detecting Network Attachment in IPv6”, May 2005 (Informational)
  • RFC 4147, “Proposed Changes to the Format of the IANA IPv6 Registry”, August 2005 (Informational)
  • RFC 4159, “Depreciation of ip6.in”, August 2005 (Best Current Practice)
  • RFC 4177, “Architectural Approaches to Multihoming for IPv6”, September 2005 (Informational)
  • RFC  4192, “Procedures for Renumbering an IPv6 Network without a Flag Day”, September 2005 (Informational)
  • RFC 4193, “Unique Local IPv6 Unicast Addresses”, October 2005 (Standards Track)
  • RFC 4213, “Basic Transition Mechanisms for IPv6 Hosts and Routers”, October 2005 (Standards Track)
  • RFC 4215, “Analysis of IPv6 Transition in Third Generation Partnership Project (3GPP) Networks”, October 2005 (Informational)
  • RFC 4218, “Threats Relating to IPv6 Multihoming Solutions”, October 2005 (Informational)
  • RFC 4241, “A Model of IPv6/IPv4 Dual Stack Internet Access Service”, December  2005 (Informational)
  • RFC 4291, “IP Version 6 Addressing Architecture”, February 2006
  • RFC 4294, “IPv6 Node Requirements”, April 2006 (Informational)
  • RFC 4311, “IPv6 Host-to-Router Load Sharing”, November 2005 (Standards Track)
  • RFC 4330, “Simple Network Time Protocol (SNTP) Version 4 for IPv4, IPv6 and OSI”, January 2006 (Informational)
  • RFC 4339, “IPv6 Host Configuration of DNS Server Information Approaches”, February 2006 (Informational)
  • RFC 4380, “Teredo: Tunneling IPv6 over UDP through Network Address Translations (NATs)”, February 2006 (Standards Track)
  • RFC 4429, “Optimistic Duplicate Address Detection (DAD) for IPv6”, April 2006 (Standards Track)
  • RFC 4443, “Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification”, April 2006 (Standards Track)
  • RFC 4472, “Operational Considerations and Issues with IPv6 DNS”, April 2006 (Informational)
  • RFC 4554, “Use of VLANs for IPv4-IPv6 Coexistence in Enterprise Networks”, June 2006 (Informational)
  • RFC 4659, “BGP-MPLS IP Virtual Private Network (VPN) Extensions for IPv6 VPN”, September 2006 (Standards Track)
  • RFC 4692, “Considerations on the IPv6 Host Density Metric”, October 2006 (Informational)
  • RFC 4727, “Experimental Values in IPv4, IPv6, ICMPv4, ICMPv6, UDP and TCP Headers”, November 2006 (Standards Track)
  • RFC 4773, “Administration of the IANA Special Purpose IPv6 Address Block”, December 2006 (Informational)
  • RFC 4779, “ISP IPv6 Deployment Scenarios in Broadband Access Networks”, January 2007 (Informational)
  • RFC 4798, “Connecting IPv6 Islands over IPv4 MPLS Using IPv6 Provider Edge Routers (6PE)”, February 2007 (Standards Track)
  • RFC 4818, “RADIUS Delegated-IPv6-Prefix Attribute”, April 2007 (Standards Track)
  • RFC 4852, “IPv6 Enterprise Network Analysis – IP Layer 3 Focus”, April 2007 (Informational)
  • RFC 4861, “Neighbor Discovery for IP version 6 (IPv6)”, September 2007 (Standards Track)
  • RFC 4862, “IPv6 Stateless Address Autoconfiguration”, September 2007 (Standards Track)
  • RFC 4864, “Local Network Protection for IPv6”, May 2007 (Informational)
  • RFC 4890, “Recommendations for Filtering ICMPv6 Messages in Firewalls”, May 2007 (Informational)
  • RFC 4919, “IPv6 over Low-Power Wireless Personal Area Networks (6LoWPANs): Overview, Assumptions, Problem Statement and Goals”, August 2007 (Informational)
  • RFC 4941, “Privacy Extensions for Stateless Address Autoconfiguration in IPv6”, September 2007 (Standards Track)
  • RFC 4942, “IPv6 Transition/Co-existence Security Considerations”, September 2007 (Informational)
  • RFC 4943, “IPv6 Neighbor Discovery On-link Assumption Considered Harmful”, September 2007
  • (Informational)
  • RFC 4968, “Analysis of IPv6 Link Models for 802.16 Based Networks”, August 2007 (Informational)
  • RFC 5006, “IPv6 Router Advertisement Option for DNS Configuration”, September 2007 (Experimental)
  • RFC 5095, “Deprecation of Type 0 Routing Headers in IPv6”, December 2007 (Standards Track)
  • RFC 5156, “Special-Use IPv6 Addresses”, April 2008 (Informational)
  • RFC 5157, “IPv6 Implications for Network Scanning”, March 2008 (Informational)
  • RFC 5172, “Negotiation for IPv6 Datagram Compression Using IPv6 Control Protocol”, March 2008 (Standards Track)
  • RFC 5175, “IPv6 Router Advertisement Flags Option”, March 2008 (Standards Track)
  • RFC 5181, “IPv6 Deployment Scenarios in 802.16 Networks”, May 2008 (Informational)
  • RFC 5214, “Intra-Site Automatic Tunnel Addressing Protocol (ISATAP)”, March 2008  (Informational)
  • RFC 5350, “IANA Considerations for the IPv4 and IPv6 Router Alert Options”, September 2008 (Standards Track)
  • RFC 5375, “IPv6 Unicast Address Assignment Considerations”, December 2008 (Informational)
  • RFC 5453, “Reserved IPv6 Interface Identifiers”, February 2009 (Standards Track)
  • RFC 5533, “Shim6: Level 3 Multihoming Shim Protocol for IPv6”, June 2009 (Standards Track)
  • RFC 5534, “Failure Detection and Locator Pair Exploration Protocol for IPv6 Multihoming”, June 2009 (Standards Track)
  • RFC 5549, “Advertising IPv4 Network Layer Reachability Information with an IPv6 Next Hop”, May 2009 (Standards Track)
  • RFC 5569, “IPv6 Rapid Deployment on IPv4 Infrastructures (6rd)”, January 2010 (Informational)
  • RFC 5570, “Common Architecture Label IPv6 Security Option (CALIPSO)”, July 2009 (Informational)
  • RFC 5572, “IPv6 Tunnel Broker with the Tunnel Setup Protocol (TSP)”, February 2010 (Experimental)
  • RFC 5579, “Transmission of IPv4 Packets over Intra-Site Automatic Tunnel Addressing Protocol (ISATAP) Interfaces”, February 2010 (Informational)
  • RFC 5619, “Softwire Security Analysis and Requirements”, August 2009 (Standards Track)
  • RFC 5701, “IP Address Specific BGP Extended Community Attribute”, November 2009 (Standards Track)
  • RFC 5722, “Handling of Overlapping IPv6 Fragments”, December 2009 (Standards Track)
  • RFC 5798, “Virtual Router Redundancy Protocol (VRRP) Version 3 for IPv4 and IPv6”, March 2010  (Standards Track)
  • 5.3.1 – Four Layer TCP/IPv6 Architectural Model

img27.png

Figure 5.3-a: Four Layer TCP/IPv6 Model

The major changes from the TCP/IPv4 model are:

  • Application Layer: DHCPv4 replaced with DHCPv6
  • Transport Layer: TCPv4 replaced with TCPv6, UDPv4 replaced with UDPv6
  • Internet Layer: IPv4 replaced with IPv6, ICMPv4 replaced with ICMPv6
  • Link Layer: Removed ARP, Added ND, OSPFv2 replaced with OSPFv3

The Application Layer implements the protocols most people are familiar with (e.g. HTTP). The software routines for these are typically contained in application programs such as browsers or web servers that make system calls to subroutines (or “functions” in C terminology) in the “socket API” (an API is an Application Program Interface, or a collection of related subroutines, typically supplied with the operating system or programming language). The application code creates outgoing data streams, and then calls routines in the API to actually send the data via TCP (Transmission Control Protocol) or UDP (User Datagram Protocol). Output to Transport Layer: [DATA] using IP addresses.

The Transport Layer implements TCP (the Transmission Control Protocol) and UDP (the User Datagram Protocol). These routines are internal to the Socket API. They add a TCP or UDP packet header to the data passed down from the Application Layer, and then pass the data down to the Internet Layer for further processing. Output to Internet Layer: [TCP HDR [DATA], using IP addresses.

The Internet Layer implements IPv6 (the Internet Protocol) and various other related protocols such as ICMPv6 (which includes the “ping” function among other things). The IP routine takes the data passed down from the Transport Layer routines, adds an IPv6 packet header onto it, then passes the now complete IPv6 packet down to routines in the Link Layer. Output to Link layer: [IPv6 HDR [TCP HDR [DATA]]] using IP addresses.

The Link Layer implements ND (the Neighbor Discovery protocol) that help locates the link layer addresses of other nodes on the link, in addition to other functionality. It also contains routines that actually read and write packets (as fed down to it by routines in the Internet Layer) onto the network wire, in compliance with Ethernet or other standards. Output to wire: Ethernet packet using MAC addresses (or the equivalent if other network hardware is used, such as Wi-Fi).

The following standards are relevant to the Link Layer in IPv6:

  • RFC 2464, “Transmission of IPv6 Packets over Ethernet Networks”, December 1998 (Standards Track)
  • RFC 2467, “Transmission of IPv6 Packets over FDDI Networks”, December 1998 (Standards Track)
  • RFC 2470, “Transmission of IPv6 Packets over Token Ring Networks”, December 1998 (Standards Track)
  • RFC 2491, “IPv6 over Non-Broadcast Multiple Access (NBMA) networks”, January 1999 (Standards Track)
  • RFC 2492, “IPv6 over ATM Networks”, January 1999 (Standards Track)
  • RFC 2497, “Transmission of IPv6 Packets over ARCnet Networks”, January 1999 (Standards  Track)
  • RFC 2590, “Transmission of IPv6 Packets over Frame Relay Networks Specification”, May 1999 (Standards Track)
  • RFC 3146, “Transmission of IPv6 Packets over IEEE 1394 Networks”, October 2001 (Standards Track)
  • RFC 4338, “Transmission of IPv6, IPv4 and Address Resolution Protocol (ARP) Packets over Fibre Channel”, January 2006 (Standards Track)
  • RFC 4392, “IP over InfiniBand (IPoIB) Architecture”, April 2006 (Informational)
  • RFC 4944, “Transmission of IPv6 Packets over IEEE 802.15.4 Networks”, September 2007 (Standards Track)
  • RFC 5072, “IP Version 6 over PPP”, September 2007 (Standards Track)
  • RFC 5121, “Transmission of IPv6 via the IPv6 Convergence Sublayer over IEEE 802.16 Networks”, February 2008 (Standards Track)

5.3.2 – IPv6: The Internet Protocol, Version 6

IPv6 is the foundation of TCP/IPv6 and accounts for many of its distinguishing characteristics, such as its 128-bit address size, its addressing model, its packet header structure and routing. IPv6 is currently defined in RFC 2460, “Internet Protocol, Version 6 (IPv6) Specification”, December 1998, but there are several RFCs that extend the definition.

5.3.2.1 – IPv6 Packet Header Structure

So what are these packet headers mentioned above? In TCP/IPv6 packets, there is a TCP (or UDP) packet header, then an IPv6 packet header, then zero or more packet header extensions, then the packet data. Each header and header extension is a structured collection of data, including things such as the IPv6 address of the sending node, and the IPv6 address of the destination node. Why are we getting down to this level of detail? Because some of the big changes from IPv4 to IPv6 have to do with the new and improved IP packet header architecture in IPv6. In this chapter, we’ll cover the IPv6 packet header.  Here it is:

img28.png

Figure 5.3-b: IPv6 Packet Header

The IP Version field (4 bits) contains the value 6 (imagine that!) which in binary is “0110”. This field allows IPv4 and IPv6 traffic to be mixed in a single network.

The Traffic Class field (8 bits) Available for use by originating nodes and/or forwarding routers to identify and distinguish between different classes or priorities of IPv6 packets, in a manner virtually identical to that of IPv4 “Type of Service”.

The Flow Label field (20 bits) is something new in IPv6. It can be used to tag up to 220  (1,048,576) distinct traffic flows, for purposes such as fine grained bandwidth management (QoS). Its use is still experimental. Hosts or routers that do not support this function should set it to zero when originating a packet, or ignore it when receiving a packet. The semantics and usage of this field are covered in Appendix A of RFC 2460.

The Payload Length field (16 bits) is the length of the IPv6 packet payload in bytes, not counting the standard packet header (as it is in IPv4 Total Length), but counting the size of any extension headers, which don’t exist in IPv4. You can think of packet extension headers as being the first part of the data field (payload) of the IPv6 packet.

The Next Header field (8 bits) indicates the type of header immediately following the standard IPv6 packet header. It uses the same values as the IPv4 Protocol field, as defined in RFC 1700, “Assigned Numbers”, October 1994. If this value contains the code for TCP, then the TCP header and packet payload (data) begins immediately after the IPv6 packet header.  Otherwise one or more IPv6 extension headers will be found before the TCP header and data begins. Since each extension header has another Next Header field (and a Header Length field), this constitutes a linked list of headers before the final extension header, which is followed by the data. UDP packets can also have extension headers.

The Hop Limit field (8 bits) is to prevent packets from being shuttled around indefinitely on a network. Every time a packet crosses a switch or router, the hop count is decremented by one. If it reaches zero, the packet is dropped. Typically if this happens, an ICMPv6 message (“time exceeded”) is returned to the packet sender. This mechanism is how the traceroute command works.

The Source Address field (128 bits) contains the IPv6 address of the packet sender.

The Destination Address field (128 bits) contains the IPv6 address of the packet recipient.

Data – (variable number of bytes) The data part (payload) of the packet, starts immediately after the packet header (it is not really part of the packet header). The contents of the Payload Length field contains the number of bytes in the payload. If there are any extension packet headers, they constitute the first part of the packet payload, and their length is included in the Payload Length field.

Note: the following fields from the IPv4 packet header have been eliminated in the IPv6 packet header: Header Length, Identification (Fragment ID), Fragmentation Flags, Fragment Offset, Header Checksum, and Options. The value in the Payload Length field no longer includes the length of the standard packet header. The Flow Label field had no corresponding field in the IPv4 packet header.  Some of the missing fields (e.g. fragmentation information) have been pushed into an extension packet header. These exist

IPv6 Packet Fragmentation and Path MTU Discovery

The fields related to fragmentation are now found in the fragmentation extension header, which exists only in fragmented packets (no need to clutter up unfragmented packets, as in IPv4). In IPv6, only the originating node can fragment packets (no intervening node is supposed to do this). The originating node uses MTU Path Discovery to determine the “width” of the proposed path (the maximum packet size that it can handle). MTU stands for Maximum Transmitted Unit (maximum packet length). Any packets larger than that size must be fragmented before transmission by the originating node, and reassembled upon receipt by the destination node. There is a default packet size that any IPv6 node must be able to handle (1280 bytes). MTU Path Discovery allows the sender to determine if larger (more efficient) packets can be used. The originating node assumes the Path MTU is the MTU of the first hop in the path. A trial packet of this size is sent out. If any link is unable to handle it, an ICMPv6 Packet Too Big message is returned. The originating node iteratively tries smaller packet sizes until it gets no complaints from any node, and then uses the largest MTU that was acceptable along the entire path. This process takes place automatically in the Internet Layer. There is no corresponding mechanism in IPv4.

Extension Headers (new in IPv6)

After the main header, there can be zero or more extension headers, before the payload (actual packet data). This approach makes IPv6 highly extensible, for new functionality in years to come. Several extension headers are already defined, and doubtless more will be defined over time.

The first byte of each extension header contains a Next Header field, identical to the same named field in the main IPv6 packet header (using codes from RFC 1700). The second byte of each extension header contains a Header Extension Length field, which specifies the length of this header, in 8 byte units, not including the first 8 bytes. Thus every extension header is at least 8 bytes long, and is a multiple of 8 bytes in length. The following header (or data, if no more extension headers) will begin immediately

after the end of this extension header. This effectively defines a linked list (a data structure familiar to all programmers).

Here are some typical packet header sequences to illustrate how each chains to the next:

img29.png

img30.png

+---------------+------------------------

The basic Extension Headers are defined in RFC 2460, “Internet Protocol, Version 6 (IPv6) Specification”,

  • December 1998. These include the following:
  • Options Extension Header
  • Hop-by-hop Options Extension Header
  • Routing Extension Header
  • Fragment Extension Header
  • Destination Extension Header

Two extension headers are used for IPsec (IP layer security). The IPsec Authentication extension header (IPsec AH) is defined in RFC 2402, “IP Authentication Header”, November 1998. The Encapsulating Security Payload header (IPsec ESP) is defined in RFC 2406, “IP Encapsulating Security Payload (ESP)”, November 1998.

When multiple extension headers are used in a single packet, the following order should be followed:

  • IPv6 basic header
  • Hop-by-Hop Options header
  • Destination Options header (for options to be processed by more than just final recipient)
  • Routing  header
  • Fragment header
  • Authentication header
  • Encapsulating Security Payload header
  • Destination Options header (for options to be processed only by final recipient)
  • Upper Layer header (TCP, UDP or SCTP)

Hop-by-hop Options Header – used to carry optional information that must be examined by every node along a packet’s delivery path. This option is indicated by a Next Header value of 0.

Routing Header – used by an IPv6 source node to list one or more intermediate nodes to be “visited on the way” to a packet’s destination. This is similar to IPv4’s Loose Source and Record Route option. The Routing Header is identified by a Next Header value of 43.

Fragment Header – used by an IPv6 source to send a packet larger than would fit in the path MTU to its destination. In IPv6, packet fragmentation is performed only by the source node, which must use MTU discovery to determine the maximum packet size along the proposed path.  The Fragment Header is identified by a Next Header value of 44.

Destination Options Header – used to carry optional information that need to be examined only by a packet’s destination node(s). The Destination Options Header is identified by a Next Header value of 60.

For the specific details on each of the above header extension packets, see RFC 2460. The Authentication Header and ESP packet headers will be described later, under IPsec.

5.3.2.2 – IPv6 Addressing Model

In IPv6, addresses are 128 bits in length. They are simply numbers from 0 to about 340 undecillion (340 Trillion, Trillion, Trillion). In exponential notation, that would be 3.40 e+38. However you write it, that’s a really big number. For the convenience of humans, these numbers are typically represented in what I call coloned hex notation (as opposed to the dotted decimal notation used with IPv4). This splits the 128 bit addresses into eight 16-bit fields, and then represents each field with a hexadecimal (base 16) number  from 0 to ffff (you can use upper or lower case for the hexadecimal digits A-F, but it is common practice in IPv6 to use lower case). These hexadecimal numbers cover all possible 16 bit binary patterns from 0000 0000 0000 0000 to 1111 1111 1111 1111. The hexadecimal numbers are separated by colons (“:”). Leading zeros can be eliminated in each field. At most one run of zeros can be replaced by the double colon, “::”. The following are all valid IPv6 addresses written in coloned hex notation:

img31.png

Some people are aware that you can use IPv4 addresses instead of nodenames in web URIs, for example: http://123.45.67.89/main.html. You can also use IPv6 addresses, but because colons demark other things in URIs (such as non-standard port number), you cannot use IPv6 addresses “as is”, you must enclose them in square brackets ([]). For example, http://[2001:df8:5403:3000::d]/nagios is a valid URI that includes an IPv6 numeric address.

In certain cases, the size of the subnet is specified after the address, similar to CIDR. This is especially common when representing prefixes, for example:

img32.png

When an RIR (e.g. APNIC) allocates a “/32” block of addresses to an ISP they assign the first 32 bits of those addresses, based on the next available “/32” block from the unallocated pool at that time.  A “/32” block contains 65,536 “/48” blocks to allocate to customers. If the ISP allocates all of those, then the RIR will give them a new “/32” block, each address of which will have a completely different first 32 bits from the addresses in the previous “/32” block given to the ISP. The most significant 32 bits (bits 1 to 32) of every address in a given “/32” block will all be the same. All smaller blocks (like “/48” or “/64”) carved out of that “/32” by the ISP will have the same first 32 bits as the other addresses in the parent block.

When an ISP allocates a “/48” block for a customer, from their “/32” block, the next 16 bits (bits 33 to 48) are chosen by the ISP, so that the first 48 bits will be unique to that customer. The first 48 bits of every address in a “/48” block given to an organization will all be the same, but will be different from the  first 48 bits of the addresses in any other “/48” block in the world.  You can think of this 48-bit sequence as the organization prefix. When a customer deploys subnets, they choose a 16 bit value (unique within their organization) for each subnet, which together with the organization’s 48 bit prefix, creates a globally unique 64 bit prefix for a working subnet. This can be used to manually configure 128-bit addresses for nodes on that subnet, or can be configured on the Router Advertisement Daemon that supplies prefixes to nodes in that subnet for Stateless Address Autoconfiguration. If using stateful DHCPv6, the administrator can also create pools of addresses for assignment, where each 128-bit address in a pool has that same 64 bit subnet prefix.

IPv6 Packet Transmission Types

In TCP/IPv4, there were several packet transmission types (unicast, anycast and multicast). IPv4 Multicast uses Class D addresses, while all other addresses are unicast (or reserved). There is no real concept of scope in IPv4 (the part of the network in which a given address is valid and unique). IPv4 “Private Addresses” are a step in this direction, but IPv6 defines real scope rules for certain kinds of addresses. These concepts are defined in RFC 4291, “IP Version 6 Addressing Architecture”, February 2006. Note: in Windows, “ping” is used for both IPv4 and IPv6. In Linux and BSD, the “ping” command is used just for IPv4 – in IPv6, the command is “ping6”. In the following, I use just the generic “ping”, but be aware that for IPv6 on some platforms, “ping6” would actually be used.

IPv6 Address Scopes

The scope of an address specifies in what part of the network it is valid and unique. The defined scopes in IPv6 are:

Node-local – valid only within the local node (e.g. loopback address)

Link-local – valid only within a single network link. All such addresses start with the ten bits “1111 1110

10” followed by 54 bits of 0 (fe80::/64). When specified in commands, you usually must follow a link- local address with “%” and the interface ID of the link it is connected to. In FreeBSD, this might be something like “fxp0”, so to ping a link local address, you might use the command:

ping fe80::3c79:b2ca:90ce:5d59%fxp0

In Windows, interface IDs are numbers, so a ping command there might look like:

ping fe80::3c79:b2ca:90ce:5d59%11

Site-local – valid only within a “site”. They start with the 10 bits “1111 1110 11” (fec0::/10). These were intended to be like IPv4 RFC 1918 “private addresses”, but are no longer used as of RFC 3878, “Deprecating Site Local Addresses”, September 2004.

Global – valid anywhere on the IPv6 Internet. Global unicast addresses are in the 2000::/3 block. When you specify global addresses, there is no need to append the interface ID, so a ping command for such an address might look like:

ping 2001:df8:5403:3000::c

IPv6 Address Types

A unicast address specifies a single network interface (destination address). Currently, all global unicast addresses are in the 2000::/3 block. There are also link local unicast addresses, in the fe80::/10 block. The global unicast address type is defined in RFC 3587 “IPv6 Global Unicast Address Format”, August 2003. This RFC deprecates (makes historic) the “Top Level Aggregator” and “Next Level Aggregator” (TLA/NLA) scheme previously defined for global unicast addresses, and formalizes the 48-bit organization prefix, 16-bit subnet number and 64-bit interface identifier concept used today.

img33.png

There are two special unicast addresses:

::            (all bits zero) – the unspecified address, must never be assigned to any node

::1          (127 zeros followed by a 1) – the loopback address for IPv6 (corresponds to 127.0.0.0 in IPv4)

When Site-local scope was deprecated, a new address type called Unique Local Unicast was defined in RFC 4193, “Unique Local IPv6 Unicast Addresses”, October 2005. These addresses are in the fc00:/7 block. The first 7 bits are “1111 110”. The 8th  bit is called “L”. If L = 1 the address is locally assigned (L = 0 is reserved for future use). The next 40 bits are a Global ID that insures the global uniqueness of the overall address. It is generated pseudo-randomly, and must not be sequential. The next 16 bits are a subnet ID and the final 64 bits are an interface ID (just like in global unicast addresses). Perhaps someday there will be a way to reserve specific Global IDs from a central authority (to prevent anyone else from using one you have chosen), but no such mechanism exists today. These addresses have much the same semantics as the IPv4 private addresses.

img34.png

An anycast address can specify any of a group of addresses (usually on different nodes). A packet sent to an anycast address will be delivered to exactly one of those interfaces, typically the “nearest” one (in the network sense, not geographic sense). Anycast addresses look just like unicast addresses, and differ only in being injected into the routing protocol at multiple locations in the network.

A multicast address specifies multiple network destinations (multiple nodes can be configured with the same multicast address). A packet sent to a multicast address will be delivered to all nodes that have been assigned that address. Multicast addresses all have the special prefix ff00::/8 (the first 8 bits of multicast addresses are all ones). After the first 8 bits, there are 4 bits of flags (0,0,0,T). If T=0, the address is a “Well Known” address assigned by IANA. If T=1, then the address is a non-permanently assigned (“transient”) address. The scope is specified in the next 4 bits, followed by 112 bits of group ID:

img35.png

img36.png

There are several multicast scopes defined by the four scope bits. All other combinations are unassigned.

0 reserved

1 interface-local scope

2 link-local scope

3 reserved

4 admin-local scope

5 site-local scope

8 organization-local scope

E global scope

F reserved

The following multicast groups are “well known” (T=0):

1 node

2 router

5 OSPF IGP router

6 OSPF IGP Designated router

9 RIP router

a EIGRP router b     mobile agent

d PIM router

16 MLDv2 capable router

fb DNS server

101 NTP server

108   NIS+ server

1:3   DHCP server

As there are 112 bits for group ID, there are 2112  (about 5.19 e+33) possible multicast groups. That is enough for the entire world, for quite some time to come. You can think of a multicast group as similar to a TV channel number. As examples, the following multicast addresses are all valid (and are all “well known”):

img37.png

Note that with the scopes larger than the local link, multicast addresses must be specifically configured on nodes (you have to “subscribe to that channel”). If you ping the multicast address ff0e::1, you are not  going to get a response from every node on earth, unless you can first talk everyone into adding that address to their nodes. Even then, various routers along the way would probably block that packet. An organization’s routers enforce the scope rules so that link-local multicast addresses will not cross any routers, organization-local multicast addresses will not cross the organization’s border router, but global multicast addresses will cross any router (in the real world, this is actually managed by the MLD – the Multicast Listener Discovery protocol and PIM, the Protocol-Independent Multicast protocol).

A solicited node multicast address is a special multicast address (addressed to all nodes on the local link) created from a global unicast address by appending the least significant (rightmost) 24 bits of the unicast address to the special prefix ff02:0:0:0:1:ff::/104. For the global unicast address

2001:df8:5403:3000:3c79:b2ca:90ce:5d59

the solicited node multicast address is:

ff02::1:ffce:5d59

These addresses are used by ND (the Neighbor Discovery protocol) in the process of mapping IPv6 addresses to Link Layer (MAC) addresses.

There is no broadcast address in IPv6, but a multicast to the all nodes on the local link multicast group ff02::1 will have pretty much the same result.

Perhaps someday there will be a central authority to coordinate use (and allow reservation) of multicast group IDs. No such authority currently exists. Once IPv6 multicast broadcasters start making their programming available over large regions (or even worldwide), such coordination will be necessary, and corresponds to the FCC’s management of broadcast frequencies that prevent stations from interfering with each other. Because the number of potential group IDs is so large (2112  or about 5.19 e+33), for now, choosing them randomly is sufficient. The probability of any two randomly generated group IDs being the same is quite low, even with millions of people using this scheme. You might think of these group ID’s as being in some sense channel numbers as found today on TVs. I can envision a search engine that would allow you to find multicast channels associated with programming that caters to specific tastes, such as Bollywood Music Videos over IPTV.

Special Case: IPv4 Compatible IPv6 Addresses (Now Deprecated)

The entire 4.3 billion addresses of IPv4 are mapped into the IPv6 address space, not just once, but twice. Once as IPv4 compatible IPv6 addresses (::w.x.y.z), and a second time as IPv4-mapped IPv6 addresses (::ffff:w.x.y.z).

The addresses in the first special block all start with 96 bits of 0, followed by a 32 bit IPv4 address (which can be specified in dotted decimal). When you send traffic to an IPv4-compatible IPv6 address, it is sent as an IPv6 packet, but encapsulated with an IPv4 header, with the protocol field of the IPv4 packet header set to 41 to indicate that the payload is an IPv6 packet. The IPv4 header allows the traffic to travel across an IPv4-only infrastructure. Upon receipt, the packet payload (the IPv6 packet) is passed to the IPv6 protocol. This is called automatic IPv6 tunneling over IPv4 networks (defined in RFC 2893, “Transition Mechanisms for IPv6 Hosts and Routers”, August 2000).

IPv4 compatible IPv6 addressees were deprecated in RFC 4291, “IP Version 6 Addressing Architecture”, February 2006. No current transition mechanism uses them. New implementations are not required to support these addresses. Note however that two special addresses that are widely used actually fall into this range, the “unspecified” address (all zeros, or “::”), and the loopback address, (“::1”).

Special Case: IPv4-Mapped IPv6 Addresses (Still Valid but Not Recommended)

The addresses in the second special block of addresses all start with 80 bits of 0 (0:0:0:0:0), followed by 16 bits of 1 (ffff), then a 32 bit IPv4 address (which can be, but does not have to be, specified in dotted decimal). When such an address is used on a dual stack node that supports IPv4-mapped IPv6 addresses, it causes an IPv4 packet to be sent using the last 32 bits of the IPv4-mapped IPv6 address, as the IPv4 address. As an example, on a Windows 7 node configured with dual stack, you can ping an IPv4 node as usual with the command:

C:Userslhughes>ping 10.1.0.14

Pinging 10.1.0.14 with 32 bytes of data:

Reply from 10.1.0.14: bytes=32 time<1ms TTL=64

Reply from 10.1.0.14: bytes=32 time<1ms TTL=64

Reply from 10.1.0.14: bytes=32 time<1ms TTL=64

Reply from 10.1.0.14: bytes=32 time<1ms TTL=64

You could ping the same IPv4, by using an IPv4-mapped IPv6 address, as follows. The ping command would first view the address as a valid IPv6 address, and create an IPv6 socket as usual. The IPv6 socket would look at the IPv6 address, realize it is an IPv4-mapped IPv6 address, then hand the operation over to the IPv4 stack to handle, using the low 32 bits of the IPv4-mapped IPv6 address. Normal IPv4 packets would be sent from the IPv6 socket, indistinguishable from the IPv4 packets sent in the example above.

C:Userslhughes>ping ::ffff:10.1.0.14

Pinging 10.1.0.14 with 32 bytes of data:

Reply from 10.1.0.14: bytes=32 time<1ms TTL=64

Reply from 10.1.0.14: bytes=32 time<1ms TTL=64

Reply from 10.1.0.14: bytes=32 time<1ms TTL=64

Reply from 10.1.0.14: bytes=32 time<1ms TTL=64

In general you can do any I/O operation to an IPv4 node using IPv4 packets, from an IPv6 socket, by using these IPv4-mapped addresses (on nodes where this is supported). Some operating systems (e.g. OpenBSD) don’t support this kind of “cross-stack” operation at all. On some operating systems (Linux, NetBSD, FreeBSD) this mode is disabled by default, but can be enabled by including the following line in

/etc/rc.conf:

ipv6_ipv4mapping=”YES”

In general, it is best to avoid use of these addresses since support varies from operating system to operating system, behavior is implementation dependent, and there are potential vulnerabilities if it is enabled. It was originally intended as a transition mechanism, but it caused more problems than it solved, so it is better left unused, and ideally, disabled.

Simple IPv6 Address Assignment Scheme (for Manually Assigned Addresses)

The following is not part of any standard, IETF or otherwise. It is a best-practices recommendation, which may help you in migration to IPv6.

Many administrators have adopted a simple scheme for assigning IPv6 addresses manually to nodes, based on existing IPv4 address conventions or actual addresses. It could be argued that it can lead to confusion (by humans) between decimal and hexadecimal. It uses the same numeric digits that are currently used in your IPv4 scheme, to create what are really hexadecimal fields. It is possible to use the numeric digits (0 to 9) to create up to three hex digits in each of the four 16-bit groups in the IPv6 interface identifier. The resulting address may look strange in binary, but this scheme will make it easier for you to keep track of your IPv6 nodes, and is especially useful in dual stack networks, where you can use what appears to be the “same” address (not counting the prefix) on a given node, in both IPv4 and IPv6.

As an example, say our 48-bit organization prefix is 2001:df8:5403::/48. Let’s also say we have four subnets (independent links) for IPv4, so we would also have four subnets for IPv6. Let’s arbitrarily assign the IPv6 subnet numbers as 3000, 3100, 3200 and 3300 (all hex) for these subnets. Choose any values you want for subnet numbers (when setting up your network architecture) – you have 65,536 (from 0000 to ffff) to play with. The following IPv4 addresses from these subnets could be assigned the corresponding IPv6 addresses:

img38.png

Alternatively, It is also possible to use just the interface identifier part of the IPv4 address (“node number within subnet”)as the IPv6 interface identifier, in which case, the above addresses would be:

img39.png

The mapping for the 172.31.25.32 address may confuse you – this is because a /12 subnet mask length divides the second 8 bit field right (31) in the middle (4 bits of it are network address, and 4 bits are interface identifier). This is why using dotted decimal for IPv4 was a bad idea, and hexadecimal is used in IPv6. This can get even more confusing with very odd subnet lengths, like /19. The following should clear things up:

img40.png

img41.png

Using only the IPv4 interface identifier is less likely to produce addresses that collide with automatically generated addresses, but requires good understanding of IPv4 subnetting (see above). Use whichever scheme makes the most sense to you, but try to be consistent.

The Simple IPv6 Address Assignment Scheme can also be used to manually assign link-local addresses. In this case, there is no IPv6 subnet number, because each address is valid only within a subnet. The following link-local addresses could be assigned to the above nodes:

img42.png

As with global unicast addresses, you could use just the interface identifier part of each IPv4 address, which would result in the following manually assigned IPv6 link-local addresses:

img43.png

Note that the address 123.45.67.1/24 and 192.168.0.1/16 would both produce fe80::1 as the equivalent IPv6 address, but this would not produce a conflict since they are in different subnets, and link-local addresses are valid only within a single subnet.

Obviously no addresses generated with SAA will use this convention, although you should be careful to make sure there are no conflicts between addresses you create and automatically generated addresses. Duplicate Address Detection during automated address creation should detect such conflicts. On the other hand, you can easily create DHCPv6 address pools that will be consistent with these schemes.

Warning: There is a perfect valid (but not often used) textual representation of IPv6 addresses that would allow you to use the exact same bits as a 32 bit IPv4 interface identifier, and even specify those 32 bits in dotted decimal. However, it mixes hexadecimal and decimal numbers, plus colons and dots in a single address representation, which to me is extremely confusing and inelegant. It represents the first 96 bits of an address in coloned hex notation, and the last 32 bits of that address in dotted decimal notation. When you use this mixed notation, you must always specify all four dotted decimal fields, and they must be the least significant 32 bits. It is possible that some software applications will not accept this representation. Also, many things that report addresses (e.g. ipconfig) have no way to know to display some addresses in mixed notation and others in regular coloned hex notation, so they just display all addresses in coloned hex notation. This can lead to confusion. As examples of addresses with this mixed notation, the above IPv4 addresses would have corresponding IPv6 addresses that look like this:

img44.png

img45.png

I recommend that you avoid use of this mixed notation altogether. If you use the Simple IPv6 Address Assignment scheme, be very careful to use colons (not dots) between all fields, as software that understands the mixed address syntax will interpret addresses with dots in the last 4 groups as perfectly valid “mixed” notation. This will result in some odd problems. The mixed notation was really intended for use with IPv4-mapped IPv6 addresses, but it works anywhere. You should never create addresses using it, but you need to know about it in case you see addresses written in it by someone else.

Multiple IPv6 Subnet Numbers on a Single Network Link

A single network link can actually have addresses with more than one 16 bit subnet number at any given time. For example, the prefix 2001:df8:5403:1600::/64 may be used with stateless auto configuration, while the prefix 2001:df8:5403:1601::/64 could be used with stateful auto configuration using DHCPv6 on the same network link. You could also have manually assigned addresses using a third prefix (e.g. 2001:df8:5403:1602::/64) on the same network link. Addresses with different subnet numbers, but the same interface identifier, are not in conflict. Normally, you only broadcast one 64-bit prefix with Router Advertisement messages onto a given network link, so all address created with stateless auto configuration in a given subnet will have only that one 64-bit prefix. It is possible in some implementations to advertise up to 100 prefixes on each network link. If multiple prefixes are advertised, there will still be only one default gateway, which is the link-local address of the gateway that is sending Router Advertisement messages. Another alternative is to define a subnet size greater than /64 on a single network link that includes all of the desired subnet numbers. With a “/60” subnet, you can actually have 16 sequential /64 subnet numbers in a single network link (the first subnet number has to be an integral multiple of 16). This is called supernetting.

Multiple IPv6 Addresses on a Single Node

Unlike with IPv4, it is normal for IPv6 nodes to have multiple valid addresses. They don’t even all have to have the same subnet number (if you are running multiple subnet numbers on a single link). A single node could have addresses with each of the above 64-bit prefixes (or even multiple manually assigned addresses) at any given time. It could also have various multicast addresses. One of the unicast addresses (chosen at random) will be used as the source address of packets sent by that node, but incoming packets addressed to any of the addresses owned by the node will be accepted.

A host is required to recognize any of the following addresses as referring to itself. Any node has most of these by default without anyone having to assign them. The default link-local address is created with Stateless Address Autoconfiguration even if there are no Router Advertisement messages. Solicited- Node Multicast addresses are created and assigned automatically when unicast or anycast addresses are assigned.

  • The Loopback Address (::1) – always present
  • The All-Nodes Multicast Addresses (ff01::1, ff02::1, etc) – only the “on node” and “on link” scoped multicast addresses are created automatically – ones with larger scope must be specifically assigned to each node that you wish to accept such addresses.
  • The automatically generated link-local unicast address
  • Any additional Unicast and Anycast Addresses that have been assigned to any of the node’s interfaces, manually or automatically
  • The Solicited-Node Multicast Address for each of its unicast and anycast addresses (created automatically for you when the corresponding unicast or anycast address is assigned)
  • Multicast Addresses for all other groups to which the node has subscribed

A router (gateway) is required to recognize all addresses that a host is required to recognize, plus the following special addresses for routers, as identifying itself:

  • The Subnet-Router Anycast Address for all interfaces for which it is configured to act as a router
  • All other Anycast Addresses with which the router has been configured
  • The All-Routers Multicast Addresses (ff01::2, ff02::2, ff05::2)

Automatically Generated Interface Identifiers based on EUI-64

By default, Stateless Address Autoconfiguration will create a link-local address (fe80::w:x:y:z). If there is a Router Advertisement daemon configured and running on the link, the node will also automatically create a global unicast address by using the 64-bit subnet prefix form the Router Advertisement message. It can generate the interface identifier (low 64 bits) either from the node’s MAC address (using EUI-64), or can use a random 64 bit value. This is described in RFC 4291, “IP Version 6 Addressing Architecture” and RFC 2464 “Transmission of IPv6 Packets over Ethernet Networks”.

An EUI-64 address is created by taking the first 24 bits of the MAC address (the Organizationally Unique Identifier), setting the 7th  bit of this to 1 (counting rightward from the Most Significant Bit), appending the 16 bit value FFFE, then appending the last 24 bits of the MAC address (the device identifier). Hence, the 48 bit MAC address

00-18-8B-78-DA-1A

produces an EUI-64 identifier of

0218:8BFF:FE78:DA1A

This is a reversible mapping, so given an EUI-64 identifier, it is trivial to determine the MAC address of the node (discard the FFFE in the middle 16 bits and clear the 7th  bit of the remaining 48 bit value). Note: the 7th  bit in the first byte of all valid Organizationally Unique Identifiers, hence of all MAC addresses, will always be 0.

One of the security advantages of IPv6 is supposed to be that the number of possible addresses in a subnet (264) is so large that it is impractical to scan all of them to discover all of the nodes on a subnet (this is called mapping a subnet). If EUI-64 interface identifiers are used, there are so few of these (in comparison to the total possible number of interface identifiers) that it is possible to scan for them (especially with the knowledge of which Organizationally Unique Identifiers are actually in use, which is not difficult to determine).

Randomized Interface Identifiers

There are several privacy concerns related to using addresses with EUI-64 interface identifiers. One is the ability for a hacker to create a map of all nodes on the subnets via scanning. It would also be possible to identify any person’s traffic at any point through which the traffic flows, if you know the MAC address of their network interface. You could certainly associate various traffic flows that all have the same MAC address as coming from a single node. Normally MAC addresses never leave your LAN. With EUI-64 based IPv6 unicast addresses, MAC addresses can go anywhere in the world. Fortunately, there is a way to generate a random interface identifier instead of using the EUI-64 identifier. This is defined in RFC 4941, “Privacy Extensions for Stateless Address Autoconfiguration in IPv6”, September 2007. The randomized identifier even changes automatically over time. I may have had that address yesterday, but today I’ve got a completely different one!  Interface identifier randomization is enabled by default in Windows 7, but it can be enabled or disabled with the following commands: netsh interface ipv6 set global randomizeidentifiers=enabled netsh interface ipv6 set global randomizeidentifiers=disabled The reason you might want to disable randomization is that some servers will only accept a connection from nodes for which they can perform a reverse DNS lookup. This often will fail with randomized identifiers. Note that use of randomized interface identifiers can make it very difficult to determine to whom specific traffic in a log belongs, unless a record is kept of randomized interface identifiers used by each node.

When a randomized address changes, the old address is kept around for some time, but marked as deprecated, which means your node will not use it for further outgoing connections. You should accept incoming replies addressed to a deprecated address until it becomes invalid, which it eventually will be. Since you aren’t making new outgoing connections with it, replies to it will cease fairly quickly.

Addresses with randomized interface identifiers are used primarily for outgoing connections (and replies thereto). A node that can accept incoming connections from anyone should have (possibly in addition to other addresses) a static (unchanging) unicast address which is published in DNS. This would be used by other nodes that want to connect to it. A node that only ever makes outgoing connections need not have such a static address assigned to it, and there is no need to publish its name and IPv6 address in DNS (at least not in your external DNS). Remember in IPv6, it is much more likely that other nodes will be connecting to your node (for VoIP, VPNs, P2P, etc). The age of NAT (and one-way connectivity) is over.

IPv6 Address Allocation

The standard allocation block to be given to organizations is a “/48”, which is 65,536 subnets, each of which is a “/64” block consisting of 264  or about 18 billion, billion addresses (about 4 billion times the total number of addresses in the First Internet). Some ISPs may choose to allocate only a single “/64” block to individuals or home users, who have no need for multiple subnets. It is not practical to allocate only a single IPv6 address (a” /128” block) to a user, due to the fact that nodes often create new addresses. One “/48” block will supply 65,536 individuals or homes with “/64” blocks. Perhaps I’m a bit unusual, but I already have two subnets in my home today (one dual stack, one IPv6-only). Who knows, I might have a bunch someday! My company has a “/48” (2001:df8:5403::/48) which we divided into 16 “/52” sub-blocks, each of which have 4096 subnets. I have one of these “/52” sub-blocks (subnets 3000  to 3fff) routed to my house. That should just about take care of me for some time to come. A single “/64” block should work for most home users.

ISPs are allocated really big “/32” blocks of addresses, which are enough to allocate “/48” blocks for up to 65,536 customers. Should they use up an entire “/32” block, there are plenty more “/32” blocks where that one came from (about 536 million of them just in the 2000::/3 block marked for allocation). The RIR’s (ARIN, RIPE, APNIC, LACNIC and AfriNIC) will be happy to give an ISP all they can use. If you assume there are 7 billion people alive, there are over 5000 “/48” blocks for every human alive, just out of the 2000::/3 range currently marked for allocation. It is extremely unlikely that any single human will ever be able to use any appreciable percentage of their “fair share” of addresses, let alone have the IANA run out. The folks in Taiwan say they want to connect 3 billion devices to the Internet in the next couple of years. This would take ¾ of the entire First Internet’s address space, but could be handled with a tiny fraction (less than 1 billionth) of a single “/64” block with IPv6, should they want to have them all in one block for some bizarre reason. It will be quite a while before anyone worries about IPv6 address space exhaustion (famous last words?)

The People’s Republic of China believes that they were cheated out of sufficient IPv4 addresses to participate fully in the First Internet. By the time China started deploying TCP/IPv4, if they had taken all of the remaining addresses, over 90% of the people there would not have gotten one. The First Internet recently passed an interesting threshold. There are now more Chinese speaking users on it than English speaking users. If you recall the chart of allocated addresses in section 3.3.2.2 of this book) the U.S. has over 43% (28% ARIN + 15% Legacy, both of which are mostly U.S. users) of the total IPv4 address space for less than 5% of the world’s population. In comparison, APNIC, which includes China, India and several other populous countries (all together about 50% of the world’s population), has only 16% of the IPv4 address space. When the IPv4 addresses are all gone in September 2011, APNIC will probably still have less than 20% of the IPv4 address space (about .28 addresses per person), while the U.S. will probably have about 45% (about 6.4 addresses per person). However, note that about 1/3 of that 45% are held by less than 50 organizations (like M.I.T, Apple, HP, etc). The distribution of addresses in the First Internet was (and remains) anything but equitable. It’s really pretty much impossible to do anything about that now. We’re doing it right in the Second Internet.

Should We Reserve Some IPv6 Addresses for Developing Nations?

There has been talk from the ITU (International Telecommunication Union) about reserving some IPv6 address space for developing nations to make absolutely certain that nobody ever gets left out again, as has happened in the First Internet. There are so many IPv6 addresses that there is essentially no chance of this ever happening. The ITU might as well try to reserve a few trillion grains of sand (maybe a dump truck’s worth) to make sure that every country can be assured of getting their fair share of them. The total number of IPv6 addresses is on the same general scale as the number of grains of sand on Earth.

Note that block 2000::/3 (which you can also think of as blocks 2000::/16 through 3fff::/16) is currently the only part of the overall space marked for unicast address allocation. This is only 1/8 of the total IPv6 address space. Even so, this is still 2125, or about 4.15 e+37 addresses. You can also view this as 245 (about 35.2 trillion) “/48” blocks, or just over 5000 “/48” blocks per human alive in 2010 (using worldwide population as 7 Billion). Should we ever use this up, there is still at least 5.5 times that much space not used for anything (from 4000::/16 to efff::/16) that we could allow for additional allocation.

I personally don’t think there is any reason to reserve a special block of addresses for anyone, including developing nations. Unlike with IPv4, there are plenty of addresses for everyone this time around.

The People’s Republic of China (and every other country) will have plenty of addresses in the Second Internet, and this is one reason they are investing so heavily in it. India is now determined to deploy IPv6 nationwide, and should have quite a bit deployed by the end of 2010. The inequitable distribution of addresses in the First Internet may also account for some of the lack of urgency to migrate to the Second Internet in the United States. Unfortunately, it is not simply of matter of still having enough IPv4 addresses. Imagine if the U.S. stayed with Standard Definition NTSC TV, while the entire rest of the world went with globally standard High Definition TV. The U.S. would not be able to export their programming to anyone else, nor import programming from the rest of the world. If they choose to stay with IPv4, they will be isolating themselves in some very serious ways. It’s not completely ridiculous to think that the U.S. might decide not to deploy the Second Internet. Look what happened with the metric system. If IPv4 is “riding horses” and IPv6 is “driving cars”, you don’t need to wait until the last horse dies before you get a car. The “cars” (IPv6) are ready and widely available today. Those who adopt cars first will leave those still riding horses way behind. I’d suggest you migrate to IPv6 as soon as possible. Countries that master it and start creating products and applications based on it will have a giant head start in the 21st  century over those who wait until the last possible minute.

How Is the Entire IPv6 Address Space Divided Up?

Here are the official allocations of the IPv6 address space as of 13 May 2008 (from IANA), along with the RFCs that allocated the blocks listed:

img46.png

img47.png

The referenced RFCs are:

  • RFC 3879 – “Deprecating Site Local Addresses”, September 2004 (affects FEC0::/10)
  • RFC 4048 – “RFC 1888 is Obsolete”, April 2005 (dropping mapping of OSI addresses)
  • RFC 4193 – “Unique Local IPv6 Unicast Addresses”, October 2005
  • RFC 4291 – “IP Version 6 Addressing Architecture”, February 2006

The 6bone was an early worldwide IPv6 testbed. It used addresses from 3ffe::/16  (as per RFC 2471, “IPv6 Testing Address Allocation”, December 1998). These have since been returned to the overall allocation pool as per RFC 3701, “6bone (IPv6 Testing Address Allocation) Phase-out”, March 2004, once the 6bone had served its purpose and was shut down. Interestingly, some addresses from this block still show up on the IPv6 backbone.

Currently, the RIR’s have the following number of “Default Free Prefixes” that actually have traffic on

the backbone:

img48.png

Here are the top ten countries plus a few from Asia (from SixXS, 24 Jan, 2010) ranked by number of IPv6 prefixes allocated. “V” means visible (actual traffic detected), “A” means allocated (obtained from an ISP or RIR), and “VP” is the percentage of all allocated blocks that are visible (total for the world would be 100%).

img49.png

img50.png

Note that this data does not reflect actual number of addresses or volume of traffic, just the number of distinct 48 bit prefixes, which is a rough indication of the number of organizations investigating IPv6. Much of this in the U.S. is probably research or academic. As percentages of the gigantic total number of “/48” blocks available for allocation, all of these are essentially zero (pretty much all of the2000::/3 IPv6 address space is still available for allocation). This is more an indication of the colossal size of the IPv6 address space, than of any lack of interest or activity.

Classless Inter-Domain Routing (CIDR)

There is no reason to define CIDR for IPv6, because it was done in IPv4 only to extend the lifetime of the IPv4 address space long enough for IPv6 to be fully developed, which has now happened. There is no need to extend the lifetime of the IPv6 address space. If IPv6 had been ready, and we had migrated to it in the mid 1990s, we would never have had to suffer through the complexities brought about by CIDR and NAT. The reason we are having to deal with these issues today is that we have already stayed with IPv4 far too long. Imagine trying to do serious work today with an 8 bit processor and 64K bytes of RAM.

Network Ports

Network ports work exactly the same way under IPv6 as they do in IPv4. There are still 65,536 of them associated with every IPv6 address. They could have gone to 32 bit port numbers (yielding 4.3 billion ports for each address), but this would have required even more changes in packet headers and other places, so this was not done. 65,536 is plenty for almost any need, especially since you can assign any number of global unicast addresses to a single interface (each of which has 65,536 ports). The same Well Known Port numbers are used in IPv6. The only difference is that you will never see port numbers on IPv6 addresses being shifted by a NAPT gateway, since there is no NAT for IPv6 to IPv6. Note that a given port being used over IPv4 does not prevent it from being used by the same or even a different application, over IPv6 (and vice versa).

5.3.2.3 – Subnetting in IPv6

There is no CIDR in IPv6 (although the CIDR “slash notation” is still used). As a result, subnetting is much simpler in IPv6. All subnets are “/64”. The only exception is if you do supernetting (e.g. a “/60” subnet) to allow multiple “/64” blocks to be used on a single network link. This will likely only be done in large,  advanced corporate networks, so most network engineers will never see anything but “/64” subnets. The only reason for doing this might be to use different “/64” subnets for specific purposes, such as 1000 for SAA, 1001 for DHCPv6 assigned addresses and 1002 for manually assigned addresses. If you use EUI-64 interface identifiers for SAA, it is not difficult to partition a single “/64” so there will be no overlap between SAA, DHCPv6 and manual assignments. If you use random interface identifiers, they may fall anywhere a “/64” address space. However, the probability of one colliding with an address assigned manually or via DHCPv6 stateful mode is incredibly low, and Duplicate Address Detection should prevent the odd collision. Having at least two “/64” subnets in a single network (one for SAA, one for manual and DHCPv6 assigned addresses) removes all possibility of an address collision.

Each subnet needs to be at least a “/64”, since EUI-64 can generate “node within subnet” values that are 64 bits long. Randomized interface identifiers are also 64 bits in length. But a “/64” subnet is already larger than any organization could conceivably use (18 billion, billion addresses). There are so many “/64” blocks in a single “/48” (65,536) that we can use them even for subnets between a border router and a firewall, which have only two addresses. There is never an excuse to use any subnet smaller than a “/64”, although I have seen some old-school IPv4 trained administrators allocate “/124” IPv6 subnets for the link between a border gateway and firewall case (in IPv4, tiny subnets like /30 would be used in such a case). Old habits die hard. After living with increasing scarcity with IPv4 addresses, it is hard for some of us to realize that there are PLENTY of addresses this time around.

5.3.2.4 – Link Layer Addresses

The software in the Application Layer, the Transport Layer, and the Internet Layer of the TCP/IPv6 stack think in terms of IP addresses. But the Link Layer (and the hardware) thinks in terms of MAC addresses. In IPv6 the mapping from IPv6 address to Link Layer (MAC) address is done with the Neighbor Discovery protocol. Note that in this book, I often use the terms Link Layer Address and MAC address interchangeably.

NOTE: A Link-Layer address is a “MAC address” only for Ethernet based network hardware (and a few others), so when I use the term MAC address, think “physical layer address for the actual network hardware in use”. The term Link-Layer address is more accurate (a MAC address is just a special case of Link Layer Address), but it is easy to confuse it with the similar sounding term link-local address. Just realize that if the actual network in use is not Ethernet there may be some other name for the physical layer addresses that IP addresses have to be mapped onto, and it may not look anything like the 48 bit MAC address.

IPv6 addresses are not actually used at the lowest layer of the TCP/IPv6 network stack (the Link Layer). The 48 bit MAC addresses covered in the TCP/IPv4 chapter still exist and are used the same way at the Link Layer (at least for Ethernet networks).

Neighbor Discovery Protocol (ND)

There is no ARP (Address Resolution Protocol) in IPv6. The new ND (Neighbor Discovery) protocol which is defined in RFC 4861 “Neighbor Discovery for IP version 6 (IPv6)”, September 2007, accomplishes the same thing, and many other functions as well, including:

  • Router Discovery: hosts can locate router(s) residing on any link it is attached to
  • Prefix Discovery: hosts can discover the correct 64 bit prefix for any link it is attached to
  • Parameter Discovery: hosts can determine the correct IPv6 parameters, for any link it is attached to, such as MTU
  • Stateless Address Auto configuration (SAA): hosts can automatically obtain a link local address; and, if a Router Advertisement Daemon exists, also a global unicast address
  • Address Resolution: mapping IPv6 addresses to MAC addresses (as the replacement for ARP)
  • Next-hop Determination: hosts can determine next-hop router for a given destination address
  • Neighbor Unreachability Detection (NUD): determine that a given neighbor is no longer reachable on any attached link (there is no corresponding IPv4 functionality)
  • Duplicate Address Detection (DAD): hosts can determine if a proposed address is already in use
  • Redirect: router can inform a host about a better (or working) first-hop

There are five ICMPv6 messages that ND uses to accomplish these things:

  • Router Solicitation– Request a Router Advertisement message
  • Router Advertisement – Router advertises the 64-bit prefix and parameters for a link, usually sent by a Router Advertisement Daemon living in a gateway router or firewall. The Router Advertisement Daemon can send different information into each attached link, if there are multiple links. This also tells nodes whether or not there is a DHCPv6 server available.
  • Neighbor Solicitation– any node can say “howdy neighbor” to another node to see if it responds
  • Neighbor Advertisement– response to a “howdy neighbor” message from someone else
  • Redirect– A router can inform any node that there is a better first-hop available than one it has just tried (“there’s a bridge out along that road, try going down this road”), based on its discovered knowledge of the surrounding network

By the way, some people us “NDP” as the acronym for the Neighbor Discovery protocol (see Wikipedia). If you read the RFCs, the creators of the protocol use just “ND”, so we will use that convention in this book. The acronyms of some protocols include the “P” (for Protocol) in the acronym (e.g. TCP), while others don’t (like MLD). I follow the conventions used in the RFCs.

IPv6 Router Advertisement messages carry link-layer (MAC) addresses, so no additional packet exchange is required to resolve the router’s link-layer address. They also carry prefixes, so no separate mechanism is needed to configure a netmask.

By using link-local addresses to uniquely identify routers, hosts can maintain router associations. This capability is necessary for Router Advertisements, and for redirects. Hosts need to maintain router associations if the site switches to a new global prefix.

ND is immune to spoofing attacks that originate from off-link nodes. In IPv4, off-link nodes can send ICMPv4 Redirect messages, and IPv4 Router Advertisement messages.

In the following, DAD refers to Duplicate Address Detection, which is one of the functions performed by ND. Addresses may be in any one of the following states at any given time:

  • TENTATIVE – generated, but not yet determined by DAD to be unique – can be used only for sending and receiving ND messages for DAD
  • DUPLICATED – generated, and determined by DAD  to be duplicated (hence unusable)
  • PREFERRED – generated , and determined by DAD to be unique (hence valid)
  • DEPRECATED – a preferred address that has passed its preferred lifetime (still valid, and incoming packets addresses to it will be accepted, but no further outgoing packets will be sent using it)
  • INVALID – a deprecated address that has passed its valid lifetime (may no longer be used for sending or receiving packets)

Here are the details of the various functions that ND can perform:

Router Discovery – at any time (but typically at power on), any node can determine the link-local address of the router(s) on the local link.

Step 1 – the node sends a Router Solicitation message to the “all routers on link” multicast group (ff02::2). If the node’s link local address has already been created, then that will be used as the source address, else the unspecified address (“::”) will be used as the source address.

Step 2 – All routers on the link will respond with Router Advertisement messages, usually to the “all nodes on link” multicast group (ff02::1), but if the source address of the Router Solicitation message was a link-local address the router can choose to send the Router Advertisement message directly to that address. The source address of each received Router Advertisement message is added to a default gateway table (from which the preferred link-local default gateway will be chosen). The Prefix Information option in all of the responses should be the same, so the subnet prefix from the last received Router Advertisement message will be used.

IPv6 Router Discovery corresponds roughly to IPv4 Router Discovery (which was defined in RFC 1256, “ICMP Router Discovery Messages”, September 1991), but in IPv6 it is a part of the base protocol. There is no need for hosts to snoop the routing protocols to discover a router. IPv4 router discovery contains a preference field, which is not needed in IPv6 Router Discovery because of Neighbor Unreachability Detection. IPv4 Router Advertisements and Solicitations (ICMP type 9) work only with multicast capable IPv4 routers, and are not commonly used. All IPv6 nodes support multicast, and Router Advertisements are a fundamental part of almost every non-trivial network.

Address Resolution (Mapping IPv6 addresses to MAC addresses)

Say Alice (one IPv6 node) is trying to send a packet to Bob (another IPv6 node). Address resolution is done as follows:

Step 1 – Alice checks her Neighbor Cache (similar to the ARP cache in IPv4) to see if it already has an entry for Bob. If it does, then Alice sends the packet immediately to Bob using Bob’s MAC address from the table, and she is finished. If Alice’s Neighbor Cache table doesn’t have an entry for Bob, the process continues.

 Step 2 – Alice adds a new Neighbor Cache entry for Bob, in the INCOMPLETE state. Alice then sends a Neighbor Solicitation message to Bob, using Bob’s solicited-node multicast address as the destination address. Any of the addresses assigned to Alice’s interface can be used as the source address of this packet, but if possible it should match the source address of the original packet Alice wanted to send. Alice includes her MAC address as the Source Link-layer Address option in this packet. This insures Bob will have her MAC address when it’s time for him to reply.

Step 3 – Bob receives the Neighbor Solicitation message, and responds with a Neighbor Advertisement message, sent to Alice’s MAC address.

Step 4 – Alice receives the Neighbor Advertisement message from Bob, and then updates Bob’s entry in her Neighbor Cache

Step 5 – Alice can now send the original packet she wanted to send to Bob using his MAC address.

Prefix Discovery

At any time, a node can discover the default A Router Advertisement message can contain up to three “options”:

  • the Source Link-Layer Address (the sending router’s MAC address)
  • the MTU (the maximum packet size supported on this link)
  • the Prefix Information (the preferred address prefix for this subnet).

When a router sends an unsolicited Router Advertisement message, it includes all three options. In a solicited Router Advertisement message, at least the Prefix and MTU options will be included, so in either case, the node will obtain the preferred prefix for the link.

Step 1 – The node wanting to discover the subnet prefix sends a Router Solicitation message, using its own link-local address as the source, and the “all routers in local link” multicast group (ff02::02) as the destination address.

Step 2 – All routers on the local link respond with Router Advertisement messages, with their own local-link address as source and the “all nodes on local link” multicast group (ff02::1) as the destination. The Router Advertisement message includes at least the subnet prefix option. This prefix is extracted from the prefix option, and stored as the subnet prefix. All routers will respond with the same prefix, but the last Router Advertisement message received will have the subnet prefix that is used.

Duplicate Address Detection (DAD)

DAD is used to determine if a proposed (tentative) address is a duplicate of any address on the local link. Both hosts and routers perform DAD on all unicast and anycast addresses regardless of how they are obtained (Stateless Address Autoconfiguration, DHCPv6 or even manual assignment). DAD is accomplished using Neighbor Solicitation and Neighbor Advertisement messages.

 Step 1 – The node owning the tentative address sends a number of Neighbor Solicitation messages using the unspecified address (::) as the source address, the Solicited-node multicast address as the destination address and the TENTATIVE address as the target address.

Step 2 – If any node on the link is already using the TENTATIVE address, it will respond by sending a Neighbor Advertisement to the “all nodes on local link” multicast group (ff02::1). If no such response is seen during a short interval (configurable), then the TENTATIVE address is considered to be unique.

Stateless Address Autoconfiguration (SAA)

This is one of the most important aspects of IPv6. It is primarily to allow IPv6 capable hosts (as opposed to routers) to automatically obtain address information (link local and global unicast node addresses and link-local default gateway), but routers still use it to generate and validate their link local addresses. The process makes strong use of link-local and multicast addresses, and all network communication is done with ICMPv6 messages that are part of ND. If a source of Router Advertisement messages is available, then at least one global unicast IPv6 address will also be generated. The acronym for Stateless Address Autoconfiguration is “SAA”.

There are four steps involved in Stateless Address Autoconfiguration:

Step 1 – the node creates a 64-bit interface identifier. This can be created using the MAC address and the EUI-64 algorithm, or can be a randomly generated value (“randomized interface identifier”).

Step 2 – the host creates a TENTATIVE link-local address. This is done by appending the chosen interface identifier to the prefix fe80://10. DAD is performed to determine if the link-local address is unique. If so, that address goes to the PREFERRED state, its lifetime starts counting, and the process continues. If the address is duplicated the address goes to the DUPLICATED state, the interface is disabled, and the SAA process fails without having generated any addresses.

Step 3 – the host sends a Router Solicitation message to the “all routers on link” multicast group (ff02::2). If the node’s link local address has already been created, then that will be used as the source address, else the unspecified address (“::”) will be used as the source address. All routers on the link will respond with Router Advertisement messages, usually to the “all nodes on link” multicast group (ff02::1), but if the source address of the Router Solicitation message was a link- local address the router can choose to send the Router Advertisement message via unicast to just that address. The source address of each received Router Advertisement message is added to a default gateway table (from which the preferred link-local default gateway will be chosen). The Prefix Information option in all of the responses should be the same, so the subnet prefix from the last received Router Advertisement message will be used.

If no router responds to the Router Solicitation message within a certain time, then the SAA process terminates, having created a valid link-local node address, but no link-local default gateway and no global unicast address.

Step 4 – if we reach this step, a valid Router Advertisement was received with a subnet prefix, so the host combines the discovered subnet prefix with the created interface identifier, to create a TENTATIVE global unicast address for the node. DAD is performed on the tentative global unicast address, and if the address is unique, it goes to the PREFERRED state and its lifetime starts counting. If not, the address goes to the DUPLICATED state, the interface is disabled, and the SAA process terminates, again having created a valid link-local address and a link-local default gateway address (but no global unicast address).

Anytime a link-local or global address lifetime expires (enters the INVALID state), address regeneration is done. If using randomized interface identifiers, a different random interface identifier is created for each address regeneration. If using EUI-64 interface identifiers, the regeneration process basically just confirms that the addresses are still valid – they don’t actually change. If something has changed since the last validation (e.g. gateway down, link broken, etc), the SAA process may fail and the address marked INVALID.

Next-hop Determination

When one node needs to send a packet to another node, the sending node must determine whether the destination address is on-link or off-link. To be considered on-link, the address must match at least one of the following criteria:

  • The prefix of the address must match one of the prefixes assigned to the link
  • The address is the target of a Redirect message sent by a router
  • The address is the target address of a Neighbor Advertisement message
  • The address is the source address of any Neighbor Discovery message received by the node

If the address is on-link then the next-hop address is the same as the destination address. If the address is off-link then the next-hop address is selected from the default router list.

Neighbor Unreachability Detection (NUD)

Each entry in the Neighbor Cache contains the IP address, the link-layer (MAC) address, and the reachability status for that node. There are five possible values for that status, and the state transition rules are as follows:

  • INCOMPLETE – cache entry is newly created, and address resolution is in progress. Any transmitted packets are queued. When the address resolution completes, the link-layer address is added into the Neighbor Table, and the state changes to REACHABLE.
  • REACHABLE – any queued packets are immediately sent. Any newly transmitted packets are sent normally. If more than a certain time passes without any traffic to or from the address, the state changes to STALE.
  • STALE – the reachability of the node is UNKNOWN. The address remains in this state until traffic to that node is generated. At that point, the traffic is queued and the state changes to DELAY.
  • DELAY – the address remains in the DELAY state for a short period. The status is still UNKNOWN.

Once the delay expires, the probe packet is sent, and the state changes to PROBE.

PROBE – a probe packet has been sent to determine reachability (after the delay), but the result has not yet been obtained. The status is still UNKNOWN. When the result is seen, REACHABILITY is confirmed, the state changes to REACHABLE. If a certain amount of time elapses without any response, then the node is considered unreachable, any queued traffic is discarded, and an error is generated to the sender.

Note that there is nothing comparable to NUD in IPv4. IPv6 NUD improves packet delivery in the presence of failing routers, and over partially failing or partitioned links. It improves delivery to nodes that change their link-layer (MAC) addresses. For example, mobile nodes can move off-link without losing any connectivity due to stale ARP caches. NUD detects dead routers and dead switches that block access to working routers.

Redirect

A router can send a Redirect message to a packet sender, if there is a better first-hop router, or if the destination is an on-link neighbor. In the first case, the Target Address field contains the link-local address o f the better first-hop router. In the second case, the Target Address field contains a copy of the Destination Address. The Destination Address field contains the address of the ultimate packet destination. The router uses its knowledge of the larger environment to generate this information. You might think of a Redirect message as saying something like “There is a bridge out down that road – try going down this road, instead”.

IPv6 Redirects contain the link-layer (MAC) address of the new first hop, which eliminates the need for an additional packet exchange to resolve the IP address. Unlike with IPv4 Redirects, the recipient of an IPv6 Redirect assumes that the new next-hop is on-link. The IPv6 Redirect is useful on non-broadcast and shared media links. On such links, nodes should not check for all prefixes for on-link destinations.

Viewing the Neighbor Cache

To view the Neighbor Cache in Windows 7:

1. Start a Command Prompt (cmd), and enter the following commands in it

2. Enter the command netsh –c “interface ipv6”

3. At the netsh prompt, enter the command show interface

4. In the resulting list, find the interface index for “Local Area Connection” (say it is 11)

5. At the netsh prompt, enter the command show neighbors 11 (or whatever interface index)

6. You should see global unicast addresses, link-local addresses, and a lot of multicast addresses.

img51.png

img52.png

Secure Network Discovery (SEND)

Note that there are some potentially exploitable vulnerabilities in ND. ARP in TCP/IPv4 has several well known and easily exploited vulnerabilities, used in many hacking attacks. For details of these vulnerabilities, search for “ARP Vulnerabilities Black Hat”). You should find an excellent PowerPoint presentation that was presented by Mike Beekey at a Black Hat Briefing security conference. It shows exactly how ARP is vulnerable, and how this is exploited by hackers.

A secure version of ND is defined in RFC 3971, “SEcure Neighbor Discovery (SEND)”, March 2005. This is still a Proposed Standard. SND uses cryptographically generated addresses which are defined in RFC 2972 “Cryptographically Generated Addresses (CGA)”, March 2005 (this is also a Proposed Standard and has already been updated by RFCs 4581 and 4982). SND does not depend on IPsec. It is still very much in an experimental status as of the writing of this book.

5.3.3 – Types of IPv6 Packet Transmission

Unicast, anycast, multicast and broadcast have already been covered in section 5.3.2.2, because in IPv6, this is considered to be part of the addressing model.

5.3.3.1 – IPv6 Broadcast

Most things that you would use broadcast for in IPv4, you would use some form of multicast, with a more restricted scope, in IPv6. A multicast transmission to the address ff01::2 would go to the same nodes (all nodes on local link) as an IPv4 broadcast. However, there are other scopes, such as site, organization and global for multicast, that (unlike IPv4 broadcast) will cross routers, but other than “all nodes in local link”, multicast to the wider scopes requires that all recipients intentionally add the necessary multicast address to their node.

5.3.3.2 – IPv6 Multicast

The basic multicast address type has been covered, but there is a lot more to a full multicast system, as you saw in section 3.3.3.2 (IPv4 Multicast). For an in-depth discussion of all aspects of IPv6 Multicast, I recommend Chapter 6, “Providing IPv6 Multicast Services” from the book “Deploying IPv6 Networks”, by Ciprian Popoviciu, Eric Levy-Abegnoli and Patrick Grossetete, Cisco Press, 2006.

Multicast exists in IPv4, but there are some serious problems with it, which are resolved in IPv6:

  • Not all IPv4 routers support multicast. In general it is difficult to deploy except in a “walled garden”, such as the customers of a single ISP like Comcast. In IPv6, support for multicast is mandatory – all compliant routers support it, and it works across ISPs, even worldwide.
  • The Internet Group Management Protocol (IGMP) is not part of TCP/IPv4, and not all IPv4 routers include it. In IPv6, the Multicast Listener Discovery protocol (MLD) is standardized, and is actually just a subset of the ICMPv6 messages, hence all IPv6 compliant routers include it.
  • Multicast in IPv4 was an afterthought, grafted on long after the original protocol was designed.

In IPv6, multicast was designed in from the beginning, and is present in all address scopes. Multicast link-local addresses are used extensively in SAA and other places.

For IPTV applications, IPv6 networks will be the first time that really global Internet TV services can be deployed and work reliably. This is as exciting as when Ted Turner first relayed the signal from his small UHF TV station via a satellite. That breakthrough resulted in WTBS, CNN, CNN Headline News, TNT, Cartoon Network, and indirectly, the entire multi-billion dollar satellite / cable television network industry.

There are many other areas in which working, scalable multicast can be used to improve applications. You could build chat, VoIP or even video conferencing clients that could build fully meshed networks, with each new participant subscribing to all existing client’s multicast “channels”, and all existing clients subscribing to the new participant’s multicast “channel”. Even if the initial participant left, all remaining participants would still have a fully functional mesh network. This also eliminates the need for any central exchange point (other than perhaps a search or directory facility to help in setting up the conference and allowing participants to locate each other).

The following standards are relevant to multicast in IPv6:

*     RFC 2375, “IPv6 Multicast Address Assignments”, July 1998 (Informational)

*     RFC 2710, “Multicast Listener Discovery (MLD) for IPv6”, October 1999 (Standards Track)

 *     RFC 3306, “Unicast-Prefix-based IPv6 Multicast Addresses”, August 2002 (Standards Track)

*     RFC 3307, “Allocation Guidelines for IPv6 Multicast Addresses”, August 2002 (Standards Track)

*     RFC 3590, “Source Address Selection for the Multicast Listener Discover (MLD) Protocol”, September 2003 (Standards Track)

*     RFC 3810, “Multicast Listener Discovery Version 2 (MLDv2) for IPv6”, June 2004 (Standards Track)

*     RFC 3956, “Embedding the Rendezvous Point (RP) Address in an IPv6 Multicast Address”, November 2004 (Standards Track)

*     RFC 4489, “A Method for Generating Link-Scoped IPv6 Multicast Addresses”, April 2006 (Standards Track)

*     RFC 4607, “Source-Specific Multicast for IP”, August 2006 (Standards Track)

Multicast Listener Discovery Protocol (MLD)

MLD is used by IPv6 routers to discover the presence of multicast listeners (nodes that wish to receive multicast packets), and the specific multicast addresses to which they want to subscribe. MLD (defined in RFC 2710) is commonly referred to as MLDv1. It is the IPv6 equivalent to IPv4’s IGMPv2 (defined in RFC 2236). MLDv1 and IGMPv2 multicast protocols are used to set up Any-Source Multicast (ASM), which allows multiple sources in a group (*,G) or “channel”. This is also known as traditional multicast.

MLDv2 extends the definition of MLDv1 by adding support for “source filtering”. It includes all of the functionality of MLDv1, so there is no need to deploy both on a given node. This allows a node to indicate interest only in packets from specific source addresses (INCLUDE mode), or in packets from all multicast addresses except for specific source addresses (EXCLUDE mode). MLDv2 is the IPv6 equivalent of IPv4’s IGMPv3. MLDv2 and IGMPv3 multicast protocols are used to set up Source-Specific Multicast (SSM), which allows a specific source (S) in a group (G) to deliver packets to all members that join (S,G) known as a “channel”. This is described in RFC 4604, “Using Internet Group Management Protocol Version 3 (IGMPv3) and Multicast Listener Discovery Protocol Version2 (MLDv2) for Source-Specific Multicast”, and in RFC 4607, “Source-Specific Multicast (SSM) for IP”.

There is another RFC that defines MLD proxying:  RFC 4605, “Internet Group Management Protocol (IGMP) / Multicast Listener Discovery (MLD)-Based Multicast Forwarding (“IGMP/MLD Proxying”). A proxy would exist on a forwarding gateway that links together multiple subnets, and relay messages across that gateway between an MLD querier on one subnet and MLD listeners on a different subnet.

MLDv1 and MLDv2 are sub-protocols of ICMPv6. All MLDv2 messages are just additional ICMPv6 messages. All IPv6 compliant devices should include support for MLD. MLD messages must be sent with a link-local IPv6 Source Address, a Hop Limit of 1, and an IPv6 Router Alert Option in the Hop-by-hop option extension packet header. When used in Neighbor Discovery protocol’s Stateless Address Autoconfiguration, the source address can be the unspecified address (::). IGMP is not a sub protocol of ICMPv4. It does not use ICMPv4 messages, but an entirely new protocol. IGMP is not mandatory on all IPv4 routers.

MLD can co-exist with IGMPv3 in a dual-stack network, as MLD (v1 or v2) will only involve IPv6 messages, and IGMP (v1, v2 or v3) will only involve IPv4 messages. However, in general, multicast will work far better on IPv6 than on IPv4.

With MLD, there is a “router role” (performed by at most one router in a subnet) and a “listener role”

(performed by any number of listener nodes in that subnet) in the protocol.

For the router role, only one router on a subnet can be the Querier at any given time. If there is more than one router on a subnet, there is an election mechanism that selects one of them to be the Querier. Should that router fail at some point, all other routers on that subnet have been listening in and maintaining state, so another election will select one of the surviving routers on that subnet to become the Querier. Only the Querier sends periodic or triggered query messages on its subnet.

There are three types of MLDv2 query messages sent by the Querier to the “all nodes on local link” multicast address (ff02::1). They should be sent with a valid IPv6 link-local source address. Any Query message received with the Source Address being the unspecified address (::), or any other address that is not a valid IPv6 link-local address, should be silently discarded.

*     General Queries

*     Multicast Address Specific Queries

*     Multicast Address and Source Specific Queries

There are two types of reports sent by listeners to the Querier, to a special multicast address (ff02::16) to which all MLDv2 compliant multicast routers listen. If a single Report message is not large enough to hold all of the state information, multiple Report messages can be sent.

*     Current State Report (sent in response to a Query)

*     State Change Report (sent unsolicited in response to some change on the listener)

General Queries are sent from the Querier to all listeners on the subnet periodically to learn multicast address listener information, to build and refresh state inside all multicast routers on the subnet. Even though only the Querier sends out periodic queries, all routers listen to the responses, and update their state.

When a listener node gets a General Query message, it responds by sending a Current State Report, with its per-interface state information. It is also possible for a listener node to immediately report a state change (such as someone “unsubscribing” to a multicast channel) through an unsolicited State Change Report. Current State Reports are sent only once (if one is lost, it will probably be received in response to the next periodic Query). State Change Reports are sent multiple times for robustness (to increase the probability of all routers getting the message).

When the Querier gets a State Change Report from a listener, it sends a Multicast Address Specific Query to see if there are still any other listeners to that multicast address. If not, the Querier will delete that multicast address from its Multicast Address Listener state table, which stops relaying the corresponding traffic. If there are source specific listeners, the Querier will send a Multicast Address and Source Specific Query instead.

There must be a service interface (API routines) available, which allows an application to cause a State Change Report to be sent to the Querier.  A sample API is documented in RFC 3678, “Socket Interface Extensions for Multicast Source Filters”, January 2004. The full API includes the ability to JOIN or LEAVE a multicast group (“subscribe to a multicast channel”), and to BLOCK and UNBLOCK specific source addresses, as well as to set and retrieve source filter sets.

For details on the syntax of the various MLDv2 messages, see RFC 3810.

5.3.3.4 – Protocol Independent Multicast (PIM) for IPv6

PIM is a multicast protocol which deals with router-to-router communications. IPv6 PIM is similar to IPv4 PIM, has the same variants (Dense Mode, Sparse Mode, and Bidirectional Mode), and is defined in the same RFCs (in the sections relevant to IPv6). The IPv6 implementation uses Neighbor Discovery protocol, Multicast Listener Discovery protocol, Path MTU Discovery and IPv6 Multicast, rather than the corresponding IPv4 mechanisms. As with TCP, the PIM message checksum factors in the source and destination IP addresses, so the pseudo-header used in the calculation of the checksum (which includes IPv6 addresses) is different from the one used in IPv4. The following items are IP version specific in all variants:

img53.png

PIM for IPv6 does not include routing, but provides multicast forwarding by using static IPv6 routes, or routing tables created by IPv6 unicast routing protocols, such as RIPng, OSPFv3, IS-ISv6 or BGP4+.

PIM Dense Mode is defined in RFC 3973, “Protocol Independent Multicast – Dense Mode (PIM-DM)”, January 2005 (for both IPv4 and IPv6). This uses dense multicast routing, which builds shortest-path trees by flooding multicast traffic domain wide, then pruning branches where no receivers are present. It does not scale well.

PIM Sparse Mode is defined in RFC 4601, “Protocol Independent Multicast – Sparse Modem (PIM-SM): Protocol Specification (Revised)”, August 2006 defines PIM-SM (for both IPv4 and IPv6). As in IPv4, PIM- SM builds unidirectional shared trees routed at a rendezvous point per group, and can create shortest- path trees per source. It scales fairly well for wide-area use.

Bidirectional PIM is defined in RFC 5015, “Bidirectional Protocol Independent Multicast (BIDIR-PIM)”, October 2007 (for both IPv4 and IPv6). It builds shared bi-directional trees. In never builds a shortest- path tree, so there may be longer end-to-end delays, but it scales very well.

There is one new standard specific to IPv6 PIM, RFC 3956 “Embedding the Rendezvous Point (RP) Address in an IPv6 Multicast Address”, November 2004. This defines an address allocation policy in which the address of the Rendezvous Point (RP) is encoded in an IPv6 multicast group address. For PIM– SM, this can be seen as a specification of a group-to-RP mapping mechanism. This supports easy deployment of scalable inter-domain multicast and simplifies configuration as well.

Example 1: An ISP manages 2001:db8::/32, and wants an RP for the network and all its customers, on an existing subnet, for example 2001:db8:beef:feed::/64. The group address would be something like ff7x:y40:2001:db8:beef:feed::/96, and the RP address would be 2001:db8:beef:feed::y (y can be any value from 1 to F, but not 0).

Example 2: An organization wants to have its own PIM-SM domain. It should pick multicast addresses such as ff7x:y30:2001:db8:beef::/80. The RP address would be 2001:db8:beef::y (y can be any value from 1 to F, but not 0).

5.3.4 – ICMPv6: Internet Control Message Protocol for IPv6

ICMPv6 is a key protocol in the Internet Layer that complements version 6 of the Internet Protocol (IPv6). It was originally defined in RFC 1885 (December 1995) and then enhanced in RFC 2463 (December 1998). It is currently defined in RFC 4443, “Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification”, March 2006.

There are many more ICMPv6 messages defined than there are ICMPv4 messages (in fact, Neighbor Discovery and Multicast Listener Discovery protocols are just subsets of the ICMPv6 messages). ICMPv6 messages have a much greater range of functionality than ICMPv4 messages. Even if you block all ICMPv4 messages (common practice by some IPv4 network administrators) normal network operation will usually occur. This is not true with ICMPv6. ICMPv6 messages are used in normal operation of IPv6.

There are two classes of ICMPv6 messages:

*     error messages, with message type ranging from 0 to 127

*     informational messages, with message type ranging from 128 to 255

ICMPv6 Error Messages

1 Destination Unreachable (ICMPv6, RFC 4443)

2 Packet Too Big (ICMPv6, RFC 4443)

3 Time Exceeded (ICMPv6, RFC 4443)

4 Parameter Problem (ICMPv6, RFC 4443) ICMPv6 Informational Messages

128 Echo Request (ICMPv6, RFC 4443)

129 Echo Reply (ICMPv6, RFC 4443)

130 Multicast Listener Query message (MLDv2, RFC 3810)

131 Multicast Listener Report (MLDv1, RFC 2710)

132  Multicast Listener Done (MLDv1, RFC 2710)

133 Router Solicitation message (ND, RFC 2461)

134  Router Advertisement message (ND, RFC 2461)

135 Neighbor Solicitation message (ND, RFC 2461)

136  Neighbor Advertisement message (ND, RFC 2461)

137  Redirect message (ND, RFC 2461)

138  Router Renumbering (RR, RFC 2894)

139 ICMP Node Information Query (NIQ, RFC 4620)

140 ICMP Node Information Response (NIQ, RFC 4620)

141 Inverse Neighbor Discovery Solicitation Message (IND, RFC 3122)

142 Inverse Neighbor Discovery Advertisement message (IND, RFC 3122)

143 Multicast Listener Report message (MLDv2, RFC 3810)

144 Home Agent Address Discovery Request Message (MIPv6, RFC 3775)

145 Home Agent Address Discovery Reply Message (MIPv6, RFC 3775)

146 Mobile Prefix Solicitation (MIPv6, RFC 3775)

147 Mobile Prefix Advertisement (MIPv6, RFC 3775)

148 Certification Path Solicitation (SEND, RFC 3971)

149 Certification Path Advertisement (SEND, RFC 3971)

151 Multicast Router Advertisement (MRD, RFC 4286)

152 Multicast Router Solicitation (MRD, RFC 4286)

153 Multicast Router Termination (MRD, RFC 4286)

154 FMIPv6 messages (MIPv6, RFC 5568)

img54.png

Note that there is no equivalent ICMPv6 message corresponding to the following ICMPv4 messages (or else its function is now contained in another message)

img55.png

Destination Unreachable Error

img56.png

img57.png

IPv6 Fields

Destination Address

Copied from the Soure Address field of the involving packet.

ICMPv6 Fields:

img58.png

Description

A Destination Unreachable message SHOULD be generated by a router, or    by the IPv6 layer in the originating node, in response to a packet that cannot be delivered to its destination address for reasons other  than congestion.  (An ICMPv6 message MUST NOT be generated if a    packet is dropped due to congestion.)

Packet Too Big Message

img59.png

IPV6 Fields:

Destination Areas

Copied from the Source address field of the involving packet.

IPV6 Fields:

Type 2

Code  Set to 0 (zero) by the originator and ignored by the receiver.

MTU            The Maximum Transmission Unit of the next-hop link.

Description

A Packet Too Big MUST be sent by a router in response to a packet that it cannot forward because the packet is larger than the MTU of the outgoing link.  The information in this message is used as part of the Path MTU Discovery process.

Time Exceed Message

img60.png

IPV6 Fields:

Destination Address

Copied from the Source Address field of the invoking packet.

ICMPv6 Fields:

 

Type 3

 

Code 0 - Hop limit exceeded in transit

1 - Fragment reassembly time exceeded

Unused This field is unused for all code values.

It must be initialized to zero by the originator and ignored by the receiver.

 

Description

 

If a router receives a packet with a Hop Limit of zero, or if a router decrements a packet's Hop Limit to zero, it MUST discard the packet and originate an ICMPv6 Time Exceeded message with Code 0 to the source of the packet.  This indicates either a routing loop or too small an initial Hop Limit value.

 

Parameter Problem Message

img61.png

img62.png

IPV6 Fields:

Destination Address

 

Copied from the Source Address field of the invoking packet.

 

ICMPv6 Fields:

 

Description

If an IPv6 node processing a packet finds a problem with a field in the IPv6 header or extension headers such that it cannot complete processing the packet, it MUST discard the packet and SHOULD originate an ICMPv6 Parameter Problem message to the packet's source, indicating the type and location of the problem.

Echo Request Message

img63.png

IPv6 Fields:

Destination Address

Any legal IPv6 address.

ICMPv6 Fields:

Type 128

Code 0

Identifier An identifier to aid in matching Echo Replies to this Echo Request.  May be zero.

Sequence Number

A sequence number to aid in matching Echo Replies to this Echo Request.  May be zero.

Data  Zero or more octets of arbitrary data.

Description

Every node MUST implement an ICMPv6 Echo responder function that receives Echo Requests and originates corresponding Echo Replies.  A node SHOULD also implement an application-layer interface for originating Echo Requests and receiving Echo Replies, for diagnostic purposes.

Echo Reply Message

img64.png

IPv6 Fields:

Destination Address

Copied from the Source Address field of the invoking Echo Request packet.

ICMPv6 Fields:

Type 129

Code 0

Identifier     The identifier from the invoking Echo Request message.

Sequence Number

The sequence number from the invoking Echo Request message.

Data The data from the invoking Echo Request message.

Description

Every node MUST implement an ICMPv6 Echo responder function that receives Echo Requests and originates corresponding Echo Replies.  A node SHOULD also implement an application-layer interface for originating Echo Requests and receiving Echo Replies, for diagnostic purposes.

The source address of an Echo Reply sent in response to a unicast Echo Request message MUST be the same as the destination address of that Echo Request message.

An Echo Reply SHOULD be sent in response to an Echo Request message sent to an IPv6 multicast or anycast address.  In this case, the source address of the reply MUST be a unicast address belonging to the interface on which the Echo Request message was received.

The data received in the ICMPv6 Echo Request message MUST be returned entirely and unmodified in the ICMPv6 Echo Reply message.

5.3.5 – IPv6 Routing

TCP/IPv6 has to solve the same problems as TCP/IPv4 in terms of how to get packets from one point to another through a packet switched network. However, the differences in IP address length and  addressing model mean that the existing routing protocols for TCP/IPv4 do not work. All of the popular routing protocols have been extended to support IPv6. These include RIPng, EIGRP, IS-ISv6, OSPF for IPv6, and BGP4 with Multiprotocol Extensions (BGP4+).

The following standards are relevant to routing in IPv6:

img65.png RFC 2080, “RIPng for IPv6”, January 1997 (Standards Track)

img65.png RFC 2185, “Routing Aspects of IPv6 Transition”, September 1997 (Informational)

img65.png RFC 2545, “Use of BGP-4 Multiprotocol Extensions for IPv6 Inter-Domain Routing”, March  1999 (Standards Track)

img65.png RFC 5308, “Routing IPv6 with IS-IS”, October 2008 (Standards Track)

img65.png RFC 5340, “OSPF for IPv6”, July 2008 (Standards Track)

RIPng – RIP Next Generation. Defined in RFC 2080, “RIPng for IPv6”, January 1997. This IETF standard specifies extensions to the RIP protocol (as defined in RFCs 1058 and 1723), to support IPv6. Like RIP for IPv4, RIPng also uses the Distance Vector algorithm. Unlike RIP for IPv4, RIPng is implemented only in routers. IPv6 itself provides mechanisms for router discover (part of ND). RIPng is a UDP based protocol, using port 521 (compare with port 520 for RIP). It supports 128 bit IPv6 addresses instead of 32 bit IPv4 addresses. It has the same limitations as RIP, such as being useful only in small networks, with less than 15 hops. It does have some of the extensions of RIPv2. When a response is sent to all neighbors, the multicast group ff02::9 (all-rip-routers) is used. RIPng only routes IPv6. On a dual stack network, you would need both RIP (for IPv4) and RIPng. Since RIPng runs over IPv6, it can use the IPsec Authentication Header (AH) and Encapsulating Security Payload (ESP) mechanisms to ensure integrity and authentication / confidentiality of routing exchanges.

EIGRP – Enhanced Interior Gateway Routing Protocol (proprietary Cisco routing protocol). This already includes extensions to allow it to route IPv4 and/or IPv6 packets. For details, see Cisco documentation.

IS-ISv6 – Extension of IS-IS to support IPv6. Based on two levels, L2 = Backbone, L1 = Stub, L2L1 = interconnected L2 and L1. It runs over CNLS. Each IS node still sends out Link State Packets, and sends information via Tag/Length/Values. There are two new TLVs: IPv6 Reachability and IPv6 Interface Address, and a new Network Layer Identifier: IPv6 NLPID. Other than that, IS-ISv6 is pretty much the same as the original IS-IS. It is still suitable mainly for large ISPs.

OSPF for IPv6 – Open Shortest Path First for IPv6 (also known as OSPFv3). Defined in RFC 5340, “OSPF for IPv6”, July 2008. This is still an interior gateway routing protocol, and is suitable for use within organizations, but not between autonomous systems (BGP4+ is needed for this).

The basic OSPF for IPv4 mechanisms (flooding, Designated Router election, area support, Short Path First calculations) are unchanged. Some changes are required because of new protocol semantics or larger address size. Most fields and packet-size limitations in OSPF for IPv4 have been relaxed, and option handling is more flexible. The protocol processing is now per-link, instead of per-subnet. There is now a flooding scope to reflect the scopes of IPv6 addresses. It uses IPv6 link-local addresses. The Addressing Semantics have been removed (with a  few exceptions) leaving a mostly network-protocol- independent core. OSPF Router IDs, Area IDs and LDA Link State IDs are still 32 bits, so those can no longer be IP addresses (which in IPv6 are 128 bits).

The new flooding scope allows control over how widely to flood information: link-local, area wide or AS wide (the entire routing domain). It is now possible to run multiple instances of the OSPF protocol on a single link (every message now includes an Instance ID value).  Link-local addresses are used where they are meaningful (for transactions completely within a link), but global scope IPv6 addresses must still be used in some places (e.g. source address for OSPF protocol packets). The AuType and Authentication fields have been removed from OSPF for IPv4, as IPsec AH and ESP are available and superior. As with TCP, the header checksum covers the entire OSPF packet and a prepended IPv6 pseudo header. All support for MOSPF (Multicast OSPF) has been removed.

OSPF for IPv6 runs only over IPv6, and only routes IPv6. On a dual stack network you would need both OSPF for IPv4 (OSPFv2) and OSPF for IPv6 (OSPFv3) deployed, similar to RIP and RIPng. It is possible that a future version of OSPF will support both IPv4 and IPv6 routing.

BGP4 with Multiprotocol Extensions (also known informally as BGP4+). Defined in RFC 4760, “Multiprotocol Extensions for BGP-4”, January 2007. BGP-4 is currently defined in RFC 4271, “A Border Gateway Protocol 4 (BGP-4)”, January 2006. BGP-4 supports only IPv4. The multiprotocol extensions have been around since RFC 2283, February 1998, but have been updated with each new version of BGP-4.

These extensions allow BGP4+ to carry routing information for multiple Network Layer protocols, e.g IPv6, IPX, L3VPN, etc). It is designed to be backward compatible, such that a BGP4+ compliant router can exchange IPv4 routing information with a router that does not support the multiprotocol extensions (basic BGP4).

Currently BGP4+ is the primary protocol used for routing IPv6 packets between Autonomous Systems (very large networks under the control of a single entity, such as ISPs or major corporations). Most IPv6 engineers will never work with it, unless they work for an ISP or a really large company.

One of the issues that ISPs face when supporting IPv6 is to migrate their BGP-4 gateways to BGP4+ gateways. They typically must also upgrade many routers to dual stack. At the ISP level, many routers have hardware acceleration, so this can be expensive. These may involve “forklift” upgrades, where

 entirely new high end routers must be purchased, and there may be relatively little resale value for legacy IPv4-only equipment (hint to ISPs: migrate to IPv6 now and sell your old gear while is still has SOME value!)

Looking at Local Routing Information

In Windows, you can view all currently known routes with the “route print” command. If you have enabled the IPv6 protocol, and are connected to an IPv6 network, you might see something like the following (the “-6” tells it to print only IPv6 route information):

img66.png

5.3.5.1 – Network Address Translation

NAT (Network Address Translation) was introduced to extend the lifetime of the IPv4 address space long enough for its replacement, IPv6, to be defined, refined, and compliant infrastructure products and applications to be developed. TCP/IPv6 is now fully developed and ready for prime time. NAT has served its purpose.  It is time to put it out to pasture.

There is a common belief that the practice of hiding nodes behind a single routable IPv4 address (“hide mode NAT”) adds security. It really doesn’t.

First, anytime you make an outgoing connection, either directly, or via NAT, the connection you make is a two-way path, and the node you connect to can easily attack you right through your packet filtering firewall and network address translation. You should have “defense in depth” and protect your node with a host-based firewall whether or not you are behind a firewall and NAT gateway.

Second, if a hacker manages to breach your firewall by installing a Trojan horse onto any node in your network, they can attack you from that compromised node. Hackers have a term for networks that have a strong perimeter defense, but limited internal defenses. It is “hard crunchy outside, soft chewy inside”.  Again, host based firewalls on all nodes are a good idea.

Third, if you are using almost any peer-to-peer software, VoIP (e.g. Skype) or IPsec VPN, it probably includes a mechanism called NAT Traversal (e.g. STUN, TURN, SOCKS, etc). NAT traversal basically bores a hole right through your NAT protection (required for any of the above applications). Anything that includes NAT traversal can easily be used to attack you. Many people think Skype is a productivity tool. Network security people think it is a network vulnerability.

Fourth, any time you open a document from outside (Word Document, Excel spreadsheet, JPEG image, etc) it may contain malware that infects your node right through firewalls and NAT.

It is better to allow direct connections to your node over IPv6, through various layers of firewalls, including a host based firewall, together with good active anti-malware software, than to have NAT giving you a false sense of security.

On the other hand, NAT causes problems with any connectivity model other than simple client server outgoing connections, such as web browser to web server. This was covered in some detail under NAT in Chapter 3, section 3.3.5.1.

The real kicker is that NAT is the hacker’s friend! It is easy for a hacker to hide behind a NAT gateway and do all kinds of mischief, sending of malware, etc. It is quite difficult for the authorities to figure out which of the nodes hidden behind the common address is doing the bad stuff. To do this, the ISP must log EVERY connection, including source address, destination address, timestamp and port. This mounts up to several TERABYTES for each ISP customer over a year, which is not a trivial amount of storage. With a flat address space (as in IPv6), it is far easier to figure out where the attack is coming from.

Because of these issues, there is no IPv6 to IPv6 NAT defined in any IETF standard. There is no need for it to extend the IPv6 address space lifetime, it has no other real benefit, it causes many problems, and it is greatly impeding innovation. Other than those minor things, I guess it’s OK (sarcasm flag!)

On the other hand, there is a real need for IPv4 to IPv6 (and IPv6 to IPv4) Network Address Translation, and there are about 8 proposed methods in the IETF now. All of them have various problems and tradeoffs (that is the nature of NAT). One of the more promising schemes is NAT64 in combination with DNS64. These will be discussed in more detail in the chapter on migration to TCP/IPv6.

5.4 – TCP: The Transmission Control Protocol

There is very little difference in TCP over IPv4 and TCP over IPv6. The main difference is that more storage must be provided in the implementations to hold the 4X larger addresses (16 bytes versus 4 bytes, for each address). The other aspect involves the TCP header checksum, which uses a pseudo header to allow inclusion of the IP addresses in the calculation of the checksum (in addition to the contents of the payload). Of course there is a different pseudo header format for IPv4 and IPv6, given the difference in address size. There are no new RFCs for TCP over IPv6.

There is one new feature for both TCP and UDP over IPv6 called “Jumbograms” . This is defined in RFC 2675, “IPv6 Jumbograms”, August 1999. Jumbograms are very large packets, with a payload containing more than 65,535 bytes. The standard Payload Length field is only 16 bits, so the maximum payload size is 65,535 bytes. RFC 2675 defines a new Hop-by-hop option that includes a 32 bit payload length field, allowing packet lengths of up to 4.3 billion bytes. Of course, such packets require paths with very large MTUs. The simple 16 bit checksum becomes a less reliable error detection scheme as the payload length increases significantly. Of course, one bit error would require retransmission of an entire packet, so this should be used only on extremely reliable links.

5.4.1 – TCP Packet Header

No changes are required to the TCP packet header, as port numbers are still 16 bits in length. The only differences are in how the header checksum is calculated (using the IPv6 pseudo header), and the availability of Jumbograms. For details on the TCP packet header fields, see section 3.4.1.

5.5 – UDP: The User Datagram Protocol

UDP over IPv6 has basically the same differences with UDP over IPv4 as was described for TCP. For details on the UDP header fields, see section 3.5.

5.6 – DHCPv6 – Dynamic Host Configuration Protocol for TCP/IPv6

Unlike with DNS, it was not possible to add new functionality into DHCPv4 to make a new version for IPv6 (let alone a single server that could handle both IPv4 and IPv6). DHCPv6 is pretty much a new design from the ground up. DHCPv4 was built from an earlier protocol called BOOTP, and contains many now unnecessary features from that. DHCPv6 was cleaned up considerably, and contains none of the things leftover from BOOTP.

DHCPv4 runs over IPv4, and supplies only 32-bit IPv4 information (assigned IPv4 addresses, IPv4 addresses of DNS servers, etc). DHCPv6 runs only over IPv6, and supplies only 128-bit IPv6 information (assigned IPv6 addresses, IPv6 addresses of DNS servers, etc). There is no conflict between DHCPv4 and DHCPv6 in terms of functionality or ports used, so it is possible to run both on a single, dual-stack node.

Hosts communicate only with DHCPv6 servers or relay agents on their local link, using link-local addresses (typically ff02::1:2). DHCPv6 uses ports UDP 546 and 547 (compare with DHCPv4 which uses   UDP ports 67 and 68). As with DHCPv4, relay agents are used to allow hosts to communicate with remote DHCPv6 servers (ones not on the local link). This is still done via UDP, but using a site-scope address (ff05::1:3) which is used only by relay agents.

In some simple networks, there is no need for DHCPv6 because of Stateless Address Autoconfiguration. Currently, however, DHCPv6 is the only way for IPv6 capable nodes to automatically learn the IPv6 addresses of DNS servers. This is particularly important for IPv6-only (“pure IPv6”) networks, of which there are not many yet. For dual stack networks, there is no conflict between DHCPv4 and DHCPv6, and both can exist even on a single node. In this case, the IPv4 side of a node would get its IPv4 configuration from the DHCPv4 server, and the IPv6 side of a node would get its IPv6 configuration from the DHCPv6 server.

DHCPv6 allows the administrator far better control over distribution of interface identifiers (low 64 bits of each address) than with Stateless Address Autoconfiguration. With SAA, interface identifiers can either make use of only a tiny percentage of the possible 264  address space (when using EUI-64 generated interface identifiers), or have interface identifiers scattered randomly all over the possible 264 address space (when using cryptographically generated addresses). Either of these can lead to problems with Network Access Control or Firewall rules. In general, administrators like to cluster IP addresses by department (or other groupings), so that a single firewall or NAC rule can be used for an entire group, by specifying an address range (e.g. all addresses that fall between 2001:df8:5403:3000::1000 and  2001:df8:5403:3000::1fff, inclusive).

IPv6 capable nodes can be informed that there is a DHCPv6 server available via two bits in the Router Advertisement message. The Router Advertisement message and the relevant bits are described in RFC 4861, “Neighbor Discovery for IP version 6 (IPv6)”, September 2007. In the Router Advertisement message there are two bits, M and O (first and second bits of the sixth byte of the Router Advertisement message), with the following semantics:

M – “Managed address configuration” flag. When set it indicates that addresses are available via DHCPv6. If set, then the O flag can be ignored. This enables stateful DHCPv6, where both the stateless information (IPv6 addresses of DNS and other servers) and global unicast addresses can be obtained from DHCPv6.

O – “Other configuration” flag. When set, it indicates that other configuration information is available via DHCPv6. This includes things such as IPv6 addresses of DNS or other servers. This is called stateless DHCPv6, and is used in conjunction with Stateless Address Autoconfiguration (for obtaining global unicast addresses).

If both M and O bits are clear, then SAA is the only way to get addresses, and there is no source of IPv6 addresses for any servers, including DNS.

RFCs for DHCPv6

There are a number of RFCs that define DHCPv6. The most important ones are:

*     RFC 3315, “Dynamic Host Configuration Protocol for IPv6 (DHCPv6)”, July 2003 (Standards Track)

 *     RFC 3319, “Dynamic Host Configuration Protocol (DHCPv6) Options for Session Initiation Protocol (SIP) Servers”, July 2003

*     RFC 3633, “IPv6 Prefix Options for Dynamic Host Configuration Protocol (DHCP) version 6”, December 2003 (Standards Track)

* RFC 3646, “DNS Configuration options for Dynamic Host Configuration Protocol for IPv6 (DHCPv6)”, December 2003 (Standards Track)

*     RFC 3736, “Stateless Dynamic Host Configuration Protocol (DHCP) Service for IPv6”, April 2004 (Standards Track)

*     RFC 3898, “Network Information Service (NIS) Configuration Options for Dynamic Host Configuration Protocol for IPv6 (DHCPv6)”, October 2004 (Standards Track)

*     RFC 4076, “Renumbering Requirements for Stateless Dynamic Host Configuration Protocol for IPv6 (DHCPv6)”, May 2005 (Informational)

* RFC 4242, “Information Refresh Time Option for Dynamic Host Configuration Protocol for IPv6 (DHCPv6)”, November 2005 (Standards Track)

*     RFC 4477, “Dynamic Host Configuration Protocol (DHCP): IPv4 and IPv6 Dual-Stack Issues”, May 2006 (Informational)

*     RFC 4580, “Dynamic Host Configuration Protocol for IPv6 (DHCPv6) Relay Agent Subscriber-ID Option”, June 2006 (Standards Track)

*     RFC 4649, “Dynamic Host Configuration Protocol for IPv6 (DHCPv6) Relay Agent Remote-ID  Option”, August 2006 (Standards Track)

*     RFC 4704, “The Dynamic Host Configuration Protocol for IPv6 (DHCPv6) Client Fully Qualified Domain Name (FQDN) Option”, October 2006 (Standards Track)

*     RFC 5007, “DHCPv6 Leasequery”, September 2007 (Standards Track)

*     RFC 5460, “DHCPv6 Bulk Leasequery”, February 2009 (Standards Track)

DHCPv6 has a failover mechanism. Two servers can manage a single pool of addresses for redundancy (in case of failure of one of the servers). This also can be used for load balancing.

All IPv6 hosts have automatically generated link-local addresses that can be used to exchange packets with any other node on the local link. DHCPv4 requires some complex hacks to allow hosts to communicate before they get an address. All IPv6 hosts support link-local multicast. All DHCPv6 servers listen to DHCPv6 multicast groups. With DHCPv4, clients have to do a general broadcast to all nodes on the link, which generates significant broadcast traffic on the link and unnecessary traffic handling on all nodes.

With DHCPv6, a single request can configure all interfaces on a node. The server can offer multiple addresses, one for each interface, and each interface can even have different options. With DHCPv4, each interface would require a separate DHCP operation.

Some of the stateless information (that is, other than assigned IPv6 addresses for each node), include:

*     IPv6 prefix

*     Vendor specific options

*     Addresses of SIP servers

*     Addresses of DNS servers and search options

 *     NIS configuration

*     SNTP servers

*     BCMS servers

There are several implementations of DHCPv6 already on the market. Windows Server 2008 contains a very complete implementation, in addition to its DHCPv4 server. You can view the IPv6 Ready Phase 2 product list for other options, including my own company’s SolidDNS appliance. Some implementations only support stateless mode, which means they can supply stateless information (like DNS addresses) but not actually allocate addresses. Be sure the DHCPv6 server you select includes full support for stateful mode as well (where it can supply addresses to each node, in addition to stateless information). You should also be sure that the gateway router or firewall you select has the ability to inform nodes that DHCPv6 servers are available on the subnet.

Address Reservations with DHCPv6

In the case of DHCPv4, it is possible to make an address reservation, linked to the MAC address of a node. Anytime the node with a MAC address for which an address reservation has been made asks DHCPv4 for an address, it will get the specific address that was reserved for that MAC address. In the case of DHCPv6, the same concept applies, except that you use two identifiers called the IAID (Interface Association ID) and DUID (DHCP Unique IDentifier).

A DUID consists of a two-bye type code represented in network byte order, followed by a variable number of bytes that make up the actual identifier. A DUID can be no more than 128 bytes long (not including the type code). The following types are currently defined:

Link-layer address plus time (DUID-LLT) – recommended for all general purpose computing devices, such as desktop computers, printers, routers, etc. The must contain some form of writable non-volatile storage. Note that the device should configure the time on the node before this DUID is generated, if possible. The only purpose of the timestamp is to lower the chance of an identifier conflict. The link- layer address is typically the MAC address for Ethernet media. The DUID is defined as follows:

img67.png

Vendor-Assigned Based on Enterprise Number (DUID-EN) – This form is assigned to the device by the vendor. This type of DUID is for have some form of non-volatile storage (e.g. EEPROM). The enterprise number is the IANA 32-bit assigned number for the vendor. The identifier can be anything the vendor chooses, but must be unique within that vendor for each device.

img68.png

Link-Layer Address (DUID-LL) – This form is just like the DUID-LLT, without the timestamp. It is recommended for permanently connected devices that have a link-layer address, but no nonvolatile, writeable stable storage.

img69.png

Viewing your Node’s DUID

In Windows 7, using a command prompt, type the command ipconfig /all. In the section related to the interface you are interested in, look for the field DHCPv6 Client DUID. Note that this is a DUID-LLT (Type code 00-01). The next six hex digit pairs (00-01-12-D6-97-E5) are the timestamp. The last six hex digit pairs (00-18-8B-78-DA-1A) are the same as the Physical Address (MAC address).

img70.png

img71.png

You can also see a DHCPv6 IAID value (in this case 234887307). This identifies a particular Identity Association which allows a server and a client to identify, group, and manage a set of related IPv6 addresses. Each IA consists of an IAID, one or more IPv6 addresses, and the time T1 and T2 for that IA. Each IA is associated with exactly one interface. For further details, see RFC 3315, section 11.

DHCPv6 Ports and Messages

Clients and servers exchange DHCPv6 messages using UDP over IPv6. The client uses a link-local address, or addresses obtained via other mechanisms as the source address for transmitted and receiving  DHCPv6 messages. Servers receive messages from clients using a reserved link-scoped multicast address, so that clients don’t need to be configured with the addresses of DHCPv6 servers. To allow hosts to communicate with servers on other links, DHCPv6 relay agents are used. Clients listen for DHCPv6 messages on UDP port 546. Servers and relay agents listen for DHCPv6 messages on UDP port 547.

The link-scoped multicast address used by a client to communicate with an on-link relay agent or server is ff02::1:2. All DHCPv6 servers and relay agents are members of this multicast group.

The site-scoped multicast address used by a relay agent to communicate with servers is ff05::1:3, if it wants to send a message to all DHCPv6 servers, or does not know the unicast address of the servers. All DHCPv6 servers in given site are members of this multicast group.

There are a number of DHCPv6 messages:

The SOLICIT message (1), is sent (multicast) by a client to locate servers.

The ADVERTISE message (2), is sent (multicast) by a server to indicate that it is available to provide  DHCPv6 service, in response to a Solicit message from a client.

The REQUEST message (3) is sent (unicast) by a client to request configuration parameters, including IP  addresses, from a specific server.

The CONFIRM message (4) is sent (multicast) by a client to any available server to determine whether the addresses it was assigned are still appropriate on the link to which the client is connected.

The RENEW message (5) is sent (unicast) by a client to the server that originally provided the client’s address and configuration parameters, to extend the lifetime on the addresses assigned to the client and update other configuration parameters.

The REBIND message (6) is sent (multicast) by a client to any available server to extend the lifetimes on the addresses assigned to the client and to update other configuration parameters. This message is sent after a client receives no response to a RENEW message.

The REPLY message (7) is sent (unicast) by a server to a client in response to a SOLICIT, REQUEST, RENEW or REBIND message received from a client. A server sends a REPLY message containing configuration parameters in response to an INFORMATION-REQUEST message. It sends a REPLY message in response to a CONFIRM message confirming or denying that the addresses assigned to the client are appropriate on the link to which the client is connected. A server sends a REPLY message to acknowledge receipt or a RELEASE or DECLINE message.

The RELEASE message (8) is sent (unicast) by a client to the server that assigned addresses to the client to indicate that the client will no longer use one or more of the assigned addresses.

The DECLINE message (9) is sent (unicast) to a server to indicate that the client has determined that one or more addresses assigned by the server are already in use on the link to which the client is connected.

The RECONFIGURE message (10) is sent (unicast) by a server to a client to inform the client that the server has new or updated configuration parameters, and that the client should initiate a RENEW/REPLY or INFORMATION-REQUEST/REPLY transaction with the server in order to obtain the updated information.

The INFORMATION-REQUEST message (11) is sent (unicast) by a client to a server to request configuration parameters, without the assignment of any IP addresses to the client.

The RELAY-FORW message (12) is sent (multicast) by a relay agent to forward messages to servers, either directly or through another relay agent. The received message, either a client message or a RELAY-FORW message from another relay agent, is encapsulated in an option in the RELAY-FORW message.

The RELAY-REPL message (13) is sent (unicast) by the server to a relay agent containing a message that the relay agent should then deliver to a client. The RELAY-REPL message may be relayed by other relay agents for delivery to the destination relay agent. The server encapsulates the client message as an option in the RELAY-REPL message, which the relay agent extracts and then relays to the next relay agent or directly to the client.

DHCPv6 Status Codes

The following codes are used to communicate the success or failure of operations requested in messages from clients and servers, and additional information about the specific cause in the event of a failure to perform the operation.

img72.png

img73.png

DHCPv6 Message Syntax

All messages sent between clients and servers share the following syntax:

img74.png

msg-type Identifies the DHCP message type

transaction-id The transaction ID for this message exchange. Options

Options carried in this message.

5.6.1 – The DHCPv6 Protocol

DHCPv6 works in somewhat the same was as DHCPv4, except that different messages are used, and communication between client and server take place over using link-local scoped multicast and unicast addresses.

When it first comes up, before any DHCPv6 operation, an IPv6 capable client node obtains a link-local unicast address through ND (and possibly a global unicast address as well, using information from a Router Advertisement message). If a Router Advertisement message is seen, then the client can check the M & O bits in it to determine if there is Stateful DHCPv6, Stateless DHCPv6, or no DHCPv6 available. If no Router Advertisement is available, a client can still attempt DHCPv6 server discovery, as follows.

The client sends a SOLICIT message to multicast group ff02::1:2. This address specifies all DHCPv6 servers or relay agents on the local-link. The included options are:

ClientID

Option Request Option (IA-NA, DNS-Servers, Domain-List)

One or more DHCPv6 servers on the link (or servers on remote links, via DHCPv6 relay agents) will reply with an ADVERTISE message to the client that sent the SOLICIT message (via unicast). The included options are:

ServerID, ClientID

DNS-Servers, IA-NA (IAID, IAPREFIX).

The client will select one responding DHCPv6 server and send a REQUEST message to it (via unicast). This will actually ask for an address lease. The included options are:

ServerID, ClientID

Option Request Option (IA-NA, DNS-Servers, Domain-List)

The selected server will send a REPLY message to the client that sent the REQUEST message (via unicast). This will confirm the address lease. The included options are:

ServerID, ClientID

DNS-Servers: 2001:xxx:yyy:zzz::a, 2001:xxx:yyy:zzz::b

IA-NA: IAID: 1,

IAPREFIX: Preferred lifetime: nnnnnn,

Valid lifetime: nnnnnn,

Prefix: 2001:xxx:yyy:zzz::c/64

At some later point in time, when the lease is about to expire, the client will send a RENEW message to the selected DHCPv6 server. The server will extend the lease and respond with a REPLY message.

For Further Information on DHCPv6

For details on how clients send and respond to DHCPv6 messages see RFC 3315, section 17. For details on DHCP Client-initiated Configuration Exchanges see RFC 3315, section 18.

For details on DHCP Server-initiated Configuration Exchanges see RFC 3315, section 19. For details on Relay Agent behavior see RFC 3315, section 20.

For details on the optional authentication mechanism, for use of DHCPv6 in unsecured environments, such as wireless networks see section RFC 3315, section 21.

For available DHCPv6 message options and their syntax see RFC 3315, section 22.

Stateless DHCPv6 assumes that assigned IPv6 addresses are obtained some other way, such as Stateless Address Autoconfiguration, and that only stateless information (IPv6 addresses of DNS servers, SIP servers, etc) will be obtained from DHCPv6. RFC 3736, “Stateless Dynamic Host Configuration Protocol (DHCP) Server for IPv6”, April 2004, defines the subset of messages and options from the full (stateful) DHCPv6 functionality that are required to provide stateless DHCPv6 service.

For details on publishing the address of SIP servers with DHCPv6, see RFC 3633, “IPv6 Prefix Options for Dynamic Host Configuration Protocol (DHCP) version 6”, December 2003.

For details on publishing the address of DNS servers with DHCPv6, see RFC 3646, “DNS Configuration options for Dynamic Host Configuration Protocol for IPv6 (DHCPv6)”, December 2003.

For details on publishing the address of NIS (Network Information Service) servers with DHCPv6, see RFC 3898, “Network Information Service (NIS) Configuration Options for Dynamic Host Configuration Protocol for IPv6 (DHCPv6)”, October 2004.

For details on publishing the address of SNTP (Simple Network Time Protocol) servers with DHCPv6, see RFC 4075, Simple Network Time Protocol (SNTP) Configuration Option for DHCPv6”, May 2005.

5.6.2 – Useful Commands Related to DHCPv6

In Windows 7, there are some commands available in a command prompt box related to DHCPv6:

img75.png

This is an example of the output from “ipconfig /all”:

img76.png

img77.png

In the above, notice the following:

*     The MAC address (“Physical Address”) of the interface is 00-22-15-23-32-9C

* A 64-bit interface identifier (b5ea:976d:679f:30f5) was created, which is a cryptographically generated value (not from EUI-64). A link-local unicast address was generated from this (by prepending fe80::/64). The link-local address of the default gateway  (fe80::21b:21ff:fe1d:c159) was then obtained using ND Router Discovery.

* A Router Advertisement message supplied the subnet prefix (2001:df8:5403:3000::/64), so the node used it to create two global unicast addresses, one of which (2001:df8:5403:3000:b5ea:976d:679f:30f5) used the  64-bit random interface identifier from ND, the other (2001:df8:5403:3000:218a:4956:7d8c:7c2c) used yet another random interface identifier.

* Obtain an IP address automatically, and Obtain DNS server address automatically were selected in the IPv4 GUI configuration (“DHCP Enabled”), and a working DHCPv4 server was found (“Autoconfiguration Enabled”). So, an IPv4 address (172.20.2.1), the subnet mask (255.255.0.0), the default gateway (172.20.0.1) and the IPv4 addresses of two DNS servers were obtained from the DHCPv4 server. The lease for this was obtained on 3/12/2010, 9:42pm, and will expire on 3/18/2010, at 9:43pm. The MAC address (00-22-15-23-32-9C) was used to make a DHCPv4 reservation for this node, so this node will always get that IPv4 address.

* Obtain an IPv6 address automatically, and Obtain DNS server address automatically were selected in the IPv6 GUI configuration, and both the M and O bits were set in the Router Advertisement message (Stateful and Stateless DHCPv6 available), so another global unicast IPv6 address (2001:df8:5403:3000::2:1) was obtained from DHCPv6, plus the IPv6 addresses of two DNS servers. The lease for this was obtained on 3/12/2010, at 9:43pm and will expire on 3/24/2010 at 9:43pm.

* The DUID of the node is 00-01-00-01-11-99-BD-28-00-22-15-24-32-9C. The first two hex digit pairs contain 00-01. That means this is a type 1 DUID (DUID-LLT), “Link-Layer plus Timestamp”. The next six hex digit pairs (00-01-11-99-BD-28) are the timestamp, and the last six digit pairs (00-22-15-23-32-9C) contain the interface MAC address. This DUID, along with the IAID (218112533) was used to make a DHCPv6 address reservation for this node. So, this node will always get that IPv6 address.

5.7 – TCP/IPv6 Network Configuration

Let’s assume our LAN has the following configuration:

img78.png

Furthermore, assume the DHCPv6 server is correctly configured with this information, and is managing the address range 2001:df8:5403:3100::1000 to 2001:df8:5403:3100::1fff (and that some leases have already been granted).

Any node connected to a network with TCP/IPv6 (that will access IPv6 nodes on the Internet) must have certain items configured, including:

*     IPv6 link-local node address (obtained automatically)

*     All nodes on local-link multicast address (ff01::1), there by default

*     IPv6 global unicast address

*     IPv6 address of default gateway (link-local address of gateway obtained automatically)

*     IPv6 addresses of DNS servers (manually configured or from DHCPv6)

*     Nodename

*     DNS domain name

5.7.1 – Manual Network Configuration for IPv6-Only

It is possible perform TCP/IPv6 configuration manually, either by editing ASCII configuration files, as in FreeBSD or Linux; or via GUI configuration tools, as in Windows. If you have understood the material in this chapter, it should be fairly easy to configure your node(s). In most cases, if you have ISP service, the ISP will give you all the information necessary to configure your node(s). In the coverage of Dual Stack networks we will show configuration of both IPv4 and IPv6 on a single node.

Auto Network Configuration Using Stateless Address Autoconfiguration

It is easy for a FreeBSD node to be automatically configured using Stateless Address Autoconfiguration. Note that the global unicast address will be created with the EUI-64 algorithm from your MAC address.

Let’s configure a FreeBSD 7.2 node automatically with SAA. Assign it the following configuration:

img79.png

You need to edit the following files (you will need root privilege to do this):

/etc/rc.conf

...

hostname=”us1.redwar.org

ipv6_enable=”YES”

 ...

/etc/resolv.conf

domain      redwar.org

nameserver  2001:df8:5403:3000::11

nameserver  2001:df8:5403:3000::12

If you make these changes, then reboot, you can check the configuration as shown:

$ ifconfig vr0

img80.png

$ uname –n

us1.redwar.org

$ nslookup

> server

> exit

$ netstat –finet6 -rn

Auto Network Configuration Using Manually Specified (Static) IPv6 Address

Let’s configure a FreeBSD 7.2 node manually with a static node address. Assign it the following configuration:

img81.png

You need to edit the following files (you will need root privilege to do this):

/etc/rc.conf

... hostname=”us1.redwar.org

ipv6_enable=”YES”

ipv6_ifconfig_vr0=”2001:df8:5403:3000::13 prefixlen 64”

ipv6_defaultrouter=”2001:df8:5403:3000::1”

...

/etc/resolv.conf

domain      redwar.org

nameserver  2001:df8:5403:3000::11

nameserver  2001:df8:5403:3000::12

If you make these changes, then reboot, you can check the configuration as shown:

$ ifconfig vr0

img82.png

$ uname –n

us1.hughesnet.local

$ nslookup

> server

> exit

$ netstat –finet6 -rn

Note:  if you specify a static IPv6 Address in FreeBSD 7.x (“ipv6-config_vr0=”…”), the node will not obtain a link-local default gateway address automatically. Therefore in this case it is essential that you also manually specify a default gateway address (which can be global unicast or link-local), using the “ipv6_defaultrouter=…” option in /etc/rc.conf. If no default gateway is defined, communication with other on-link nodes will work OK, but communication with off-link nodes will fail.

This is different from the behavior of Windows 7 and Linux, where the addition of a manually configured global unicast address does not stop the node from obtaining the link-local default gateway automatically.