Computer Deployment over VPN
Remote Installation Server (RIS) has been a possible choice for computer deployment up until Windows 2003. Even though it there was a lot of room for improvement it seemed to be working quite well… also over VPN connections.
A couple of years ago I decided to start using Windows Deployment Services (WDS) and Microsoft Deployment Toolkit (MDT). It worked great, easier to handle and offered more functions. If you are familiar with RIS/WDS or other deployment technologies, you might already know that you cannot deploy across subnets unless you make special configurations in your DHCP scope (see http://support.microsoft.com/kb/259670) or setup DHCP Relay Agents/IP Helper/etc. on each subnet. My choice was to configure IP helper adresses in my Cisco networking equipment.
At the company, where I’m currently employed, we have more than 120 minor off site locations connected to headquarters through VPN tunnels. It was decided that we would like to offer the possibility to deploy a workstation through the VPN tunnel. Obviously it would take hours to do so but it would save us alot of logistical problems and time issues since some of the off site locations are in other parts of world. I didn’t expect this to be a problem since I knew this could be handled by good old RIS, but I was very much wrong. When PXE booting across the VPN kept getting the following error:
PXE-E32: TFTP open timeout
I googled a lot, tried to change DHCP settings but nothing really worked… I kept ending up with the TFTP timeout. But when all else fails my best advise is to go back to basics and capture some network traffics to find out what’s really going on. That’s what I did.
To better understand the network captures I just want to show a simple layout of the network. The client with the IP 10.1.3.83 is an internal client that can be deployed through WDS. The client with the IP 10.129.200.150 is the client that are experiencing problems with deployment over the VPN tunnel. Obviously the server has the IP 10.1.0.17.
To know how a working scenario would look like on the network I made a capture of the initial deployment process from the 10.1.3.83 client, which could be deployed successfully. Since this is a working scenario the packets look the same on both the client- and server-side.
No. Time Source Destination Protocol Info
1 0.000000 10.1.3.83 10.1.0.17 DHCP DHCP Request - Transaction ID 0xb8b20989
2 0.007570 10.1.0.17 10.1.3.83 DHCP DHCP ACK - Transaction ID 0xb8b20989
3 0.013010 10.1.3.83 10.1.0.17 TFTP Read Request, File: boot\x86\wdsnbp.com\000, Transfer type: octet\000
4 0.024179 10.1.0.17 10.1.3.83 TFTP Option Acknowledgement, tsize\000=31124\000
5 0.024185 10.1.3.83 10.1.0.17 TFTP Error Code, Code: Not defined, Message: TFTP Aborted\000
6 0.025794 10.1.3.83 10.1.0.17 TFTP Read Request, File: boot\x86\wdsnbp.com\000, Transfer type: octet\000
7 0.027849 10.1.0.17 10.1.3.83 TFTP Option Acknowledgement, blksize\000=1456\000
8 0.027897 10.1.3.83 10.1.0.17 TFTP Acknowledgement, Block: 0
9 0.028376 10.1.0.17 10.1.3.83 TFTP Data Packet, Block: 1
10 0.028381 10.1.3.83 10.1.0.17 TFTP Acknowledgement, Block: 1
11 0.029067 10.1.0.17 10.1.3.83 TFTP Data Packet, Block: 2
12 0.029071 10.1.3.83 10.1.0.17 TFTP Acknowledgement, Block: 2
...
51 0.049679 10.1.0.17 10.1.3.83 TFTP Data Packet, Block: 22
52 0.049681 10.1.3.83 10.1.0.17 TFTP Acknowledgement, Block: 22
53 0.086404 10.1.3.83 10.1.0.17 DHCP DHCP Request - Transaction ID 0xb8b20989
54 0.086408 10.1.0.17 10.1.3.83 DHCP DHCP ACK - Transaction ID 0xb8b20989
55 0.133574 10.1.3.83 10.1.0.17 TFTP Read Request, File: Boot\x64\pxeboot.com\000, Transfer type: octet\000
56 0.027849 10.1.0.17 10.1.3.83 TFTP Option Acknowledgement, blksize\000=1456\000
57 0.140103 10.1.3.83 10.1.0.17 TFTP Acknowledgement, Block: 0
58 0.140624 10.1.0.17 10.1.3.83 TFTP Data Packet, Block: 1
59 0.140629 10.1.3.83 10.1.0.17 TFTP Acknowledgement, Block: 1
60 0.141278 10.1.0.17 10.1.3.83 TFTP Data Packet, Block: 2
61 0.141282 10.1.3.83 10.1.0.17 TFTP Acknowledgement, Block: 2
...
92 0.153642 10.1.0.17 10.1.3.83 TFTP Data Packet, Block: 18
93 0.154076 10.1.3.83 10.1.0.17 TFTP Acknowledgement, Block: 18
Please notice packet 7 and 56 marked with red. The TFTP server tells the client that it should expect a block size (blksize) of 1456 bytes. In RIS this was 512 bytes but in Windows 2008 and later this has been changed to 1456 bytes to speed up the deployment process. But Microsoft forgot to take VPN into consideration.
Here’s how the packet capture looks like from the server, when the remote client on the 10.129.200.0/24 network is PXE booting. Packet 7 reaches the client as the server receives an acknowledgement in packet 8. After that the server just keeps sending block 1 again and again, as it doesn’t receive an acknowledgement from the client.
No. Time Source Destination Protocol Info
1 0.000000 10.129.200.150 10.1.0.17 DHCP DHCP Request - Transaction ID 0xb8b20989
2 0.000267 10.1.0.17 10.129.200.150 DHCP DHCP ACK - Transaction ID 0xb8b20989
3 0.041467 10.129.200.150 10.1.0.17 TFTP Read Request, File: boot\x86\wdsnbp.com\000, Transfer type: octet\000
4 0.043171 10.1.0.17 10.129.200.150 TFTP Option Acknowledgement, tsize\000=31124\000
5 0.058853 10.129.200.150 10.1.0.17 TFTP Error Code, Code: Not defined, Message: TFTP Aborted\000
6 0.066817 10.129.200.150 10.1.0.17 TFTP Read Request, File: boot\x86\wdsnbp.com\000, Transfer type: octet\000
7 0.071815 10.1.0.17 10.129.200.150 TFTP Option Acknowledgement, blksize\000=1456\000
8 0.088466 10.129.200.150 10.1.0.17 TFTP Acknowledgement, Block: 0
9 0.088706 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
10 2.074910 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
11 4.081189 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
12 6.074582 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
13 8.076182 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
14 10.076389 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
15 12.086364 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
16 14.085904 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
17 16.074910 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
Let us take a look at the client side of things to see what’s going on there.
No. Time Source Destination Protocol Info
1 0.000000 10.129.200.150 10.1.0.17 DHCP DHCP Request - Transaction ID 0xb8b20989
2 0.044616 10.1.0.17 10.129.200.150 DHCP DHCP ACK - Transaction ID 0xb8b20989
3 0.050171 10.129.200.150 10.1.0.17 TFTP Read Request, File: boot\x86\wdsnbp.com\000, Transfer type: octet\000
4 0.068788 10.1.0.17 10.129.200.150 TFTP Option Acknowledgement, tsize\000=31124\000
5 0.068934 10.129.200.150 10.1.0.17 TFTP Error Code, Code: Not defined, Message: TFTP Aborted\000
6 0.070140 10.129.200.150 10.1.0.17 TFTP Read Request, File: boot\x86\wdsnbp.com\000, Transfer type: octet\000
7 0.097809 10.1.0.17 10.129.200.150 TFTP Option Acknowledgement, blksize\000=1456\000
8 0.097934 10.129.200.150 10.1.0.17 TFTP Acknowledgement, Block: 0
9 0.142206 10.1.0.17 10.129.200.150 IP Fragmented IP protocol (proto=UDP 0x11, off=0, ID=3ab9) [Reassembled in #10]
10 0.142305 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
11 2.129867 10.1.0.17 10.129.200.150 IP Fragmented IP protocol (proto=UDP 0x11, off=0, ID=3abe) [Reassembled in #12]
12 2.129870 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
13 4.133477 10.1.0.17 10.129.200.150 IP Fragmented IP protocol (proto=UDP 0x11, off=0, ID=3ac7) [Reassembled in #14]
14 4.133931 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
15 6.127184 10.1.0.17 10.129.200.150 IP Fragmented IP protocol (proto=UDP 0x11, off=0, ID=3acc) [Reassembled in #16]
16 6.127536 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
17 8.127606 10.1.0.17 10.129.200.150 IP Fragmented IP protocol (proto=UDP 0x11, off=0, ID=3ad1) [Reassembled in #18]
18 8.128362 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
19 10.128063 10.1.0.17 10.129.200.150 IP Fragmented IP protocol (proto=UDP 0x11, off=0, ID=3ad7) [Reassembled in #20]
20 10.128068 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
21 12.137783 10.1.0.17 10.129.200.150 IP Fragmented IP protocol (proto=UDP 0x11, off=0, ID=3aec) [Reassembled in #22]
22 12.137788 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
23 14.135978 10.1.0.17 10.129.200.150 IP Fragmented IP protocol (proto=UDP 0x11, off=0, ID=3b19) [Reassembled in #24]
24 14.136504 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
25 16.124145 10.1.0.17 10.129.200.150 IP Fragmented IP protocol (proto=UDP 0x11, off=0, ID=3b39) [Reassembled in #26]
26 16.124460 10.1.0.17 10.129.200.150 TFTP Data Packet, Block: 1
As you can see the client actually receives block 1 repeatedly, but why doesn’t it send back an acknowledgement then? The reason for this is that the PXE client on the network card is very simple. It doesn’t support fragmented packages (notice the lines marked with red). You might ask why isn’t this a problem on the internal network then. Well, Microsoft set the block size to 1456 bytes to leave some room for IP headers so the packets wouldn’t get fragment on the network. However when you add VPN into the equation you’ll get some extra VPN headers and the packet size would grow too big. To take care of this the networking equipment will devide the TFTP packets into 2 VPN packets to be able to send them across the Internet and fragmentation occurs. Since the PXE client is unable to handle this it doesn’t send an acknowledgement to the server, which then keeps sending the same block again and again undtil we get a TFTP timeout. Since we cannot change the configuration of the client, the configuration of the server needs to be changed. But Microsoft hasn’t made this an easy task. The TFTP server is a builtin part of WDS and Microsoft doesn’t provide an interface to modify settings related to the TFTP server.
Luckily enough Microsoft has realized that this is a significant problem, so after a while I fell over this KB http://support.microsoft.com/kb/975710. Here you can request a hotfix from Microsoft that makes it possible to control the TFTP block size through the registry. What block size is best for your environment depends on the type of VPN you use as the size of the VPN headers vary. If you don’t know where to begin, 1400 bytes could be a good place to start. If it doesn’t work, lower the value. If it works you can leave it there or add a few bytes to speed up deployment (probably not noticable). By applying this hotfix and setting the block size in registry, you avoid the packet fragmentation through VPN and the PXE client is then cabable of handling the TFTP packets and the deployment process should work.
Hope this made the problem clearer for everyone. I thinks this is also a problem with System Center Configuration Manager (SCCM). If someone can confirm this, I’ll add it to this post. I’ve attached the original packet captures below.




