It’s been a while since I posted some content on my blog. But one of my goals for 2010 is to continue sharing my thoughts and knowledge about the products and technologies I love.
I’m using Citrix Provisioning Services for XenApp environments for a while now, and I must say I love the product! Using PVS you’re able to create an extremely flexible and efficient XenApp environment. I don’t think I need to point all advantages of using Provisioning Services in a XenApp environment here…
But after implementing the product for a while I also discovered that making your environment 100% High Available can be a real bitch. There’s a lot that needs to be done and you can easily forget a component. In this article I will try to explain how to implement a 100% highly available Provisioning Server environment. I know there’s already some documentation and blogs out there that describe Provisioning Server High Availability. But none of the seem to tell you the complete story.
Streaming Service
The streaming service is the core of your environment and is the service that provides the actual vDisk to the target devices. This is obviously an essential part of the environment. The streaming service is provides by the Provisioning Servers. Since version 5 Provisioning Server uses a farm model. So you can now easily add extra servers to your Provisioning Server farm to handle the load of your target devices. When one of your Provisioning Servers fail the target devices connected to that server will flawlessly failover to the other Provisioning Server in your farm.
vDisk Store
Having multiple Provisioning Servers in your farm, means that all servers must have access to the vDisk you want to provide to your target devices. This can be leveraged by pointing both Provisioning Servers to a file share containing the vDisk files. Watch out using this options in large environments as might not scale to way you expect it to. Another option is to use shared storage solution with in addition a clusterfilesystem that supports active/active mode disks. This means that you’re able to attach the same LUN to both your Provisioning Servers and they can both read and write at the same time. As this is probably the best solution, it’s definitely not the most affordable one.
A very good alternative is to use the local storage of your Provisioning Servers. Just create the same folder structure on both servers and create a vDisk store that points to the local folder. Of course you have to make sure the vDisk in both locations is always up-to-date.
Database
The Provisioning Server database can be easily made high available using Microsoft clustering. And even when this is not a preferred option you can make use of the Offline database feature. When using this feature the database is being cashed locally on the Provisioning Server itself. So if the database becomes unavailable you have enough time to troubleshoot the problem and maybe restore the database from backup.
The Boot process
The real challenge in making Provisioning Services high available lies in the boot process. To fully understand this let’s take a look at the Provisioning Services boot process in detail. In our scenario we use 2 Provisioning Servers with the TFTP service installed. DHCP with options 66 and 67 is being used for delivering the TFTP location and bootfile name.
- The Provisioning Server target device is being booted
- The device is configured booting from network so a DHCP request is being broadcasted.
- The DHCP server receives the request and provides the target device with IP configuration.
- Together with this the target device receives the ip-address or DNS name of the TFTP server and the name of the TFTP bootstrap.
- The target device contacts the TFTP server and downloads the TFTP bootstrap.
- The bootstrap loads on the target device.
- The target device contacts the first available Provisioning Server configured in the bootstrap.
- The Provisioning Server being contacted matches the MAC address of the target device with the Provisioning Services database.
- When a match is being found the Provisioning Server checks if a vDisk is configured for the target device.
- If so, the target device is being assigned to the least loaded Provisioning Server configured for the vDisk using a load balancing algorithm.
- The target device connects to the chosen Provisioning Server and mounts the vDisk
- The vDisk is being booted and Windows starts. Looking at the above boot process we can easily point out the parts where we need redundancy.
DHCP
To be fully high available we first need a redundant DHCP solution. This is actually not really a pbig challenge. Because we use DHCP reservations for every target device, we can easily install a second DHCP server with the same scope and create the same set of reservations on this server. Because the ip-addresses are linked to the MAC addresses of the target devices, ip-addresses will only be used once.
TFTP
This is the tricky one, is the TFTP process. Provisioning Services provides it’s own TFTP service. This service can be installed as an option while installing the Provisioning Server. Using the Provisioning Services console we can configure the TFTP bootstrap to contain both Provisioning Servers. Now we can use DHCP option 66 to provide the TFTP server ip-address or DNS name to target devices. The problem we have here is that it’s only possible to add one ip-address or DNS name in option 66 in DHCP. This means only one of the TFTP servers can be provided to the target devices.
We have a few options here to provide redundancy in some form;
DNS roundrobin
We could create a DNS record which points two the ip-address of both/all TFTP servers. TFTP requests will then be equally load balanced between the TFTP servers. Problem here is that when one of the TFTP servers fail TFTP requests will then still be routed to that server. In this scenario half of your target devices will fail to boot.
Co-host TFTP with DHCP
This is actually a pretty nice solution. It’s not perfect, but gets close. In this setup we will have DHCP being hosted on the same servers as TFTP. In the DHCP configuration each server will point to itself for TFTP. So when one of the servers fail, the other server will answer all DHCP requests and thus automatically provides TFTP for all target devices. But as I said, this configuration is not perfect. When the TFTP server crashes on one of your servers, this server will still answer to DHCP request thereby providing itself as TFTP server. This means all target devices getting DHCP information from this server will fail.
Loadbalance TFTP using a hardware loadbalancer
This is the only option for providing 100% high availability for the Provisioning Services TFTP service. Using a hardware load balancer like Citrix NetScaler you can create a virtual TFTP server that’s intelligent enough to know when a node fails.
Boot using ISO
The last option is using an ISO to boot the target devices in stead of PXE. You can use the Provisioning Services Boot Device Manager to create a bootable ISO. In this ISO you configure the Provisioning Servers as you normally do in the Provisioning Services console at “Configure Bootstrap”. You then let your target devices boot from this ISO. In this case you can completely forget the DHCP / TFTP process.
Downside in this scenario is that this option is only valid in environments where XenApp servers are being virtualized. You don’t want to put bootable cdroms in all your physical XenApp servers;)
A very important thing to know is that ONLY the servers configured in the bootstrap will be used in loadbalancing and failover (This doesn’t only apply for the ISO method but also for using DHCP / TFTP). So lets say, if only one Provisioning Server is configured in the bootstrap, target devices booted from that Provisioning Server will never be able to failover to another Provisioning Server in case of an outage. Taken this into account means that whenever you add another Provisioning Server to your farm, you’ll have to update your ISO and reattach this ISO to ALL your target devices. Believe me… you don’t want to do this in an environment with more then 100 target devices.
Conclusion
While using Provisioning Services in your XenApp environment has lots of advantages, the complexity of the product doesn’t need to be underestimated. When not implemented correctly Provisioning Server can cause you a big headache. Especially when implementing a highly available environment you have to make sure all of above components are well configured and fully redundant.






) and is still busy designing and implementing SBC and VDI environments at customers, based on Citrix products. Besides consultancy Eelco is frequently asked for troubleshooting jobs and infrastructural challenges.