This is not about the technical details of building a data center network, but the non-technical parts. A very interesting read from Ethereal Mind. Some points worth taking home:
- SFPs and cables form a significant chunk of network infrastructure costs. And most network vendors seem to cash in on that.
- Most network vendors can’t read the requirements document carefully, but are mostly busy with their marketing gimmicks.
- Most vendors cannot understand the power usage of their boxes over a period of time (in terms of dollars that may need to get spent)
- Cheap networking gear has two positive side-effects: it allows one to buy generously and keep back up, just in case, RoI will be less, and a direct consequence of a low RoI means networking gear can be upgraded frequently
- The switch you choose decides the oversubscription ratio that can be achieved (or possible). The number of 10G ports/40G ports it has will decide that – assuming one is going to use all the ports with no spares left
- When the infrastructure cost is low, it allows us to buy some more experimentation
In google’s words:
Today at the 2015 Open Network Summit, we are revealing for the first time the details of five generations of our in-house network technology. From Firehose, our first in-house datacenter network, ten years ago to our latest-generation Jupiter network, we’ve increased the capacity of a single datacenter network more than 100x. Our current generation — Jupiter fabrics — can deliver more than 1 Petabit/sec of total bisection bandwidth. To put this in perspective, such capacity would be enough for 100,000 servers to exchange information at 10Gb/s each, enough to read the entire scanned contents of the Library of Congress in less than 1/10th of a second.
That’s quite a bandwidth. One can get more details about it in their Jupiter Rising paper which is being presented in SIGCOMM 15.
This is going to be a summary of the posts I read in cio, which BTW, are very nice reads:
- Just because some technology is hot and cool ( 🙂 ), does not mean I am going to use it, says one CIO
- How to connect physical NICs to virtual switches? Any best practices/recommendations? One guy ponders
- The end game is management of virtual infrastructure, not hypervisors, says another.
VMware should continue to integrate its products deeply with its partners’ offerings and leverage their sales channels—a kind of “unite and overcome” strategy. “VMware’s best chance is not to go it alone,”
- How I/O bottleneck is posing limits on what can be virtualized and what people are doing to solve that problem?
- One guy says “virtualization is netting him 50 to 60 percent utilization rates, compared to 10 to 15 percent utilization rates on standard servers”. He also says “the company’s IT provisioning has improved by a factor of 65 percent, changing from weeks to days”.
- A survey on virtualization adoption says:
- The order of preference of adapting virtualization are: cost cutting with server consolidation, improve disaster recovery, fast provisioning, competitive advantage
- Politial challenges are as important as technical challenges
- Balancing workloads without sacrificing performance is the biggest challenge
- They are not paying enough attention to integrating teams across IT but focusing on delivery
- Transplace braved and virtualized almost all of their servers when virtualization was relatively new. However, they would still leave the I/O intensive SQL and Exchange server on physical boxes. One advantage, though, they had is that the software is developed in-house so they know how to tune it and what it needs
- A related note on software licensing says SAP’s policy is good – it licenses based on number of users, not on number of physical/virtual resources
- On Citrix buying Xen, one guy says, “a rivalry between VMware and Citrix can only be good news, since rivalry equals faster product innovation and more pressure on pricing and support quality”. On having multi-vendor solutions in the data center, he feels, ” Managing less complexity costs less. And the more you buy from one vendor, the better your potential discount. Today, she says, many companies are buying “good enough” functionality from a single virtualization vendor as opposed to best-of-breed functionality from many vendors to get cost savings”
- David Siles says on of the advantages of having thin clients (virtualized desktops) is security: “We had a couple laptops stolen out of police cars,” Siles says. “Now [with the virtualized thin clients], you essentially just lose a dumb terminal.”. It is also a nice read on how people are benefiting from virtualizing their data centers – better server utilization, server licensing fees, server renewal costs, power saving, IT staffing reduction etc. etc. One of the quotes go “IDC says that there’s $140 billion in excess server capacity sitting around worldwide right now.”. That is too much!
- There is also a nice article on why we need tools to manage the virtual environment and how different players are gearing up to it.
- Why storage virtualization matters, for the services and flexibility it offers.
- Some gotchas of virtualizing IT environments
One advantage of VMFS as purported by this VMWare article on VMFS can be said as follows: give the management of storage to VMFS rather than provisioing storage ourselves for each VM that we create. It is also about ease of management. Given that, we have already bought in (or sold out, depending on how you see it) to VMWare, we can take one step further and use VMFS.
Reading this article about VMWare virtualization architecture, it looks to me that no other vendor in the market provides such a comprehensive end-to-end virtualization solution today. Here is the summary of what they offer:
- VMWare ESX – the thin hypervisor which:
- has direct driver access for virtual machines provide the base layer on which VMs will run
- does better memory management with overcommit and memory ballooning
- Virtual infrastructure – suite of tools to
- VMFS to provide virtualized storage
- Virtual Center that
- centralized management
- operational automation
- rapid provisioning
It sure has a niche over other vendors/products/solutions!!
This article on philosophy of HA is a nice read, in that it gives some perspective about how to do HA.
Traditionally, HA has been done by deploying an application on a set of physical servers and also deploying a layer of HA software like Microsoft Cluster Server or Veritas Cluster Server. This HA layer will monitor and appropriately failover for HA. With the recent trend in virtualization, the application itself will be a VM deployed on a hypervisor. Since it is now loosely coupled with the physical server, the hypervisor layer itself can provide HA monitoring there by taking the HA part away from the application and transparently given by the underlying infrastructure. That is what Virtual Infrastructure is about, from VMWare.
If HA can be decoupled from the application, then why not? It will make life much simpler. I am for it. For VMWare solution to work, we ought to have VMWare HA clusters and VMs deployed on one of the nodes in the cluster. The HTBT business will be taken care of by the virtual infrastructure.
As the author quotes, having VMs AND HA software (like MSCS/VCS) is not definitely worth the effort.
Continuing over last post, over last visited blog, the possibilities with virtualization are many. This article lists some of the operational/business transformations that are possible with virtualization:
- Template based provisioning model (not just OS, but entire application stacks). An indirect benefit of this is standardization (of procedures, operations, benchmarks etc.). As the author quotes: “higher standardization is the number one driver of a high server-to-admin ratio”
- Before making the configuration changes, one can take a snapshot of a VM. later, if the configuration changes causes some trouble, we can quickly rollback to the state of snapshot and continue working with minimal downtime
- Physical server maintenance: this is the most touted benefit of virtualization. One can even do it in business hours – so no late night work/over time work etc by admin folks!
- One can even patch a VM when it is powered off, which is great news!
The possibilities are many. And like the auther quotes, this is all realized in the later phases of virtualization deployment – when it is not seen as experimental anymore, when executives see value add in it.
Apart from flexibility and other engineering/managerial advantages, does virtualization offer any cost advantages? If this question is behind your mind, then this article on memory oversubscription answers that question.
The table showing how many VMs can be accomodated on a hypervisor (with and without overcommitment) shows how much is the cost per VM incurred. Even if we do not consider the overcommitment arguments, the cost per VM vs the cost per hardware host says it all!!
And talking about overcommitment, memory overcommitment is achieved by finding duplicate read-only pages in memory. Storage overcommitment is achieved by using thin disks (allocate-as-you-grow store), volume snapshots, deduplication etc.
Oversubscribing different resources have different mechanisms. CPU and network resources are time shared, while memory and storage are space shared (in a reasonable sense). How to overcommit CPU and network resources? What are the ideas for it?
Will there be or can there be any advantage of sharing memory across VMs? For memory to be shared, some conditions have to be met: the pages have to be read only. The most obvious choice for read-only pages are code pages. And code pages from different operating systems are less likely to match. So, obviously, code pages from same operating systems can be shared and that is where there is an advantage of consolidating VMs running same operating systems on a physical host server because there could be good memory savings.
Apparently, though VMWare supports it, Xen and Microsoft does not support memory sharing across VMs. This article talks about the same:
Ultimately, the memory savings can lead to higher consolidation ratios or the flexibility to over commit memory. Memory overcommitment is an essential piece of virtualization management, as it allows administrators to allocate enough memory to each VM in order for it to handle workload peaks. With staggered performance peaks amongst VMs on the same physical host, consolidation can be optimized without sacrificing performance.