VMWARE STORAGE PART 4: VSAN

Going through the VMware storage options I would be remiss if I did not talk about VSAN.  VSAN is simply VMware’s next way of helping to further improve the Software Defined Datacenter.  To begin with though, it is important to understand a little about what VMware is trying to accomplish. 2 years ago, on gigaom, an interesting article came through on VMware’s slow and steady attack on storage. This was following the release of their less than stellar Virtual Storage Appliance, VSA. At the time I took exception with this. Of course VMware would never want to get into the storage business, they are a software company. Then came VSAN.

I still do not believe VMware intends to completely own the storage market, but they are certainly changing the game. Now I work for HP, and I remember when server virtualization started to take off, we thought the server was going to become irrelevant, we would just use some cheap whitebox server. Fortunately we at HP realized that we had to step up our game. As usual the server engineering team worked with our alliance partners and built even better products designed around virtualization to give higher virtualization density, and higher performance. I equate this latest storage product to the same thing. It will certainly capture certain market segments, but it is not a threat to the core storage business of the larger storage vendors.

With that said, just what is VSAN? The concept behind VSAN is actually an old one come again. We have been doing scale out object storage for some time in this industry. VSAN simply moves this into the hypervisor stack. The requirements are pretty simple, you need a minimum of 3 host servers running vSphere 5.5 each with an SSD and at least one SAS or SATA drive, HDD. The requirements are well documented so I don’t want to get into those details, but this is enough to get started.

Conceptually, the SSD becomes a cache to accelerate the reads and writes to the drives. The HDDs are used for storage, and replicas are kept based on the rules, generally at least 2 copies of the data on separate hosts.

The setup is pretty simple, and there are hands on labs available online with VMware. It is also quite simple to setup using VMware Workstation running vSphere 5.5 for labs.

This scales currently, in Beta v1, to 8 hosts, so this is not going to be a massive system, more of an SMB environment or a lab system. It also introduces some interesting challenges on the server and network side. On the server, there is pretty limited official support since the raid controller has to enable pass through. There is no raid since this is an object store, data protection is accomplished through multiple copies of the data. On the network side, this is challenging because we are copying data between hosts to retain consistancy. This generally, in my mind, means that the era of running VMware on 1GbE networks is probably nearing an end.

At the end of the day, VMware has a number of use cases. This is a Beta v1 product, I am nervous about running it in production just yet. Many of their use cases are around high performance workloads, VDI for example, where user experience can make or break a project. I do think that this is an exceptional way of creating shared storage in a lab, and gives us many new ways to work in a lab environment.

As to the future of VMware storage, traditional storage, and our industry, I think this is the beginning of the next big thing. I will be discussing HP’s own software defined storage soon, as well as our traditional storage platforms in a VMware context. I don’t see VSAN as a threat, but rather as a call to action on our part to make our products better and continue to innovate. I will personally use VSAN in my lab along side HP StoreVirtual, different use cases, and fun to test.

VMWARE STORAGE PART 4: VSAN

VMware Storage Part 3: NAS

Nas is an interesting topic when it comes to VMware. This is often a religious debate, many users of products that are better at file storage than block love to talk about using VMware on NAS. Now there is really nothing wrong with that, NAS is a great medium for VMware storage.

To start with, NAS, Network Attached Storage, is nothing more than allowing multiple client machines to share storage. Likely your PC has a shared drive, probably several, your public or departmental share is on a NAS using the windows SMB, Server Message Block, protocol.

In a VMware environment, we use the NFS, Network File System, a linux based protocol to connect the servers to the storage. Remember with VMware we want to use shared storage for High Availability and load distribution. The advantage to using NFS really comes down to simplicity. When we use NFS storage in VMware what we are doing is just creating a file rather than writing blocks. This was an early attraction when block based storage was not able to keep up with the writes coming from VMware. Since it was writing to an open file, there was no concept of writing and committing data, it was all just writing. This hasn’t changed, but block storage has gotten significantly better, but that is for another post.

The simplicity also comes from the ability to expand simply. If you have the space on the NFS server you can just grow the NFS share, no extents (joining multiple logical volumes together), just grow the file system. Really not much too it, and very simple to manage you also don’t worry about the file system. In a block based system you create a file system based on VMFS, the VMware File System, on NFS, it is it’s own file system, you are just a file living there.

So the plus side is this is pretty simple, and the performance is now about the same from block to file, so what is the downside?

The biggest issue on my side is multipathing. There are ways to get around this, mostly proprietary, and with proper networking you can actually use nic bonding to give you a sense of multiple paths, but this requires some planning outside of the VMware environment, and can be challanging. On that point you are bound by your IP network, remember this doesn’t run over Fiber Channel, so if you go this direction, you better be on a rock solid network.

The other major downside, from my perspective, is VMware typically releases the features first to the block systems and then to NAS. Now again this gap is closing, but it is there. If you are like me, you update your iPhone to the latest code the min it is released and the servers stop crashing. In my lab, I run bleeding edge code, and I am all about the coolest and flashy, so this is important to me.

At the end of the day, NAS is great in a VMware environment. I personally like to have both options available, and would never tell someone they are wrong for going either direction, just make sure you are able to justify your decision.

VMware Storage Part 3: NAS

VMware Storage Part 2: SAN

Continuing on with the theme from last week. Again this is just to get things started, I do intend to dig into some of these areas much deeper, and discuss specific products, but to assume that everyone knows the basics would be in opposition to my desire to make technology, specifically virtualizaiton, something we can all understand and work with.

I was first introduced to the concept of a SAN with the Apple X-Raid in 2004 by a software developer I worked with. We were at a small software startup in Sacramento, CA, and the concept seemed outrageous to me. Shortly after that I ended up moving to a Casino where I was handed a SAN and the VMware ISO images. I quickly learned the value of shared storage.

The concept behind a SAN is that rather than the islands of storage we talked about in part 1, we can logically divide a large pool of storage between multiple servers. This is important in a VMware environment to enable things such as High Availability and Load Balancing between hosts. Since all hosts have access to all shared storage (SAN), a Virtual Machine may reside on any host.

A critical design point when using a SAN in any environment, but especially in VMware, is multipathing. This is simply having more than one connection from the host server to the shared storage. This becomes particularly critical in a VMware environment. Remember we are dealing with consolidation, so I may be moving 5,10, or more workloads to a single VMware host. Not only does this increase risk but also the load carried by the storage connections. This is where your SAN vendor’s controller design can help or hurt you, but that is a topic for another day.

SAN’s come in many flavors, but the connectivity methods are generally iSCSI, Fiber Channel, and Fiber Channel Over Ethernet. Each of these has it’s own advantages and disadvantages. What I have found is generally speaking this is largely dependent on the environment. For smaller customers, iSCSI is often perfectly acceptable, and can provide a familiar medium for the networking team. In larger environments, Fiber Channel is often preferable since it offers a simple and low latency network which is designed to do one thing and one thing only.

One closing thought on storage networking, it is important to consider line speed. With the release of 16G Fiber Channel, and 10GbE becoming more affordable, it is often wise to step up and pay for the fastest storage network you can afford. As Solid State Drives continue to gain market share we are seeing more and more storage networks become saturated. Many storage array vendors are dropping support for slower speeds, 1GbE iSCSI in particular. Always wise to prevent bottlenecks wherever possible even if it does cost a little more up front.

VMware Storage Part 2: SAN

VMware Storage Part 1: Introduction & Local Storage

One of my favorite radio talk hosts talks about the importance of having the “heart of a teacher”. When I started out in IT, I thought I wanted to be in tech support, teach others how to use their computers, and how to make the technology work for them. I have come to realize that while I enjoy helping others, I prefer to talk about concepts, and help them understand storage and virtualization. I am going to spend the next several posts going through some of the VMware storage concepts, in what to many may seem simple terms, but many of the people I talk to do not have a solid understanding, so I think it is always wise to level set, to start from a common point as it were. While there are many blogs out there with some incredibly technical content on this, many well written and helpful, I thought I would give this my own slant in an attempt to help some of the people I interact with and meet new ones. Feedback is appreciated, and I am always open to suggestions for new topics.

VMware in general is all about abstraction. With compute we put the software layer between the physical hardware and the operating system. This enables us to have portable servers, and to consolidate many workloads on to a smaller physical footprint. When we think about this from the storage side of things, it is not so much different. If we think about VMware creating a container to hold many servers, then a datastore, storage presented to VMware to hold Virtual Machines can be considered a container to store the hard drives and configuration files that make up a Virtual Machine. This storage is presented as one or more logical drives, datastores in VMware terms, up to 64TB in size. The reason behind sizing a datastore will be covered later, and is certainly open for discussion, but it is enough to know for now that we create a datastore from logical disk space.

When creating a Virtual Machine, VMware will ask you how much space you want, and which datastore you want to place it on. This will again be covered in a future post about design, but it is important to note, a datastore can contain multiple Virtual Machines, much like a VMware Host, physical machine running VMware, can contain multiple Virtual Machines.

Each VMware Host machine, provided it contains local hard drives, will have a datastore called “Local Datastore” or something similar. This is not a bad thing, it can be useful for Virtual Machines which you do not want to be able to move, but it is limited in that shared storage is required for high availability and resource distribution. With the release of VSAN in vSphere 5.5, as well as many Virtual Storage Appliance, VSA products, this can be used as shared storage as well, but more on that later.

To wrap up, storage is one of the more critical aspects to virtualization success. There are many more features to cover, I will be explaining many of the VMware features as well as where each different HP storage product may make sense as well as some reasons why I personally would choose HP over competitors products. Stay tuned and please let me know if there are other topics I should be covering, or if there is more detail needed.

VMware Storage Part 1: Introduction & Local Storage

Multi-Tenancy

Another question from someone I work with.  “Can you explain the meaning of ‘Multi-Tenant’?  Aren’t all virtualized servers and storage ‘Multi-Tenant’?”

The term Multi-Tenant is a little misleading.  Technically, yes virtualized servers are multi-tenant in that there are multiple servers running on a single server.  The term Multi-Tenant in the technology world however typically refers to a system where more than one company is using the same systems.

By way of example, we can look at a cloud hosting provider such as HP’s Cloud System Matrix (CSM).  As a disclaimer, this not an official HP Blog, and I am not an expert on CSM, but I do have a solid understanding of it.  So on CSM, you have multiple options.  The most common and cost effective method is to simply purchase a virtual instance of a server.  Essentially you get a virtual machine.  Your virtual machine is on the same physical machine as several other peoples.  In fact, it might get moved to other servers using vMotion without your knowledge.  The concept is that you are simply renting or leasing a virtual machine, but you don’t care what the physical hardware is.

In the HP StoreServ, formerly 3Par, storage, arena, we have a similar concept of multi-tenancy.  Consider the design of the 3Par array.  Everything within the system is built first for redundancy, then for performance.  It is the only truly tier 1 array that extends from the SMB through the global enterprise using the same architecture.  The system was originally designed for hosting providers, thus everything had to remain functional no matter what.  With this in mind, the system was created to be multi-tenant.  It is literally possible to present the system as multiple virtual SAN’s without the users realizing it.  This is perfect again if you want a granular control of your SAN, but you want to rent or lease it.  A service provider might purchase a large system, and partition it off to many different companies.  Since they all share the SAN, the system in multi-tenant.  The HP StoreServ array is able to give a secure virtual san to each user.

So to wrap up, something which is virtualized but not multi-tenant would be most traditional virtualized systems.  A company virtualizes their infrastructure within their own datacenter, or even at a colo.  Since they are the only company on the system, that would be a single tenant system, or not-multi-tenant.

Multi-Tenancy

SAN v. NAS

In my job at HP I come into contact with many people with many questions.  Some are quick answers, but many are an opportunity for me to spend a few min educating them on storage or virtualization.  Anyone who knows me knows I am passionate about Storage and Virtualizaiton, and I love to be able to help explain what I have learned.  I have a similar passion for learning from others and asking questions, but that is a topic for another post.  Going forward, rather than responding to questions via e-mail I am going to try to use this as a forum to reach a broader audience.

This question came to me from a friend, whom I work with from time to time, who is in a similar line of work, and loves to learn and ask questions much as I do.  He writes, “Why would anyone use a (Insert Unified Storage Here)  as a “NAS”?”.

Unified Storage essentially refers to any array that can present both file and block.  Unfortunately this is a bit misleading.  The reality is that File and Block are two different things.  Block storage is the underlying logical storage typically presented as a raw device to the operating system, whereas File storage requires a file system and is presented to the file system through a path using SMB(CIFS), or NFS.  So the misleading part of this is that you can’t have both in a single controller, or controller pair.

Most companies start as one, File for example, and then present the block storage out through the file controllers.  This creates some significant performance issues.  This requires a multi layer system which is inherently inefficient.  On the other side for a company starting with a block system and adding file, it ends up as a block system and a fiber attached “Gateway”, which is essentially a dedicated server.  As you can see neither solution is doing both well, but it works well enough.  Some systems do better at masking themselves, but at the end of the day they only do one well.

So back to the question, why would anyone use a Unified System as a NAS?  Honestly it usually comes down to simplicity and licensing.  Coming from a mixed windows and *nix background both, I am torn here.  In a windows environment, SMB v3 on Windows 2012 will provide a much higher level of performance, but there is also a cost for the windows licensing.  For many smaller shops, either using a linux server, running Samba, or a small unified appliance will provide a simulated windows file server.  While it might lack some of the advanced features of the newer versions of windows, the costs are usually lower, and there are sometimes management or other features which are attractive.

The short answer is that many small shops like a hybrid appliance, and find that the cost makes it very attractive.  For larger companies, this approach does not make sense, but when you are running an one man IT shop, this is a potentially compelling fit.

SAN v. NAS

The future of storage

So in a previous post, VMware Fundamentals Part 3 I talked about VMware and storage, and made a case for a mixed protocol storage environment. While I stand behind what I said, I always think it is interesting to take a deeper look at the industry. Calvin Zito, the HP storage guy, made some very good points in his post, Block of File Based Storage for VMware which got me thinking a bit more. That coupled with the recent product releases from HP have inspired me to talk a little more about this topic.

Calvin points out some compelling points about how block gets the most attention on the development cycle from VMware, and at this point, with the software initiators, and the ease of use, it often makes more sense to simply use a block based protocol.

That being said, in many virtual environments, we often find that the traditional storage array doesn’t fit the bill. We are running a number of host servers, with internal storage that is going to waste. So how do we take advantage of this lost capacity, and how can we lower our costs while adding additional value?

A tough concept for many of us who came up through the ranks of storage administration to swallow is that storage is not king any more. It is certainly important, but gone are the days when I can walk into a company as a storage admin and name my price. Now certainly as a Storage Architect I can demand more, but even so, I am required to know more than just storage. The really tough part though is storage is no longer defined as a monolithic array. We have to start embracing the concept that storage, like everything else must be defined as software. This becomes more and more important when we look at the move toward ITaaS. Nicholas Carr drives this point home in his book, The Big Switch.

So the short answer to the question, what is the future of storage, much like compute, networking, and the rest of what we do, the future of storage is software. Whether it is open source or supported by a company such as HP, this is where we are headed.

The future of storage

New Direction

So it has been quite some time since I last posted, far too long. Since my last post, a number of things have changed. The company I was working for decided to exit the Datacenter business, causing our small team of four to seek employment elsewhere. One went back to EMC, one moved on to Cisco, one is currently contracting, and I have moved over to Hewlett Packard as a Pre-Sales Storage Solutions Architect, a position more in line with my personal preferences, and a direction I have been wanting to go in for some time.

This position means the focus of my blog will change, it wouldn’t be in my best interests to write about EMC products since they are a competitor, and I plan to focus more on storage, although I do want to continue to talk about vmware in the event any of my former customers stumble on this blog.

A bit about why I chose to join HP. Anyone who follows any tech news knows that our stock prices are lower than we would like, something I knew coming in to the position. HP has also not, in my opinion, been a strong player in the storage market historically.

The EVA was an interesting product. It was a very different way of thinking about storage, and it’s performance was not at the level of it’s competitors. While it did play well in some markets, it was not a market leader for the most part.

Recognizing this, HP purchased Left Hand Networks. This company was mostly a software play, and honestly a bit ahead of it’s time. Realizing that they could completely abstract the software from the hardware, was a little strange to most of us. HP saw an opportunity to increase their portfolio, and give them a play in the SMB markets.

Realizing the EVA was not keeping up, HP acquired 3Par following a bidding war with Dell. Again, not a traditional storage platform though. 3Par was built as a storage server, not another modular array. The true unique value here is that they took the concept of virtualizing the storage to the next level.

When a friend, who I had worked with before told me he was joining HP as a Sales Specialist, and asked me to look into the possibility of joining him in the Pre-Sales role, I was pretty surprised. I chose to join HP because after looking at the product portfolio, after looking at the leadership team, after looking at the company history, I am convinced that while it may be a rough ride for a bit, HP is in a great position to deliver exceptional products, which will return the company to its former glory. We have the people, the technology, and the leadership to execute, that is why I joined HP.

I will try to update more often on HP storage and virtualization topics, the future is bright, and only getting better.

New Direction

VMware Fundamentals Part 3

As I am on my flight to EMC World in Las Vegas, I can think of no more appropriate time to write about storage in a virtual environment.

Starting with the basics. Since we are dealing with VMware, we have several options available to us. To set the proper foundation, I would like to give a little thought on each one.

    Fiber Channel
    Fiber Channel over Ethernet (FCoE)
    iSCSI
    NFS

First of all, we need to group our protocols. The first three are block protocols. This means nothing more than the storage is presented at a block level which is to say it appears to be local to the host. Think of this very similarly to the hard drive in your laptop or desktop, only it lives somewhere else and is presented through a network protocol.

The NFS protocol is a *nix protocol which is used for file sharing. This is similar to a windows file share, if you go to a \\\ this is very similar to NFS.

On the block protocols, each has its own advantages and disadvantages. First of all Fiber Channel. This is an old protocol, used in many enterprise environments. It is currently spec’ed at 2,4, and 8 gbps. This is essentially sending the scsi commands over a fiber optic network. It is very fast, very efficient, and relatively expensive. It requires special adapters in the host server, as well as special fiber channel switches, and expensive fiber optic cable. The performance is good since it is a transport protocol, and dedicated to only one task.

FCoE is a newer protocol, designed to allow us to move to converged networks. This is typically run over one or more 10gbps connections. Modern implementations of this protocol run over a converged network using QOS and Storage and Network I/O control to provide maximum performance. The concept is similar, Fiber Channel, but does not require dedicated infrastructure. This also allows for backwards compatibility with legacy FIber Channel Networks.

iSCSI is not new, but is very enticing as a block protocol in the VMware environments. Operating at Layer 3 on the OSI model as opposed to Layer 2 as FCoE, there is a bit more overhead associated with iSCSI, but not enough to rule it out in a majority of environments. With the advent of 10GBe networks, we are now able to use iSCSI in a converged network similar to FCoE.

NFS stands out as very different. In the previous protocols, we use a dedicated network, or a subset of a converged network to present the storage using a block protocol. In the case of NFS, we use a file protocol. With the block protocol, we are present the storage and let VMware take over. With NFS, we present a file share. By definition, this must have an operating system on it already. VMware can take this file share and place the Virtual Machine files on it, and run as though it were a native and local file system.

Logically presenting a block protocol should be faster and more efficient, the OS get’s to manage the storage and handle everything internally. This is not always the case though. FCoE is the most efficient block protocol since we run it over a 10GBe network, and the protocol gives us more freedom than a native Fiber Channel Network. NFS on the other hand is simply a large open file. Rather than performing block writes, the NFS datastore is written as though it were a large open file. No write a block, write a block, write a block, as we find in the block based protocols.

So which protocol is best? Well the answer is as always, it depends. For Microsoft Exchange, and a few other applications, we have to use a block protocol for support reasons, though there is no performance impact and we expect this to change soon. For the majority of deployments, I recommend using NFS, with iSCSI for block requirements. If there is a specific need we can fall back to the Fiber Channel or FCoE protocols, but those are very specific use cases in very large environments. The additional advantage to NFS is simplicity of expansion, and it’s ease of deployment.

At the end of the day there is no right or wrong answer, but generally speaking multiple protocols are always nice to have.

VMware Fundamentals Part 3

VMware Fundamentals Part 2

A little behind, but better late than never. In my last post I covered the basic layout of VMware in modern terms.

Clustering in a VMware environment provides us with a number of benefits. The first of course is high availability. High availability simply put means if one host server dies, the virtual machines on that host are restarted on another host. This is excellent as it allows us to minimize downtime. Of course this feature in addition to a majority of the other excellent features in VMware require shared storage, but that is a post for another day.

If we can tolerate zero downtime, we can use the additional feature, Fault Tolerance. This enables us to have two virtual machines running in a master/slave configuration so if one dies the other picks up where it left off with no downtime. This is nice but it comes with a number of limitations and at significant cost since it doubles the number of virtual machines in use and thus the amount of resources required.

One of the coolest tricks, in my opinion, about VMware though is vMotion. We can actually take a live virtual machine, and move it between physical hosts without interruption, provided we have properly configured our network. This is excellent because it allows us to automate this process to keep the physical hosts balanced by moving live virtual machines around to ensure no one host is overloaded.

Taking it yet a step further, we often have multiple datastores in VMware. This is done to reduce contention on the file systems since we are seeing larger and larger deployments, with more and more virtual machines. Storage vMotion allows us to move the disk of the virtual machine between datastores. With the release of vSphere 5 we can now even automate this portion through clustered datastores, but that is again a topic for another day.

Migrating a virtual machine whether the live state, or the virtual disks is as simple as either dragging the virtual machine to a new host, or right clicking the virtual machine and selecting migrate.

Next post I plan to cover a little on the storage side, about how we configure datastores, and hopefully demystify why we chose different protocols in a VMware environment.

VMware Fundamentals Part 2