Welcome!

Related Topics: SYS-CON MEDIA

SYS-CON MEDIA: Article

Monitoring the SAN shine with Virtual Instruments

VI's VirtualWisdom provides the insight to major savings on SAN and Virtual infrastructures

It was about three months ago that one of my friends had informed me he was leaving HDS to join a company named Virtual Instruments. ‘Virtual Instruments?’ I asked myself, trying to fathom if I’d have heard of them before only to realize that I had once seen a write up on their SAN monitoring solution, which was then termed NetWisdom. I was then inadvertently asked to mention Virtual Instruments in one of my blogs - nice try pal but I had made it clear several times before to vendors requesting the same that I didn’t want my blog to become an advertising platform. Despite this though I was still intrigued by what could have persuaded someone to leave a genuinely stable position at HDS to a company I hadn’t really had much exposure to myself. Fast forward a few months, several whitepapers and numerous discussions and I find myself writing a blog about the very said company.

Simple fact is it’s rare to find a solution or product in the storage and virtualization market that can truly be regarded as unique. More often than not most new developments fall victim to what I term the ‘six month catch up’ syndrome in which a vendor brings out a new feature only for its main competitor to initially bash it and then subsequently release a rebranded and supposedly better version six months later. The original proponents of thin provisioning, automated tiered storage, deduplication, SSD flash drives etc. can all pay testament to this. It is hence why I have taken great interest in a company that currently occupies a niche in the SAN monitoring market and as yet doesn’t seem to have worthy competitor, namely Virtual Instruments.

My own experience of Storage monitoring has always been a pain in the sense that nine times out of ten it was a defensive exercise in proving to the applications, database or server guys that the problem didn’t lie with the storage. Storage most of the time is fairly straightforward, wherein if there are any performance problems with the storage system they’ve usually stemmed from any immediate change that may have occurred. For example provision a write intensive LUN to an already busy RAID group and you only have to count the seconds before your IT director rings your phone on the verge of a heart attack at how significantly his reporting times have increased. But then there was always the other situation when a problem would occur with no apparent changes having been made. Such situations required the old hat method of troubleshooting supposed storage problems by pinpointing whether the problem was between the Storage and the SAN fabric or between the Server and the SAN but therein dwelled the Bermuda Triangle at the centre of it all i.e. the SAN. Try to get a deeper look into the central meeting point of your Storage Infrastructure and to see what real time changes have occurred on your SAN fabric and you’d subsequently enter a labyrinth of guesses and predictions.

Such a situation occurred to me when I was asked to analyze and fix an ever-slowing backup of an Oracle database. Having bought more LTO4 tapes, incorporating a destaging device, spending exorbitant amounts of money on man days for the vendor’s technical consultants, playing around with the switches buffer credits and even considering buying more FC disks, the client still hadn’t resolved the situation. Now enter yours truly into the labyrinth of guesses and predictions. Thankfully I was able to solve the issue by staying up all night and running a Solaris IOSTAT, while simultaneously having the storage system up on another screen. Eventually I was able to pinpoint (albeit with trial and error tactics) the problem to rather large block sizes and particular LUNs that were using the same BEDs and causing havoc on their respected RAID groups. With several more sleepless nights to verify the conclusion, the problem was finally resolved. Looking back surely there was a better, cost effective and more productive way to have solved this issue. There was but I just wasn’t aware of it.

Furthermore ask any Storage guy that’s familiar with SAN management/monitoring software such as HDS’ Tuning Manager, EMC’s ControlCenter, HP’s Storage Essentials and their like and they’ll know full well that despite all the SNIA SMI-S compliancy they still fail to provide metrics beyond the customary RAID group utilization, historic IOPS/sec, cache hit rate, disk response times etc. in other words from the perspective of the end-user there really is little to monitor and hence troubleshoot. Frustratingly such solutions still fail to provide performance metrics from an application to storage system view and thus also fail to allow the end user to verify if they are indeed meeting the SLAs for that application. Put this scenario in the ever growing virtual server environment and you are further blinded by not knowing the relation between the I/Os and the virtual machines from which they originated.

Moreover Storage vendors don’t seem to be in a rush to solve this problem either and the pessimist in me says this is understandable when such a solution would inevitably lead to a non-procurement of unnecessary hardware. With a precise analysis and pinpointing of performance problems/degradation and you have the consequent annulment of the haphazard ‘let’s throw some more storage at it’, ‘let’s buy SSDs’ or ‘let’s upgrade our Storage System’ solutions that are currently music to the ears of storage vendor sales guys. So amidst these partial viewing vendor provided monitoring tools, which lack that essential I/O transaction-level visibility, Virtual Instruments (VI) pushes forth it’s solution, which boldly claims to encompass the most comprehensive monitoring and management of end-to-end SAN traffic. From the intricacies of a virtual machine’s application to the Fibre Channel cable that’s plugged into your USPV, VMax etc. VI say they have an insight. So looking back had I had VI’s ability to instantly access trending data on metrics such as MB/sec, CRC errors, log ins and outs etc. I could have instantly pinpointed and resolved many of the labyrinth quests I had ventured through so many times in the past.

Looking even closer at VI, there are situations beyond the SAN troubleshooting syndrome in which it can benefit an organization. Like most datacenters if you have one of the Empire State Building-esque monolithic storage systems it is more than likely being under utilized with the majority of its residing applications not requiring the cost and performance of such a system. So while most organizations are aware of this and look to saving costs by tiering their infrastructure onto cheaper storage via the alignment of their data values to the underlying storage platform, it’s seldom a seen reality due to the headaches and lack of insight related to such operations. Tiering off an application onto a cheaper storage platform requires the justification from the Storage Manager that there will be no performance impact to the end users but due to the lack of precise monitoring information, many are not prepared to take that risk. In an indirect acknowledgement to this problem, several storage vendors have looked at introducing automated tiering software for their arrays which in essence merely looks at the LUN utilization before migrating them to either higher-performance drives or cheaper SATA drives. In reality this is still a rather crude way of tiering an infrastructure when you consider it ignores SAN fabric congestion or improper HBA queue depths. In such a situation a monitoring tool that tracks I/Os across the SAN infrastructure without being pigeonholed to a specific device is axiomatic in the enablement of performance optimization and the consequent delivery of Tier I SLAs with cheaper storage – cue VI and their VirtualWisdom 2.0 solution.

In the same way that server virtualisation exposed the under utilization of physical server CPU and Memory, the VirtualWisdom solution is doing the same for the SAN. While vendors are more than pleased to further sell more upgraded modules packed with ports for their enterprise directors, it is becoming increasingly apparent that most SAN fabrics are significantly over-provisioned with utilization rates often being less than 10%. While many SAN fabric architects seem to overlook fan in ratios and oversubscription rates in a rush to finish deployments within specified project deadlines, underutilized SAN ports are now an ever-increasing reality that in turn bring with them the additional costs of switch and storage ports, SFPs and cables.

Within the context of server virtualisation itself, which has undoubtedly brought many advantages with it, one irritating side affect has been the rapid expansion of FC traffic to accommodate the increased number of servers going through a single SAN switch port and the complexity now required to monitor it. Then there’s the virtual maze which starts with applications within the Virtual Machines that are in turn running on multi-socket and multi-core servers, which are then connected to a VSAN infrastructure only to finally end up on storage systems which also incorporate virtualization layers whether that be with externally attached storage systems or thinly-provisioned disks. Finding an end-to-end monitoring solution in such a cascade of complexities seems an almost impossibility. Not so it seems for the team at Virtual Instruments.

Advancing upon the original NetWisdom premise, VI’s updated Virtual Wisdom 2.0 has a virtual software probe named ProbeV. The ProbeV collects the necessary information from the SAN switches via SNMP and on a port to port basis metrics on information such as the number of frames and bytes are collated alongside potential faults such as CRC errors synchronization loss, packet discards or link resets /failures. Then via the installation of splitters (which VI name TAPs - Traffic Access Points) between the storage array ports and the rest of the SAN, a percentage of the light from the fibre cable is then copied to a data recorder for playback and analysis. VI’s Fibre-Channel probes (ProbeFCXs) then analyze every frame header, measuring every SCSI I/O transaction from beginning to end. This enables a view of traffic performance whether related to the LUN, HBA, read/write level, or application level, allowing the user to instantly detect application performance slowdowns or transmission errors. The concept seems straightforward enough but it’s a concept no one else has yet been able to put in practice, despite growing competition from products such as Akorri's BalancePoint, Aptare's StorageConsole or Emulex's OneCommand Vision.

Added to this VI’s capabilities can also provide a clear advantage in preparing for a potential virtualization deployment or dare I fall for the marketing terminology – a move to the private cloud. Lack of insight of performance metrics has evidently led to the stagnation of the majority of organizations virtualising their tier 1 applications. Server virtualization has reaped many benefits for many organizations, but ask those same organizations how many of them have migrated their IO intensive tier 1 applications from their SPAARC based physical platforms to an Intel based virtual one and you’re probably looking at a paltry figure. The simple reason is risk and fear of performance degradation, despite logic showing that a virtual platform with resources set up as a pool could potentially bring numerous advantages. Put this now in the context of a world where cloud computing is the new buzz as more and more organizations look to outsource many of their services and applications and you then have even fewer numbers willing to launch their mission critical applications from the supposed safety and assured performance of the in-house datacenter to the unknown territory of the clouds. It is here where VirtualWisdom 2.0 has the potential to be absolutely huge in the market and at the forefront of the inevitable shift of tier 1 applications to the cloud. While I admittedly I find it hard to currently envision a future where a bank launches it’s OLTP into the cloud based on security issues alone, I’d be blinkered to not realize that there is a future where some mission-critical applications will indeed take that route. With VirtualWisdom’s ability to pinpoint virtualized application performance bottlenecks in the SAN, it’s a given that the consequences will lead to an instantly significant higher virtual infrastructure utilization and subsequent ROI.

The VI strategy is simple in that by recognizing I/O as the largest cause of application latency, VirtualWisdom’s inclusion of baseline comparisons of I/O performance, bandwidth utilization and average I/O completions comfortably provide the necessary insight fundamental to any major virtualization or cloud considerations an organization may be planning for. With its ProbeVM, a virtual software probe that collects status from VMware servers via vCenter, the data flow from virtual machine through to the storage system can be comprehensively analyzed with historical and real-time performance dashboards leading to an enhanced as well as accurate understanding of resource utilization and performance requirements. With a predictive analysis feature based on real production data the tool also provides the user the ability to accurately understand the effects of any potential SAN configuration or deployment changes. With every transaction from Virtual Machine to LUN being monitored, latency sources can quickly be identified whether it’s from the SAN or the application itself, enabling a virtual environment to be easily diagnosed and remedied should any performance issues occur. With such metrics at their disposal and the resultant confidence given to the administrator, the worry of meeting SLAs could quickly become a thing of the past while also rapidly hastening the shift towards tier 1 applications being on virtualized platforms. So despite growing attention being given to other VM monitoring tools such as Xangati or Hyperic, they’re solutions still lack the comprehensive nature of VI.

The advantages to blue-chip, big corporate customers are obvious and as their SAN and virtual environments continue to grow, an investment into a VirtualWisdom solution should soon become compulsory for any end of year budget approval. In saying that though, the future of VI also quite clearly lies beyond the big corporates with benefits which include the enablement of an organization to have real- time proactive monitoring and alerting, consolidation, preemptive analysis of any changes within the SAN or Virtual environment and comprehensive trend analysis of application, host HBA, switches, virtualization appliances, storage ports and LUN performance. Any company therefore looking to either consolidate their costly over-provisioned SAN, accelerate troubleshooting, improve their VMware server utilization & capacity planning, implement a tiering infrastructure or migrate to a cloud would find the CAPEX improvements that come with VirtualWisdom a figure too hard to ignore. So while Storage vendors don’t seem to be in any rush to fill this gap, they too have an opportunity to undercut their competitors by working alongside VI by promoting its benefits as a complement to their latest hardware, something which EMC, HDS, IBM and most recently Dell have cottoned on to having signed an agreement to sell the VI range as part their portfolio. Despite certain pretenders claiming to take its throne, FC is certainly here to stay for the foreseeable future. If the market/customer base is allowed to fully understand and recognize its need, then there’s no preventing a future when just about every SAN fabric comes part and parcel with a VI solution ensuring its optimal use. Whether VI eventually get bought out by one of the large whales or continue to swim the shores independently, there is no denying that companies will need to seriously consider the VI option if they’re to avoid drowning in the apprehensive nature of virtual infrastructure growth or the ever increasing costs of under-utilized SAN fabrics.

More Stories By Archie Hendryx

SAN, NAS, Back Up / Recovery & Virtualisation Specialist.

Latest Stories
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
21st International Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Me...
DevOps at Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to w...
SYS-CON Events announced today that CA Technologies has been named "Platinum Sponsor" of SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CA Technologies helps customers succeed in a future where every business - from apparel to energy - is being rewritten by software. From planning to development to management to security, CA creates software that fuels transformation for companies in the applic...
@DevOpsSummit at Cloud Expo taking place Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center, Santa Clara, CA, is co-located with the 21st International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is ...
"When we talk about cloud without compromise what we're talking about is that when people think about 'I need the flexibility of the cloud' - it's the ability to create applications and run them in a cloud environment that's far more flexible,” explained Matthew Finnie, CTO of Interoute, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
In 2014, Amazon announced a new form of compute called Lambda. We didn't know it at the time, but this represented a fundamental shift in what we expect from cloud computing. Now, all of the major cloud computing vendors want to take part in this disruptive technology. In his session at 20th Cloud Expo, Doug Vanderweide, an instructor at Linux Academy, discussed why major players like AWS, Microsoft Azure, IBM Bluemix, and Google Cloud Platform are all trying to sidestep VMs and containers wit...
In his session at @ThingsExpo, Eric Lachapelle, CEO of the Professional Evaluation and Certification Board (PECB), provided an overview of various initiatives to certify the security of connected devices and future trends in ensuring public trust of IoT. Eric Lachapelle is the Chief Executive Officer of the Professional Evaluation and Certification Board (PECB), an international certification body. His role is to help companies and individuals to achieve professional, accredited and worldwide re...
Amazon started as an online bookseller 20 years ago. Since then, it has evolved into a technology juggernaut that has disrupted multiple markets and industries and touches many aspects of our lives. It is a relentless technology and business model innovator driving disruption throughout numerous ecosystems. Amazon’s AWS revenues alone are approaching $16B a year making it one of the largest IT companies in the world. With dominant offerings in Cloud, IoT, eCommerce, Big Data, AI, Digital Assista...
When growing capacity and power in the data center, the architectural trade-offs between server scale-up vs. scale-out continue to be debated. Both approaches are valid: scale-out adds multiple, smaller servers running in a distributed computing model, while scale-up adds fewer, more powerful servers that are capable of running larger workloads. It’s worth noting that there are additional, unique advantages that scale-up architectures offer. One big advantage is large memory and compute capacity...
Both SaaS vendors and SaaS buyers are going “all-in” to hyperscale IaaS platforms such as AWS, which is disrupting the SaaS value proposition. Why should the enterprise SaaS consumer pay for the SaaS service if their data is resident in adjacent AWS S3 buckets? If both SaaS sellers and buyers are using the same cloud tools, automation and pay-per-transaction model offered by IaaS platforms, then why not host the “shrink-wrapped” software in the customers’ cloud? Further, serverless computing, cl...
You know you need the cloud, but you’re hesitant to simply dump everything at Amazon since you know that not all workloads are suitable for cloud. You know that you want the kind of ease of use and scalability that you get with public cloud, but your applications are architected in a way that makes the public cloud a non-starter. You’re looking at private cloud solutions based on hyperconverged infrastructure, but you’re concerned with the limits inherent in those technologies.
The taxi industry never saw Uber coming. Startups are a threat to incumbents like never before, and a major enabler for startups is that they are instantly “cloud ready.” If innovation moves at the pace of IT, then your company is in trouble. Why? Because your data center will not keep up with frenetic pace AWS, Microsoft and Google are rolling out new capabilities. In his session at 20th Cloud Expo, Don Browning, VP of Cloud Architecture at Turner, posited that disruption is inevitable for comp...
Internet of @ThingsExpo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devic...
IoT solutions exploit operational data generated by Internet-connected smart “things” for the purpose of gaining operational insight and producing “better outcomes” (for example, create new business models, eliminate unscheduled maintenance, etc.). The explosive proliferation of IoT solutions will result in an exponential growth in the volume of IoT data, precipitating significant Information Governance issues: who owns the IoT data, what are the rights/duties of IoT solutions adopters towards t...