Testing of VxRail as a Scalable Starter platform for Splunk Enterprise – Part 2 of …..

By Keith Quebodeaux – Principal vArchitect for Dell EMC

Testing of VxRail as a Scalable Starter platform for Splunk Enterprise – Part 2 of …..

It’s official – VxRail All-Flash meets or exceeds Splunk performance guidelines!  As previously discussed in Part 1 (one) of this blog, @KylePrins and @Queboduck of the newly minted Dell|EMC Converged Platforms Solutions Division of Dell Technologies (love that new Org smell – like Daffodil Daydream) set out to deploy Splunk on VxRail.  Working with our awesome Splunk team the end goal was Splunk’s validation that the platform “meets or exceeds” as well as providing initial guidance for our customers.

While there is more to come we can officially state that VxRail All-Flash meets or exceeds Splunk performance guidelines as set out and validated by Splunk.  Splunk validation of VxRail Hybrid is in progress, and while it is not yet validated officially by Splunk we are highly confident in its performance based on our experience so far.

So that is awesome, but that was only half of our initial objectives.  What about guidance and lessons learned?

  1. Splunk is CPU intensive, and as of this writing has minimum reference configuration of 12 (>2GHz) CPUs for Single Instance or Indexer and 16 (>2GHz) CPUs for Search Heads in a distributed configuration. We saw this first hand, so when you look at VxRail what you want to consider are those configurations with greater than 16 CPUs which carry a model number today of 160 or greater.vxrailhybrid
  2. In any Hyper Converged Environment storage is tied to your compute resources and is a relatively fixed resource relative to CPU and Memory. This is not unique to VxRail, and in fact Splunk is traditionally DAS based which shares the same limitations without the features of software defined storage (SDS).  Just be aware.  If you are looking for very large quantities of Hot/Warm data disproportionate to your compute resources you may want to leverage a highly scalable all flash array like the Dell|EMC XtremIO in a VxBlock 540.  We will talk more briefly about scaling strategies below and in an upcoming blog here on BigDataBeard.
  3. vSAN in VxRack is a scalable storage platform through adding additional nodes to the cluster, but it is a single storage tier in a cluster. We would recommend using vSAN for your Hot/Warm Tier only and age off old data to external Cold/Frozen storage or deletion.  For Cold Storage we would recommend Isilon scale-out NAS for its scale and data reduction features.  One of the advantages of Isilon is the cost per gb can make having a large and accessible cold bucket highly compelling in lieu of Frozen bucket that must be thawed to consume and find value in aged data.  If frozen is required it is possible to write to Dell|EMC Data Domain as an external NFS share.
  4. Do not overcommit Splunk resources. As we have discussed, Splunk is CPU intensive; treat it accordingly.  While we frequently talk about virtualization ratios of 2-1, 4-1, 8-1 CPUs and so on for many traditional workloads like web servers and infrastructure application servers, with Splunk – just don’t do it.  Capacity planning and resource balancing should be part of what we do with any application.  To be fair Splunk is not unique here – SAP and other applications have similar recommendations.
  5. Examine your Distributed Resource Scheduling (DRS) strategy to avoid excessive or unnecessary movement of your VMs.

Again, we will talk more about scaling in a near future blog, but here are some quick things to consider:

  • Scaling additional Indexers and associated Hot/Warm data is very easy by adding additional appliances to the VxRail vSAN cluster. A single cluster can expand to 64 nodes, however remember Splunk is not bound to a single VMware cluster and very few customers choose to approach that maximum in any single cluster.
  • At scale Hot/Warm vSAN capacity will be the most important resource to manage, but for any long term retention for Cold storage tiers you can use Dell|EMC Isilon or other appropriate external storage.

Re-Visit BigDataBeard.com for more on the awesome work being done on Splunk and other big data platforms including upcoming material on sizing and scaling VxRail for Splunk.

See you at Splunk .conf 2016 – Find us in the Dell|EMC booth or on the floor sporting blue Dell|EMC Splunk Ninja t-shirts!