Modern Data Virtualization: More Data Capacity, Consolidated VMs

Database clusters can often grow very large – larger than was ever planned for or anticipated. Limitations in the ability to handle heavy volumes of READ requests leads to a need to spread data (and thus the READ requests to access that data) over many nodes. If the database nodes are implemented as VMs, this results in many VMs. To maximize performance per node, usually each VM is supported by a single and dedicated physical machine. Unfortunately, this effectively negates the cost, flexibility and management advantages (the original promise) of virtual machines.

The necessity to distribute data is due to the performance limitations of standard CPU-based servers in executing the data-centric functions that databases are asked to do more than any others:

  • terminate TCP/IP connections
  • parse READ requests
  • access stored data
  • deliver that data back through TCP/IP connections  

These inherent limitations of CPUs greatly limit the amount of READ requests a single CPU-based database node can handle. This generally means it can efficiently store and serve less data, which can lead to:

  • sprawling and ever-growing data deployments
  • large and ever growing clusters
  • continuous management headaches to make sure nodes are not overloaded
  • failed attempts to prevent poor performance and user dissatisfaction

But what if database nodes did not have to service any READ requests at all?   This can be achieved by running FPGA-based Data Engines that cache all data and service all the READ requests alongside database nodes. Being FPGA-based, such Data Engines can deliver data at much greater scale than CPU-based database servers.

When you deploy Data Engines to accelerate existing database servers, two immediate benefits are realized.   First, your users receive lower and more predictable latencies, even at times of peak usage. Second, your database servers go on vacation.   They just kind of sit back, lightly loaded, drinking margaritas, saving that much needed processing headroom for a big spike in traffic or allowing you to cut down on expansion plans.

This allows you to consolidate VMs onto a common hardware platform. Tests have shown that in a database environment with a mix of 80% READs and 20% WRITEs environments that previously required a physical server per VM can now be deployed as 4 VMs per physical server. In a 90:10 READ/WRITE environment the VM density can be increased for a higher 8:1 consolidation ratio.

80:20 READ/WRITE workload with rENIAC allows for 2.3x higher throughput with nearly 4x the data capacity and VM density on the same server:

Blog Mod Data Virt Benchmark 1 multi VMs

80:20 READ/WRITE workload with rENIAC allows for the same throughput and performance with 10x the data capacity or data density on the same server – from 1.2TB to 10.6TB:

Blog Mod Data Virt Benchmark 2 density

Increasing VM density results in huge and immediate cost savings. There are also significant operational advantages. As your data grows, you don’t need to add more VMs. The data each existing VM handles can be increased with no impact on performance. And you no longer need to worry about hot spots and data replication across VMs.

It is time to bring sanity back to database clusters by leveraging purpose-built database optimization tools that leverage FPGA-based hardware. Let the FPGA execute the repetitive functions that bog CPU-based all-purpose database nodes down. Such architectures result in greater scale, better performance, and flexibility to handle future growth and/or usage pattern changes.

Book a live demo