Hybrid Cloud Infrastructure2018-07-24T02:07:34+00:00

Hybrid Cloud Infrastructure Options — Video & Transcript

Shiv Sikand, Founder & EVP, IC Manage (edited transcript – DAC 6.25.18)
Get a Demo

How Cloud and On-Premise Infrastructures Differ

So, once I’ve managed to do this magic of the Holodeck, and I’ve got you there and you’ve got everything that you need in the cloud, we kind of hit another snag. The snag is that typically our EDA workflows are filer based. We have a shared file workflow model, where multiple compute nodes can see each other’s results.

Engineers can run successive jobs so they can enforce a bunch of parallelism — corners, and multiple data-gathering approaches — so that we can have a top-level view. But in the cloud, we have very, very high density of compute nodes and we only have block stores. There isn’t really a shared storage or a shared file workflow available in the cloud because that’s not how they’ve evolved or developed.

So, how is everybody else doing this?

Option 1 – Rewrite Tools and Flows for the Cloud

Well let’s look at how software did it, that’s option one. They rewrote their tools and their flows, and they adopted what was called a microservices architecture. This was originally pioneered by Netflix and their chief cloud architect was Adrian Cockcroft. And they came up with an architecture where you deploy software applications as modular services. You just slice it all up, and everything is modular, and you share data only at the very end. There’s very little sharing that occurs between these services until later.

That’s very, very impractical for EDA (Electronic Design Automation), and HPC (high-performance compute) hardware design. We have way too many interdependencies to modularize. It is almost impossible to say I’m going to slice these big chips into these smaller pieces, because we have to deal with physical issues, physical proximity, and physical interaction at the device layers. We can’t slice it up.

On top of that, there are a whole bunch of legacy tools that we’ve been using for don’t ask me how long that are just essentially impossible to convert. You’re not going to be able to rewrite those — at least if you are going to rewrite them the risk of rewriting them is too big, because unlike software, we can’t ship patches to our hardware.

Option 2 – Roll Your Own Filer

You could roll your own filer, unfortunately filers are scale-up. The cloud offers all this high-density compute which is scale-out. So now if you try and even roll your own filer you’ve got this mismatch you’re trying to apply a scale-up storage solution or file solution with a scale-out compute.

This mismatch results in the inability for your I/O to keep up, leading your applications to become I/O bound even though you have all this compute. We see it already today, we see that today on-prem; our filers are really struggling to keep up and buying more filers isn’t really helping us. Because our filers are giving us capacity but they’re not giving us the I/O that we need.

Option 3 – Use an External Filer

You could try an external filer, but that’s going to be even worse. You could test a new file system I believe there are 20 to 200 storage startups all trying to do something to solve this problem.

The reason is that typically high-performance storage is a multibillion dollar market and a lot of people are trying to go after it.

Option 4 – Test a New Filesystem

[You could test a new file system.] However, if you want to build an HPC model in the cloud using all flash for semiconductor, you’re going to be broke really quickly.

You can certainly do it, you can write those big checks, you can get massive amounts of custom hardware and flash and do it all, but flash memory fundamentally from a semiconductor perspective is not scaling the way magnetic media is scaling. So, what you end up doing is burning a lot of cash.

Option 5 – IC Manage PeerCache HPC Scale Out I/O

Alternatively, we introduce PeerCache scale-out I/O that delivers extreme file performance. We have about 70 person-years of development behind the tool.

What we do is that we take advantage of local NVME. And we provide a shared storage model that caches from peer-to-peer.