PeerCache Whitepaper 2017-11-15T18:40:03+00:00

IC Manage PeerCache EDA Tool Accelerator — Whitepaper

By Shiv Sikand, EVP, Engineering, IC Manage

See Demo

Overview: Speeding up All EDA Tools by Removing NFS Bottlenecks

IC Manage PeerCache™ EDA Tool Accelerator gives 10x tool I/O speedup and reduces your disk storage use by 90 percent.  PeerCache does not require IC Manage design & IP management tools — it is a new product category for all files in your flows, both managed and unmanaged.

PeerCache speeds up all your tools, including RTL simulation, static timing analysis, power/IR/EM tools, physical synthesis, P&R, and layout — by removing the NFS bottlenecks resulting in slow read, write and stat speeds, as well as jobs held up in wait states or stuck in LSF queues, waiting to get disk resources.

Because PeerCache is 100% plug-and-play software, design, verification, and CAD teams can implement it seamlessly, so that all their workflows can immediately benefit from the performance speedup. It eliminates the risk of having to redesign NFS during production flows, while leveraging existing filer investment for backup, high-availability and disaster recovery.

It achieves this by creating a P2P caching network with the flash in your compute nodes. PeerCache also utilizes virtual workspaces to save copy time and filer disk storage.

Problem Statement

Data I/O access due to NFS bottlenecks are dramatically impacting EDA tool speeds for non-CPU bound jobs and interactive applications.

Companies also want to reduce expensive filer storage disk space. However, any speed-up solutions must fit within their existing Network File System, as redesigning their systems are too risky for production flows.

More companies are investing in flash-based devices for their compute farms to accelerate critical jobs; however, this storage is not networked making it hard to deploy into the grid.

90 percent of the data generated by design work is unmanaged. Some examples of this are regression data, place and route, layout, timing/power/EMI data. This unmanaged data causes additional data transfer bottlenecks, as well as issues with information accessibility and security.

DOWNLOAD Whitepaper PDF

10x EDA Tool I/O Speedup through P2P Caching

PeerCache speeds up EDA tool I/O by 10x, by harnessing the inexpensive flash in existing compute farms. It turns the local/scratch flash into peer-to-peer networked flash cache, so that all nodes in the grid can now share all changes dynamically.

When using a compute farm with NVMe devices, PeerCache currently delivers 10x speedup for sequential reads, 7x for sequential writes, and 80x for Random I/O, as compared with NFS v3. We expect even stronger results over time with new technologies such as Intel 3D Xpoint/Optane.

As shown in figure 1 below, by creating a P2P network out of compute node flash devices, PeerCache effectively gives teams close to the 1 GB+/sec speed of direct attached storage, vs thesub 100 MB/sec speed of the NAS filer. PeerCache does not fully achieve local speeds, due to the overhead from the userspace filesystem and metadata translation.

         Figure 1: Sequential I/O Speeds for NFS V3 vs PeerCache with NVMe

PeerCache also allows all file system ‘stats’ to run locally, removing the central filer CPU bottleneck. You can start seeing PeerCache’s speed up benefits with as few as four nodes; it works for both bare metal and virtual machines. Because PeerCache uses the flash as cache rather than direct storage, it only requires small flash volumes, typically less than 1 TB per node.

Virtual Workspaces Reduce Filer Disk Storage by 90 Percent

PeerCache uses virtual workspaces for both your managed and unmanaged data. The file data is separated into two layers as part of creating the virtual workspace: The descriptive file metadata (file name, size, owner, group, mask, create/access/modify time…), and the file content with all the actual bytes your EDA tools need.

These virtual workspaces require much less disk storage than multiple physical copies do.

First, virtual workspaces eliminate the need to make physical copies of your design’s source and derived data files. Instead, when you create multiple new workspaces, you avoid duplication.  This is because rather than copying the entire file tree, we only copy the metadata, and then point to the same set of common files, so that all the workspaces on a given host share the same common files.

  • The core design is an invariant copy that doesn’t change.
  • Only the metadata changes and per workspace changes require additional storage. If you are using a DM system, once those changes are checked in, the space is automatically freed up.

With Virtual Workspaces, any duplication is in the caches, and not in the NFS filer’s authoritative data.   This combination reduces NFS filer storage by 90 percent.

In figure 2 below, you can see a visual representation of the filer reduction from PeerCache virtual workspaces as compared with traditional workspaces that require physical copies.

Figure 2. PeerCache clones vs. Traditional physical copies

Additionally, PeerCache only transfers the files that are directly accessed by the engineer.  These files are treated as an LRU cache, so that only small flash volumes are needed.

PeerCache not only does all this for your managed data, it also does it for the other 90% of data that is not managed.

Fast Copies with Cloning

The traditional method of making physical copies can easily be 2-3 hours to copy two terabytes. As a result, chip design teams create very few master design copies, resulting in a serial design methodology where each engineer’s progress stalls as they wait several hours for their big jobs to run. They must also wait for other engineers’ data in the shared master directories to finish and update.

PeerCache’s cloning offers on-demand access.  Because making clones takes only a minute or two, all the work files are immediately accessible, with only small amount of metadata manipulated “on demand” by the engineer. Later, an independent highly parallel peer to peer transport layer delivers the full content as needed.

The result is massively parallel workflows for designers with parallel data transfers, the best of both worlds.

PeerCache: ALL Data gets Audit Logging, Search & Analytics

The bulk of project data today – such as physical design, regression and chip integration –  is unmanaged data which must still be broadly shared.  In addition to addressing the NFS I/O and storage bottlenecks associated with this project data, PeerCache also solves the problem of limited data/information accessibility.

PeerCache has three primary mechanisms: Logging, Search, and Analytics.

1. Audit Logs – with File Data and File Metadata

Design and verification teams today do not know what’s happening with all their files, because NFS is block-based and it has no capability for file logging.

PeerCache automatically logs detailed information as part of the meta-data translation described in the virtual workspace section.  Audit logs are generated for all operations on every file.

Built-in data anonymization separates the analytics data from the individual users, requiring that certain protocols occur before the data-user link can be accessed. This allows companies to protect employee privacy, and comply with various local laws requiring separation of the analytics data from the individual users.

2. Search queries File History

Because all log data is now automatically collected, even for the 90% of the data that is unmanaged, design and verification teams can finally access their entire files information to do meaningful analytics.

PeerCache interfaces to the Elasticsearch open source engine, to allow for fast search queries on the large volume of log data.  You can see exactly what changes were made to every file – again, even for your derived/unmanaged data.

Once an interesting detail is discovered the feature can be zoomed in to explore in detail — such as how many times DRC was done or a specified tool was run on a specific block. You can see how and when files were accessed, modified, stat’d, renamed or deleted.

3. Big Data Analytics and Machine Learning

Further, PeerCache has an analytics plugin that allows you to connect the log data to any backend analysis or big data analytics tool you may have.

You can link the data to IC Manage Envision Design Progress Analytics for design progress analytics based on the activity generated by a project over time. Your search data is also available to any other big data analytics and visualization tools you may have.

You can use the information obtained from your Elasticsearch to create graphs to analyze items of interest.  For example, folding activity data over a single day and clustering by location can show how work flows over the globe daily.

Data Loss Prevention. PeerCache provides machine learning analytics to prevent IP theft. You can filter activity by cluster, and then set rules for flags, real-time alerts and revoking of permission. If there is an attempted IP theft or cyber-attack, the file system access can be immediately revoked, meaning you can prevent the theft, rather than finding out after the fact.

Intelligent Storage. The data analytics streams can be incorporated for continuous tuning of the peer-to-peer network. Machine learning techniques such as clustering and regression analysis will allow ongoing improvements in peering and PM cache management to deliver finely tuned application performance.

Conclusion: Scaling Out for Maximum Speedup

Design and verification engineers are always pushing for more speedup.  You can achieve this speedup — with security — by sharing the flash storage from existing compute farms using a P2P caching network.

At the same time, you reduce filer storage costs, with the authoritative data still saved to your filer to preserve your existing reliability, uptime, backup and disaster recovery.

The horizontal compute farm “scale out” that PeerCache offers is already in place in other industry sectors. This scale out is quickly replacing traditional practice of scaling up with more filer storage. Further, because PeerCache is hardware agnostic, as your hardware technology improves (e.g. Optane) the software automatically takes advantage of it.

Beyond this, we are moving to an era of “intelligent storage”, where deep learning can be applied to evolve the security, reliability, and efficiency of the peer-to-peer caching network.

About the Author

Shiv Sikand,
Executive Vice President
IC Manage, Inc.

Shiv Sikand is founder and Vice President of Engineering at IC Manage, and has been instrumental in achieving its technology leadership in design and IP management for the past 15 years. Shiv has collaborated with semiconductor leaders such as Cypress, Broadcom, Maxim, NVIDIA, AMD, Altera, and Xilinx in deploying IC Manage’s infrastructure to enable enterprise-wide design methodologies for current and next generation process nodes.

Shiv has deep expertise in design and IP management, with a long history of developing innovative solutions in this field. He started in the mid 1990’s during the MIPS processor era at SGI, authoring the MIPS Circuit Checker before specializing on design management tools for a number of advanced startups, including Velio Communications and Matrix Semiconductor, before founding IC Manage in 2003. Shiv received his BSc and MSc degrees in Physics and Electrical Engineering from the University of Manchester. He did his postgraduate research in Computer Science also at Manchester University.