Xilinx: IP-based Design Techniques across 5 Sites with IC Manage GDP

Simon Burke, Xilinx (edited transcript)

I’m going to talk to you a little about our experience with IC Manage, and the way we use it, and some of the things that we see and observe from it.

In terms of trends that drive our data management, and the problems that we see in general, at least for our business, we’ve seen the number of projects grow each generation. The time to tapeout doesn’t seem to change, the time to market doesn’t seem to change. The products just get harder to do and it’s more often. We do try to actively manage our silicon risks down. We try to reduce the number of tape outs that we do. We just don’t have the resources to excessively spin after tapeout. Both tracking and getting good known silicon first time is a very high priority for us.

What we see as we go down the process nodes is that design complexity just keeps going up. We are seeing between 1.3x and 1.6x growth per process node, and yet we must get chips out in the same amount of time as we’ve done before. What we are doing is using lots of multi-geographies to address that problem. So we have sites around the planet and we are using them 24/7 to try to get our chips out the door in the same amount of time. Each stage has got to be available in India in pretty much the same amount of time as it’s checked in here. We can’t afford to be waiting a couple of days or hours even to get that over there.

The real challenge on the data base side, in my opinion, is just keeping the lid on all that stuff. And not having it explode. As Yaron said, you want data management to be boring. You just want it to work; you have other problems to deal with and that is where you want to spend your time.

So some trends that we see:

  • We actually did our transition to IC Manage just between the 40 and the 28 nanometer node. And we migrated our 14 nm database right into IC Manage, so we didn’t have to maintain two systems. We’ve seen our products grow about 25% per product node.
  • Our growth business is in end markets that we don’t address today. That needs new products to go after them; that’s what we do.
  • Our tape outs have actually been going down over each process node. We tape out fewer chips at 28 than we did at 40, and we will hope to do the same thing at 20 when we eventually tape out 20. That is a testament to us trying to get the silicon risk down in each initial tapeout that we do.
  • The tapeout per product is also going down correspondingly. Even then we see a 1.4x growth in man effort just to see the chips going out the door. And that’s been happening over the last few process nodes. It just takes more effort to get the chip done.
  • Correspondingly, we are seeing between about a 1.3 and a 1.6x uptick in data management activity in general. People check in more stuff more often. That means we have to grow our data management system to cope with that, and have it be scalable. Having something on the brink of what it can do today, is just going to fail next time. So we need a system that has a lot of headroom, beyond where we currently are at today.
  • Man effort is really growing about 30% a node, as far as we can tell. IC Manage activity is going up about 60% per node. What that really means is that not only are we checking in more stuff, more often, we are also doing more iterations to fix it, to make it work. So you are seeing more work in between those check-ins, which goes to our increased resources.
  • Our depot size has been growing about 20% per node. So our 28nm database was about 20% bigger than our 40 nm database, and our 20 nm database is about 20% bigger than our 28. So the scalability on the data management side is also growing with that complexity growth. So don’t buy too small a disc, buy a big one – make sure you’ve got lots of room.

So in terms of the critical dependencies: What does that mean to us for a data management system?

  • What we really want is a central repository of data. Having data in different systems just doesn’t work. There is too much overhead and it takes too much time to track down. So knowing where your data is for each project and tapeout really helps you get the tape outs done.
  • What we want to do is to take out that unfortunate risk of taping out your chip with the wrong piece of IP version or a bug fix that got missed. That’s expensive; it takes a lot of dollars and a lot of time from our schedule. We just can’t afford to do it. So knowing what we are taping out and where it came from is very important.
  • We currently have about 1800 IC Manage libraries across our product portfolio. Those are individual functional blocks that we have to manage and track to get into our product for tapeout.
  • Branching is again, fundamental to what we need to do to tape out chips. Each project we tape out, we essentially create a branch of it, so we have a known database that we can reference back to. And we have a rolling branch model for our product tape outs. We have a significant number of branches; within those branches, we can track back to where the IP came from, what version it came from, and “is this a bug fix?” We can update those branches for those bug fixes and move forward to a new tape out. It lets us do the tracking of what bug fixes we did, whilst keeping the database intact.
  • We have about 450 IC Manage libraries to track in a typical product. It sounds like a lot, but when you add up Cadence data, RTL data, IP, etcetera, it adds up very quickly. Having them in a spreadsheet is just too much work. So what we do want is leverage the power of IC Manage to track those 450 IC Manage IPs and libraries into our product so that we know this is the thing we are taping out and this is where it came from. Then if we add an object to one of those, we can go track those successfully.
  • Security…as we try to grow resources to get our chips out on time, we are starting to pick up resources from all over the globe. Some in our existing sites and some not. There is a security aspect to that, especially with important IP, or data from fabs who are very secure about who we show their data to – as to who can see what. So having good security is important to us. So we can restrict who is allowed to see which data. We can’t just make everything available to everybody. It’s not the world we live in.

So, a couple of data points for you, in terms of our experience across the 28 and 20 nm nodes. As I said, we did the transition between 40 and 28.

  • To date, we haven’t had any corruption of data or loss of data that’s actually been checked in – not one byte that we can tell.
  • We’ve have zero tape outs due to respins of wrong IP being taped out. That was not true before we switched to IC Manage. I’m not going to say IC Manage is the reason that happened, but it certainly helped us get to that goal. So to date we’ve had no re-tapeouts due to wrong IP being inadvertently taped out.
  • In terms of our proxy availability – and we have proxies across all our remote sites – we are at about three 9’s uptime, 99.9% uptime, as best as I can tell. That goes to that you want this stuff to be boring. You don’t want it to just break. Every time it breaks, people go home, and you don’t get work done.
  • On the service side, we are more like 99.5% uptime. Now that’s excluding scheduled downtimes where you have to patch the OS, and you have to update versions, etc. We can schedule those to when it doesn’t matter to our design team. What I’m talking about is where it matters to design teams – when we are trying to get chips out the door and people are sitting at their desk waiting for you to fix something.
  • We have about 5 global sites where we coordinate our design IPs, so when someone is designing an IP it has got to come from that global site to the site that is doing the tapeout. And we have two sites that do tape-outs across the organization. Typically we see data come from one remote site to one of the two tape-out sites. That has to happen quickly. You don’t want to have one of your integration teams sitting around and waiting for data to show up. So there is a data management challenge making sure everything syncs around.

So that’s our experience at 28nm and as we walk through 20nm, that we are seeing with our current data management system. And I will concur with what Yaron said. For the most part, you want this to be boring, you don’t want it to be exciting.

That’s all I had. Thank you.


Audience Question: How do you manage the CAD environment – scripting, tool versions, contours – around the versions of the design such that you don’t have to have a copy in each workspace, i.e. have a data explosion?

The way we manage that at Xilinx – we have exactly that problem of course – we tend to break up our CAD environment into project-based CAD, and then we use the IC Manage auto-sync to push out a version of the CAD environment that goes along with that project. It’s managed inside IC Manage, along with all the verification testbenches and the IP itself, so when we snapshot it so we have a copy of it.

In terms of the usage model the verification data and the RTL data is in the workspace, and you just point to an auto-sync mirror of the appropriate CAD versions that go along with that environment, and then track them on a project basis. It’s a compromise.

Audience Question: Simon, you said you had 450 libraries for your project. Do you have one workspace per project, or does each designer have one workspace?

That’s really across the whole dataset. That includes RTL testbenches, Cadence data, place and route data, CAD libraries. It’s the complete set of libraries that describe a project. Typically the CAD guys will check out the CAD libraries and the RTL guys will check out the RTL plus testbenches. So we very rarely see a single workspace that has all that stuff in it; that would be big, it would take a while to sync in, and nobody cares about 90% of it.

So we have multiple workspaces, multiple configurations set up for the individual design team to pull out the subset of the data that they care about.

Audience Question: What about metal fill libraries, that tend to take up a lot of space – do you have a workspace for that?

We do tend to treat those as outside the regular design database. We don’t put them in the main design library, we keep a separate library, because they do change pretty much every time we tape out and every time we do an ECO and they are big. And there is no value to maintaining previous versions, you would just want the latest one. So we do manage those as a separate library attached to the main design IP that we can discard if we need to.

Audience question: Are the metal fill (C-Fill) libraries stored in IC Manage?

It depends on the size of the IP, some is, some isn’t. The top level fill for instance would not be stored in IC Manage because it’s too big and we just generate it once and merge it on the GDS side. For low level IPs, we do – as an example.

The way we manage them is, when we tape out, we create an integration branch of what we are about to tape out and that’s where the C-Fills go, not in the main database.


Simon Burke is a distinguished engineer at Xilinx in San Jose, California