As I was putting this together, I was given a few questions as prompts for some of my thoughts. And this was the first one, which was what are the top 2 challenges with design management, how do we address them and why.
NVIDIA’s process is basically one built on evolution rather than revolution. So we stand on the shoulders of giants as we do our designs. There is even code that we have in our source code repository that goes all the way back to our first chip, which was called NV1. That may or may not have changed, or it has changed over that time. So having a design management platform that allows us to do that, and handle that complexity is critical for us.
Second place is how we keep our global teams all executing together. We have a very large team of course in Santa Clara in our corporate headquarters. But we also have design centers around the United States, as well as several in China and India. So it’s critical that our design management platform allows us to keep the data sets that are required for that same chip, which is being designed across the world, all in sync.
So we also have to make sure that when we tape out, the correct bit of IP is in every chip. So we rely on our design data management platform, to allow us to do that.
The best 3 practices for design and IP management that we recommend and why. I am on the rotation schedule for giving the new employee introduction at NVIDIA. We talk about where they’re supposed to store their files, whether it’s group shares or whatever. Basically, I tell them, “if it isn’t checked in, as far as we’re concerned, it doesn’t exist.” So if you don’t check in critical bits of NVIDIA’s IP, it could go away. Now we don’t have a history of having disks fail, but that’s how much we believe in design management – it has to be checked in.
Because of the evolutionary nature of our designs – we do again build on our previous generations – we follow very strict branching methodologies. So new designs are branched off the old designs.
IP that’s in library is branched out of a place where we design those IP elements, and we branch them into chip-specific library areas. We have scripts and processes that monitor to make sure that the latest bits of IP get pushed into the chip, so that we again, we can tape out with the correct bit of the IP.
And all IP components are checked into our data management solution – standard cells, analog blocks, RAMS, pads – everything gets checked in. Whether it’s designed by NVIDIA, which is now the trend, is we design almost everything that we put in our chips all the way down to even many of the standard cells, or whether it comes from some 3rd party, it goes into our data management platform, and it stays there and it’s then branched to any of the projects based on that technology.
Top must haves. Flexibility and a comprehensive security model. There are lots of projects happening at NVIDIA, lots of chips that are being designed, that we don’t – and it’s not that we don’t trust our employees – but, if you don’t need to know about it, you don’t need to know about it, so we don’t let you know about it.
So the ability to control access to projects and data, and having a very, very comprehensive security model, is key. We allow read only access, or read and write access, or no access, again depending on who that individual might be. Secondary issues, that are important to us, but probably not as important as that security model, include that we must have atomic transactions.
Check in speed is critical, so we spend a lot of time, both on our networks, as well as our storage and our data management solution, to make sure the engineers can check in their code as quickly as possible. We have to have efficient handling of large data sets. As you know, chip designs are large things, they are very large beasts, and lots of that is binary data. We have to have a solution that can handle those large data sets, especially if they are binary, very well.
It needs to be an enterprise-class solution. Our data management solution that we use has well over 5000 users, from all over the world, and it needs to be able to handle that, and scale to that level. And then it’s important that other engineers know what their co-workers are working on. So the visibility into what people have checked out and are working on, is pretty important to us as well.
Measurement and Concrete Results from Implementation
Measurement and concrete results from these implementations. In our main hardware server, we have almost 400 Million files and revisions of files on that server. So that speaks to just how large that footprint is. The branching commands, the ones that relate to moving IP, either from one project to the next project, or from libraries into projects – those are some of the most popular commands that our engineers are executing against our data management solution.
Library reuse is very high. In one of our typical nodes that’s very active right now, that we’re doing designs in right now: there are 3 different flavors of that node. In one of those flavors we have 26 designs that are being executed out of that one flavor. So the analog blocks that are designed, the pads that are designed, the standard cells, are getting used in multiple designs in all at the same time.
Addressing question number 3 about security, we have almost 11,000 protection lines in our source code management system. Again, it allows us the flexibility of granting both ‘protection’ or ‘unprotection’, both at the user, at the group, as well as at the IP level. There are certain technologies, that for whatever reason – many of them related to government regulations – we can’t allow in certain countries. So based on the IP address we restrict access for those things.
Our data is growing at a phenomenal rate – about 50% per year – is the amount of data that is stored in our repository. We have approximately 200 terabytes that are under management right now by our data management solution for the chips.