Big data isn't enough: How decision making is the key to making big data matter
KPMG has built an approach to decision making that finds value in data that typically never emerges by other approaches.
Good morning, I apologize for my voice. I'm coming out at the end of a cold, but my real voice doesn't sound any better. Today, I'm going to talk about big data. That's my job. That's what I was trained to do. That's what all my projects are. But, my wife thinks I have something called “opposition disorder,” so today I'm going to conflict with myself, I'm going to make the big data feel a little bit small and I'm going to articulate why and I hope that it will come across. What we are faced with is, big data is big, but there is something about decision space that we have to start looking at. Actually, the more data we have, the more choices we have, I must submit. And then the more decisions we need to make. And the difference between good decision and a bad decision will differ a lot.
But before I even do that, I want to go back and respect the big data. I'm going to go back all the way to 1982, and I'm going to talk about the technology, what we had there. And I have a gadget today that I brought. So back in 1980, this is me actually and I look pretty good. Look at the hair. I don't have less hair…I have no hair now. You could argue this is linked to the decisions we all make and that's the actual heart of this talk. I want to start with this.
This computer is called Sinclair ZX Spectrum. I see some members in the audience, they might actually had this one and I'm not going to point out, that will be a little bit more revealing your age here. But there's something unique about the machine I'm holding in my hand. This is my first computer. This is not a replica I bought from eBay. This is the one that I actually had when I was this guy back then. 1985, my first computer, Sinclair ZX Spectrum. Tim, may I ask you to send it around? Maybe some of the audience would have nostalgia with it.
Sinclair ZX Spectrum, 1982, a beautiful machine. Forty rubber keys. Very, very touchy-feel rubber keys. It had 48 kilobytes of memory. Okay, let's think about that. 48 kilobytes of memory. For those of your digital photographers, if I took a picture of you, it would maybe store Tim's left eye. That's it, it's not going to get any of us. But I'm not belittling this. There is so much I did with this machine. I programmed in a language called basic. I played a ton of computer games. My children think that their games are a lot better, well actually it's not that much better now.
There's more resolution to it, but they do less with imagination. But I'll leave it at that, but think about 48 kilobytes, you could not actually –it doesn't have a hard drive. Okay, you guys are thinking, how many more…I have nothing else in my pockets. This is a cassette tape. You can only load your computer program again with a cassette tape or when you're finished coding your basic program to do whatever you want. You couldn't turn off your computer. You need to put it back to the cassette tape, rewind it, and you would be able to store it. But, I'm not belittling this, actually. I'm going to come today, this age. Okay. And this is my iPhone. Many of you guys have the iPhone in there. Just to compare this, it has, it's one of the newer ones, iPhone 11 I don't know if you guys are up to that. 512 gigabytes of memory.
Let me actually make a comparison. 48 kilobytes. Kilo is 1000. Mega is a million. Giga is a billion. So this machine, my phone, has 10 million times more memory than my beautiful Sinclair ZX Spectrum. 10 million is awesome. Think about it. Naturally, I did a little bit of Googling and you guys can fact check me because I actually prepared for this one. This is approximate, but if I were to have the same memory with my Sinclair ZX Spectrum, it makes up a little bit medium size cruise ship. Like, if you built all of these on top. So, 10 million times better memory. This is my appreciation of big data. I'm not actually making any fun of it. It's good. But here's where I start to pivot to something mysterious. What could be bigger?
Let's shift to the decision space. So, if the storage is 10 million times better, if the CPU power is about maybe 10,000 times better, we're not going to get into why CPU didn't catch up as memory. But this is a lot of improvement. Are our decisions 10 million times better today? Are we raising our children a hundred times better? 10? Worse? So, you could start to think that “why is it the knowledge doesn't translate immediately to a decision?” I'll give you one example. It's kind of a good set example. Global infant mortality rate dropped. Great, 60%. Not 10 million times. It didn't go to zero. So, what this says to me is that there are many problems. Dow Jones went 28 times since the picture that I showed. Great, but not 10 million times. So, it isn't just information and data, it's also the choice space and the decisions we make.
Melissa closed it by humans. Humans make very, very good choices in figuring out that decision space. But now I'm going to pivot to this. What is the other space, other than the data? What else is there? Let's start with this, a little kind of a game. Again, you could see that I'm a mathematician, so I'm fascinated with numbers, and this goes go on and on for the next seven more minutes about the numbers. These are billiard balls, 16 of them here. And every time you start playing pool you order these balls separately. Of course, you don't put stripes next to each other all the time. Everybody has their own methods to do. Think about how many ways you could have actually sorted these balls. Sorted as in how many ways you change it. Pick a number, don't say it.
Pick a number in your mind. I hope that is large, like large, I’m not giving it away. But I'm going to surprise you, actually. You will also think that, as I'm going to reveal these numbers to you, I want you to think that this is not just a game. This is a game, but ordering 16 things is a very real life problem. If you're a package delivery person, you need to drop 16 packages. You don't know anything about the city and the order of the packages that you drop matters. Maybe you want to save gas, you want to go home early. There are 16 factorial ways of ordering these balls. 16 factorial. Now those who you picked a number, I'm going to show you how big that number is. That number is larger than number of trees on earth. NASA estimates 3 trillion trees on earth, 16 factorial. The number of ways I can sort those balls were 20 trillion.
So, the way you could sort, the number of ways you could sort is more than number of trees on earth. Now, if you have another visual…the Milky way. There are only 300 billion stars in Milky way. So again, the numbers keep on going, but I want to show that there's some problems that doesn't look big data, 16 doesn't look big. I could have put all those 16 balls in my pockets. But it's not big data. But the problem that space creates is huge. If there's a chart, the purple one on the right side shows us in the time progression of the data growth. I would submit, this is my conflict part, I would submit that something else is growing bigger than data. That is decisions, big decisions, the choice space. And actually with more data we have more choices.
So, what is the science of making decisions? And there is actually a science to it, to making decisions. One of them is before you make a decision, you want to understand the difference between best and the worst possible decision. If you can quickly decide that the difference between best and worse, doesn't matter, don't spend time. Pick one. You can recover from that. But if there's a difference between best and the worst, then you need to spend time in something I call algorithms. So, while we appreciate data, there is this powerful thing we have called algorithms. Can I represent that blue growing huge data space through algorithms? We did. I'll give you one or two examples. This is the work that we are honored to do with NBA, National Basketball League. We wanted to solve their scheduling problem.
They always did the schedule by hand and they wanted to, is there a better way to do schedule? So, whenever they did it by hand, all the teams complained. They said, “Why am I not playing on the opening night, why am I not on the particular news channel, a TV channel I wanted to be?” We took all of these constraints into their mathematical model, an algorithm. We took all the things that these 30 teams wanted to achieve. They have the desires, what I call objectives, and they have these things that they want to avoid, what I call constraints. We put all of them to a mathematical model. This wasn't very easy. We needed the help of a cloud computing infrastructure. Think 30 machines working for a week. Now, 30 machines working for a week in cloud is a lot, because these days there's a lot of things I can do in my laptop. But at the end, the algorithm was able to meet all the constraints and it met all the desired things that the teams wanted.
The NBA is very, very happy with that. The last three years KPMG released the NBA's game schedule. The reason I gave this as an example, it is not big data. Thirty teams on top, 82 games they are going to play. The solution fits on a paper that I can print and hand out. How is it possible that the choice space is huge? It was astronomical. Around the order of those trees that I showed. And some of the technology improvements made this possible. Cloud computing, parallel processing and some of you might of heard this, there's this new thing called quantum computing. Now, we're not there yet for the machines, but in this project we use something called quantum-inspired algorithms. These algorithms use quantum computing mindsets and then go again. Remember here that the focus is data could be large, but the choice space can be traversed very efficiently through the use of algorithms.
I want to give one or two examples and close with this one. Another example that we use is that we have a fantastic team here doing portfolio modeling. We have a securitization team. Think about, you have 500 securities you want to pick from, or stocks you want to pick from and you want to pick 10% of it, which is 50. To pick 50 out of 500, there is 10 to the power 69, don't laugh. You guys are thinking I'm deluded in the numbers. Ten to the 69 ways of picking 50 out of 500. There is 10 to the power 50 atoms on earth. So, the team went into using power of algorithms to find a way to traverse the space way more faster using the algorithmic power. No machine can solve this problem, but algorithms can. So, I’d like to actually close by seeing, what can we do with this?
Again, I articulated that data space large, I love it. Big data is my job, but something bigger is that decision space, choice space. There are a ton of hidden problems are hiding there. I mentioned portfolio optimization. I touched on that NBA. There's a lot of routing, scheduling. You have a budget to cut, which of the 1000 programs that you would need to cut? That's an optimization problem. It's not just big data problem, optimization.
So, in closing, I think that what you're seeing here is that, the UN, a couple of years ago, came up with 17 sustainable development goals. These are awesome. We need to reduce hunger, we need gender equality. So, this is that 16 billiard ball problem slide. I really submit that there are many problems sitting here that are very difficult. You need data, but not just data. We need to reduce inequalities. I had the pleasure to work with my team here, with a bank, and we wanted to find credit hidden population, and figure out a way to give them credit. To me, this is an inch. I'm not reducing inequalities on the world as the UN wants to, but I feel I contributed a number 10, and I see my colleagues here that who had contributed to other parts. So my ask, like Duncan, let's be obsessed with one of these. I think that is big data and big algorithm and solution space will allow us to tackle these large problems.