Has anybody noticed that processor speed has stopped doubling every 18 months? This occurred to me the other day, so I took some time to figure out why and draw some conclusions about Moore's law and the impacts of continued advances in chip technology. Here what I've come up with: 1) Moore's law is still valid, but the way processor power is measured has changed, 2) disk-based memory is going the way of the cassette tape, and 3) applications will move into the cloud.
We have pushed semiconductor technology to its physical limits, including our ability to cool chips and the speed of light. As a result, chip manufacturers have turned to multicore processing technology rather than pure chip and bus speed. Now the power of a microprocessor is judged by the number of cores it contains — and the number of cores on a single chip will continue to increase for the near future.
So what? Extra cores per chip means more parallel processing to speed through operations — so parallel is the future.
Two other trends are also important to understand my conclusions:
- RAM keeps getting more powerful and cheaper.
- As the number of cores in a chip goes up, its ability to process data begins to exceed bus technology’s ability to deliver it. Bus speed is governed by Moore’s law.
Considering these trends, here’s what going to happen (and is really happing now, but the signals are still weak):
- Large companies will soon be able to store the entire contents of their corporate transactional data in memory. SAP’s High Performance Analytic Appliance (HANA) makes this claim today, and I think other vendors will follow up with their own versions. This offers so many advantages that it’s hard to imagine a future with disk-based storage in it. Bottom line: Our entire enterprise data warehouse (EDW) architecture is going to change when we no longer need to physically move data around to aggregate it for analytics.
- The key to handling more and more data is parallel processing, and cost-effective high-performance computing with near-limitless scalability is one thing the cloud can provide via shared-nothing architectures (SNAs). What this means is that technologies such as HANA and Hive are going to provide SQL-like interactions with structured data; transactions will either be adapted to eventually consistent models or will execute in small clusters where ACID guarantees are absolutely required. Behind the scenes, SQL/MDX query technology will leverage cloud-based massive parallel processing and SNA to solve bigger problems.
Bottom line, it will eventually be more common for applications to live in a cloud environment than not, and in that cloud, in-memory databases will be the order of the day. How long will this take, and what should you do? I’m not sure yet, but I’m considering doing some more research into the area, depending on interest.
Please let me know what you think.