Looking at Oracle’s latest iteration of its SPARC processor technology, the new M7 CPU, it is at first blush an excellent implementation of SPARC, with 32 cores with 8 threads each implemented in an aggressive 20 nm process and promising a well-deserved performance bump for legacy SPARC/Solaris users. But the impact of the M7 goes beyond simple comparisons to previous generations of SPARC and competing products such as Intel’s Xeon E7 and IBM POWER 8. The M7 is Oracle’s first tangible delivery of its “Software on Silicon” promise, with significant acceleration of key software operations enabled in the M7 hardware.[i]
Oracle took aim at selected performance bottlenecks and security exposures, some specific to Oracle software, and some generic in nature but of great importance. Among the major enhancements in the M7 are:[ii]
- Cryptography – While many CPUs now include some form of acceleration for cryptography, Oracle claims the M7 includes a wider variety and deeper support, resulting in almost indistinguishable performance across a range of benchmarks with SSL and other cryptographic protocols enabled. Oracle claims that the M7 is the first CPU architecture that does not present users with the choice of secure or fast, but allows both simultaneously.
- Oracle database performance acceleration – The M7 contains hardware for acceleration of columnar decompression and table scan, filter and join operations in Oracle’s eponymous database product, yielding benchmark results that are in some cases difficult to believe, with some showing in excess of an order of magnitude improvement versus unaccelerated previous SPARC processors and contemporary competitive systems.[iii]
- Memory reference protection – Oracle has built in “Silicon Secured Memory” (SSM), a powerful hardware extension that protects against incorrect or malicious error references in real time. Conceptually, every time memory is allocated it gets a “color”, and any time a pointer references that memory, the “color” (actually implemented with extra hidden bits of memory) of the pointer and the destination must match. This zero overhead check, enforced for every memory reference, can stop a number of malicious exploits that depend on buffer overruns and pointer manipulation. Oracle claims that well-known exploits like Heartbleed and Venom would have been stopped with SSM. SSM can also guard against data corruption caused by memory reference errors in legitimate code.
As a concept, hardware acceleration of software functions is not in any sense new, and the application of hardware to ameliorate successive layers of software bottlenecks dates to at least the 1970’s to my knowledge. In 1977, in the days of 25Mhz clock speeds, I worked at Prime Computer where we were justifiably proud of the fact that we were the first to implement a lot of the complexities of subroutine call and process management in hardware. In the decade that followed there were multiple attempts to implement complex high-level language constructs like loops and conditional statements in hardware.[iv] These experiments continued through the 90s as the industry eventually settled on the handful of architectures we have today with their quasi-RISC architectures, and today other high-end architectures implement some level of network protocol processing offload, crypto acceleration, assists for virtual memory and VM acceleration, memory compression and a host of other features to make low-level software execute well.
What is significant about Oracle’s approach is it appears to mark the first application of hardware optimization for a specific set of application level features, in this case the processing of columnar compressed data and selected table operations in a manner specifically tailored to the Oracle database. Oracle will allow other software to use these hardware accelerators, but the dominant use case will definitely be Oracle software.
Additionally, the implementation of SSM removes a major impediment to more secure operations by allowing continuous runtime protection against a dangerous class of exploit. While Oracle was not able offer any statistics on frequency of attacks, the ability to enable SSM with almost no performance penalty ensures rapid uptake and makes the M7 the new gold-standard for runtime security.[v]
What’s in it for Oracle and its customers?
In choosing to optimize the SPARC M7 and presumable future versions, Oracle gets a stronger position with many of its more than 400,000 customers, for many of whom the additional performance per licensed core will more than outweigh any increased hardware cost (if any) compared to other alternatives. In addition, the M7 should act as a drastic brake on future migrations from SPARC/Solaris to Linux, since it appears that the new M7 CPU in the low-cost T-Series servers can equal or exceed x86 price-performance across many Oracle workloads.
If the performance numbers that Oracle discussed at Oracle Open World are anything close to representative of real customer experiences, the M7 delivers both extreme performance on some Oracle workloads and also price-performance in line with x86 solutions.
For existing Oracle SPARC customers, I think the M7 should make them think twice about any plans to migrate off of SPARC unless there is solid economic justification based on rigorous benchmarking of actual workloads. In a nutshell, it looks like the M7 has delivered on Oracle’s promise of maintaining a competitive roadmap for its processors, almost completely vitiating any price-performance argument for a migration.
What does this mean for competitors?
It’s a pretty simple calculus – the M7 is a high bar for competition, but it comes with a twist. Oracle is the only vendor in a position to reasonably invest in hardware that is optimized for its own software stack, but this advantage is double-edged. For Oracle’s own customer base it helps build some very high walls around it. At the same time, the extra chip cost and R&D investment doesn’t do any similar magic for the rest of the software world, including customer-developed software. IBM, the only remaining non-x86 architecture worth thinking about in the context of enterprise servers, faces an uphill battle against Oracle for customers running its eponymous database, and since its Power 8 is barely a year old, it has a long runway before it could field similar assists for DB2, and the economics of a DB2-specific hardware investment may be questionable considering its much smaller installed base.[vi] Intel faces the dilemma of the truly general-purpose player, and any optimizations it makes in future server platforms must play well with Oracle, SQL Server, DB2 and a long list of other enterprise software vendors.
[i] For the technically inclined, the M7 has 8 query acceleration coprocessors with four execution engines each in addition to the standard SPARC cores, and uses the coprocessors to enable some of the specific functions.
[ii] Oracle also included more arcane and less impactful hardware assists for live migration, light-weight IPC, and remote memory access. While these no doubt help in the overall performance picture, they are less significant than the “big three” discussed above.
[iii] Anticipating the chorus of “this has got to be too good to be true”, Oracle has provided a list of benchmarks for the new M7 systems at https://blogs.oracle.com/
[iv] These approaches turned out to be a sort of architectural dead-end, since the extra logic needed to execute these complex instructions tended to bloat the resulting microprocessors and slow them down. Similarly, attempts to implement other complex software in hardware, including the kernel for FEA solvers, Java, and other interpreted environments, all failed.
[v] Note that ADI also protects against subtle runtime errors as well as malicious attacks. Data corruption can be difficult to detect until the misdirected pointer operation results in overwriting some critical runtime structure and generates a fault.
[vi] IBM has made extensive optimizations of its DB2 product for in-memory analytics with its BLU Acceleration technology, which yields exceptional improvements in certain kinds of in-memory analytics. However, this is a generalized software technology that runs on mainframe, Power and x86 architectures, and does not currently involve dedicated hardware assist. Currently there are no publicly available benchmarks comparing BLU Acceleration and the M7 on comparable workloads.