Maximizing Thermal Cooling Efficiencies in High-Performance Processors

Despite the continuous development of new, higher performing processors, the thirst for increased embedded computing capability remains unquenched. In fact, it seems like Moore’s Law may have slowed when it comes to frequency but increased in terms of driving processor core and field programmable gate array (FPGA) LUTS counts. The previous need for fewer frequency increases has become a need for increased core counts, faster front side bus speeds, and greater support chip integration, all of which drive continually rising power requirements. Meeting these ever increasing "compute density escalations" while simultaneously maximizing thermal cooling efficiencies requires innovative packaging solutions.

Figure 1. Intel®Microprocessor Pin Count Over Time (Credit: Lee Pavelich, Progression of CPU Pin Counts, Scrub Physics blog, September 19, 2011)
The need to increase core counts in processor chips and LUTs growth in FPGAs continues to grow at an unprecedented pace. Processor manufacturers like Intel and AMD continue to integrate functionality and processor core count to achieve greater processor volumetric efficiency. FPGA suppliers like Xilinx and Altera that dominate 90% of the FPGA market are offering larger LUTs-size FPGAs that appeal to embedded computing engineers, but come at a higher thermal management cost. This thermal management challenge is usually left to the end of the design process when engineers start to ask, “How will we cool these new chip densities?”

Size Matters

Silicon chips continue to evolve. Figure 1 articulates how the ball grid array (BGA) ball quantities of server and mobile- class chips have continued to grow in package size as functionality is integrated into the processor or scaled into the FPGAs.

Figure 2 (left to right). Intel
The portable device market requires a smaller volumetric approach and a cooling demand driven by size, weight and power (SWaP). Handheld devices, tablets, and laptops require maximum cooling in a very small environment. Server-class chips, however, use a different approach. Figure 2 shows a mobileclass processor and a server-class processor that offers a built-in heat spreader to aid in the mass transfer of thermal energy.

Each of these chips requires a different approach to dealing with this challenge. The smaller device demands an approach that controls the distribution of energy in a manner that does not add weight. The server-class chips are driving larger BGA ball counts and controlling the thermal heat spread of the chip with copper surfaces and volume to mass transfer energy to server-designed heat sinks. The size and weight is significantly different.

Thermal Densities — The Hidden Variables

Figure 3 (left to right). Thermal scans of mobile-class, server-class and FPGA chips.
When looking at these very different technologies in silicon-based chips and their ever-shrinking lithography implementations, one attribute is extremely consistent. As functionality and capabilities are scaled into the chips, they increase in size and carry a non-linear thermal distribution in thermal energy heat flux.

Figure 3 shows some of the enormous challenges present in all three silicon technologies. The FLIR camera analysis shows that there are significant differences in the thermal heat generation in the silicon. This means that watts per square inch is no longer a sensible measure for linearly analyzing these challenges. When sophisticated computational fluid dynamic (CFD) software tools like Flotherm, Ice Pack, or others are utilized, linear energy distribution is not observed. Thermal energy density and the ability to mass transfer the concentrated heat is becoming a thermal analyst’s "Disneyland", where copper or diamonds are preferred due to their conductivity. The weight or costs of these technology implementations are outside the scope of this article. So, the images shown in Figure 3 illustrate how some of these chips require a new approach to cooling to help absorb these highly concentrated energy loads.

Agnostic Cooling

Figure 4. Mercury's 6U OpenVPX payload cards may be packaged in a variety of standards-compatible cooling options without modification.
One example of solving this outstanding thermal energy non-linear challenge was developed by engineers at Mercury Systems. As shown in Figure 4, Mercury developed a 6U OpenVPX design approach, utilizing a standardized and scalable approach to VPX open standard cooling, and a common printed circuit board assembly across each different type of cooling technology.

This approach affords engineers the ability to solve these complex thermal density challenges in various environments, with the same computational architecture. A VPX solution in a lab environment needs a certain cooling solution, while a VPX solution in ground radar, a mobile vehicle, a manned aircraft, or an unmanned aerial vehicle (UAV) need significantly different cooling solutions. An agnostic approach allows affordable rugged VPX cooling solutions to be used in each of these very different environments, while also saving precious design, development and deployment time.

Open Standards, VITA and Standardized Module Cooling

It’s here where the VMEbus Industry Trade Association (VITA) has really embraced cooling agnostics. VITA continues to drive standardized cooling technologies into VPX computational cooling to support these requirements. Below are some examples:

  1. VITA 48.1 supports air cooling.
  2. VITA 48.2 supports conduction cooling.
  3. VITA 48.3 is an open unfinished standard for liquid cooling.
  4. VITA 48.4 is a developing standard for liquid cooling.
  5. VITA 48.5 supports air flow through cooling.
  6. VITA 48.6 is an open standard for liquid cooling.
  7. VITA 48.7 supports Air Flow-By™ cooling.
  8. VITA 48.8 …What will it be?

Figure 5. Thermal Resistance Comparison, Air Flow-By™ (AFB) vs. Conduction Cooled (CC).
As a final example, Figure 5 shows the effective mathematical solution between Mercury's VITA 48.7 Air Flow-By™ cooling and VITA 48.2 conduction cooling technologies.

Each of these cooling technologies has a direct impact on reliability through temperature impact and its associated direct variable of Coefficient of Thermal Expansion (CTE) impacts. As the power levels, thermal densities, and concentrated heat fluxes drive embedded systems forward, companies like Mercury Systems are driving mathematical high reliability cooling solutions to meet these ever increasing demands.

This article was written by Darryl McKenney, Vice President, Engineering Services Mercury Systems, Inc. (Chelmsford, MA). For more information, Click Here .