Computer App Speeds Boosted

John Simpson | October 11, 2016

Researchers at Samsung Electronics and North Carolina State University (NCSU) have found a way to boost the speed of computer applications by more than 9% using techniques that allow processors to retrieve data more efficiently.

Computer processors have to retrieve data from memory to perform operations. All data is stored in off-chip “main” memory. But data that the processor uses frequently is also stored temporarily in a die-stacked dynamic random access memory (DRAM) cache that is located closer to the processor, where it can be retrieved more quickly.

Researchers developed a technique in which the DRAM cache learns over time which data the processor needs from each macroblock.The data in the cache is organized into large blocks, or macroblocks, so that the processor knows where to find whatever data it needs. However, for any given operation, the processor doesn’t need all of the data in a macroblock—and retrieving the unnecessary data takes time and energy.

To make the process more efficient, researchers led by led by Yan Solihin, professor of electrical and computer engineering at NCSU, developed a technique in which the cache learns over time which data the processor needs from each macroblock. This allows the cache to do two things. First, the cache can compress the macroblock, retrieving only the relevant data. This enables the cache to send data to the processor more efficiently. Second, because the macroblock is compressed, space is freed up in the cache that can be used to store other data that the processor is more likely to need.

The researchers tested this approach, called Dense Footprint Cache, in a processor and memory simulator. After running 3 billion instructions for each application tested through the simulator, they found that the approach sped applications by 9.5% compared to state-of-the-art competing methods for managing die-stacked DRAM. Dense Footprint Cache also used 4.3% less energy.

The Dense Footprint Cache was also reported to have achieved a significant improvement in the ratio of “last-level cache misses,” which occur if the processor tries to retrieve data from the cache when it is not located there, forcing the processor to retrieve the data from off-chip main memory instead. The new process reduced last-level cache miss ratios, which make operations much less efficient, by 43%.

To contact the author of this article, email GlobalSpeceditors@globalspec.com