ARM cache basics

This ARM tutorial covers ARM cache basics and ARM architecture. Refer following pages for other ARM tutorial contents.

What is Cache?

Small memory placed between processor and main memory to store recently accessed transactions and make memory access faster Most modern desktop and server CPUs have at least three independent caches: an instruction cache to speed up executable instruction fetch, a data cache to speed up data fetch and store, and a TLB used to speed up virtual-to-physical address translation for both executable instructions and data. Data cache generally is organized as hierarchy with multiple cache levels e.g. L1/L2 etc. This is also called as multi level caches.

What is Write Buffer?

A very small (FIFO) memory placed between the processor core and main memory. The purpose is to free the processor core and cache memory from the slow write time associated with writing to main Memory.

How cache improves performance

Principle of locality of reference: computer software programs frequently run small loops of code that repeatedly operate on local sections of data memory.

How much cache is enough?

Bigger cache leads to higher latency. Balance is important. A processor with a clock speed of 2.4 Ghz and 4 megabytes of L2 cache would theoretically perform the same as a 2.6 Ghz processor with 2 megabytes of cache.

ARM architecture

ARM cache architecture
There are two main parts in ARM cache viz. cache controller and cache memory.

ARM Cache Types:

• Virtual Cache( ARM9)
• Physical Cache( ARM11)
• Multilevel caches:
• ARM Cache Write Policy:
• Writethrough: Cache & memory is updated at the same time.
• Writeback: Cache is updated first, dirty bit is set( memory & cache not in sync). Memory updated when cache line requires eviction.
• ARM Cache Line Replacement Policy:
The strategy implemented in a cache controller to select the next victim is called its replacement policy

Round-robin: simply selects the next cache line in a set to replace. Has more predicitability
Pseudorandom: randomly selects the next cache line in a set to replace
• CPU Stalls:
state when CPU is waiting to read the cache( CPU is fast, but cache read is slow)

• Out-of-order CPUs: attempt to execute independent instructions after the instruction that is waiting for the cache miss data

Hyper-Threading: which allows an alternate thread to use the CPU core while a first thread waits for data to come from main memory.

Similar posts on ARM

ARM tutorial page1
ARM tutorial page2
ARM tutorial page3
ARM tutorial page4
ARM tutorial page5
ARM tutorial page6