This page takes a better look on the Raspberry Pi memory hierarchy. Every degree of the memory hierarchy has a capability and pace. Capacities are relatively simple to discover by querying the working system or reading the ARM1176 technical reference handbook. Pace, however, is not as simple to find and must usually be measured. I exploit a easy pointer chasing technique to characterize the behavior of every degree in the hierarchy. The method additionally reveals the habits of Memory Wave memory booster-related efficiency counter occasions at every stage. The Raspberry Pi implements 5 ranges in its memory hierarchy. The degrees are summarized in the desk below. The very best level consists of digital memory pages that are maintained in secondary storage. Raspbian Wheezy keeps its swap house in the file /var/swap on the SDHC card. This is sufficient area for Memory Wave 25,600 4KB pages. You might be allowed as many pages as will fit into the preallocated swap area.
The Raspberry Pi has either 256MB (Mannequin A) or 512MB (Model B) of major memory. This is sufficient house for 65,536 pages or 131,072 physical pages, if all of primary memory were accessible for paging. It isn’t all accessible for person-area applications as a result of the Linux kernel needs space for its own code and information. Linux additionally supports massive pages, but that’s a separate matter for now. The vmstat command shows information about virtual memory utilization. Please check with the man web page for usage. Vmstat is a good tool for troubleshooting paging-related efficiency points since it shows web page in and out statistics. The processor in the Raspberry Pi is the Broadcom BCM2835. The BCM2835 does have a unified level 2 (L2) cache. Nonetheless, the L2 cache is dedicated to the VideoCore GPU. Memory references from the CPU side are routed around the L2 cache. The BCM2835 has two level 1 (L1) caches: a 16KB instruction cache and a 16KB data cache.
Our evaluation beneath concentrates on the information cache. The info cache is 4-method set associative. Each means in an associative set stores a 32-byte cache line. The cache can handle up to 4 energetic references to the same set without battle. If all 4 ways in a set are valid and a fifth reference is made to the set, then a battle happens and one of the 4 methods is victimized to make room for the new reference. The data cache is just about indexed and bodily tagged. Cache strains and tags are stored separately in DATARAM and TAGRAM, respectively. Digital handle bits 11:5 index the TAGRAM and DATARAM. Given a 16KB capacity, 32 byte lines and 4 methods, there have to be 128 sets. Virtual address bits 4:0 are the offset into the cache line. The data MicroTLB translates a digital handle to a bodily address and sends the physical deal with to the L1 information cache.
The L1 data cache compares the physical handle with the tag and determines hit/miss status and the right way. The load-to-use latency is three (3) cycles for an L1 information cache hit. The BCM2835 implements a two level translation lookaside buffer (TLB) structure for Memory Wave digital to bodily deal with translation. There are two MicroTLBs: a ten entry knowledge MicroTLB and a ten entry instruction MicroTLB. The MicroTLBs are backed by the principle TLB (i.e., the second level TLB). The MicroTLBs are totally associative. Each MicroTLB translates a digital handle to a bodily address in one cycle when the page mapping data is resident within the MicroTLB (that's, a success in the MicroTLB). The main TLB is a unified TLB that handles misses from the instruction and knowledge MicroTLBs. A 64-entry, 2-method associative construction. Essential TLB misses are handled by a hardware web page table walker. A page table stroll requires a minimum of one additional memory access to seek out the web page mapping information in primary memory.