Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Cache Memory in Computer Organisation, Lecture notes of Computer Science

Veer Surendra Sai University of Technology Computer Science

Brief Introduction to Cache Memory in Computer Organisation and Architecture

Typology: Lecture notes

2019/2020

Uploaded on 10/18/2020

altamas-bari 🇮🇳

1 document

1 / 16

This page cannot be seen from the preview

Don't miss anything!

Basic Concepts

The maximum size of the memory that can be used in any computer is determined by

the addressing scheme.

For example, a 16-bit computer that generates 16-bit addresses is capable of addressing

up to 216=24*210 = 64KB memory locations.

Similarly, machines whose instructions generate 32-bit addresses can utilize a memory

that contains up to 232= 22 * 230 =4GB memory locations.

When the CPU executes a program, that program is stored in the computer's main

memory (also called the RAM or random access memory).

A memory unit is called random-access memory (RAM) if any location can be accessed

for a Read or Write operation in some fixed amount of time that is independent of the

location’s address.

Data transfer between the memory and processor takes place through the use of two

processor registers, MAR and MDR.

Suppose a processor takes 5 ns to process an instruction, to fetch an instruction

Partial preview of the text

Download Cache Memory in Computer Organisation and more Lecture notes Computer Science in PDF only on Docsity!

Basic Concepts

 The maximum size of the memory that can be used in any computer is determined by the addressing scheme.  For example, a 16-bit computer that generates 16-bit addresses is capable of addressing up to 2 16 = 4

10 = 64KB memory locations.  Similarly, machines whose instructions generate 32-bit addresses can utilize a memory that contains up to 2 32 = 2 2

2 30 =4GB memory locations.  When the CPU executes a program, that program is stored in the computer's main memory (also called the RAM or random access memory).  A memory unit is called random-access memory (RAM) if any location can be accessed for a Read or Write operation in some fixed amount of time that is independent of the location’s address.  Data transfer between the memory and processor takes place through the use of two processor registers, MAR and MDR.  Suppose a processor takes 5 ns to process an instruction, to fetch an instruction

from the memory it takes 10 ns (The time taken to access the memory location (data + instruction) is known as Memory Access time).  There is a mismatch of the speed between the processor and the memory. So, after processing the instruction, the processor will be in an idle state (Stall) for 5ns, leads to decrease the throughput of the processor.  The processor of a computer can usually process instructions and data faster than they can be fetched from the memory, so we can say that the processor is a fast unit as compared to memory (slow).  The memory cycle time is the bottleneck in the system.  One way to reduce the memory access time is to use a cache memory.  Cache memory is the fastest system memory, required to keep up with the CPU as it fetches and executes instructions. The data most frequently used by the CPU is stored in cache memory. The fastest portion of the CPU cache is the register file, which contains multiple registers. Registers are small storage locations used by the CPU to store instructions and data.  Virtual memory is another important concept in related to memory organization. It is used to increase the apparent size of the physical memory. Data are addressed in a virtual address space that can be as large as the addressing capability of the processor. But at any given time, only the active portion of this space is mapped onto locations in the physical memory. The remaining virtual addresses are mapped onto the bulk storage devices used, such as magnetic disks.

 When the required data is not present in the Main Memory then it’s a miss then the reference will be forward to Secondary storage (HDD).  Secondary storage is the final memory in the computer system. The required data will always be hit in Secondary storage. So, the required data will be transferred to Main Memory in the form of Pages, Main Memory to Cache in the form of blocks, Cache to CPU in the form of words.

Concept of Cache Memory Design

Cache Size and Block Size
Mapping function
Replacement Algorithm
Write Policy
Multilevel Cache

Cache Size and Block Size

 To align with the processor speed, cache memories are very small so that it takes less time to find and fetch data. They are usually divided into multiple layers based on the architecture. The size of cache should accommodate the size of blocks of the Main Memory.

Cache Memory Size =8 byte 00

Block size/Offset size/ Word size//Line Size = 2 byte 01

No. of Cache blocks = Cache Memory Size/Block size 10

= 4(2 bits are required) Cache Memory No. of Cache blocks (Cache line/Line no.)  Line Offset Main Memory (MM) Size = 16 byte B Block size/Offset size/Word size = 2 byte No. of Main Memory blocks = Main Memory Size / Block size = 16 / 2 = 8(3 bits are required) Tag B Main Memory

Cache Mapping techniques

 The process of transferring the data from Main Memory to Cache Memory is known as cache mapping.

Cache Mapping techniques

Block Size Block Size

Block size = 4 byte  So, 2 bits (LSB) will be required to represent the Block size and 5 bits (MSB) will be required to represent the Block No (Tag).  The MM block size is equivalent to Cache memory block size,  Line offset = CM Size/Block size & the remaining bits will represent the Tag. 00010(MSB) 5 bits 10(LSB) 2 bits Block No. 000(MSB) 3 bits 10 2 bits 10(LSB) 2bits  W10 B  If block no. W10 (B2) is present in the cache memory and its tag bit is matched with the processor generated request (Tag) then it is cache Hit, otherwise it’s a cache miss, so the B2 must be transferred from MM to Cache Memory.  If CPU generated Memory address tag (Ex-: CPU generated Memory request 00010 10, Tag 000 ) is not matched with the tag (001) of the cache memory then it’s a Cache Miss. Types of cache Miss  Compulsory Miss, First access to a cache will cause a miss(cache is initially empty)  Capacity misses occur when the cache is too small to hold all concurrently used data.  Conflict misses (Drawback) are caused when several addresses map to the same set and evict blocks that are still needed. Changing cache parameters can affect one or more types of cache miss. LO

L

We know that MM block can map only to a particular line offset of the cache Memory. So, B1 can be mapped to L1 by the formula K mod n. Next iteration B5 can be mapped to L1 by removing B1. Next iteration B9 can be mapped to L by removing B1.Next iteration B1 again mapped to L1 by removing B9.B1 was initially present in the cache memory but we have removed it because of B5 that create cache miss. If we see L0, L2 & L3 are empty but we are unable to utilize them which results into conflict miss.

B1,B5,B9, B

Need of Replacement Algorithm-

 In direct mapping, there is no need of any replacement algorithm.  This is because a main memory block can map only to a particular line of the cache and the position of each block is predetermined.  Thus, the new incoming block will always replace the existing block (if any) in that particular line.

Associative Mapping

 In this type of mapping, a block of main memory can map to any line of the cache that is freely available at that moment.  This means that the word id bits are used to identify which word in the block is needed, but the tag becomes all of the remaining bits. This enables the placement of any word at any place in the cache memory at the cost of the size of the Tag bit.  It is considered to be the fastest and the most flexible mapping form. Cache Memory Tag Line-Offset Block Size 000 L0(00) B 000 L1(01) B 000 L2(10) B 000 L3(11) B (Tag = Tag+ Line-Offset) Tag Block Size 00000 B 00001 B 00010 B 00011 B Cache size = 16 byte Block size =4 byte 7 bits Tag Block Size 5 2 Main memory Tag Block Size 00000(B0) W0, W1, W2, W 00001(B1) W4, W5, W6, W 00010(B2) W8, W9, W10, W 00011(B3) W12, W13, W14, W 00100(B4) W16, W17, W18, W 00101(B5) W20, W21, W22, W 00110(B6) W24, W25, W26, W 00111(B7) W28, W29, W30,W

.. .. 11111(B31) W124,W125,W126,W Main memory size = 128 byte (7bits) Block size = 4 byte = 2 2 No. of blocks =MM size / Block size = 128 / 4 = 32 = 2 5 7 bits No. of blocks Block Size 5 2

7 bits Tag Set No. Block Size 4 1 2

5 7 bits No. of blocks Block Size 5 2  If all the cache lines are occupied, then one of the existing blocks will have to be replaced. Need of Replacement Algorithm-  Set associative mapping is a combination of direct mapping and fully associative mapping.  It uses fully associative mapping within each set.  Thus, set associative mapping requires a replacement algorithm. Q1.> Consider a direct mapped cache of size 16 KB with block size 256 bytes. The size of main memory is 128 KB. Find Number of bits in tag and Tag directory size? Soln: Cache memory size = 16 KB = 2 14 Block size = Frame size = Line size = 256 bytes = 2 8 Main memory size = 128 KB = 2 17 Number of bits in physical address = 17 bits Line offset = CM size / Block size = 2 14

6 lines =6 bits required. 3 6 8 Cache Memory 17 Number of Bits in Tag =17 -14 = 17 Main Memory 9 8 Tag directory size = Number of tags X Tag size Tag Line Offset Block Size Tag Block Size

= Number of lines in cache X Number of bits in tag = 2 6 X 3 bits = 64 X 3 bits = 192 bits = 192/8 = 24 bytes

Locality of reference

 Since size of cache memory is less as compared to main memory. So to check which part of main memory should be given priority and loaded in cache is decided based on locality of reference. Types of Locality of reference Spatial Locality of reference(Space)  This says that there is a chance that word present in the close proximity to the reference word will be accessed next. Temporal Locality of reference(Time)  This says that if a word is referenced now then same word will be referenced in the near future.  In this, least recently used (LRU) algorithm will be used.

Replacement Algorithm

 In case of a cache miss, a replacement policy needs to be followed, to decide which block in the corresponding set will be replaced. This requires additional decision hardware. There are several possible replacement policies: LRU (Least Recently Used):  Replace the block which was least recently referenced (more complex and expensive hardware, but lower miss rate, assuming that the most recently referenced words are most likely to be referenced again in the near future). FIFO(First In First Out):  Replace the block that was replaced least recently (so, blocks are replaced based on the order in which they were copied, rather than accessed).

 Initially all slots are empty, so when 7, 0, 1, 2 are allocated to the empty slots so, 4 Cache Miss.  0 is already their so it’s a cache Hit.  When 3 came it will take the place of 7 because it is least recently used so its cache Miss.  0 is already in cache memory so it’s a cache Hit.  4 will takes place of 1, so its cache Miss.  Now for the further page reference string, it’s a cache Hit because they are already available in the cache memory.

Write Policy in Cache

 Whenever a Processor wants to Read/write a word, it checks to see if the address it wants to Read/write the data to, is present in the cache or not.  If address is present in the cache i.e., Read/write Hit.  In Read operation, the Main memory is not involved.  In Write operation, the Main memory is involved.  We can update the value in the cache and avoid an expensive main memory access. As both cache and main memory have different data at same memory location, it will 0 3 0 3 Data at location 10101100 not updated in the Main Memory Simultaneously. 1 2 3 4

cause problem in two or more devices sharing the main memory (as in a multiprocessor system).This property is called as coherence/ cache coherence.  This results in Inconsistent Data Problem.  In write operation, the system can proceed in two ways i.e Write Through and Write Back protocol. Write Through Protocol  In write through, data is simultaneously updated to cache and memory. This process is simpler and more reliable. This is used when there are no frequent writes to the cache (Number of write operation is less).  It helps in data recovery (In case of power outage or system failure).  A data write will experience latency (delay) as we have to write to two locations (both Memory and Cache).  It solves the inconsistency problem. Write Back Protocol  The data is updated only in the cache and updated into the memory in later time. Data is updated in the memory only when the cache line is ready to replaced (cache line replacement is done using Least Recently Used Algorithm, FIFO, LIFO and others depending on the application).

 Here the CPU at first checks whether the desired data is present in the Cache Memory or not i.e. whether there is a “hit” in cache or “miss” in cache. Suppose there are 3 miss in Cache Memory then the Main Memory will be accessed only 3 times.  Here the Cache performance is optimized further by introducing multilevel Caches.  We are considering 2 levels Cache Design. Suppose there are 3 miss in the L1 Cache Memory and out of these 3 misses there are 1 miss in the L2 Cache Memory then the Main Memory will be accessed only 1 times. It is clear that here the Miss Penalty is reduced considerably than that in the previous case thereby improving the Performance of Cache Memory.

Cache Memory in Computer Organisation, Lecture notes of Computer Science

Related documents

Partial preview of the text

Download Cache Memory in Computer Organisation and more Lecture notes Computer Science in PDF only on Docsity!

Basic Concepts

Concept of Cache Memory Design

Cache Size and Block Size

Cache Memory Size =8 byte 00

Block size/Offset size/ Word size//Line Size = 2 byte 01

No. of Cache blocks = Cache Memory Size/Block size 10

Cache Mapping techniques

Cache Mapping techniques

L

L

L

B1,B5,B9, B

Need of Replacement Algorithm-

Associative Mapping

Locality of reference

Replacement Algorithm

Write Policy in Cache