Welcome to Linux Knowledge Base and Tutorial
"The place where you learn linux"
Mercy Corps

 Create an AccountHome | Submit News | Your Account  

Tutorial Menu
Linux Tutorial Home
Table of Contents
Up to --> The Computer Itself

· Memory
· RAM
· Cache Memory

Glossary
MoreInfo
Man Pages
Linux Topics
Test Your Knowledge

Site Menu
Site Map
FAQ
Copyright Info
Terms of Use
Privacy Info
Disclaimer
WorkBoard
Thanks
Donations
Advertising
Masthead / Impressum
Your Account

Communication
Feedback
Forums
Private Messages
Recommend Us
Surveys

Features
HOWTOs
News
News Archive
Submit News
Topics
User Articles
Web Links

Google
Google


The Web
linux-tutorial.info

Who's Online
There are currently, 243 guest(s) and 4 member(s) that are online.

You are an Anonymous user. You can register for free by clicking here

  
Linux Tutorial - The Computer Itself - Memory - Cache Memory
  RAM ---- The Central Processing Unit  


Cache Memory

Based on the principle of spatial locality, a program is more likely to spend its time executing code around the same set of instructions. This is demonstrated by the tests that have shown that most programs spend 80 percent of their time executing 20 percent of their code. Cache memory takes advantage of that.

Cache memory, or sometimes just cache, is a small set of very high-speed memory. Typically, it uses SRAM, which can be up to ten times more expensive than DRAM, which usually makes it prohibitive for anything other than cache.

When the IBM PC first came out, DRAM was fast enough to keep up with even the fastest processor. However, as CPU technology increased, so did its speed. Soon, the CPU began to outrun its memory. The advances in CPU technology could not be used unless the system was filled with the more expensive, faster SRAM.

The solution to this was a compromise. Using the locality principle, manufacturers of fast 386 and 486 machines began to include a set of cache memory consisting of SRAM but still populated main memory with the slower, less expensive DRAM.

To better understand the advantages of this scheme, lets cover the principle of locality in a little more detail. For a computer program, we deal with two types of locality: temporal (time) and spatial (space). Because programs tend to run in loops (repeating the same instructions), the same set of instructions must be read over and over. The longer a set of instructions is in memory without being used, the less likely it is to be used again. This is the principle of temporal locality. What cache memory does is enable us to keep those regularly used instructions "closer" to the CPU, making access to them much faster. This is shown graphically in Figure 0-10.

Image - Level 1 and Level 2 Caches (interactive)

Spatial locality is the relationship between consecutively executed instructions. I just said that a program spends more of its time executing the same set of instructions. Therefore, in all likelihood, the next instruction the program will execute lies in the next memory location. By filling cache with more than just one instruction at a time, the principle of spatial locality can be used.

Is there really such a major advantage to cache memory? Cache performance is evaluated in terms of cache hits. A hit occurs when the CPU requests a memory location that is already in cache (that is, it does not have to go to main memory to get it). Because most programs run in loops (including the OS), the principle of locality results in a hit ratio of 85 to 95 percent. Not bad!

On most 486 machines, two levels of cache are used: level 1 cache and level 2 cache. Level 1 cache is internal to the CPU. Although nothing (other than cost) prevents it from being any larger, Intel has limited the level 1 cache in the 486 to 8k.

The level 2 cache is the kind that you buy separately from your machine. It is often part of the advertisement you see in the paper and is usually what people are talking about when they say how much cache is in their systems. Level 2 cache is external to the CPU and can be increased at any time, whereas level 1 cache is an integral part of the CPU and the only way to get more is to buy a different CPU. Typical sizes of level 2 cache range from 64K to 256K, usually in increments of 64K.

There is one major problem with dealing with cache memory: the issue of consistency. What happens when main memory is updated and cache is not? What happens when cache is updated and main memory is not? This is where the caches write policy comes in.

The write policy determines if and when the contents of the cache are written back to memory. The write-through cache simply writes the data through the cache directly into memory. This slows writes, but the data is consistent. Buffered write-through is a slight modification of this, in which data are collected and everything is written at once. Write-back improves cache performance by writing to main memory only when necessary. Write-dirty is when it writes to main memory only when it has been modified.

Cache (or main memory, for that matter) is referred to as "dirty" when it is written to. Unfortunately, the system has no way of telling whether anything has changed, just that it is being written to. Therefore it is possible, but not likely, that a block of cache is written back to memory even if it is not actually dirty.

Another aspect of cache is its organization. Without going into detail (that would take most of a chapter itself), I can generalize by saying there are four different types of cache organization.

The first kind is fully associative, which means that every entry in the cache has a slot in the "cache directory" to indicate where it came from in memory. Usually these are not individual bytes, but chunks of four bytes or more. Because each slot in the cache has a separate directory slot, any location in RAM can be placed anywhere in the cache. This is the simplest scheme but also the slowest because each cache directory entry must be searched until a match (if any) is found. Therefore, this kind of cache is often limited to just 4Kb.

The second type of cache organization is direct-mapped or one-way set associative cache, which requires that only a single directory entry be searched. This speeds up access time considerably. The location in the cache is related on the location in memory and is usually based on blocks of memory equal to the size of the cache. For example, if the cache could hold 4K 32-bit (4-byte) entries, then the block with which each entry is associated is also 4K x 32 bits. The first 32 bits in each block are read into the first slot of the cache, the second 32 bits in each block are read into the second slot, and so on. The size of each entry, or line, usually ranges from 4 to 16 bytes.

There is a mechanism called a tag, which tells us which block this came from. Also, because of the very nature of this method, the cache cannot hold data from multiple blocks for the same offset. If, for example, slot 1 was already filled with the data from block 1 and a program wanted to read the data at the same location from block 2, the data in the cache would be overwritten. Therefore, the shortcoming in this scheme is that when data is read at intervals that are the size of these blocks, the cache is constantly overwritten. Keep in mind that this does not occur too often due to the principle of spatial locality.

The third type of cache organization is an extension of the one-way set associative cache, called the two-way set associative. Here, there are two entries per slot. Again, data can end up in only a particular slot, but there are two places to go within that slot. Granted, the system is slowed a little because it has to look at the tags for both slots, but this scheme allows data at the same offset from multiple blocks to be in the cache at the same time. This is also extended to four-way set associative cache. In fact, the cache internal to 486 and Pentium has a four-way set associate cache.

Although this is interesting (at least to me), you may be asking yourself, "Why is this memory stuff important to me as a system administrator?" First, knowing about the differences in RAM (main memory) can aide you in making decisions about your upgrade. Also, as I mentioned earlier, it may be necessary to set switches on the motherboard if you change memory configuration.

Knowledge about cache memory is important for the same reason because you may be the one who will adjust it. On many machines, the write policy can be adjusted through the CMOS. For example, on my machine, I have a choice of write-back, write-through, and write-dirty. Depending on the applications you are running, you may want to change the write policy to improve performance.

 Previous Page
RAM
  Back to Top
Table of Contents
Next Page 
The Central Processing Unit


MoreInfo

Test Your Knowledge

User Comments:


You can only add comments if you are logged in.

Copyright 2002-2009 by James Mohr. Licensed under modified GNU Free Documentation License (Portions of this material originally published by Prentice Hall, Pearson Education, Inc). See here for details. All rights reserved.
  
Help us cut cost by not downloading the whole site!
Use of automated download sofware ("harvesters") such as wget, httrack, etc. causes the site to quickly exceed its bandwidth limitation and therefore is expressedly prohibited. For more details on this, take a look here

Login
Nickname

Password

Security Code
Security Code
Type Security Code


Don't have an account yet? You can create one. As a registered user you have some advantages like theme manager, comments configuration and post comments with your name.

Help if you can!


Amazon Wish List

Did You Know?
The Linux Tutorial can use your help.


Friends



Tell a Friend About Us

Bookmark and Share



Web site powered by PHP-Nuke

Is this information useful? At the very least you can help by spreading the word to your favorite newsgroups, mailing lists and forums.
All logos and trademarks in this site are property of their respective owner. The comments are property of their posters. Articles are the property of their respective owners. Unless otherwise stated in the body of the article, article content (C) 1994-2013 by James Mohr. All rights reserved. The stylized page/paper, as well as the terms "The Linux Tutorial", "The Linux Server Tutorial", "The Linux Knowledge Base and Tutorial" and "The place where you learn Linux" are service marks of James Mohr. All rights reserved.
The Linux Knowledge Base and Tutorial may contain links to sites on the Internet, which are owned and operated by third parties. The Linux Tutorial is not responsible for the content of any such third-party site. By viewing/utilizing this web site, you have agreed to our disclaimer, terms of use and privacy policy. Use of automated download software ("harvesters") such as wget, httrack, etc. causes the site to quickly exceed its bandwidth limitation and are therefore expressly prohibited. For more details on this, take a look here

PHP-Nuke Copyright © 2004 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.
Page Generation: 0.08 Seconds