This article is about understanding different hardware for x86 architecture available in Market today. I have been wanting to do this for long time to understand what are the different processors, Motherboards, Architecture available.
I will start with intel, as it’s one of the largest Manufacturer of x86 Processors.
As of this writing intel has released 7th Generation processors named kabylake. So i thought before i start understanding these processors, we need to understand what you see in a typical x86 Intel Microprocessors. So this part of the article deals with basics of x86 processors and other hardware available in a x86 System.
In a typical old motherboard you will see this:
The main ingredients from the above image are :
CPU : Which Executes Instructions
NorthBridge: Transfers Data from Memory to CPU and vice versa through an interconnect called Front side bus.
SouthBridge: It’s an interconnect(I/O Hub) to which USB, PCI, Harddisks are connected.
Northbridge is a hub which transfers data from Soutbridge to RAM, RAM to CPU, and CPU to RAM and RAM to southbridge. Northbridge is connected to CPU through a bus[lots of wires] called Front Side Bus, It regulates traffic from CPU to RAM . In some systems we can find that Video card is also connected to Northbridge .
Southbridge is an I/O Hub where USB, PCI , and Harddisks are connected to. Northbridge needs to transfer data/instructions to Southbridge from RAM & CPU.
The data that is residing in Hard disk at one point has to be saved in RAM and from RAM to CPU to execute and back to RAM and back to hard disk. And same also for other devices attached to PCI slots, USB, graphics card. The data that is flowing through these devices to RAM and CPU need to be controlled through Northbridge which makes it very very important.
Also if we look at the above diagram the only way for data to flow in to CPU is through Front side bus and there is only 1 front side bus causing congestion. So the bus that connects between Northbridge to Southbridge, the front side bus(FSB) and the connections between Northbridge to RAM will have massive amounts of data flowing and will be heavily congested.
So if you have a high end graphics card connected to Graphics slot which is connected to Northbridge and you have usb pen drive where USB is connected to Southbridge , Don’t expect that your usb data transfer will be in full speed. Understanding the above architecture and bottle necks is very important
Next comes the CPU. CPU is the most expensive part of the whole PC. It’s where all the instructions are executed. What i want to concentrate more on CPU is it’s basic functionality. At very fundamental level it has the following ingredients
1. Program counter: This contains the address of the next instruction to be executed
2. Instruction Decoder: This decodes the instructions like what the instruction means, where it’s operands lie, Are they in CPU registers or in Memory, if in Memory where are they. So for instruction to be executed what all need to be done, Multiplication, Addition, Subtraction, Division.
3. Arithmetic Logical Unit: The decoded instruction is executed, The result is put on bus to be stored in RAM.
4. CPU has also General purpose registers where it can store data temporarily . The number of register and size of these registers vary in different architectures.
So let’s say we have an Instruction at some location in RAM (say address X)and Program Counter is currently pointing to that address X, the following sequence occurs:
- CPU fetches the instruction from address X
- Once the instruction is fetched it’s decoded
- Then it’s executed
- Put the result back in bus to be stored back in RAM.
When CPU is fetching the instruction, the other parts of the CPU i.e the Instruction Decoder, Arithmetic Logical Unit are all idle. To make effective use of CPU there is a concept called pipeline. i.e when CPU is fetching instructions, It makes sure that there are enough instructions for Instruction Decoder to Decode and Enough instructions to ALU to execute.
Note: The above is a very simplified explanation of CPU pipe-lining , There is a lot to it, you could find more on internet.
From our above basic explanation we will expand CPU more:
- Since instruction are in Memory(RAM), they need to be fetched from Memory and Execute. This fetching happens via FSB & Northbridge which do not run at the same speed as RAM. This fetching takes a lot of time, so most modern processors have caches where instructions from memory are fetched and stored locally so when it needs to execute it will fetch from cache instead of Memory.
- The other reason to have caches is because RAM is very slow compared to the speed at which CPU is executing instructions. The total time taken to wait for instruction to fetched from RAM to CPU is higher causing a lot of power wasted and making the processor slow.
- These caches inside the CPU are layered , First comes CPU Registers (eax, rax etc), Next comes L1 cache, Next comes L2 cache which is slightly bigger in size than L1 and you have L3 cache which is bigger than L2 Cache but not a lot (Mostly it would be in-order of 1 or 2MB)
- Instructions fetching from Cache are queued
- From the queue, instructions will be decoded
- The decoded Instructions are executed.
The basic need for cache is because CPU need to be fetched data/instructions quickly, it needs to access memory which is very fast and quick hence the cache. And also one more reason is Time Taken to execute the instruction is very less than time taken to fetch the instruction. So it’s important to keep the time to fetch the instruction as less as possible.
Some terms with regard to CPU cache.
- cache hit: if the information CPU requested is available in cache it’s called cache hit
- cache miss: If the information CPU requested is not available in cache and has to be accessed in RAM it’s called cache miss
- Snoop: When cache is watching the bus(address lines) for transactions it’s called Snoop
- Snarf: When cache is taking data from data lines(bus) , cache is said to have snarfed the data.
- Dirty Data: When data is modified within cache but not modified in Main memory
- Stale Data: When data is modified in main memory but not modified in cache.
When cache is reading data, there are 2 ways to it can do this:
- Look Aside
- Look Through
When cache is writing data, there are 2 ways to do this:
- Write Back
- Write Through
Look aside: In look aside architecture, when CPU wants to fetch some data, both cache and main memory(RAM) will see the bus at the same time. If the information is available in cache it’s a HIT else it’s a MISS.
Look Through: Cache gets access of the bus before RAM and if the information is available in cache it’s HIT else it’s a MISS. The disadvantage with this policy is that when bus is being used to read memory, CPU has no access to cache and it has to wait till bus is freed.
Write Back: When Processor needs to write something , it first writes in cache. At this point cpu can continue with other tasks, cache will then update main memory.
WriteThrough: In this method processor writes to both cache and main memory
Components of Cache.
There are 3 components to cache namely:
- SRAM: Static Ram is a memory block which holds data.
- Tag RAM: It’s a small piece of SRAM which holds addresses of data stored in RAM
- Cache Controller: is the brains behind cache, it’s responsible for:
- Performing snoops and snarfs
- Updating SRAM and TRAM
- Implementing write policy
- Determine if the memory request is cacheable
- Check if a request to cache is hit or miss
Organization of cache:
To fully understand cache organization two terms are required to be understood first :
- cache page
- cache line
Main memory(RAM) is divided in to equal pieces called page cache. The size of page is dependent on size of cache
Cache page is broken in to small pieces called cache line. Each line can store 4 to 64 bytes in it. During data transfer the whole line is read or written.
Methods of Cache organization:
- Fully-Associative: Any line in Main memory can be stored at any line in cache. In this method cache pages is not used, only lines are used.
- Advantages: A memory location can be stored at any line in cache
- Disadvantages: To search through the cache lines is complex
- Direct Map Cache: In this method, Main memory is divided in to cache pages, The size of each page is equal to size of cache. Line 0 in Page 1 of Main memory can be stored in line 0 in page1 of Cache memory
- Set Associative: This is combination of Fully-Associative and Direct Mapped caching schemes. SRAM part of cache is divided in to equal size (2 or 4) called cache ways. The size of cache page is equal to size of cache way. Each cache way is like direct map cache
In further articles i will delve more on cache implemenation on Processors , its operating modes etc.