# Hardware Design I Chap. 10 Design of microprocessor

Computing Architecture Lab.

Hajime Shimada

E-mail: shimada@is.naist.jp

1

# Outline





- What is microprocessor?
- Microprocessor from sequential machine viewpoint
  - Microprocessor and Neumann computer
  - Memory hierarchy
  - Instruction set architecture
- Microarchitecture of the microprocessor
  - OMicroarchitecture with sequential processing
  - OMicroarchitecture with pipelined processing



Hardware Design I (Chap. 10)

## What is microprocessor?

- LSI for processing data at the center of the computer
  - Also, called "processor"
- There are several type of microprocessors
  - Central Processing Unit (CPU)
  - Microcontroller
  - Graphic accelerator
  - Other several accelerators



Hardware Design I (Chap. 10)

# Central Processing Unit (CPU)

- A nucleus of Neumann computer
  - O Detail will be taught in later slide
- Sometimes, the word "microprocessor" denotes this
- By combining CPU with memories, disks, I/Os, we can create PC or server
- Examples: Intel core i7, Fujitsu SPARC64 VII, AMD Opteron, ...









Hajime Shimada

Hardware Design I (Chap. 10)

#### Microcontroller



- A processor used for control of electric devices
  - Optimized for those use
    - e.g. give high current drive ability to output pin to directly drive LED
- Many of them can organize computer with one chip
  - Implement memory hierarchy into them
- Example: Renesus H8, Atmel AT91, Zilog Z80, ...
  - O Too many companies provide them







Hardware Design I (Chap. 10)

5

#### Graphic accelerator



- Also called Graphic Processing Unit (GPU)
- Implement too many ALU to utilize parallelism
  - In graphic processing, usually, we can process each pixel independently
  - O It also utilized for high parallelism arithmetic
- Examples: NVIDIA GeForce, AMD(ATI) Radeon, ...







Hardware Design I (Chap. 10)

#### Other accelerators



- There's several processor to accelerate data processing which is not suitable to process with CPU or GPU
  - O But recently, GPU intrudes to this area
- Usually, it implements much ALU to supply high arithmetic performance
- Example: ClearSpeed CSX, Ageia PhysiX, ...







Hardware Design I (Chap. 10)

7

## Outline



- What is microprocessor?
- Microprocessor from sequential machine viewpoint
  - OMicroprocessor and Neumann computer
  - Memory hierarchy
  - Instruction set architecture
- Microarchitecture of the microprocessor
  - Microarchitecture with sequential processing
  - OMicroarchitecture with pipelined processing



Hardware Design I (Chap. 10)

#### Microprocessor from sequential circuit viewpoint We can abstract microprocessor with following sequential machine Inputs Programs (= instructions) Data for processing Outputs: Processed data State: Register (and register file) Combinational **Programs** Processed logic circuit (=instructions) data data for processing Register (and register file) Clock Hardware Design I (Chap. 10) puting Architecture Lab. Hajime Shimada 9



# Advantages and disadvantages of Neumann computer

- Advantages
  - We don't have to modify hardware between different data processing
    - EDSAC(1949) is one of the early Neumann computer
    - ENIAC have to change wire connection if it change processing
  - We can execute complicated processing with multiple instructions
- Disadvantages (Neumann bottleneck)
  - O Communication between processor and memory increases
  - Slow memory drags down processor performance
- How about non-Neumann computer?
  - O It remains in some specific use (e.g. movie codec)
  - It begins to reposition with reconfigurable hardware

->Chap. 9



Hardware Design I (Chap. 10)



#### States in the microprocessor

- How we define states of sequential machine in the processor?
  - Usually, we call it register
- There are many types of registers
  - Special purpose registers (SPR)
    - Program counter (PC): Denotes position of instruction which is executing
    - Flag register: Denotes carry generation, overflow, ...
  - Global purpose registers (GPR)
    - Used for hold data before/after processing (work as a part of main memory)
    - Also, used for intermediate data under arithmetic
- The organization of register differs between instruction set architectures

  Relationship between GPR and

Computing Architecture Lab.
Hajime Shimada

Hardware Design I (Chap. 10)

memory hierarchy is shown in later

#### Inputs for the microprocessor

- There are two inputs of sequential machine in the processor
  - O Instruction: must be defined if we design sequential machine
  - O Data: don't have to define them
- What's instruction?
  - Series of bits: e.g. 0000000100001010100000000010000
  - O Usually, we use assembly language to represent it
    - A programming language which has one to one relationship to instruction
    - It defines operation relationship between registers and main memory (in basic)
    - e.g. add R8, R4, R5 (GPR #8 = GPR #4 + GPR #5)
    - <-> 0000000100001010100000000010000

Introduce how to define it efficiency in later slide



Hardware Design I (Chap. 10)

# GPR and memory hierarchy (1/2)

- In recent processors, the GPR becomes a part of main memory
  - Firstly the processor moves data from main memory to register
  - O Processor apply operation to the data in the register
  - O After operation, it write back data to main memory
- This organization effectively reduces workload for main memory
  - Assuming that we apply multiple operation to data



GPR and memory hierarchy (2/2)

- We call "memory hierarchy" for those hierarchical
  - Including disk
  - O It also reduces performance degradation from slow device
- Recently, number of hierarchy increasing because the speed difference between devices is increasing



16

## Instruction set architecture (ISA)

- To create sequential machine, we have to define format of inputs and internal state
  - Internal state: denoted by registers (for internal state)
  - O Inputs: instructions
- We usually call this definition as Instruction Set Architecture (ISA)
  - Including systematic instruction construction method
- By defining ISA carefully you can reduce
  - States (registers)
  - Combinational logics



Hardware Design I (Chap. 10)

17

#### Instruction encoding

- Instruction is encoded to chunk of binary under ISA definition
  - o e.g. add R8, R4, R5 (GPR #8 = GPR #4 + GPR #5)
  - <-> <mark>0000000100001010100000000010000</mark>
- In usual encoding, we give meaning into some chunk of bits



Computing Architecture Lab.
Hajime Shimada

Hardware Design I (Chap. 10)





# Example of instruction encoding (3/3)

lw R8, 8(R4)

| 100 | 101 | 00100 | 01000 | 000000000001000 |   |
|-----|-----|-------|-------|-----------------|---|
| 31  | 26  | 21    | 16    |                 | 0 |

- Operation: Load value in (R4 + 8) position on main memory to R8 (lw: load word)
- bne R4, R5, -5

| 000 | 101 | 00100 | 00101 | 1111111111111011 |
|-----|-----|-------|-------|------------------|
| 31  | 26  | 21    | 16    | 0                |

 Operation: if R4 != R5, back to 5 prior instruction (bne: branch not equal)



Hardware Design I (Chap. 10)

21

#### **Short Exercise**

- Let's translate following assembly to instruction notated by binary
  - ORefer R-type instruction notation in slides

add R10, R13, R14



Hardware Design I (Chap. 10)





- Let's translate following assembly to instruction notated by binary
  - ORefer R-type instruction notation in slides

add R10, R13, R14





Hardware Design I (Chap. 10)

23

## Outline







- What is microprocessor?
- Microprocessor from sequential machine viewpoint
  - OMicroprocessor and Neumann computer
  - Memory hierarchy
  - Instruction set architecture
- Microarchitecture of the microprocessor
  - OMicroarchitecture with sequential processing
  - OMicroarchitecture with pipelined processing



Hardware Design I (Chap. 10)

## What's Microarchitecture?

- An implementation of processor on the hardware
- We can choose several possible microarchitecture in same ISA
  - o e.g. Intel Core i7, Intel Atom, AMD Phenon
  - O It can execute same program (e.g. Windows) because ISA is the same
- Usually, we choose microarchitecture for the purpose of the computer
  - e.g. Choose low power consumption microarchitecture for notebook PC



Hardware Design I (Chap. 10)

25

#### One organization of microprocessor (1/3)

- Combinational logics
  - ALU: execute add, sub, logical arithmetic, shift, ...
  - Multiplexers: construct data path from instructions and values in register.
  - Adder after PC: increment PC to indicate next instruction
  - Adder beside ALU: calculate branch target in branch instruction







# How to understand operation of prior chunk of hardware?

- It seems that it's hard to understand operation of prior large hardware -> True
- How can we understand it easily?
  - -> Decompose hardware to 5 part and understand those operation
- This 5 part decomposition has importance in operation
  - Operate 5 part sequentially: 5 phase operation processor
    - Finish one instruction with 5 clock pulse
  - Operate 5 part simultaneously: 5 stage pipelined processor
    - Finish one instruction with 1 clock pulse (in general case)



Hardware Design I (Chap. 10)

29

#### Decomposition to 5 part

- Instruction fetch (IF)
- 2. Instruction decode (ID), register read
- 3. Execution (EX)
- 4. Memory access (MA)
- Write back to register (WB), and commit



































