3.6 Computer Organization

3.6 Computer Organization

Introduction to Computer Organization

Computer Organization bridges the gap between the abstract functionality of a computer system (as defined by its architecture) and its concrete physical implementation. It deals with how the various components of a computer—the Central Processing Unit (CPU), memory, and input/output systems—are structured and interconnected to execute instructions efficiently. This unit delves into the core of the CPU, exploring the design of the Control Unit, the structure and function of the Arithmetic Logic Unit (ALU) and registers, the ways instructions are formatted and accessed, and the architectural philosophies that define modern processors. Furthermore, it examines advanced techniques like pipelining that are employed to dramatically boost computational performance.


1. Control Unit: Hardwired vs. Microprogrammed Control

The Control Unit (CU) is the component of the CPU that generates the sequence of control signals necessary to fetch, decode, and execute each instruction in the instruction set.

1.1 Hardwired Control

  1. Principle: The control logic is implemented as a complex, dedicated finite state machine (FSM) using combinational circuits (gates, decoders, multiplexers) and sequential circuits (counters, flip-flops).

  2. Operation:

    • The instruction opcode and the current state of the system (from a state register/counter) are the inputs to a large combinational logic block.

    • This block's outputs are the control signals that govern all CPU operations (e.g., MemRead, ALUOp, RegWrite).

  3. Characteristics:

    • Speed: Very fast, as control signals are generated directly by hardware with minimal delay.

    • Flexibility: Inflexible. The control logic is fixed at manufacturing time. Modifying the instruction set requires redesigning and rewiring the hardware.

    • Complexity: Design complexity increases exponentially with the complexity of the instruction set. Difficult to design and debug.

    • Applications: Used in Reduced Instruction Set Computers (RISC) and high-performance processors where speed is paramount.

1.2 Microprogrammed Control

  1. Principle: The control logic is implemented via software-like microprograms stored in a special, fast memory called Control Memory or Microprogram Memory. Each machine instruction is executed by running a sequence of microinstructions.

  2. Operation:

    • The instruction opcode serves as an initial address into the control memory, pointing to the start of a micro-routine for that instruction.

    • Each microinstruction contains bits that directly or indirectly represent the control signals for one CPU cycle.

    • A microprogram counter (µPC) sequences through the microinstructions.

  3. Characteristics:

    • Speed: Slower than hardwired due to the memory access overhead for fetching each microinstruction.

    • Flexibility: Highly flexible. The instruction set can be modified or extended by changing the microprogram in control memory, making it easier to debug and update.

    • Complexity: Design is simpler and more systematic. The control logic design shifts from hardware design to microprogramming.

    • Applications: Traditionally used in Complex Instruction Set Computers (CISC) like the Intel x86 family, where the instruction set is large and complex.

2. Control Memory and Addressing Sequencing

  1. Control Memory: A high-speed read-only memory (ROM) or writable control store (WCS) that holds the microprogram. Each word in this memory is a microinstruction.

  2. Microinstruction Format: Contains two main parts:

    • Control Word: A set of bits that activate specific control lines (e.g., ALU source select, Register file write enable).

    • Address Field(s): Information used to determine the address of the next microinstruction to execute (next µ-address).

  3. Addressing Sequencing: The process of determining the next microinstruction address. Common methods include:

    • Incrementing: The µPC is simply incremented (for sequential microinstructions).

    • Branching: Based on conditions (e.g., an ALU flag like Zero or Carry), the address field in the current microinstruction is loaded into the µPC.

    • Mapping: The machine instruction opcode is transformed (mapped) into the starting address of its corresponding micro-routine in control memory.

    • Subroutines: A microinstruction can call a micro-subroutine (e.g., for operand fetch), saving the return address in a stack.

3. Microinstruction Format

Common formats organize the bits within a microinstruction for efficiency and clarity:

  1. Horizontal Microcode:

    • Concept: Each control signal in the CPU has a dedicated bit (or field) in the microinstruction. A 1 in a bit turns that control signal ON.

    • Characteristics:

      • Long Word Length: Microinstructions can be very wide (many bits).

      • High Degree of Parallelism: Many control signals can be activated simultaneously in a single cycle.

      • Fast Execution: Direct control, minimal decoding.

  2. Vertical Microcode:

    • Concept: The microinstruction contains coded fields that must be decoded (by a small decoder) to generate the actual control signals. It resembles a machine language for the control unit.

    • Characteristics:

      • Short Word Length: Microinstructions are more compact.

      • Limited Parallelism: Fewer actions can be specified per microinstruction.

      • Slower Execution: Requires an extra decoding step.

  3. Nanocode: A two-level approach where a horizontal microinstruction (nanocode) is stored in a nano-memory. The vertical microinstruction selects a nanocode word, combining the benefits of both formats.


4. CPU Structure and Function

4.1 Arithmetic Logic Unit (ALU)

  1. Function: The digital circuit that performs arithmetic (addition, subtraction) and logical (AND, OR, XOR, NOT) operations on data.

  2. Inputs: Two operands (typically from registers or immediate values) and a set of control lines (ALUOp) specifying the operation to perform.

  3. Outputs: The result of the operation and a set of status flags (Condition Codes):

    • Zero (Z): Set if result is zero.

    • Carry (C): Set if an arithmetic operation generates a carry out of the most significant bit.

    • Overflow (V): Set if the signed result is too large to be represented.

    • Sign/Negative (N): Set if result is negative (MSB = 1).

4.2 Register File

  1. Function: A small, very fast memory internal to the CPU used to store temporary data, addresses, and status information during program execution.

  2. Structure: An array of registers (e.g., 32 registers of 32 bits each). Accessed via register numbers (addresses).

  3. Key Registers:

    • General-Purpose Registers (GPRs): Hold operands and results for instructions.

    • Program Counter (PC): Holds the memory address of the next instruction to be fetched.

    • Instruction Register (IR): Holds the currently executing instruction.

    • Memory Address Register (MAR): Holds the address of a memory location to be read from or written to.

    • Memory Buffer/Data Register (MBR/MDR): Holds the data to be written to memory or the data just read from memory.

    • Status/Flags Register (PSW): Contains the ALU condition code flags.

5. Instruction Formats and Addressing Modes

5.1 Instruction Format

Defines the layout of bits within a machine instruction. Common elements include:

  • Opcode: Specifies the operation to be performed.

  • Operand References: Specify the location of source operands and the destination for the result. Can be register numbers, memory addresses, or immediate data. Common formats include Three-address, Two-address, One-address (accumulator-based), and Zero-address (stack-based) instructions.

5.2 Addressing Modes

Specify how the CPU calculates the effective address (EA) of an operand.

  1. Immediate Addressing:

    • Operand Value: The operand itself is contained within the instruction.

    • EA = The value following the opcode.

    • Advantage: Fast, no memory reference beyond instruction fetch.

    • Use: For constants.

  2. Direct (Absolute) Addressing:

    • Operand Value: In memory.

    • EA = The address field of the instruction.

    • Disadvantage: Limited address space (size of address field).

  3. Indirect Addressing:

    • Operand Value: In memory.

    • EA = The contents of the memory location (or register) whose address is given in the instruction. (The instruction points to a pointer).

    • Advantage: Flexibility (easy to change pointer value).

    • Disadvantage: Requires two memory accesses (one for pointer, one for data).

  4. Register Addressing:

    • Operand Value: In a CPU register.

    • EA = Register number.

    • Advantage: Very fast, short instruction format.

  5. Register Indirect Addressing:

    • Operand Value: In memory.

    • EA = Contents of a specified register.

    • Advantage: Effective address can be changed easily.

  6. Indexed Addressing:

    • Operand Value: In memory.

    • EA = Contents of a base register (or index register) + a constant displacement (given in instruction).

    • Use: Ideal for accessing arrays (EA = base address + index).

  7. Relative Addressing:

    • A form of indexed addressing where the PC is used as the base register.

    • EA = PC + Displacement.

    • Use: For PC-relative branching, making code position-independent.

6. Data Transfer and Manipulation Instructions

  1. Data Transfer Instructions:

    • Move data between memory, registers, and I/O ports.

    • Examples: LOAD (memory to register), STORE (register to memory), MOVE (register to register), PUSH/POP (stack operations), IN/OUT (I/O).

  2. Data Manipulation Instructions:

    • Arithmetic Instructions: ADD, SUBTRACT, MULTIPLY, DIVIDE, INCREMENT, DECREMENT.

    • Logical and Bit Manipulation Instructions: AND, OR, XOR, NOT, SHIFT (left/right), ROTATE.

    • Comparison Instructions: COMPARE (typically subtracts operands, sets flags, but discards result).


7. Processor Architecture: RISC vs. CISC

This is a fundamental design philosophy dictating the complexity of the instruction set.

Feature

CISC (Complex Instruction Set Computer)

RISC (Reduced Instruction Set Computer)

Instruction Set

Large, complex, variable-length instructions. Many perform multiple operations.

Small, simple, fixed-length instructions. Each performs a single, basic operation.

Addressing Modes

Many and complex.

Few and simple (primarily register and immediate).

Control Unit

Typically microprogrammed (for flexibility).

Typically hardwired (for speed).

Registers

Fewer general-purpose registers.

Larger set of general-purpose registers (minimizes memory accesses).

Clock Cycles/Instr.

Variable (1 to many). Many instructions access memory directly.

Typically 1 cycle per instruction (due to pipelining). Uses LOAD/STORE architecture.

Code Density

High (complex instructions do more).

Lower (more simple instructions needed).

Compiler Role

Simpler compilers; hardware handles complexity.

More sophisticated compilers optimize instruction sequences.

Examples

Intel x86, Motorola 68000.

ARM, MIPS, SPARC, RISC-V.

Philosophy

Hardware-centric: Move complexity to hardware to simplify software.

Software/Compiler-centric: Simplify hardware to run fast; let software/compiler handle complexity.


8. Performance Enhancement Techniques

8.1 Pipelining

  1. Concept: An implementation technique where multiple instructions are overlapped in execution. The instruction execution process is divided into discrete stages (e.g., Fetch, Decode, Execute, Memory, Write-back). Each stage works on a different instruction in each clock cycle, like an assembly line.

  2. Ideal Speedup: For an n-stage pipeline with a large number of instructions, the ideal speedup is a factor of n.

  3. Pipeline Hazards (Limitations):

    • Structural Hazards: Occur when two instructions need the same hardware resource simultaneously (e.g., memory for both data and instruction fetch). Solved by resource duplication.

    • Data Hazards: Occur when an instruction depends on the result of a previous instruction that is still in the pipeline. Solved by forwarding/bypassing (sending results early) or inserting pipeline stalls/bubbles.

    • Control Hazards: Occur due to branches. The pipeline may fetch the wrong instructions before the branch direction is known. Solved by branch prediction, delayed branching, or speculative execution.

8.2 Parallel Processing Concepts

  1. Definition: The use of multiple processing elements (cores, processors) working concurrently to solve a computational problem.

  2. Flynn's Taxonomy classifies computer architectures based on instruction and data streams:

    • SISD (Single Instruction, Single Data): Traditional sequential processor.

    • SIMD (Single Instruction, Multiple Data): A single instruction operates on multiple data points simultaneously (e.g., vector processors, GPU cores). Good for data-parallel tasks.

    • MISD (Multiple Instruction, Single Data): Rarely used.

    • MIMD (Multiple Instruction, Multiple Data): Multiple processors execute different instructions on different data (e.g., multi-core CPUs, computer clusters). Most common form of parallel processing.

  3. Levels of Parallelism:

    • Instruction-Level Parallelism (ILP): Exploited by pipelining and superscalar architectures (multiple execution units).

    • Thread-Level Parallelism (TLP): Multiple threads of execution (e.g., in a multi-core CPU).

    • Task-Level/Process-Level Parallelism: Independent processes running on different processors.

Conclusion: Computer Organization reveals the intricate machinery that executes software. The trade-off between hardwired and microprogrammed control defines adaptability versus speed. The choice of instruction format and addressing modes balances programming convenience with hardware efficiency. The RISC/CISC dichotomy represents two enduring philosophies for achieving performance. Finally, techniques like pipelining and parallel processing are the direct hardware responses to the relentless demand for faster computation, transforming a simple sequential model into a highly concurrent engine of modern computing.

Last updated