Step 1: Instruction fetch

We have the following operations and control signals:

Step 2: Instruction Decode and Register

Since we don’t yet know the operation that will be run, we have to do things that don’t need any information:

These are optimistic optimizations, because they cause no harm if useless, but may be helpful once we know the instruction

The ALU operates on the operands that were prepared in step 2, which can either be:

A load/store instruction accesses memory, or an arithmetic instruction writes its result to the register file

Uses the memory we read at the last step and writes it to the register file

The FSM has two steps which are common between all the different types we can end up having:

After that, we switch to another FSM, depending on the operand read by the decoding step

We either read/write data to memory

If we are writing, we need to just write the register computed at step 2 into memory
If we are reading, we need
- To load the word into memory
- And then write it to a register

Therefore we see that to load/store from memory, we need 4/5 cycles to complete the FSM.

Here, we just have to write the ALU’s result to the register file.

Therefore this FSM requires only 4 cycles.

If the operation is to branch, then we can jump to the value we got in step 2. This makes the branch operation work in only 3 cycles.