Monday, June 6, 2022

Introduction to Processor and x86-64 Assembly

(This blogpost is created from my study notes)

Processor

A computer broadly consists of a CPU, I/O devices, Main memory and all these communicate through a System Bus.

A CPU broadly consists of an ALU (Arithmetic Logic Unit), Registers and a Control Unit. All these communicate through an internal bus.

computer top-level structure 

(Image credit: William Stallings's Computer Organization and Architecture book)

von Neumann architecture

Most of the computers today follow the von Neumann architecture which was devised in 1940s. von Neumann architecture describes the design architecture for a digital computer as follows:

  • A processing unit that contains an arithmetic logic unit and processor registers
  • A control unit that contains an instruction register and program counter
  • Memory that stores data and instructions
  • External mass storage
  • Input and output mechanisms

von Neumann architecture is a stored-program model where the data and instructions are stored in the same memory. There are some other architectures where the program is stored but in different memory devices.

Processor

The basic function of a processor is to fetch (one at a time) a set of instructions stored in the memory and execute them.

The processor architecture that the manufacturers adopt can be categorized into two types: CISC and RISC.

CISC (Complex Instruction Set Computer)

CISC architecture is a category of processors that can do multiple tasks with a single instruction. They have complex hardware to ease the work of a compiler. For example, the ENTER instruction in x86 assembly is equivalent to the following three x86 instructions:

PUSH EBP 
MOV EBP, ESP 
SUB ESP, space_for_local_variables

(where EBP is the frame pointer and ESP is the stack pointer (which will be covered later)).

A function in x86 assembly starts with the above 3 lines and those 3 lines can be replaced by a single ENTER instruction. That's how the CISC category processors implement a complex underlying hardware to create Instruction Set intended to make the software-implementation easier.

As mentioned, one of the examples of CISC category processors are Intel x86. And x86 has more than 1000 instructions in it's Instruction Set.

x86 Instruction Listing: https://en.wikipedia.org/wiki/X86_instruction_listings

RISC (Reduced Instruction Set Computer)

Contrary to CISC, RISC was designed to make instructions execute individual functions to realize a task. The instructions are simpler and a programmer might have to provide multiple instructions to the processor to execute a task whereas, in CISC, there used to be instructions that execute multiple underlying smaller functions to execute a task and hence each instruction required more CPU cycles and thus more power-consumption. RISC architecture tries to solve these issues.

Example of RISC is the ARM architecture, MIPS etc.

Although RISC and CISC are theoretically separate, but in practice, RISC processors are growing a bit complex to meet the new technology requirements and CISC processors are becoming a bit simpler to challenge the RISC processors in the performance and energy-efficiency race. So, there is a thin line separating the RISC and CISC processors but the underlying concept that they are based upon is distinct.

Instruction Set Architecture (ISA)

The ISA specifies the syntax and semantics of the of the assembly. That means, an ISA for a processor defines the Registers, Instructions (and their syntax and semantics), Data types and Addressing Modes.

Register-memory and Load-store architectures

Most of the CISC processors have Register-memory architecture which is a type of Instruction Set Architecture (ISA). We can understand Register-memory architecture using an example. The ADD instruction in x86 can have two operands that are present either in Register or Memory.

In contrast, the Load-store architecture separates memory operation instructions and ALU instructions. That means, ADD instruction takes two operands that need to present in the Registers and not in the memory. And the data can be moved to and from the memory using separate load and store instructions. This architecture is common amongst the RISC processors.

Here we'll be learning about the ISA of x86-64 processor.

The Wikibook page gives a very comprehensive explanation on x86 Instruction Set Architecture: https://en.wikibooks.org/wiki/X86_Assembly/X86_Architecture

These notes are made by watching MIT OpenCourseWare lecture on x86-64 ISA: https://youtu.be/L1ung0wil9Y?t=808

As seen in the above article link, the AX register was a 16-bit register called Accumulator and it was used during arithmetic operations. Then, it was later on extended to 32-bit EAX ('E' stands for extended) and the 64-bit RAX ('R' stands for 'Register').

On RAX register, we can still address the lower 32-bit part of it using the alias EAX. Or, use the lower 16-bit part using AX. We can also address the higher 8-bit of that 16-bit part using AH and the lower 8-bit part using AL.

So when we write movl %al, %edx, we're asking the CPU to copy the lower 8 bits of Accumulator register into the Data register. Note that mov stands for move but it doesn't actually "move" the value but rather "copies" it. We're saying the word "copy" because the value of al register will be left behind in the Accumulator register after that instruction is complete. (Source: https://youtu.be/L1ung0wil9Y?t=1393)

x86-64 Assembly

The x86-64 Instruction is in the following format: <opcode> <operand_list>.

opcode is a short mnemonic identifying the type of instruction. operand_list has either 1, 2 or 3 operands separated by commas. Typically, one operand amongst them is the destination and the rest are the sources.

We have two different syntax to choose from when we're writing x86-64 Assembly code:

  1. AT&T syntax
  2. Intel syntax

The difference between Intel and AT&T syntax can be best understood by the below screenshot:

intel and AT&T difference

The above screenshot says that when we write an instruction, say, <operation> A, B in x86-64 Assembly:

  • In AT&T syntax, the operation is done in the order: B <operation> A and the result of that operation will be stored in B.
  • In Intel syntax, the operation is done in the order: A <operation> B and the result will be stored in A.

There are also other differences showcased in the above screenshot. In this blogpost series, we'll be using AT&T syntax but we'll also try to understand the Intel syntax along the way.

Common Opcodes

common opcodes

Data Types

For a processor, there is no distinction in the data types that are declared in high-level programming languages. Any register can hold any bytes. However, there are only two data types that need to be handled differently by the hardware: Integers and Floating Points.

The ALU of the processor only handles integers and to deal with the floating point, there is a separate hardware part in x86 processor and it's called "x87" or the "FPU" (Floating-Point Unit). It has separate registers called the "x87 stack" and extra x86 instructions called "x87 Instruction set". Although the Floating-Point hardware is integrated into the same processor, it is considered distinct due to historical reasons.

Opcode suffixes

Many instructions in AT&T syntax have a suffix (b, w, l, or q) which indicates the bitwidth of the operation (1, 2, 4, or 8 bytes, respectively). The suffix is often ignored when the bitwidth can be determined from the operands (i.e., %rax is 64-bit, %eax is 32-bit etc.). For example, if the destination register is %eax, it must be 4 bytes, if %ax it must be 2 bytes, and %al would be 1 byte. A few instructions such as movs and movz have two suffixes: the first is for the source operand, the second for the destination. For example, movzbl moves a 1-byte source value to a 4-byte destination.

When the destination is a sub-register (i.e., using %eax on %rax register), only those specific bytes in the sub-register are written with one broad exception: a 32-bit instruction zeroes the high order 32 bits of the destination register.

(Source for above explanation: https://web.stanford.edu/class/archive/cs/cs107/cs107.1222/guide/x86-64.html)

Prefixes

In AT&T syntax, registers have '%' as a prefix and constants have '$' and hexadecimal values have '0x' prefix.

Addressing Modes

There are two main types in which we can visit a memory address or a register to fetch or place a value. One way is to directly specify the memory address or the register name in our instruction, and the other way is to indirectly reference it.

The below two slides explain this in detail:

direct addressing mode

indirect addressing mode

References

MIT OpenCourseware video lecture: https://youtu.be/L1ung0wil9Y

x86 Assembly Wikibook

I highly recommend buying one of the copies of the x86 Assembly Wikibook to understand further concepts in x86-64 Assembly. You can find the online version here: https://en.wikibooks.org/wiki/X86_Assembly

No comments:

Post a Comment