Friday, June 17, 2022

My notes on Cookies and CORS

Cookies

(This section assumes that you have a basic knowledge of what Cookies are. For proper introduction and more details about cookies, please read the Mozilla Developer Docs: https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies. The flowcharts in this section are created by Nandan Desai)

Cookies are created on the client side when either the server responds with Set-Cookie header or when the cookies are set using Document.cookie in JavaScript.

And the browser sends the Cookies to the server on the next request using the Cookie header.

Understanding the intricacies of cookies boils down to understanding various cookie attributes.

Cookie attributes can be classified into two main categories:

  1. The attributes that restrict access to cookies (like Secure and HttpOnly)
  2. The attributes that define where the cookies are sent (like SameSite, Domain and Path)

'Secure' and 'HttpOnly'

Cookies that have Secure attribute set are only sent to the server on an encrypted (HTTPS) network. Also, sites with http:// in their URL can't set Secure attribute on cookies.

Cookies that have HttpOnly attribute set are not accessible via JavaScript and are sent to the server by the browser automatically when all the proper conditions are met.

'Domain', 'Path' and 'SameSite'

  • 'Domain' attribute

domain cookie attribute

  • 'Path' attribute

path cookie attribute

  • 'SameSite' attribute

Here, the "site" refers to the domain combined with the scheme (http or https). For example, http://example.com and https://example.com are different sites according to this definition.

samesite attribute

CORS (Cross-Origin Resource Sharing)

(Most of the CORS-related content presented here is either a direct copy-paste of Mozilla Developer Docs or I've made some minor modifications to make certain things simpler to understand. You can get more details on this topic here: https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS)

In simple words, CORS is a communication between the server and the browser about what the browser is allowed to do when it receives some content from the server. It's like the server is doing Access Control on the browser.

Suppose web content at https://foo.example wishes to invoke content on domain https://bar.other. Code of this sort might be used in JavaScript deployed on foo.example:

const xhr = new XMLHttpRequest();
const url = 'https://bar.other/resources/public-data/';

xhr.open('GET', url);
xhr.onreadystatechange = someHandler;
xhr.send();

This operation performs a simple exchange between the client and the server, using CORS headers to handle the privileges:



Let's look at what the browser will send to the server in this case:

GET /resources/public-data/ HTTP/1.1
Host: bar.other
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:71.0) Gecko/20100101 Firefox/71.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Connection: keep-alive
Origin: https://foo.example

The request header of note is Origin, which shows that the invocation is coming from https://foo.example.

Now let's see how the server responds:

HTTP/1.1 200 OK
Date: Mon, 01 Dec 2008 00:23:53 GMT
Server: Apache/2
Access-Control-Allow-Origin: *
Keep-Alive: timeout=2, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: application/xml

[…XML Data…]

In response, the server returns a Access-Control-Allow-Origin header with Access-Control-Allow-Origin: *, which means that the resource can be accessed by any origin.

This pattern of the Origin and Access-Control-Allow-Origin headers is the simplest use of the access control protocol. If the resource owners at https://bar.other wished to restrict access to the resource to requests only from https://foo.example, (i.e no domain other than https://foo.example can access the resource in a cross-origin manner) they would send:

Access-Control-Allow-Origin: https://foo.example

CORS effect on Cookies

The most interesting capability exposed by both XMLHttpRequest and CORS is the ability to make "credentialed" requests that are aware of Cookies and HTTP Authentication information. By default, in cross-origin XMLHttpRequest invocations, browsers will not send credentials (i.e., cookies). A specific flag has to be set on the XMLHttpRequest object when it is invoked.

Consider the following example:

const invocation = new XMLHttpRequest();
const url = 'https://bar.other/resources/credentialed-content/';

function callOtherDomain() {
  if (invocation) {
    invocation.open('GET', url, true);
    invocation.withCredentials = true;
    invocation.onreadystatechange = handler;
    invocation.send();
  }
}

If we want to send Cookies to the server then withCredentials = true needs to set on the XMLHttpRequest instance. And if the server responds with some cookie, then the server also needs to include Access-Control-Allow-Credentials: true header along with Access-Control-Allow-Origin not being a wildcard (i.e., it shouldn't be "*"). Only then, the response and the response cookies will be made available to the JavaScript code. Otherwise, a CORS error will be printed on the devtools console.

The following example explains it:



Here is a sample exchange between client and server:

GET /resources/credentialed-content/ HTTP/1.1
Host: bar.other
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:71.0) Gecko/20100101 Firefox/71.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Connection: keep-alive
Referer: https://foo.example/examples/credential.html
Origin: https://foo.example
Cookie: pageAccess=2
HTTP/1.1 200 OK
Date: Mon, 01 Dec 2008 01:34:52 GMT
Server: Apache/2
Access-Control-Allow-Origin: https://foo.example
Access-Control-Allow-Credentials: true
Cache-Control: no-cache
Pragma: no-cache
Set-Cookie: pageAccess=3; expires=Wed, 31-Dec-2008 01:34:53 GMT
Vary: Accept-Encoding, Origin
Content-Encoding: gzip
Content-Length: 106
Keep-Alive: timeout=2, max=100
Connection: Keep-Alive
Content-Type: text/plain

[text/plain payload]

Although line 10 contains the Cookie destined for the content on https://bar.other, if bar.other did not respond with an Access-Control-Allow-Credentials: true (line 16), the response would be ignored and not be made available to the web content. Also notice the following: Access-Control-Allow-Origin: https://foo.example. If it was Access-Control-Allow-Origin: *, then browser would have not allowed the JavaScript to access the response or the response cookies as explained earlier.

References

https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies

https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS

Monday, June 6, 2022

Introduction to Stack (of a Process/Thread in the OS)

(All the diagrams in this blogpost are created by Nandan Desai)

Stack

The operating system sets up an area in the virtual memory for the stack and loads the starting address of this empty stack into the SP, the Stack Pointer register (this is the ESP/RSP register on 32 and 64-bit Intel processors respectively.). If a process has multiple threads, then each thread is given it's own stack by the OS.

The elements are pushed onto the stack using the PUSH instruction and removed from the stack using the POP instruction. Apart from PUSH and POP, there are two more instructions that can perform actions on the stack: the CALL and RET instructions. (ENTER and LEAVE instructions also perform actions on the stack but they're out of scope for this blogpost).

The main purpose of the stack is to keep track of the control flow of the program when the programmer is using multiple procedures (i.e., functions). The control is transferred to the procedures using CALL instruction and the control is returned to the previous procedure using RET instruction. A stack is where the following data is stored: the parameters passed to a function, local variables of a function and the return information (like which address to return to when RET instruction is called).

Also, there are certain exploits that abuse this return information on the stack to let the attacker change the flow of the program. To patch this, Intel has a hardware-enforced protection called Control-flow Enforcement Technology (CET) which uses an extra stack called the "Shadow Stack" along with our regular stack. Read more about it here! Covering the Shadow Stack is out of scope for us right now.

The currently executed procedure has a memory block of it's own on the stack to store it's variables. This block is called as a stack frame. The starting address of this block is stored in BP register (called as the Stack-Base Pointer). This is the EBP/RBP register on 32 and 64-bit processors. The ending address of this block is stored in ESP register (the Stack Pointer register that we talked about earlier).

There are two very important things to know here:

  1. The responsibility of creating this memory block (the stack frame) on the stack is entirely up to the assembly programmer. The starting address of the stack will initially be set by the OS in the ESP and EBP registers. But after that, the programmer needs to decide how will they use these registers to allocate space on the stack for local variables and function parameters.
  2. The stack grows downwards in the virtual address space of the Process, i.e., it starts at a higher address and when we push something on the stack, the ESP register value is decremented and when we pop something out of the stack, the ESP register value is incremented.

With this background knowledge in mind, let's try to understand how the stack is used in our assembly code!

The below diagram shows a sample C program and it's assembly translation:

C to Assembly

The below diagram walks you through each of the stages of how Stack changes for every instruction of the main() function shown above:

assembly stack walkthrough

(The instructions marked in red in the above diagram are either self-explanatory or will be discussed later.)

When a function is returning, it puts its final return value into the EAX register for it's previous function to access. That's the reason why xorl %eax, %eax is used in the above code as we're saying return 0; in our C program.

(I'm still not sure why we're putting 0 onto the stack. It's being pushed even if I don't use 0 anywhere in my C code. If you know, then please let me know if you know why the Assembler puts 0 onto the stack here.)

The CALL and RET Instructions

The instruction that is to be executed after the currently executing instruction, is stored in the Instruction Pointer register.

A CALL instruction will push the current value of EIP register (the Instruction Pointer) onto the stack, loads the offset or the address (this depends on the type of CALL instruction and there are many different types) in the EIP register and begins executing the procedure (function).

The RET instruction will pop the instruction pointer value from the stack and puts it into the EIP register and optionally clears the data on the stack used by the procedure which has the RET instruction.

Introduction to Processor and x86-64 Assembly

(This blogpost is created from my study notes)

Processor

A computer broadly consists of a CPU, I/O devices, Main memory and all these communicate through a System Bus.

A CPU broadly consists of an ALU (Arithmetic Logic Unit), Registers and a Control Unit. All these communicate through an internal bus.

computer top-level structure 

(Image credit: William Stallings's Computer Organization and Architecture book)

von Neumann architecture

Most of the computers today follow the von Neumann architecture which was devised in 1940s. von Neumann architecture describes the design architecture for a digital computer as follows:

  • A processing unit that contains an arithmetic logic unit and processor registers
  • A control unit that contains an instruction register and program counter
  • Memory that stores data and instructions
  • External mass storage
  • Input and output mechanisms

von Neumann architecture is a stored-program model where the data and instructions are stored in the same memory. There are some other architectures where the program is stored but in different memory devices.

Processor

The basic function of a processor is to fetch (one at a time) a set of instructions stored in the memory and execute them.

The processor architecture that the manufacturers adopt can be categorized into two types: CISC and RISC.

CISC (Complex Instruction Set Computer)

CISC architecture is a category of processors that can do multiple tasks with a single instruction. They have complex hardware to ease the work of a compiler. For example, the ENTER instruction in x86 assembly is equivalent to the following three x86 instructions:

PUSH EBP 
MOV EBP, ESP 
SUB ESP, space_for_local_variables

(where EBP is the frame pointer and ESP is the stack pointer (which will be covered later)).

A function in x86 assembly starts with the above 3 lines and those 3 lines can be replaced by a single ENTER instruction. That's how the CISC category processors implement a complex underlying hardware to create Instruction Set intended to make the software-implementation easier.

As mentioned, one of the examples of CISC category processors are Intel x86. And x86 has more than 1000 instructions in it's Instruction Set.

x86 Instruction Listing: https://en.wikipedia.org/wiki/X86_instruction_listings

RISC (Reduced Instruction Set Computer)

Contrary to CISC, RISC was designed to make instructions execute individual functions to realize a task. The instructions are simpler and a programmer might have to provide multiple instructions to the processor to execute a task whereas, in CISC, there used to be instructions that execute multiple underlying smaller functions to execute a task and hence each instruction required more CPU cycles and thus more power-consumption. RISC architecture tries to solve these issues.

Example of RISC is the ARM architecture, MIPS etc.

Although RISC and CISC are theoretically separate, but in practice, RISC processors are growing a bit complex to meet the new technology requirements and CISC processors are becoming a bit simpler to challenge the RISC processors in the performance and energy-efficiency race. So, there is a thin line separating the RISC and CISC processors but the underlying concept that they are based upon is distinct.

Instruction Set Architecture (ISA)

The ISA specifies the syntax and semantics of the of the assembly. That means, an ISA for a processor defines the Registers, Instructions (and their syntax and semantics), Data types and Addressing Modes.

Register-memory and Load-store architectures

Most of the CISC processors have Register-memory architecture which is a type of Instruction Set Architecture (ISA). We can understand Register-memory architecture using an example. The ADD instruction in x86 can have two operands that are present either in Register or Memory.

In contrast, the Load-store architecture separates memory operation instructions and ALU instructions. That means, ADD instruction takes two operands that need to present in the Registers and not in the memory. And the data can be moved to and from the memory using separate load and store instructions. This architecture is common amongst the RISC processors.

Here we'll be learning about the ISA of x86-64 processor.

The Wikibook page gives a very comprehensive explanation on x86 Instruction Set Architecture: https://en.wikibooks.org/wiki/X86_Assembly/X86_Architecture

These notes are made by watching MIT OpenCourseWare lecture on x86-64 ISA: https://youtu.be/L1ung0wil9Y?t=808

As seen in the above article link, the AX register was a 16-bit register called Accumulator and it was used during arithmetic operations. Then, it was later on extended to 32-bit EAX ('E' stands for extended) and the 64-bit RAX ('R' stands for 'Register').

On RAX register, we can still address the lower 32-bit part of it using the alias EAX. Or, use the lower 16-bit part using AX. We can also address the higher 8-bit of that 16-bit part using AH and the lower 8-bit part using AL.

So when we write movl %al, %edx, we're asking the CPU to copy the lower 8 bits of Accumulator register into the Data register. Note that mov stands for move but it doesn't actually "move" the value but rather "copies" it. We're saying the word "copy" because the value of al register will be left behind in the Accumulator register after that instruction is complete. (Source: https://youtu.be/L1ung0wil9Y?t=1393)

x86-64 Assembly

The x86-64 Instruction is in the following format: <opcode> <operand_list>.

opcode is a short mnemonic identifying the type of instruction. operand_list has either 1, 2 or 3 operands separated by commas. Typically, one operand amongst them is the destination and the rest are the sources.

We have two different syntax to choose from when we're writing x86-64 Assembly code:

  1. AT&T syntax
  2. Intel syntax

The difference between Intel and AT&T syntax can be best understood by the below screenshot:

intel and AT&T difference

The above screenshot says that when we write an instruction, say, <operation> A, B in x86-64 Assembly:

  • In AT&T syntax, the operation is done in the order: B <operation> A and the result of that operation will be stored in B.
  • In Intel syntax, the operation is done in the order: A <operation> B and the result will be stored in A.

There are also other differences showcased in the above screenshot. In this blogpost series, we'll be using AT&T syntax but we'll also try to understand the Intel syntax along the way.

Common Opcodes

common opcodes

Data Types

For a processor, there is no distinction in the data types that are declared in high-level programming languages. Any register can hold any bytes. However, there are only two data types that need to be handled differently by the hardware: Integers and Floating Points.

The ALU of the processor only handles integers and to deal with the floating point, there is a separate hardware part in x86 processor and it's called "x87" or the "FPU" (Floating-Point Unit). It has separate registers called the "x87 stack" and extra x86 instructions called "x87 Instruction set". Although the Floating-Point hardware is integrated into the same processor, it is considered distinct due to historical reasons.

Opcode suffixes

Many instructions in AT&T syntax have a suffix (b, w, l, or q) which indicates the bitwidth of the operation (1, 2, 4, or 8 bytes, respectively). The suffix is often ignored when the bitwidth can be determined from the operands (i.e., %rax is 64-bit, %eax is 32-bit etc.). For example, if the destination register is %eax, it must be 4 bytes, if %ax it must be 2 bytes, and %al would be 1 byte. A few instructions such as movs and movz have two suffixes: the first is for the source operand, the second for the destination. For example, movzbl moves a 1-byte source value to a 4-byte destination.

When the destination is a sub-register (i.e., using %eax on %rax register), only those specific bytes in the sub-register are written with one broad exception: a 32-bit instruction zeroes the high order 32 bits of the destination register.

(Source for above explanation: https://web.stanford.edu/class/archive/cs/cs107/cs107.1222/guide/x86-64.html)

Prefixes

In AT&T syntax, registers have '%' as a prefix and constants have '$' and hexadecimal values have '0x' prefix.

Addressing Modes

There are two main types in which we can visit a memory address or a register to fetch or place a value. One way is to directly specify the memory address or the register name in our instruction, and the other way is to indirectly reference it.

The below two slides explain this in detail:

direct addressing mode

indirect addressing mode

References

MIT OpenCourseware video lecture: https://youtu.be/L1ung0wil9Y

x86 Assembly Wikibook

I highly recommend buying one of the copies of the x86 Assembly Wikibook to understand further concepts in x86-64 Assembly. You can find the online version here: https://en.wikibooks.org/wiki/X86_Assembly