If you're learning assembly, you probably know about instructions like add
, sub
and mul
for doing arithmetic. However, if you try to write example programs to see them, you might be surprised by the results you get instead:
int exampleAdd(int x) { return x + 3; // we expect add } int exampleMul(int x) { return x * 5; // we expect mul }
exampleAdd: lea eax, [rdi+3] ret exampleMul: lea eax, [rdi+rdi*4] ret
Both examples ended up using an instruction called lea
instead. This is the Load Effective Address instruction, which the documentation says is used to "compute an effective memory address". But your code doesn't involve memory addresses at all, so what gives?
What you are seeing is a clever optimization by your compiler to do that math a little bit faster. To understand how it works, let's back up a little bit and talk about arrays and memory.
Arrays in C are just pointers to the start of the array. If you want to access a particular element at index i
, you start at the base memory address and step forward i
times. In other words, the math to look up an array element looks like:
address of element = base address + (index * size of element)
This is exactly the math that goes on behind the scenes when you index an array in C. Not only is this a common thing to do in C, it's a common thing to do in plain old assembly. Let's look at an example function and the assembly it produces:
int getElementValue(int* array, int index) { return array[index]; }
getElementValue: movsx rsi, esi mov eax, DWORD PTR [rdi+rsi*4] ret
In the assembly, you can see the expression [rdi+rsi*4]
, which looks suspiciously like the equation mentioned above.
Basically, x86 assembly has this math built into its syntax. Whenever you're getting a value from memory, you can use the syntax:
[base register + constant offset + offset register * constant size]
Let's break down the expression [rdi+rsi*4]
from our example:
rdi
: This is the register used for array
, the first parameter of our function. We are using it as the base register - the register containing the starting address of our array.rsi
: This is the register used for index
, the second parameter of our function. We are using it as the offset register, which tells us how far to go into the array.4
: An int
is four bytes long, so we need to step by four bytes for each element in the array.Let's look at another example:
int getElementValue_Offset(int* array, int index) { return array[index + 3]; }
getElementValue_Offset: movsx rsi, esi mov eax, DWORD PTR [rdi+12+rsi*4] ret
This time we are skipping forward by three elements. Our assembly now has a constant offset of 12
, because we are skipping forward by three ints, and 3 * 4 = 12.
As a final example, let's get an element from an array of long
s, which have a size of eight bytes:
long getElementValue_Long(long* array, int index) { return array[index + 3]; }
getElementValue_Long: movsx rsi, esi mov rax, QWORD PTR [rdi+24+rsi*8] ret
Now you can see that our constant offset is 24 (8 * 3) and our constant size is 8.
Let's take our first example and tweak it to give us the address of an array element instead of the actual value:
int* getElementAddress(int* array, int index) { return &array[index]; }
getElementAddress: movsx rsi, esi lea rax, [rdi+rsi*4] ret
The &
operator in our C code gives us the memory address of that array element. You can see a similar change in the assembly - now instead of a mov
instruction, we have an lea
.
lea
stands for "load effective address", and it is the assembly equivalent of &
. If mov destination, [source]
means "look up the element at address [source]
and copy it to destination
", lea destination, [source]
means "just get the address [source]
."
Take a look at our previous examples compared to the versions with &
, and you'll see that the only change is that mov
becomes lea
. (Well, and eax
changes to rax
sometimes, but those are just two different names for the same register.)
int getElementValue(int* array, int index) { return array[index]; } int* getElementAddress(int* array, int index) { return &array[index]; } long getElementValue_Long(long* array, int index) { return array[index + 3]; } long* getElementAddress_Long(long* array, int index) { return &array[index + 3]; }
getElementValue: movsx rsi, esi mov eax, DWORD PTR [rdi+rsi*4] ret getElementAddress: movsx rsi, esi lea rax, [rdi+rsi*4] ret getElementValue_Long: movsx rsi, esi mov rax, QWORD PTR [rdi+24+rsi*8] ret getElementAddress_Long: movsx rsi, esi lea rax, [rdi+24+rsi*8] ret
This is what lea
was built to do, but it has another trick up its sleeve.
lea
is designed for use with arrays. But the math it's doing is just math, and we can use it for other things.
Let's look back at our original example:
int justSomeExample(int x) { return ++x; }
justSomeExample: lea eax, [rdi+1] ret
This loads a "memory address" starting at rdi
(which corresponds to x) and with a constant offset of 1. But this is really just a fancy way of saying "give me the value x + 1
".
There is really nothing special about a memory address. It is just a number that happens to correspond to a place in memory. If we don't care about actually looking at memory, we can use lea
to do any math that fits the format.
Here's another example, simply adding two numbers together:
int justAddition(int x, int y) { return x + y; }
justAddition: lea eax, [rdi+rsi] ret
And here's a fancier, more confusing example:
int simpleArithmetic(int x) { return 5 * x + 7; }
simpleArithmetic: lea eax, [rdi+7+rdi*4] ret
This is our compiler being clever. It saw 5x + 7
, realized that was the same as x + 4x + 7
, and put that into lea
as [rdi+7+rdi*4]
.
In all of these cases, the compiler is using lea
to save a couple instructions here and there. Our simpleArithmetic
function could look something like this instead...
simpleArithmetic: mov eax, rdi mul eax, 5 add eax, 7 ret
But lea
allows us to do all of that math in a single instruction.