rsp register keeps address to stack location where call/ret/push/pop instructions can read or write.
call address is equivalent to this piece of code:
sub rsp, 8
mov [rsp], next
ret instruction is equivalent to this piece of code:
mov TEMP_REGISTER, [rsp]
add rsp, 8
Also, If that is the case, why in the copy3 function does the caller only move 32 bytes?
In copy3 function first operation is
push rbx that restores 16-byte aligned stack pointer. So you shadow space can be just required 32 bytes, no need for extra.