Python's Pass-by-Object-Reference Paradigm

When variables are passed as arguments to functions in Python, they are passed as object references. What does that actually mean? Well let's first take a look at how data is stored in variables with an example.

>>> 
>>> var1 = 1000
>>> var2 = var1
>>> 
>>> hex(id(var1))
'0x7f4a880374f0'
>>>
>>> hex(id(var2))
'0x7f4a880374f0'
>>> 
>>> var1 is var2
True
>>> 
>>> del var1
>>> 
>>> hex(id(var2))
'0x7f4a880374f0'
>>> 

Just to walk through it briefly, I first defined a variable named var1 with a value of 1000. Next, I defined var2 and assigned it var1. I then compared their memory locations and you can see that they have the same id values.

Next, I deleted var1 and checked the memory location of var2 again. var2 still existed even after deleting var1. The reason for this is that variables in Python are merely names that reference an object in memory. The objects themselves exist independently of the variable name. In fact, when var1 was defined, it becomes a reference to a new instance of the int class with the value of 1000.

This is an important distinction to make and is the reason we say that everything in Python is an object. Other lower-level programming languages don't inherently do this. When you define an integer variable in C++ for example, the variable is the actual instance of the int class stored in memory. You might think of a C++ variable as a box containing an object, whereas a Python variable would be more like a label indicating which box the object is in.

If this doesn't quite make sense just yet, that's okay. Just understand that in Python, everything is an object and variables are just names for those objects. With that in mind, let's look at the pass-by-value and pass-by-reference paradigms next, before comparing them to pass-by-object-reference.

Pass-by-value

Passing by value means that when a variable is passed as an argument to a function, only a copy of the variable value is passed in as the argument. Let's look at an example in C++

#include <iostream>
using namespace std;

void pass_by_val(int input_var)
{
    cout << "'input_var' inside pass_by_val() = " << input_var << " @ " << &input_var << endl;
}


int main()
{
    int var = 1000;
    cout << "'var' outside pass_by_val() = " << var << " @ " << &var << endl;
    pass_by_val(var);
    cout << "'var' outside pass_by_val() = " << var << " @ " << &var << endl;
}


// OUTPUT:
// 'var' outside pass_by_val() = 1000 @ 0x7ffc950125ec
// 'input_var' inside pass_by_val() = 1000 @ 0x7ffc950125cc
// 'input_var' inside pass_by_val() = 2000 @ 0x7ffc950125cc
// 'var' outside pass_by_val() = 1000 @ 0x7ffc950125ec

You can see in the output that while the values of input_var inside the function and var outside the function are the same, the memory addresses are different. That shows that only a copy of 'var's value was passed into the function. When the value input_var was changed within the function, it did not affect the value of var outside the function because var and input_var are two different variables containing two different integers.

Another way to think about it is that var and input_varare two different "boxes" and the contents of 'var' were copied into 'input_var'. They're two separate "boxes".

Pass-by-reference

Once again we can also demonstrate pass-by-reference using C++.

#include <iostream>
using namespace std;

void pass_by_ref(int& input_var)
{
    cout << "'input_var' inside pass_by_ref() = " << input_var << " @ " << &input_var << endl;
    input_var = 2000;
    cout << "'input_var' inside pass_by_ref() = " << input_var << " @ " << &input_var << endl;
}


int main()
{
    int var = 1000;
    cout << "'var' outside pass_by_ref() = " << var << " @ " << &var << endl;
    pass_by_ref(var);
    cout << "'var' outside pass_by_ref() = " << var << " @ " << &var << endl;
}


// OUTPUT:
// 'var' outside pass_by_ref() = 1000 @ 0x7ffea960334c
// 'input_var' inside pass_by_ref() = 1000 @ 0x7ffea960334c
// 'input_var' inside pass_by_ref() = 2000 @ 0x7ffea960334c
// 'var' outside pass_by_ref() = 2000 @ 0x7ffea960334c

In the example above, I used essentially the same code as in the previous pass-by-value example with one significant change. In the function signature, I specified that the parameter is a reference by adding an ampersand. This tells the C++ compiler to pass in a reference or alias to the variable given. Essentially, a pointer to the variable is passed into the function as the argument.

As shown in the output, the memory location didn't change because var and input_var reference the same location in memory. Even when the value of input_var is changed within the function, those changes are also seen outside the function because the changes were made on the same object in the same memory location.

Another way to think about it is that 'input_var' is a label which references the box var and they both have the same contents because 'input_var' references 'var'. In this case they're the same box.

Pass-by-object-reference

Finally, let's take a look at a similar example in Python to demonstrate pass-by-object-reference.

def pass_by_obj_ref(input_var):
    print("'input_var' inside pass_by_obj_ref() = ", input_var, '@', hex(id(input_var)))
    input_var = 2000
    print("'input_var' inside pass_by_obj_ref() = ", input_var, '@', hex(id(input_var)))


if __name__ == '__main__':
    var = 1000
    print("'var' outside pass_by_obj_ref() = ", var, '@', hex(id(var)))
    pass_by_obj_ref(var)
    print("'var' outside pass_by_obj_ref() = ", var, '@', hex(id(var)))


# OUTPUT: 
# 'var' outside pass_by_obj_ref() =  1000 @ 0x7ff92671f650
# 'input_var' inside pass_by_obj_ref() =  1000 @ 0x7ff92671f650
# 'input_var' inside pass_by_obj_ref() =  2000 @ 0x7ff92671f6d0
# 'var' outside pass_by_obj_ref() =  1000 @ 0x7ff92671f650

This output looks a little different than what we've seen thus far. You can see that initially when var is passed into the function as input_var both var and input_var have the same memory location just like the pass-by-reference example. However, once the value of input_var was changed within the function, it had a different memory location and different value than var outside the function. What's going on?

Well, recall that in Python everything is an object and variables are just names referring to an object. The integer object itself is only referenced by var; it's not "inside" var. When var was passed to the function, a reference to the actual integer object referenced by var was passed to the function. input_var became another reference to the integer object. However, once input_var was reassigned to a new integer object, it no longer referenced the object passed to the function.

That's a lot of words, I know. Another way to think about it is to think of variables in Python as names. They don't actually contain objects, they just reference them. When you assign one variable to another, or when you pass a variable as an argument to a function, you're simply creating a new name referencing the same object. If you assign a different object to an existing variable, that variable simply no longer references the old object, and begins to reference the new object.

Why is all this important? Well, when dealing with only immutable object types like integers and strings within functions, it probably isn't all that important. Since immutable types can't be changed in place, they have to be replaced. Just like in pass-by-value languages, the end result for immutable objects is that they can't be changed in place within a function. With mutable objects like lists, however, that's not the case. You really do need to understand that as long as the variable inside the function still references the object passed to the function, the object can still be changed in place.

Let's take a look at one last example to show what I mean.

def pass_by_obj_ref(input_var):
    print("'input_var' inside pass_by_obj_ref() = ", input_var, '@', hex(id(input_var)))
    
    input_var.append(4)
    print("'input_var' inside pass_by_obj_ref() = ", input_var, '@', hex(id(input_var)))
    
    input_var = [10, 20, 30]
    print("'input_var' inside pass_by_obj_ref() = ", input_var, '@', hex(id(input_var)))

if __name__ == '__main__':
    var = [1, 2, 3]
    print("'var' outside pass_by_obj_ref() = ", var, '@', hex(id(var)))
    pass_by_obj_ref(var)
    print("'var' outside pass_by_obj_ref() = ", var, '@', hex(id(var)))


# OUTPUT:
# 'var' outside pass_by_obj_ref() =  [1, 2, 3] @ 0x7f0ffc45d0c8
# 'input_var' inside pass_by_obj_ref() =  [1, 2, 3] @ 0x7f0ffc45d0c8
# 'input_var' inside pass_by_obj_ref() =  [1, 2, 3, 4] @ 0x7f0ffc45d0c8
# 'input_var' inside pass_by_obj_ref() =  [10, 20, 30] @ 0x7f0ffc452f08
# 'var' outside pass_by_obj_ref() =  [1, 2, 3, 4] @ 0x7f0ffc45d0c8

In this example, I made a few changes. This time I assigned a list to var instead of an integer. Within the function, I also appended a value to the list, and then replaced the list with a new list.

In the output, you can see that the new value 4 was added to the list in place since the memory location didn't change. However, when I replaced the list with a new list, you can see that input_var was now referencing a new list object at a different memory location and thus the original list, which was still referenced by 'var', did not change.

Without understanding that input_var was a reference to the same list object referenced by var you might not have expected that to happen. This was a very simple example, but on larger projects with the same objects being passed to multiple functions or with mutable objects inside tuples, it might not be so obvious to debug.

Try this yourself and see if you can guess what will happen. Create a function and pass in a tuple containing a list. Inside the function, append an item to the list. Will the changes to the list persist outside the function? Why or why not?