mscharhag, Programming and Stuff;

A blog about programming and software development topics, mostly focused on Java technologies including Java EE, Spring and Grails.

  • Monday, 22 November, 2021

    HTTP - Content negotiation

    With HTTP, resources are identified using URIs. And a uniquely identified resource might support multiple resource representations. A representation is a specific form of a particular resource.

    For example:

    • a HTML page /index.html might be available in different languages
    • product data located at /products/123 can be served in JSON, XML or CSV
    • an avatar image /user/avatar might available in JPEG, PNG and GIF formats

    In all these cases one underlying resource has multiple different representations.

    Content negotiation is the mechanism used by clients and servers to decide which representation should be used.

    Server-driven and agent-driven content negotiation

    We can differentiate between server-driven and agent-driven content negotiation.

    With server-driven content negotiation the client tells the server which representations are preferable. The server then picks the representation that best fits the clients needs.

    When using agent-driven content negotiation the server tells the client which representations are available. The client then picks the best matching option.

    In practice nearly only server-driven negotiation is used. Unfortunately, there is no standardized format for doing agent-driven negotiation. Additionally, agent-driven negotiation is usually also worse for performance as it requires an additional request / response round trip. In the rest of this article we will therefore focus on server-driven negotiation.

    Accept headers

    With server-driven negotiation the client uses headers to indicate supported content formats. A server-side algorithm then uses these headers to decide which resource representation should be returned.

    Most commonly used is the Accept-Header, which communicates the media-type preferred by the client. For example, consider the following simple HTTP request containing an Accept header:

    GET /monthly-report
    Accept: text/html; q=1.0, text/*; q=0.8

    The header tells the server that the client understands HTML (media-type text/html) and other text based formats (mediatype text/*).

    text/* indicates that all subtypes of the text type are supported. To indicate that all media types are supported we can use */*.

    In this example HTML is preferred over other text based formats because it has a higher quality factor (q).

    Ideally a server would respond with a HTML document to this request. For example:

    HTTP/1.1 200 OK
    Content-Type: text/html
    
    <html>
        <body>
            <h1>Monthly report</h1>
            ...
        </body>
    </html>

    If returning HTML is not feasible, the server can also respond with another text based format, like text/plain:

    200 OK
    Content-Type: text/plain
    
    Monthly report
    Bla bli blu
    ...

    Besides the Accept header there are also the Accept-Language and Accept-Encoding headers, we can use. Accept-Language indicates the language preference of the client while Accept-Encoding defines the acceptable content encodings.

    Of course all these headers can be used together. For example:

    GET /monthly-report
    Accept: text/html
    Accept-Language: en-US; q=1.0, en; q=0.9, fr; q=0.4
    Accept-Encoding: gzip, br

    Here the client indicates that he prefers

    • an HTML document
    • US English (preferred, q=1.0) but other English variations are also fine (q=0.9). If English is not available, French can do the job too (q=0.4)
    • gzip and brotli (br) encoding is supported

    An acceptable response might look like this:

    200 Ok
    Content-Type: text/html
    Content-Language: en
    Content-Encoding: gzip
    
    <gzipped html document>

    What if the server cannot return an acceptable response?

    If the server is unable to fulfill the clients preferences the HTTP status code 406 (Not Acceptable) can be returned. This status code indicates that the server is unable to produce a response matching the clients preference.

    Depending on the situation it might also be viable to return a response that does not exactly match the clients preference. For example, assume no language provided in the Accept-Language header is supported by the server. In this case, it can still be a valid option to return a response using a predefined default language. This might be more useful for the client than nothing. In this case, the client can look at the Content-Language header of the response and decide if he wants to use the response or ignore it.

    Content negotiation in REST APIs

    For REST APIs it can be a viable option to support more than one standard representation for resources. For example, with content negotiation we can support JSON and XML and let the client decide what he wants to use.

    CSV can also be an interesting option to consider in certain situations as the response can directly be viewed with tools like Excel. For example, consider the following request:

    GET /users
    Accept: text/csv
    

    Instead of returning a JSON (or XML) collection, the server now can respond with a list of users in CSV format.

    HTTP/1.1 200 Ok
    Content-Type: text/csv
    
    Id;Username;Email
    1;john;john.doe@example.com
    2;anna91;anna91@example.com

     

    Interested in more REST related articles? Have a look at my REST API design page.

  • Monday, 1 November, 2021

    Avoid leaking domain logic

    Many software architectures try to separate domain logic from other parts of the application. To follow this practice we always need to know what actually is domain logic and what is not. Unfortunately this is not always that easy to separate. If we get this decision wrong, domain logic can easily leak into other components and layers.

    We will go through this problem by looking at examples using a hexagonal application architecture. If you are not familiar with hexagonal architecture (also called ports and adapters architecture) you might be interested in the previous post about the transition from a traditional layered architecture to a hexagonal architecture.

    Assume a shop system that publishes new orders to a messaging system (like Kafka). Our product owner now tells us that we have to listen for these order events and persist the corresponding order in the database.

    Using hexagonal architecture the integration with a messaging system is implemented within an adapter. So, we start with a simple adapter implementation that listens for Kafka events:

    @AllArgsConstructor
    public class KafkaAdapter {
        private final SaveOrderUseCase saveOrderUseCase;
    
        @KafkaListener(topic = ...)
        public void onNewOrderEvent(NewOrderKafkaEvent event) {
            Order order = event.getOrder();
            saveOrderUseCase.saveOrder(order);
        }
    }

    In case you are not familiar with the @AllArgsConstructor annotation from project lombok: It generates a constructor which accepts each field (here saveOrderUseCase) as parameter.

    The adapter delegates the saving of the order to a UseCase implementation.

    UseCases are part of our domain core and implements domain logic, together with the domain model. Our simple example UseCase looks like this:

    @AllArgsConstructor
    public class SaveOrderUseCase {
        private final SaveOrderPort saveOrderPort;
    
        public void saveOrder(Order order) {
            saveOrderPort.saveOrder(order);
        }
    }

    Nothing special here. We simply use an outgoing Port interface to persist the passed order.

    While the shown approach might work fine, we have a significant problem here: Our business logic has leaked into the Adapter implementation. Maybe you are wondering: what business logic?

    We have a simple business rule to implement: Everytime a new order is retrieved it should be persisted. In our current implementation this rule is implemented by the adapter while our business layer (the UseCase) only provides a generic save operation.

    Now assume, after some time, a new requirement arrives: Every time a new order is retrieved, a message should be written to an audit log.

    With our current implementation we cannot write the audit log message within SaveOrderUseCase. As the name suggests the UseCase is for saving an order and not for retrieving a new order and therefore might be used by other components. So, adding the audit log message here might have undesired side-effects.

    The solution is simple: We write the audit log message in our adapter:

    @AllArgsConstructor
    public class KafkaAdapter {
    
        private final SaveOrderUseCase saveOrderUseCase;
        private final AuditLog auditLog;
    
        @KafkaListener(topic = ...)
        public void onNewOrderEvent(NewOrderKafkaEvent event) {
            Order order = event.getOrder();
            saveOrderUseCase.saveOrder(order);
            auditLog.write("New order retrieved, id: " + order.getId());
        }
    }

    And now we have made it worse. Even more business logic has leaked into the adapter.

    If the auditLog object writes messages into a database, we might also have screwed up transaction handling, which is usually not handled in an incoming adapter.

    Using more specific domain operations

    The core problem here is the generic SaveOrderUseCase. Instead of providing a generic save operation to adapters we should provide a more specific UseCase implementation.

    For example, we can create a NewOrderRetrievedUseCase that accepts newly retrieved orders:

    @AllArgsConstructor
    public class NewOrderRetrievedUseCase {
        private final SaveOrderPort saveOrderPort;
        private final AuditLog auditLog;
    
        @Transactional
        public void onNewOrderRetrieved(Order newOrder) {
            saveOrderPort.saveOrder(order);
            auditLog.write("New order retrieved, id: " + order.getId());
        }
    }

    Now both business rules are implemented within the UseCase. Our adapter implementation is now simply responsible for mapping incoming data and passing it to the use case:

    @AllArgsConstructor
    public class KafkaAdapter {
        private final NewOrderRetrievedUseCase newOrderRetrievedUseCase;
    
        @KafkaListener(topic = ...)
        public void onNewOrderEvent(NewOrderKafkaEvent event) {
            NewOrder newOrder = event.toNewOrder();
            newOrderRetrievedUseCase.onNewOrderRetrieved(newOrder);
        }
    }

    This change only seems to be a small difference. However, for future requirements, we now have a specific location to handle incoming orders in our business layer. Otherwise, chances are high that with new requirements we leak more business logic into places where it should not be located.

    Leaks like this happen especially often with too generic create, save/update and delete operations in the domain layer. So, try to be very specific when implementing business operations.

  • Wednesday, 6 October, 2021

    Media types and the Content-Type header

    A Media type (formerly known as MIME type) is an identifier for file formats and format contents. Media types are used by different internet technologies like e-mail or HTTP.

    Media types consist of a type and a subtype. It can optionally contain a suffix and one or more parameters. Media types follow this syntax:

    type "/" [tree "."] subtype ["+" suffix]* [";" parameter]
    

    For example the media type for JSON documents is:

    application/json

    It consists of the type application with the subtype json.

    A HTML document with UTF-8 encoding can be expressed as:

    text/html; charset=UTF-8

    Here we have the type text, the subtype html and a parameter charset=UTF-8 indicating UTF-8 character encoding.

    A suffix can be used to specify the underlying format of a media type. For example, SVG images use the media type:

    image/svg+xml

    The type is image, svg is the subtype and xml the suffix. The suffix tells us that the SVG file format is based on XML.

    Note that subtypes can be organized in a hierarchical tree structure. For example, the binary format used by Apache Thrift uses the following media type:

    application/vnd.apache.thrift.binary

    vnd is a standardized prefix that tells us this is a vendor specific media type.

    The Content-Type header

    With HTTP any message that contains an entity-body should include a Content-Type header to define the media type of the body.

    The RFC says:

    Any HTTP/1.1 message containing an entity-body SHOULD include a Content-Type header field defining the media type of that body. If and only if the media type is not given by a Content-Type field, the recipient MAY attempt to guess the media type via inspection of its content and/or the name extension(s) of the URI used to identify the resource. If the media type remains unknown, the recipient SHOULD treat it as type "application/octet-stream".

    The RFC allows clients to guess the media type if the Content-Type header is not present. However, this should be avoided in any case.

    Guessing the media-type of a piece of data is called Content sniffing (or MIME-sniffing). This practice was (and sometimes is still) used by web browsers and accounts for multiple security vulnerabilities. To explicitly tell browsers not to guess certain media types the following header can be added:

    X-Content-Type-Options: nosniff

    Note that the Content-Type header always contains the media type of the original resource, before any content encoding is applied. Content encoding (like gzip compression) is indicated by the Content-Encoding header.

  • Monday, 20 September, 2021

    From layers to onions and hexagons

    In this post we will explore the transition from a classic layered software architecture to a hexagonal architecture. The hexagonal architecture (also called ports and adapters architecture) is a design pattern to create loosely coupled application components.

    This post was inspired by a German article from Silas Graffy called Von Schichten zu Ringen - Hexagonale Architekturen erklärt.

    Classic layers

    Layering is one of the most widely known techniques to break apart complicated software systems. It has been promoted in many popular books, like Patterns of Enterprise Application Architecture by Martin Fowler.

    Layers allows us to build software on top of a lower level layer without knowing the details about any of the lower level layers. In an ideal world we can even replace lower level layers with different implementations. While the number of layers can vary we mostly see three or four layers in practice.

    Here, we have an example diagram of a three layer architecture:

    The presentation layer contains components related to user (or API) interfaces. In the domain layer we find the logic related to the problem the application solves. The database access layer is responsible database interaction.

    The dependency direction is from top to bottom. The code in the presentation layer depends on code in the domain layer which itself does depend on code located in the database layer.

    As an example we will examine a simple use-case: Creation of a new user. Let's add related classes to the layer diagram:

    In the database layer we have a UserDao class with a saveUser(..) method that accepts a UserEntity class. UserEntity might contain methods required by UserDao for interacting with the database. With ORM-Frameworks (like JPA) UserEntity might contain information related to object-relational mapping.

    The domain layer provides a UserService and a User class. Both might contain domain logic. UserService interacts with UserDao to save a User in the database. UserDao does not know about the User object, so UserService needs to convert User to UserEntity before calling UserDao.saveUser(..).

    In the Presentation layer we have a UserController class which interacts with the domain layer using UserService and User classes. The presentation also does have its own class to represent a user: UserDto might contain utility methods to format field values for presentation in a user interface.

    What is the problem with this?

    We have some potential problems to discuss here.

    First we can easily get the impression that the database is the most important part of the system as all other layers depend on it. However, in modern software development we no longer start with creating huge ER-diagrams for the database layer. Instead, we usually (should) focus on the business domain.

    As the domain layer depends on the database layer the domain layer needs to convert its own objects (User) to objects the database layer knows how to use (UserEntity). So we have code that deals with database layer specific classes located in the domain layer. Ideally we want to have the domain layer to focus on domain logic and nothing else.

    The domain layer is directly using implementation classes from the database layer. This makes it hard to replace the database layer with different implementations. Even if we do not want to plan for replacing the database with a different storage technology this is important. Think of replacing the database layer with mocks for unit testing or using in-memory databases for local development.

    Abstraction with interfaces

    The latest mentioned problem can be solved by introducing interfaces. The obvious and quite common solution is to add an interface in the database layer. Higher level layers use the interface and do not depend on implementation classes.

    Here we split the UserDao class into an interface (UserDao) and an implementation class (UserDaoImpl). UserService only uses the UserDao interface. This abstraction gives us more flexibility as we can now change UserDao implementations in the database layer.

    However, from the layer perspective nothing changed. We still have code related to the database layer in our domain layer.

    Now, we can do a little bit of magic by moving the interface into the domain layer:

    Note we did not just move the UserDao interface. As UserDao is now part of the domain layer, it uses domain classes (User) instead of database related classes (UserEntity).

    This little change is reversing the dependency direction between domain and database layers. The domain layer does no longer depend on the database layer. Instead, the database layer depends on the domain layer as it requires access to the UserDao interface and the User class. The database layer is now responsible for the conversion between User and UserEntity.

    In and out

    While the dependency direction has been changed the call direction stays the same:

    The domain layer is the center of the application. We can say that the presentation layer calls in the domain layer while the domain layer calls out to the database layer.

    As a next step, we can split layers into more specific components. For example:

    This is what hexagonal architecture (also called ports and adapters) is about.

    We no longer have layers here. Instead, we have the application domain in the center and so-called adapters. Adapters provide additional functionality like user interfaces or database access. Some adapters call in the domain center (here: UI and REST API) while others are outgoing adapters called by the domain center via interfaces (here database, message queue and E-Mail)

    This allows us the separate pieces of functionality into different modules/packages while the domain logic does not have any outside dependencies.

    The onion architecture

    From the previous step it is easy to move to the onion architecture (sometimes also called clean architecture).

    The domain center is split into the domain model and domain services (sometimes called use cases). Application services contains incoming and outgoing adapters. On the out-most layer we locate infrastructure elements like databases or message queues.

    What to remember?

    We looked at the transition from a classic layered architecture to more modern architecture approaches. While the details of hexagonal architecture and onion architecture might vary, both share important parts:

    • The application domain is the core part of the application without any external dependencies. This allows easy testing and modification of domain logic.
    • Adapters located around the domain logic talk with external systems. These adapters can easily be replaced by different implementations without any changes to the domain logic.
    • The dependency direction always goes from the outside (adapters, external dependencies) to the inside (domain logic).
    • The call direction can be in and out of the domain center. At least for calling out of the domain center, we need interfaces to assure the correct dependency direction.

    Further reading

  • Wednesday, 28 July, 2021

    File down- and uploads in RESTful web services

    Usually we use standard data exchange formats like JSON or XML with REST web services. However, many REST services have at least some operations that can be hard to fulfill with just JSON or XML. Examples are uploads of product images, data imports using uploaded CSV files or generation of downloadable PDF reports.

    In this post we focus on those operations, which are often categorized as file down- and uploads. This is a bit flaky as sending a simple JSON document can also be seen as a (JSON) file upload operation.

    Think about the operation you want to express

    A common mistake is to focus on the specific file format that is required for the operation. Instead, we should think about the operation we want to express. The file format just decides the Media Type used for the operation.

    For example, assume we want to design an API that let users upload an avatar image to their user account.

    Here, it is usually a good idea to separate the avatar image from the user account resource for various reasons:

    • The avatar image is unlikely to change so it might be a good candidate for caching. On the other, hand the user account resource might contain things like the last login date which changes frequently.
    • Not all clients accessing the user account might be interested in the avatar image. So, bandwidth can be saved.
    • For clients it is often preferable to load images separately (think of web applications using <img> tags)

    The user account resource might be accessible via:

    /users/<user-id>

    We can come up with a simple sub-resource representing the avatar image:

    /users/<user-id>/avatar

    Uploading an avatar is a simple replace operation which can be expressed via PUT:

    PUT /users/<user-id>/avatar
    Content-Type: image/jpeg
    
    <image data>
    

    In case a user wants to delete his avatar image, we can use a simple DELETE operation:

    DELETE /users/<user-id>/avatar
    

    And of course clients need a way to show to avatar image. So, we can provide a download operation with GET:

    GET /users/<user-id>/avatar
    

    which returns

    HTTP/1.1 200 Ok
    Content-Type: image/jpeg
    
    <image data>
    

    In this simple example we use a new sub-resource with common update, delete, get operations. The only difference is we use an image media type instead of JSON or XML.

    Let's look at a different example.

    Assume we provide an API to manage product data. We want to extend this API with an option to import products from an uploaded CSV file. Instead of thinking about file uploads we should think about a way to express a product import operation.

    Probably the simplest approach is to send a POST request to a separate resource:

    POST /product-import
    Content-Type: text/csv
    
    <csv data>
    

    Alternatively, we can also see this as a bulk operation for products. As we learned in another post about bulk operations with REST, the PATCH method is a possible way to express a bulk operation on a collection. In this case, the CSV document describes the desired changes to product collection.

    For example:

    PATCH /products
    Content-Type: text/csv
    
    action,id,name,price
    create,,Cool Gadget,3.99
    create,,Nice cap,9.50
    delete,42,,
    

    This example creates two new products and deletes the product with id 42.

    Processing file uploads can take a considerable amount of time. So think about designing it as an asynchronous REST operation.

    Mixing files and metadata

    In some situations we might need to attach additional metadata to a file. For example, assume we have an API where users can upload holiday photos. Besides the actual image data a photo might also contain a description, a location where it was taken and more.

    Here, I would (again) recommend using two separate operations for similar reasons as stated in the previous section with the avatar image. Even if the situation is a bit different here (the data is directly linked to the image) it is usually the simpler approach.

    In this case, we can first create a photo resource by sending the actual image:

    POST /photos
    Content-Type: image/jpeg
    
    <image data>

    As response we get:

    HTTP/1.1 201 Created
    Location: /photos/123

    After that, we can attach additional metadata to the photo:

    PUT /photos/123/metadata
    Content-Type: application/json
    
    {
        "description": "Nice shot of a beach in hawaii",
        "location": "hawaii",
        "filename": "hawaii-beach.jpg"
    }
    

    Of course we can also design it the other way around and send the metadata before the image.

    Embedding Base64 encoded files in JSON or XML

    In case splitting file content and metadata in seprate requests it not possible, we can embed files into JSON / XML documents using Base64 encoding. With Base64 encoding we can convert binary formats to a text representation which can be integrated in other text based formats, like JSON or XML.

    An example request might look like this:

    POST /photos
    Content-Type: application/json
    
    {
        "width": "1280",
        "height": "920",
        "filename": "funny-cat.jpg",
        "image": "TmljZSBleGFt...cGxlIHRleHQ="
    }

    Mixing media-types with multipart requests

    Another possible approach to transfer image data and metadata in a single request / response are multipart media types.

    Multipart media types require a boundary parameter that is used as delimiter between different body parts. The following request consists of two body parts. The first one contains the image while the second part contains the metadata.

    For example

    POST /photos
    Content-Type: multipart/mixed; boundary=foobar
    
    --foobar
    Content-Type: image/jpeg
    
    <image data>
    --foobar
    Content-Type: application/json
    
    {
        "width": "1280",
        "height": "920",
        "filename": "funny-cat.jpg"
    }
    --foobar--

    Unfortunately multipart requests / responses are often hard to work with. For example, not every REST client might be able to construct these requests and it can be hard to verify responses in unit tests.

    Interested in more REST related articles? Have a look at my REST API design page.