mscharhag, Programming and Stuff;

A blog about programming and software development topics, mostly focused on Java technologies including Java EE, Spring and Grails.

Wednesday, 2 September, 2020

REST: Dealing with Pagination

In a previous post we learned how to retrieve resource collections. When those collections become larger, it is often useful to provide a way for clients to retrieve partial collections.

Assume we provide an REST API for painting data. Our database might contain thousands of paintings. However, a web interface showing these paintings to users might only be able to show ten paintings at the same time. To view the next paintings the user needs to navigate to the next page which shows the following ten paintings. This process of dividing the content into smaller consumable sections (pages) is called Pagination.

Pagination can be an essential part of your API if you are dealing with large collections.

In the following sections we will look at different types of pagination

Using page and size parameters

The page parameter tells which page should be returned while size indicates how many elements a page should contain.

For example, this might return the first page, containing 10 painting resources.

GET /paintings?page=1&size=10

To get the next page we simply increase the page parameter by one.

Unfortunately it is not always clear if pages start counting with 0 or 1, so make sure to document this properly.

(In my opinion 1 should be preferred because this represents the natural page counting)

A minor issue with this approach might be that the client cannot change the size parameter for a specific page.

For example, after getting the first 10 items of a collection by issuing

GET /paintings?page=1&size=10

we cannot get the second page with a size of 15 by requesting:

GET /paintings?page=2&size=15

This will return the items 15-30 of the collection. So, we missed 5 items (10-14).

Using offset and limit parameters

Another, but very similar approach is the use of offset and limit parameters. offset tells the server the number of items that should be skipped, while limit indicates the number of items to be returned.

For example, this might return the first 10 painting resources:

GET /paintings?offset=0&limit=10

An offset parameter of 0 means that no elements should be skipped.

We can get the following 10 resources by skipping the first 10 resources (= setting the offset to 10):

GET /paintings?offset=10&limit=10

This approach is a bit more flexible because offset and limit do not effect each other. So we can increase the limit for a specific page. We just need to make sure to adjust the offset parameter for the next page request accordingly.

For example, this can be useful if a client displays data using a infinite scrollable list. If the user scrolls faster the client might request a larger chunk of resources with the next request.

The downsides?

Both previous solutions can work fine. They are often very easy to implement. However, both share two downsides.

Depending on the underlying database and data structure you might run into performance problems for large offsets / page numbers. This is often an issue for relational databases (see this Stackoverflow questions for MySQL or this one for Postgres).

Another problem is resource skipping caused by delete operations. Assume we request the first page by issuing:

GET /paintings?page=1&size=10

After we retrieved the response, someone deletes a resource that is located on the first page. Now we request the second page with:

GET /paintings?page=2&size=10

We now skipped one resource. Due to the deletion of a resource on the first page, all other resources in the collection move one position forward. The first resource of page two has moved to page one. 

Seek Pagination

An approach to solve those downsides is called Seek Pagination. Here, we use resource identifiers to indicate the collection offset.

For example, this might return the first five resources:

GET /paintings?limit=5

Response:

[
    { "id" : 2, ... },
    { "id" : 3, ... },
    { "id" : 5, ... },
    { "id" : 8, ... },
    { "id" : 9, ... }
]

To get the next five resources, we pass the id of the last resource we received:

GET /paintings?last_id=9&limit=5

Response:

[
    { "id" : 10, ... },
    { "id" : 11, ... },
    { "id" : 13, ... },
    { "id" : 14, ... },
    { "id" : 17, ... }
]

This way we can make sure we do not accidentally skip a resource.

For a relational database this is now much simpler. It is very likely that we just have to compare the primary key to the last_id parameter. The resulting query probably looks similar to this:

select * from painting where id > last_id order by id limit 5;

Response format

When using JSON, partial results should be returned as JSON object (instead of an JSON array). Beside the collection items the total number of items should be included.

Example response:

{
    "total": 4321,
    "items": [
        {
            "id": 1,
            "name": "Mona Lisa",
            "artist": "Leonardo da Vinci"
        }, {
            "id": 2
            "name": "The Starry Night",
            "artist": "Vincent van Gogh"
        }
    ]
}

When using page and size parameters it is also a good idea to return the total number of available pages.

Hypermedia controls

If you are using Hypermedia controls in your API you should also add links for first, last, next and previous pages. This helps decoupling the client from your pagination logic.

For example:

GET /paintings?offset=0&limit=10
{
    "total": 4317,
    "items": [
        {
            "id": 1,
            "name": "Mona Lisa",
            "artist": "Leonardo da Vinci"
        }, {
            "id": 2
            "name": "The Starry Night",
            "artist": "Vincent van Gogh"
        },
        ...
    ],
    "links": [
        { "rel": "self", "href": "/paintings?offset=0&limit=10" },
        { "rel": "next", "href": "/paintings?offset=10&limit=10" },
        { "rel": "last", "href": "/paintings?offset=4310&limit=10" },
        { "rel": "by-offset", "href": "/paintings?offset={offset}&limit=10" }
    ]
}

Note that we requested the first page. Therefore the first and previous links are missing. The by-offset link uses an URI-Template, so the client choose an arbitrary offset.

 

Leave a reply