开发者

What’s the best RESTful method to return total number of items in an object?

I'm developing a REST API service for a large social networking website I'm involved in. So far, it's working great. I can issue GET, POST, PUT and DELETE requests to object URLs and affect my data. However, this data is paged (limited to 30 results at a time).

What would be the best RESTful way to get the total number of say, members, via my API?

Currently, I issue requests to a URL structure lik开发者_如何学JAVAe the following:

  • /api/members - Returns a list of members (30 at a time as mentioned above)
  • /api/members/1 - Affects a single member, depending on request method used

My question is: how would I then use a similar URL structure to get the total number of members in my application? Obviously requesting just the id field (similar to Facebook's Graph API) and counting the results would be ineffective given only a slice of 30 results would only be returned.


I have been doing some extensive research into this and other REST paging related questions lately and thought it constructive to add some of my findings here. I'm expanding the question a bit to include thoughts on paging as well as the count as they are intimitely related.

Headers

The paging metadata is included in the response in the form of response headers. The big benefit of this approach is that the response payload itself is just the actual data requestor was asking for. Making processing the response easier for clients that are not interested in the paging information.

There are a bunch of (standard and custom) headers used in the wild to return paging related information, including the total count.

X-Total-Count

X-Total-Count: 234

This is used in some APIs I found in the wild. There are also NPM packages for adding support for this header to e.g. Loopback. Some articles recommend setting this header as well.

It is often used in combination with the Link header, which is a pretty good solution for paging, but lacks the total count information.

Link

Link: </TheBook/chapter2>;
      rel="previous"; title*=UTF-8'de'letztes%20Kapitel,
      </TheBook/chapter4>;
      rel="next"; title*=UTF-8'de'n%c3%a4chstes%20Kapitel

I feel, from reading a lot on this subject, that the general consensus is to use the Link header to provide paging links to clients using rel=next, rel=previous etc. The problem with this is that it lacks the information of how many total records there are, which is why many APIs combine this with the X-Total-Count header.

Alternatively, some APIs and e.g. the JsonApi standard, use the Link format, but add the information in a response envelope instead of to a header. This simplifies access to the metadata (and creates a place to add the total count information) at the expense of increasing complexity of accessing the actual data itself (by adding an envelope).

Content-Range

Content-Range: items 0-49/234

Promoted by a blog article named Range header, I choose you (for pagination)!. The author makes a strong case for using the Range and Content-Range headers for pagination. When we carefully read the RFC on these headers, we find that extending their meaning beyond ranges of bytes was actually anticipated by the RFC and is explicitly permitted. When used in the context of items instead of bytes, the Range header actually gives us a way to both request a certain range of items and indicate what range of the total result the response items relate to. This header also gives a great way to show the total count. And it is a true standard that mostly maps one-to-one to paging. It is also used in the wild.

Envelope

Many APIs, including the one from our favorite Q&A website use an envelope, a wrapper around the data that is used to add meta information about the data. Also, OData and JsonApi standards both use a response envelope.

The big downside to this (imho) is that processing the response data becomes more complex as the actual data has to be found somewhere in the envelope. Also there are many different formats for that envelope and you have to use the right one. It is telling that the response envelopes from OData and JsonApi are wildly different, with OData mixing in metadata at multiple points in the response.

Separate endpoint

I think this has been covered enough in the other answers. I did not investigate this much because I agree with the comments that this is confusing as you now have multiple types of endpoints. I think it's nicest if every endpoint represents a (collection of) resource(s).

Further thoughts

We don't only have to communicate the paging meta information related to the response, but also allow the client to request specific pages/ranges. It is interesting to also look at this aspect to end up with a coherent solution. Here too we can use headers (the Range header seems very suitable), or other mechanisms such as query parameters. Some people advocate treating pages of results as separate resources, which may make sense in some use cases (e.g. /books/231/pages/52. I ended up selecting a wild range of frequently used request parameters such as pagesize, page[size] and limit etc in addition to supporting the Range header (and as request parameter as well).


While the response to /API/users is paged and returns only 30, records, there's nothing preventing you from including in the response also the total number of records, and other relevant info, like the page size, the page number/offset, etc.

The StackOverflow API is a good example of that same design. Here's the documentation for the Users method - https://api.stackexchange.com/docs/users


I prefer using HTTP Headers for this kind of contextual information.

For the total number of elements, I use the X-total-count header.
For links to next, previous page, etc. I use HTTP Link header:
http://www.w3.org/wiki/LinkHeader

Github does it the same way: https://docs.github.com/en/rest/overview/resources-in-the-rest-api#pagination

In my opinion, it's cleaner since it can be used also when you return content that doesn't support hyperlinks (i.e binaries, pictures).


Alternative when you don't need actual items

Franci Penov's answer is certainly the best way to go so you always return items along with all additional metadata about your entities being requested. That's the way it should be done.

but sometimes returning all data doesn't make sense, because you may not need them at all. Maybe all you need is that metadata about your requested resource. Like total count or number of pages or something else. In such case you can always have URL query tell your service not to return items but rather just metadata like:

/api/members?metaonly=true
/api/members?includeitems=0

or something similar...


You could return the count as a custom HTTP header in response to a HEAD request. This way, if a client only wants the count, you don't need to return the actual list, and there's no need for an additional URL.

(Or, if you're in a controlled environment from endpoint to endpoint, you could use a custom HTTP verb such as COUNT.)


I would recommend adding headers for the same, like:

HTTP/1.1 200

Pagination-Count: 100
Pagination-Page: 5
Pagination-Limit: 20
Content-Type: application/json

[
  {
    "id": 10,
    "name": "shirt",
    "color": "red",
    "price": "$23"
  },
  {
    "id": 11,
    "name": "shirt",
    "color": "blue",
    "price": "$25"
  }
]

For details refer to:

https://github.com/adnan-kamili/rest-api-response-format

For swagger file:

https://github.com/adnan-kamili/swagger-response-template


As of "X-"-Prefix was deprecated. (see: https://www.rfc-editor.org/rfc/rfc6648)

We found the "Accept-Ranges" as being the best bet to map the pagination ranging: https://www.rfc-editor.org/rfc/rfc7233#section-2.3 As the "Range Units" may either be "bytes" or "token". Both do not represent a custom data type. (see: https://www.rfc-editor.org/rfc/rfc7233#section-4.2) Still, it is stated that

HTTP/1.1 implementations MAY ignore ranges specified using other units.

Which indicates: using custom Range Units is not against the protocol, but it MAY be ignored.

This way, we would have to set the Accept-Ranges to "members" or whatever ranged unit type, we'd expect. And in addition, also set the Content-Range to the current range. (see: https://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.12)

Either way, I would stick to the recommendation of RFC7233 (https://www.rfc-editor.org/rfc/rfc7233#page-8) to send a 206 instead of 200:

If all of the preconditions are true, the server supports the Range
header field for the target resource, and the specified range(s) are
valid and satisfiable (as defined in Section 2.1), the server SHOULD
send a 206 (Partial Content) response with a payload containing one
or more partial representations that correspond to the satisfiable
ranges requested, as defined in Section 4.

So, as a result, we would have the following HTTP header fields:

For Partial Content:

206 Partial Content
Accept-Ranges: members
Content-Range: members 0-20/100

For full Content:

200 OK
Accept-Ranges: members
Content-Range: members 0-20/20


What about a new end point > /api/members/count which just calls Members.Count() and returns the result


Seems easiest to just add a

GET
/api/members/count

and return the total count of members


Sometimes frameworks (like $resource/AngularJS) require an array as a query result, and you can't really have a response like {count:10,items:[...]} in this case I store "count" in responseHeaders.

P. S. Actually you can do that with $resource/AngularJS, but it needs some tweaks.


You could consider counts as a resource. The URL would then be:

/api/counts/member


Interesting discussion regarding Designing REST API for returning count of multiple objects: https://groups.google.com/g/api-craft/c/qbI2QRrpFew/m/h30DYnrqEwAJ?pli=1

As an API consumer, I would expect each count value to be represented either as a subresource to the countable resource (i.e. GET /tasks/count for a count of tasks), or as a field in a bigger aggregation of metadata related to the concerned resource (i.e. GET /tasks/metadata). By scoping related endpoints under the same parent resource (i.e. /tasks), the API becomes intuitive, and the purpose of an endpoint can (usually) be inferred from its path and HTTP method.

Additional thoughts:

  1. If each individual count is only useful in combination with other counts (for a statistics dashboard, for example), you could possibly expose a single endpoint which aggregates and returns all counts at once.
  2. If you have an existing endpoint for listing all resources (i.e. GET /tasks for listing all tasks), the count could be included in the response as metadata, either as HTTP headers or in the response body. Doing this will incur unnecessary load on the API, which might be negligible depending on your use case.


Seeing that "X-" prefix was deprecated. Here's what I came up with:

  1. Added another item count:23 to the response
  2. Removed the item from response before using data in the app.


If resource count is resource metadata => new endpoint

If the resource count (and possibly other metadata) is useful to the client app and end user, then it doesn't need to be in a header; it can legitimately be in the response body. There can be other metadata (other totals or averages or stats in general) that are relevant for a given resource.

In this case, I'd use a specific endpoint for resource metadata. Back to the social network members example in the original question, there could be:

GET /api/members         // collection of members, which could be paginated
GET /api/members/{id}    // a single member
GET /api/members/stats   // total member count; member count per region; average posts per member; etc.

Instead of stats you may prefer metadata, totals or something else in the endpoint (YMMV). But we treat this endpoint as returning something specific for the members resource. That's why it's a separate endpoint and not a query string parameter. Analogously, we should have a separate endpoint--not a query param--for the posts of a member:

GET /api/members/{id}/posts   

If resource count is for pagination => use headers

If the client app is handling the number of resources returned in the GET request for pagination purposes, then this resource count is messaging metadata.

In this case, I agree using headers is a better approach. You should look at the answers by Stijn de Witt and adnan kamili.


When requesting paginated data, you know (by explicit page size parameter value or default page size value) the page size, so you know if you got all data in response or not. When there is less data in response than is a page size, then you got whole data. When a full page is returned, you have to ask again for another page.

I prefer have separate endpoint for count (or same endpoint with parameter countOnly). Because you could prepare end user for long/time consuming process by showing properly initiated progressbar.

If you want to return datasize in each response, there should be pageSize, offset mentionded as well. To be honest the best way is to repeat a request filters too. But the response became very complex. So, I prefer dedicated endpoint to return count.

<data>
  <originalRequest>
    <filter/>
    <filter/>
  </originalReqeust>
  <totalRecordCount/>
  <pageSize/>
  <offset/>
  <list>
     <item/>
     <item/>
  </list>
</data>

Couleage of mine, prefer a countOnly parameter to existing endpoint. So, when specified the response contains metadata only.

endpoint?filter=value

<data>
  <count/>
  <list>
    <item/>
    ...
  </list>
</data>

endpoint?filter=value&countOnly=true

<data>
  <count/>
  <!-- empty list -->
  <list/>
</data>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜