I think it's great that we're talking about standardising how we build RESTful JSON APIS. However I don't think this has got it quite right yet.
What's the reason for the top level rel? It seems like it's just there to stop the urls from being repeated and to save space, but isn't that what gzip is for? Why complicate the data format and require all that extra logic and gzip would remove most of the redundancy before transmission anyway?
Also the name is a bit of an annoying land grab. It'll make it hard to talk about JSON APIs without getting them confused with JSON APIs that specifically use "JSON API".
Lastly, it really seems based on a rails active record style data store, it's assuming ids are the most important thing and that links are all relations that point to other objects within the system. Proper hyperlinks and point anywhere and can link together disperate systems which don't necessarily all use the exact same formats.
> What's the reason for the top level rel? It seems like it's just there to stop the urls from being repeated and to save space, but isn't that what gzip is for?
It also makes it possible to cache things locally indexed on their IDs, and to form URLs that make requests for just the precise documents that aren't available locally. In order to achieve this, it's necessary to have both (1) IDs, and (2) a way to convert a list of IDs into a single request for all of the documents at once.
> Also the name is a bit of an annoying land grab
HAL is "The Hypertext Application Language". In general, people tend to be using generic names for these things, so I chose an available, generic name.
> Lastly, it really seems based on a rails active record style data store, it's assuming ids are the most important thing and that links are all relations that point to other objects within the system
I reviewed a large number of server-side solutions (Firebase, Parse, CouchDB, Django, Rails) and they all had the concept of an ID for the document. As I said above, this ID is useful to keep track of which documents have already been cached locally, and how to formulate a URL that makes a request for just the missing URLs. I don't consider this solution to be particularly tied to ActiveRecord.
Less importantly, it is also more convenient to cache documents on the server using their IDs (or slugs, or whatever the storage wants to use), and allow a top-level configuration to define how to generate URLs. This allows server-side solutions to serialize and cache documents without having to be plugged into the router architecture, but enforces a URL-centric view once the HTTP response is built.
The thing is that "JSON API" was not available, it's already in common use to describe APIs that use JSON. "HAL" is totally not the same, it's clearly a name chosen to not conflict with existing terminology.
The proper layer for caching is HTTP, do we really want to end up with duplicated overlapping functionality between the layers? I see you have an application specific need for a certain thing but I don't think it generalises enough.
A few other things I noticed:
- There's no name spaceing of your special properties, they're just mixed up with the domain-specific properties. So I can't have a property of my object named "rel" and worse still you can't ever add features to the spec without breaking backwards compatibility.
- No type information in items, which kind of breaks automatic caching.
- Multiple entities returned for a URL with no indication as to which is the main one represented by that URL and which are just extra associated ones.
> The proper layer for caching is HTTP, do we really want to end up with duplicated overlapping functionality between the layers?
HTTP caching doesn't work well with compound documents that represent a graph of objects. The kind of caching described in JSON API allows an application to group together requests for documents while avoiding making requests for documents it already has. The only way HTTP caching works is if every request for a document is 1:1 with an HTTP request, which doesn't map onto my experience (and our experience with Ember Data) at all.
> I see you have an application specific need for a certain thing but I don't think it generalises enough.
This functionality was extracted out of a general purpose framework used by a number of applications that made heavy use of it. The ability to send a normalized graph of objects in a single payload, and then make a small number of additional requests, only for the documents that the client doesn't already have, is a huge win for clients working with a non-trivial number of related objects.
The rest of your concerns are very valid. FWIW, I envision future extensions being added to the `meta` section, which is already reserved. ID, URL, and rel are the building blocks of the graph, while other kinds of extensions (like pagination, etc.) are optional metadata. But you raise a good point and I will give it some thought :)
> This functionality was extracted out of a general purpose framework used by a number of applications that made heavy use of it
That's exactly it! This looks like a sensible design for an API served from rails and consumed by Ember.JS. It's just over selling it a bit to imply that this should be how all JSON APIs should be.
I don't think "served from Rails" and "consumed by Ember.js" is quite right. It was designed to work with existing servers that have facilities for easily generating JSON (Rails, Django, various Node frameworks), and smart clients that want to index local data by ID (Angular, Backbone, Ember, etc).
In short, it's an attempt to extract the learnings about efficiently transporting a non-trivial number of objects over a REST-like transport in a sometimes-incremental way. Many different server/client combinations have been attempting to do this in an ad-hoc way for years, and Ember Data was simply an attempt to try something general-purpose out in the real world.
"HAL" describes something specific and has uniqueness to it- seriously dude, pick a name that's something short of "this is the internet," even if it is doing something core such as defining interlinked data in 21c.
How's that, Interlinked Json, IJ? Relational Json? Seriously, pick something that's not going to muddy the highest level of namespace we swim in wycats.
> It also makes it possible to cache things locally indexed on their IDs, and to form URLs that make requests for just the precise documents that aren't available locally. In order to achieve this, it's necessary to have both (1) IDs, and (2) a way to convert a list of IDs into a single request for all of the documents at once.
Couldn't the id column contain a canonical url then? E.g.:
I think biggest benefit from this kind of specification would be for data (partly) distributed and (partly) shared between different hosts, as there would be at least some common ground for both clients and server how to communicate. IDs are very abstract and does not necessarily tie data under particular host. Of course for some data domains ID could be URL, but that is decision made by data provider. Spec decision would be crippling for overall use.
I'd say it's the other way around. Url's are opaque for the client - id's imply more knowledge of the implementation. E.g. the client would have to know which host to communicate with and how to structure url's from id's. With hyperlinked documents, all the client needs to know is http.
URLs are opaque, and can often serve as very useful IDs, but, alone, they imply a one-at-a-time model of fetching documents, and this spec is trying to provide a way to easily request only the documents a client needs in a compound document.
Keep in mind that this spec actually requires that every ID be able to be readily converted into a URL based on information found in the same payload, so URLs are still front-and-center in the design. It just separates out the notion of a unique identifier, so that it can be used in other kinds of requests.
How would you combine multiple authors into a single request? With IDs + a template, you can form a single request for all of the "people" you don't have yet. If you use URLs as IDs, which is one of the primary goals here.
If I understand your question correctly, I would say that http pipelining solves that issue. It can only be used for GET requests, so there are limitations.
SPDY begins to offer some very appealing alternatives where when sending a document the transport can push all of the individual dependent documents. It really does fix things, begins pushing all the data at once, in a glorious resource-oriented fashion. That said, I would also enjoy a spec that does resource-description of subresources so we can send linked data around without having to have every piece of data be an endpoint.
That said, the immediate follow on question arises- now that we're sending sub-resources, can we get the most important agent to understand and grok our sub-resources- can the browser follow our subresourcing and those subresources to their canonical URLs, and serve those subresources if it's seen them inside another document? There are two questions- first, is your spec good enough to enable that facility where addressing can be well known- here, in this Json Resource Description spec presented yes, via URI templates, very good- and second, does the browser bother to inspect the JSON it sees? No? Well, I'm not super bothered by this academic interest not being materialized, knowing at least in principle the specs make it possible.
Thanks for the link - I weren't aware there were so many issues with pipelining. I have mainly used it server to server, where it seems to present less problems (not surprisingly really - I'm in much better control of the chain of components).
It can only be used for GET requests, has problems in web browsers, and still requires the overhead of individual requests on the server side to construct and return many responses.
In theory, things like pipelining allow you to never have to worry about compound documents. In practice, I don't know anyone who has gotten this to work well for browser clients and general-purpose frameworks when dealing with non-trivial numbers of documents.
The server side support seems like the wrong thing to base the protocol design on, but I admit that is probably just me showing my limited experience with very high traffic api's. I would think though that much of this could be alleviated by proper caching. As requests would naturally be finely granulated into individual resources, presumably that could be done efficiently.
The point of browser support is probably more pressing. I'm curious as to how big an issue that still is? Which browsers support it properly these days and which don't?
I wonder if it would be worth to build an api around the assumption of support for pipelining and then provide a fallback hack for those that lack support. E.g. something similar to the good old _method hidden-field hack for lack of http method support. I'm thinking something like an optional "batch request endpoint", that would tunnel through multiple requests, similar to what a pipeline would do. I believe Facebook is offering something similar in their api's.
N.B. Pipelining can be used for any idempotent request (so PUTs and DELETEs work too). That said, lack of broad implementation support for pipelining is still an issue. Since HTTP/2.0 is being based on SPDY, hopefully we'll see a day where this is less of an issue.
What's the reason for the top level rel? It seems like it's just there to stop the urls from being repeated and to save space, but isn't that what gzip is for? Why complicate the data format and require all that extra logic and gzip would remove most of the redundancy before transmission anyway?
Also the name is a bit of an annoying land grab. It'll make it hard to talk about JSON APIs without getting them confused with JSON APIs that specifically use "JSON API".
Lastly, it really seems based on a rails active record style data store, it's assuming ids are the most important thing and that links are all relations that point to other objects within the system. Proper hyperlinks and point anywhere and can link together disperate systems which don't necessarily all use the exact same formats.