How should I represent tabular data in JSON?
I'm writing an API for retrieving data from a JDBC-connected Java Servlet via JSON. I've chosen to use JSON because we'll want to do sorts and other operations on the data in the browser, and we'll be accessing the data from across domains.
Since I'm essentially doing SQL queries in JavaScript, the data that comes back is tabular in nature. I started to write this so that you get back a list of column labels, then arrays of values, for example:
{
"columns": [
"given_name",
"surname",
],
"results": [
[
"Joe",
"Schmoe"
],
[
"Jane",
"Doe"
]
]
}
But as I start to write the JavaScript to deal with the returned data, I wonder if it might be better to just output the results with key/value pairs, such as:
{
"results": [
{
"given_name": "Joe",
"surname": "Schmoe"
},
{
"given_name": "Jane",
"surname" : "Doe"
}
]
}
If you're returning a lot of results, that's a lot of repeated text. But we're going to be transporting gzipped, so I'm not too concerned about bandwidth.
Basically, should I engineer this so that I'm accessing my data with
$.getJSON(query, function(data) {
var columns = data.columns;
var results = data.results;
$.each(results, function(key, row) {
console.log(row[columns.indexOf('surname')]);
});
});
or the much prettier
$.getJSON(query, function(data) {
var results = data.results;
$.each(results, function(key, row) {
console.log(row.surname);
});
});
?
Essentially, I want to know if the potential hit to performance justifies the much cleaner syntax of the latter option.
Follow up
I did implement it both ways and profile. Profiling was a great idea! The differences in performance were marginal. The differences in data transfer size were substantial, but with Gzip compression, the variance was down to 5-6% between both formats and betwee开发者_Go百科n very large and very small data sets. So I'm going with the prettier implementation. For this particular application, I can expect all clients to support Gzip/Deflate, so the size doesn't matter, and the computational complexity on both the client and server is similar enough that it doesn't matter.
For anyone interested, here is my data with graphs!.
Profile both. Optimize afterwards.
Synthesizing other answers:
- Your wire format doesn't have to be the same as your in-memory format.
- Profile which is better - see if it makes a difference.
- Simpler is usually better to start with.
Further:
- If you just have a page of results, and few users, then the 2nd format may be no worse than the 1st format.
- If your data is quite sparse, the 2nd format may well be better.
- If you're sending 1000's or rows of data, and you have millions of users, then it's possible that the size of data you send can start to matter, and perhaps the 1st format may help.
- You can't guarantee that all user agents support gzip / deflate, so bear this in mind.
Just another JSON structure from which I got very nice results:
{
"recordCount": 2,
"data": {
"Id": [1, 2],
"Title": ["First record", "Second record"],
"Value": [18192, 18176]
}
}
Traversing all data:
for (var i = 0; i < recordSet.recordCount; ++i) {
console.log("Record " + i.toString() + ":");
for (var field in recordSet.data)
console.log("\t" + field + ": " + recordSet.data[field][i].toString());
}
You don't have to tie your code to the more compact, but also more cumbersome format. Just write a simple JS adapter to check the returned structure for the presence of columns
. If that's missing you're dealing with a plain array of objects. If it's present you can easily map the cumbersome format to the more convenient format.
FWIW I'd go for the second option, it lends itself to cleaner JavaScript as you've observed and will also be easier for a human to read and understand. It seems to me that readability trumps whatever little performance gain you get from option 1.
I also imagine if you were to add more columns or the order of columns changed someday, with the first option, you'll likely have to rewrite a lot of the JavaScript since you'll be working with the position of the data in the response.
You can always convert your first option to its JSON representation
const tabularDataToJSON = (rows: string[][], columns: { value: string }[]) => {
return rows.map(function (row) {
const record: Record<string, string> = {}
columns.forEach((col, index) => {
record[col.value] = row[index]
})
return record
})}
const json = tabularDataToJSON(results, columns)
精彩评论