jpt 1.0 can deal with multiple JSON texts

jpt 1.0 can now deal with multiple JSON texts whether they are proper JSON Sequences or mutants like JSON Lines and NDJSON or just plain old concatenated JSON texts in one file. Even better it gives as good as it gets, it can also output in those formats as well. Let’s take a look!

Multiple JSON Text Options:
  -M "<value>" - options for working with multiple JSON texts
     S - Output JSON Text Sequences strictly conforming to RFC 7464 (default)
     N - Output newline delimited JSON texts
     C - Output concatenated JSON texts
     A - Gather JSON texts into an array, post-query and post-patching
     a - Gather JSON texts into an array, pre-query and pre-patching 

What is a JSON Text Sequence? It’s the standardized way (RFC 7464) to combine multiple JSON texts using the Record Separator (RS) control character (0x1E) as a delimiter.

The genius of JSON Text Sequences is that newlines are not made to be significant like they are in JSON Lines/NDJSON which introduce a fragility to what should be ignorable whitespace. These formats break the JSON specification. JSON sequences however are easily parsed and if needed more easily read by humans since the JSON texts do not need to be on a single line.

Let’s look at a simple JSON sequence. We can create the record separator character with the ANSI-C shell syntax $'\x1e' additional JSON-Seq requires that numbers be terminated with a newline (\n) for consistency I did that for each JSON text.

% jpt <<< $'\x1e1\n\x1e"a"\n\x1etrue\n'         
1
"a"
true

Now you might say that looks just like JSON Lines/NDJSON but what you don’t see are the RS characters, they are stripped out in Terminal. However if we send the output to a file or pipe it into bbedit you will see them appear as ¿

RS (0x1e) characters in bbedit

With jpt 1.0 we can now choose to convert multiple JSON text (whether sequences, lines, or concatenated) into an array, either pre or post query and patching.

#-Ma and -MA without a query or patch are functionally the same
% jpt -Ma <<< $'\x1e1\n\x1e"a"\n\x1etrue\n'
[
  1,
  "a",
  true
]

Now let’s look at 3 separate JSON texts of arrays and see how -Ma and -MA differ

#an array of the 3 arrays is created pre-query
#thus /2 corresponds to the last JSON text
#-Ma is most useful to get a specific JSON text
% jpt -Ma /2 <<< $'[1,2,3]\n[4,5,6]\n[7,8,9]\n'
[
  7,
  8,
  9
]

#the query /2 is run first on all 3 texts
#then those results 3, 6, and 9 are gathered into an array
% jpt -MA /2 <<< $'[1,2,3]\n[4,5,6]\n[7,8,9]\n'
[
  3,
  6,
  9
]

One of thing about JSON Lines and NDJSON is they take otherwise ignorable whitespace like newline and make it something significant, the introduce a brittleness to the otherwise robust JSON spec. Let’s see how jpt handles it

#-MN will output NDJSON and correct the missing newline
% jpt -MN <<< $'{"a":1}\n{"b":2}{"c":3}'
{"a":1}
{"b":2}
{"c":3}

As you can see a simple line based NDJSON parser would have failed with the last object, jpt treats anything that doesn’t parse as a possibly concatenated JSON.

You can also output concatenated JSON… but why?! Maybe you have your reasons, if so, use -MC

#no RS characters with concatenated JSON
jpt -MC <<< $'{"a":1}\n{"b":2}{"c":3}'
{
  "a": 1
}
{
  "b": 2
}
{
  "c": 3
}

If you work with multiple JSON texts such as JSON logs then jpt might be a useful addition to your arsenal of tools. It can be downloaded from my jpt Releases page at Github.