Extract a subtree from a document tree
In this page, we are going to show how to use the
subtree
class to extract a subtree structure from an
existing document tree using JSONPath. The subtree
class
takes as the argumnets to its constructor:
an existing document tree instance of
document_tree
type, anda JSONPath expression
in order to reference a subtree within the document tree. Once the subtree is extracted,
you can use its dump()
function to dump its
content as a JSON string.
First, let’s include the headers we need in this example code:
#include <orcus/json_document_tree.hpp>
#include <orcus/config.hpp>
#include <iostream>
#include <string_view>
Both document_tree
and
subtree
classes are provided by the
json_document_tree.hpp
header, while the config.hpp
header is to access
the orcus::json_config
struct type.
The following is the input JSON string we will be using in this example:
constexpr std::string_view input_json = R"(
{
"id": "12345",
"name": "John Doe",
"email": "johndoe@example.com",
"roles": ["admin", "editor"],
"isActive": true,
"profile": {
"age": 34,
"gender": "male",
"address": {
"street": "123 Elm Street",
"city": "Springfield",
"state": "IL",
"zipCode": "62704"
},
"phoneNumbers": [
{
"type": "home",
"number": "555-1234"
},
{
"type": "work",
"number": "555-5678"
}
]
},
"preferences": {
"notifications": {
"email": true,
"sms": false,
"push": true
},
"theme": "dark",
"language": "en-US"
},
"lastLogin": "2024-11-25T13:45:30Z",
"purchaseHistory": [
{
"orderId": "A1001",
"date": "2024-01-15T10:00:00Z",
"total": 249.99,
"items": [
{
"productId": "P123",
"name": "Wireless Mouse",
"quantity": 1,
"price": 49.99
},
{
"productId": "P124",
"name": "Mechanical Keyboard",
"quantity": 1,
"price": 200.00
}
]
},
{
"orderId": "A1002",
"date": "2024-06-10T14:20:00Z",
"total": 119.99,
"items": [
{
"productId": "P125",
"name": "Noise Cancelling Headphones",
"quantity": 1,
"price": 119.99
}
]
}
]
}
)";
It is defined as a raw string literal to make the value more human-readable.
First, let’s load this JSON string into an in-memory tree:
orcus::json::document_tree doc;
doc.load(input_json, orcus::json_config{});
We can pass the input string defined above as its first argument. The
load()
function also requires a
json_config
instance as its second argument to specify
some configuration parameters, but since we are not doing anything out of the
ordinary, a default-constructed one will suffice.
With the source JSON document loaded into memory, let’s use the
orcus::json::subtree
class to extract the subtree whose root path
is located at the path $.profile.address
of the original document:
orcus::json::subtree sub(doc, "$.profile.address");
std::cout << sub.dump(2) << std::endl;
Executing this code will generate the following output:
{
"street": "123 Elm Street",
"city": "Springfield",
"state": "IL",
"zipCode": "62704"
}
One thing to note is that a subtree
instance can only
reference the original document stored in
document_tree
. The user therefore must ensure that
the referencing instance will not outlive the original. Accessing the
subtree instance after the original document has been destroyed causes an
undefined behavior.
Note
You must ensure that the subtree instance will not outlive the original document tree instance. Accessing the subtree instance after the original document tree instance has been destroyed causes an undefined behavior.
Let’s use another example. This time, we will extract the subtree whose root path
is located at $.purchaseHistory[1].items[0]
:
orcus::json::subtree sub(doc, "$.purchaseHistory[1].items[0]");
std::cout << sub.dump(2) << std::endl;
This path includes object keys as well as array positions. Executing this code will generate the following output:
{
"productId": "P125",
"name": "Noise Cancelling Headphones",
"quantity": 1,
"price": 119.99
}
It’s important to note that, currently, subtree
only
supports a small subset of the JSONPath specification, and does not fully
support expressions involving slicing or filtering. It does, however, support
wildcards as the following example demonstrates:
orcus::json::subtree sub(doc, "$.purchaseHistory[*].items");
std::cout << sub.dump(2) << std::endl;
Executing this code will generate the following output:
[
[
{
"productId": "P123",
"name": "Wireless Mouse",
"quantity": 1,
"price": 49.99
},
{
"productId": "P124",
"name": "Mechanical Keyboard",
"quantity": 1,
"price": 200
}
],
[
{
"productId": "P125",
"name": "Noise Cancelling Headphones",
"quantity": 1,
"price": 119.99
}
]
]
It extracted the items
subtrees from both elements of the
purchaseHistory
array, and sequentially put them into a newly-created array
in order of occurrence.