JSON to JSON transformation library written in Java.

Related tags

JSON jolt
Overview

Jolt

JSON to JSON transformation library written in Java where the "specification" for the transform is itself a JSON document.

Useful For

  1. Transforming JSON data from ElasticSearch, MongoDb, Cassandra, etc before sending it off to the world
  2. Extracting data from a large JSON documents for your own consumption

Table of Contents

  1. Overview
  2. Documentation
  3. Shiftr Transform DSL
  4. Demo
  5. Getting Started
  6. Getting Transform Help
  7. Why Jolt Exists
  8. Alternatives
  9. Performance
  10. CLI
  11. Code Coverage
  12. Release Notes

Overview

Jolt :

  • provides a set of transforms, that can be "chained" together to form the overall JSON to JSON transform.
  • focuses on transforming the structure of your JSON data, not manipulating specific values
    • The idea being: use Jolt to get most of the structure right, then write code to fix values
  • consumes and produces "hydrated" JSON : in-memory tree of Maps, Lists, Strings, etc.
    • use Jackson (or whatever) to serialize and deserialize the JSON text

Stock Transforms

The Stock transforms are:

shift       : copy data from the input tree and put it the output tree
default     : apply default values to the tree
remove      : remove data from the tree
sort        : sort the Map key values alphabetically ( for debugging and human readability )
cardinality : "fix" the cardinality of input data.  Eg, the "urls" element is usually a List, but if there is only one, then it is a String

Each transform has its own DSL (Domain Specific Language) in order to facilitate its narrow job.

Currently, all the Stock transforms just effect the "structure" of the data. To do data manipulation, you will need to write Java code. If you write your Java "data manipulation" code to implement the Transform interface, then you can insert your code in the transform chain.

The out-of-the-box Jolt transforms should be able to do most of your structural transformation, with custom Java Transforms implementing your data manipulation.

Documentation

Jolt Slide Deck : covers motivation, development, and transforms.

Javadoc explaining each transform DSL :

  • shift
  • default
  • remove
  • cardinality
  • sort
  • full qualified Java ClassName : Class implements the Transform or ContextualTransform interfaces, and can optionally be SpecDriven (marker interface)
    • Transform interface
    • SpecDriven
      • where the "input" is "hydrated" Java version of your JSON Data

Running a Jolt transform means creating an instance of Chainr with a list of transforms.

The JSON spec for Chainr looks like : unit test.

The Java side looks like :

Chainr chainr = JsonUtils.classpathToList( "/path/to/chainr/spec.json" );

Object input = elasticSearchHit.getSource(); // ElasticSearch already returns hydrated JSon

Object output = chainr.transform( input );

return output;

Shiftr Transform DSL

The Shiftr transform generally does most of the "heavy lifting" in the transform chain. To see the Shiftr DSL in action, please look at our unit tests (shiftr tests) for nice bite sized transform examples, and read the extensive Shiftr javadoc.

Our unit tests follow the pattern :

{
    "input": {
        // sample input
    },

    "spec": {
        // transform spec
    },

    "expected": {
        // what the output of the transform looks like
    }
}

We read in "input", apply the "spec", and Diffy it against the "expected".

To learn the Shiftr DSL, examine "input" and "output" json, get an understanding of how data is moving, and then look at the transform spec to see how it facilitates the transform.

For reference, this was the very first test we wrote.

Demo

There is a demo available at jolt-demo.appspot.com. You can paste in JSON input data and a Spec, and it will post the data to server and run the transform.

Note

  • it is hosted on a free Google App Engine instance, so it may take a minute to spin up.
  • it validates in input JSON and spec client side.

Getting Started

Getting started code wise has its own doc.

Getting Transform Help

If you can't get a transform working and you need help, create and Issue in Jolt (for now).

Make sure you include what your "input" is, and what you want your "output" to be.

Why Jolt Exists

Aside from writing your own custom code to do a transform, there are two general approaches to doing a JSON to JSON transforms in Java.

  1. JSON -> XML -> XSLT or STX -> XML -> JSON

Aside from being a Rube Goldberg approach, XSLT is more complicated than Jolt because it is trying to do the whole transform with a single DSL.

  1. Write a Template (Velocity, FreeMarker, etc) that take hydrated JSON input and write textual JSON output

With this approach you are working from the output format backwards to the input, which is complex for any non-trivial transform. Eg, the structure of your template will be dictated by the output JSON format, and you will end up coding a parallel tree walk of the input data and the output format in your template. Jolt works forward from the input data to the output format which is simpler, and it does the parallel tree walk for you.

Alternatives

Being in the Java JSON processing "space", here are some other interesting JSON manipulation tools to look at / consider :

  • jq - Awesome command line tool to extract data from JSON files (use it all the time, available via brew)
  • JsonPath - Java : Extract data from JSON using XPATH like syntax.
  • JsonSurfer - Java : Streaming JsonPath processor dedicated to processing big and complicated JSON data.

Performance

The primary goal of Jolt was to improve "developer speed" by providing the ability to have a declarative rather than imperative transforms. That said, Jolt should have a better runtime than the alternatives listed above.

Work has been done to make the stock Jolt transforms fast:

  1. Transforms can be initialized once with their spec, and re-used many times in a multi-threaded environment.
    • We reuse initialized Jolt transforms to service multiple web requests from a DropWizard service.
  2. "*" wildcard logic was redone to reduce the use of Regex in the common case, which was a dramatic speed improvement.
  3. The parallel tree walk performed by Shiftr was optimized.

Two things to be aware of :

  1. Jolt is not "stream" based, so if you have a very large Json document to transform you need to have enough memory to hold it.
  2. The transform process will create and discard a lot of objects, so the garbage collector will have work to do.

Jolt CLI

Jolt Transforms and tools can be run from the command line. Command line interface doc here.

Code Coverage

Build Status

For the moment we have Cobertura configured in our poms.

mvn cobertura:cobertura
open jolt-core/target/site/cobertura/index.html

Currently, for the jolt-core artifact, code coverage is at 89% line, and 83% branch.

Release Notes

Versions and Release Notes available here.

Comments
  • Not able to use multiple spec's on the basis of conditional statements

    Not able to use multiple spec's on the basis of conditional statements

    Hi,

    I am new to Jolt. I need to use multiple specs (like shift, remove) on the basis of conditional statements.

    For example:

    Input:

    {
        "data": {
            "1234": {
                "clientId": "12",
                "hidden": true
            },
            "1235": {
                "clientId": "35",
                "hidden": false
            }
        }
    }
    

    Now, on the basis of some conditions, I need to perform remove or shift specs. eg.

    spec:

    [
        {
            "operation": "shift",
            "spec": {
                "data": {
                    "*": {
                        "hidden": {
                            "true": {
                                 // Use remove spec here
                            },
                            "false": {
                                 // Use shift spec here
                            }
                        }
                    }
                }
            }
        }
    ]
    

    I am not able to implement the above task. Can you please let me help me how to do it. Also, if this is not the correct way to do it, then what should I do to make it work on the basis of the specified conditions. Kindly provide some solution for the same.

    2 Shifts good example 
    opened by dhruvsinha8 22
  • URGENT: Convert simple input  JSON to complex/dynamic output JSON

    URGENT: Convert simple input JSON to complex/dynamic output JSON

    I need help in converting below INPUT JSON to OUTPUT JSON, which I am unable to completely achieve with 'shift' spec and hence have some queries to make the transformation. Please help.

    This is the INPUT JSON (there will be multiple Category and QuestionIDs respectively) -

    {  
      "SurveyID":"783_FG-123_4567",
      "SurveyName":"Shopping 2014 to Date",
      "SourceSystemID":1234,
      "COID":123,
      "InternalLocNum":"123456",
      "RespPeriod":"2016-10",
      "ShopDate":"2016-10-04T00:00:00",
      "ShopPointsPoss":100,
      "ShopPointsRec":100,
      "ShopID":12345678,
      "Category":[  
        {  
          "ID":1234,
          "MetricOrder":123,
          "Category":"GENERAL INFORMATION",
          "CatPointsRec":0,
          "CatPointsPoss":0,
          "QuestionID":[  
            {  
              "ID":12345,
              "Type":"Multiple Choice",
              "SortOrder":"0.6.",
              "Text":{  
                "lang":1,
                "Number":"0.6.",
                "Question":"Shake Required?"
              },
              "QTR":{  
                "ID":123456,
                "Answer":"No"
              }
            }
          ]
        }
      ]
    }
    

    This is the expected OUTPUT JSON (there will be single items[] and details[] per record, however there will be multiple detail[] under details[]) -

    {  
      "organizationId":"FG+FG",
      "missionId":"FGSRV123",
      "items":[  
        {  
          "missionId-storeId":"FGSRV123-12345678",
          "date-id":"2016-10-05T00:00:00",
          "date":"2016-10-05T00:00:00Z",
          "title":"Overall score (test)",
          "missionId":"FGSRV123",
          "storeId":"12345678",
          "meta":{  
            "coid":"123",
            "internalLocNum":"123456",
            "sourceSystemId":"1234",
            "respPeriod":"2016-10"
          },
          "importDate":"2016-12-21T12:45:47.387Z",
          "isSuccess":true,
          "pointsMax":100,
          "points":85,
          "details":[  
            {  
              "detail":[  
                {  
                  "meta":{  
                    "answers":[  
                      {  
                        "text":"No",
                        "id":"123456"
                      }
                    ]
                  },
                  "id":"12345",
                  "title":"Milkshake Required?",
                  "number":"0.6.",
                  "sortOrder":"0.6.",
                  "points":0,
                  "pointsMax":0,
                  "description":"No"
                }
              ]
            }
          ]
        }
      ]
    }
    
    opened by avish-3pg 16
  • Difference between JsonUtils.classpathToObject and ClassLoader.getResourceAsStream?

    Difference between JsonUtils.classpathToObject and ClassLoader.getResourceAsStream?

    I've got a simple test controller:

        public Result transform(String applicationName, String entityType) throws Exception {
            String specPath = "jolt/" + applicationName + "/" + entityType + "-spec.json";
            InputStream is = this.getClass().getClassLoader().getResourceAsStream(specPath);
            // Object spec = JsonUtils.classpathToObject(specPath);
            Object spec = JsonUtils.jsonToObject(is);
            Chainr chainr = Chainr.fromSpec(spec);
            String input = request().body().asJson().toString();
            Object transformed = chainr.transform(JsonUtils.jsonToObject(input));
            String json = JsonUtils.toJsonString(transformed);
            return ok(json);
        }
    

    If I use get the InputStream from the ClassLoader read it with JsonUtils.jsonToObject it works correctly.

    If I try to use JsonUtils.classpathToObject then the following exception is thrown:

    Caused by: java.lang.RuntimeException: Unable to load JSON object from InputStream.
            at com.bazaarvoice.jolt.JsonUtilImpl.jsonToObject(JsonUtilImpl.java:105)
            at com.bazaarvoice.jolt.JsonUtilImpl.classpathToObject(JsonUtilImpl.java:198)
            at com.bazaarvoice.jolt.JsonUtils.classpathToObject(JsonUtils.java:157)
            at com.company.TestTransformingController.transform(TestTransformingController.java:19)
            at router.Routes$$anonfun$routes$1$$anonfun$applyOrElse$1$$anonfun$apply$1.apply(Routes.scala:73)
            at router.Routes$$anonfun$routes$1$$anonfun$applyOrElse$1$$anonfun$apply$1.apply(Routes.scala:73)
            at play.core.routing.HandlerInvokerFactory$$anon$4.resultCall(HandlerInvoker.scala:157)
            at play.core.routing.HandlerInvokerFactory$$anon$4.resultCall(HandlerInvoker.scala:156)
            at play.core.routing.HandlerInvokerFactory$JavaActionInvokerFactory$$anon$14$$anon$3$$anon$1.invocation(HandlerInvoker.scala:136)
            at play.core.j.JavaAction$$anon$1.call(JavaAction.scala:73)
    Caused by: com.fasterxml.jackson.databind.JsonMappingException: No content to map due to end-of-input
     at [Source: UNKNOWN; line: 1, column: 0]
            at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:255)
            at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:3851)
            at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3792)
            at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2867)
            at com.bazaarvoice.jolt.JsonUtilImpl.jsonToObject(JsonUtilImpl.java:102)
            at com.bazaarvoice.jolt.JsonUtilImpl.classpathToObject(JsonUtilImpl.java:198)
            at com.bazaarvoice.jolt.JsonUtils.classpathToObject(JsonUtils.java:157)
    

    Am I doing something wrong, or is this behavior expected?

    Thanks

    opened by efenderbosch 16
  • JSON transform to String type

    JSON transform to String type

    Input is:

    {
      "key": 342,
      "data": {
        "t1": 1234,
        "t2": "test"
      }
    }
    

    Expect output is:

    {
      "key": 342,
      "data": "{
        \"t1\": 1234,
        \"t2\": \"test\",
      }"
    }
    

    I have use last spec to test:

    [
      {
        "operation": "modify-overwrite-beta",
        "spec": {
          "data": "=toString(@(1,data))"
        }
      }
    ]
    

    but it output:

    {
    	"key": 342,
    	"data": "{t1=1234, t2=test}"
    }
    

    the output of data is not JSON.

    can some one help me ? thanks

    opened by youtNa 13
  • Issue #556 - string padding

    Issue #556 - string padding

    pad function accomplishes left and right padding. Determination of left or right padding is facilitated through the order of parameters.

    For left padding, the padding width is before the source string.

    =pad('X', 10, 'fox')
    

    Output XXXXXXXfox

    For right padding, the padding width is after the source string.

    =pad('X', 'fox', 10)
    

    Output foxXXXXXXX

    opened by abrathovde 13
  • JOLT : Grouped as array while transforming the json

    JOLT : Grouped as array while transforming the json

    Hello @milosimpson ,

    I was trying to convert the json from one JSON Format to another JSON Format using JOLT, But i'm not able to get the expected output. I have included the my work around below. Any help will be appreciated. Thanks.

    My input.json

    {
        "Result": {
        },
        "Content": [
            {
                "MovieDetails": [
                    {
                        "Key": "TicketNumber",
                        "Value": "DF-0001"
                    },
                    {
                        "Key": "MovieName",
                        "Value": "Test"
                    }
                ]
            },
            {
                "MovieDetails": [
                    {
                        "Key": "TicketNumber",
                        "Value": "DF-0002"
                    },
                    {
                        "Key": "MovieName",
                        "Value": "Test2"
                    }
                ]
            },
            {
                "MovieDetails": [
                    {
                        "Key": "TicketNumber",
                        "Value": "DF-0003"
                    },
                    {
                        "Key": "MovieName",
                        "Value": "Test3"
                    }
                ]
            }
        ]
    }
    
    

    My Expected Output.json

    {
    "Result": {
    
    },
    "Content": [
        {
            "MovieDetails": {
                "TicketNumber": "DF-0001",
                "MovieName": "Test1"
            },
            "MovieDetails": {
                "TicketNumber": "DF-0002",
                "MovieName": "Test2"
            }.
            "MovieDetails": {
                "TicketNumber": "DF-0003",
                "MovieName": "Test3"
            }
        }
    ]
    } 
    

    My Actual Output.json

    {
    "Result": {
    
    },
    "Content": [
        {
            "MovieDetails": {
                "TicketNumber": ["DF-0001","DF-0002","DF-0003"],
                "MovieName": ["Test1","Test2","Test3"]
            }
        }
    ]
    } 
    

    My Spec.json

    [
        {
            "operation": "shift",
            "spec": {
                "Result": {
                    "*": "Result.&"
                },
                "Content": {
                    "*": {
                        "MovieDetails": {
                            "*": {
                                "Value": "Content.MovieDetails.@(1,Key)"
                            }
                        }
                    }
                }
            }
        }
    ]
    
    opened by ramuece09 12
  • Can we make the spec template changes dynamic in nature?

    Can we make the spec template changes dynamic in nature?

    Lets say I make a change in spec and I want my code to dynamically pick up the mapping change without re building the code and starting the application

    Is this possible?

    opened by sharmmoh1983 12
  • Need helps on transform nested JSON

    Need helps on transform nested JSON

    Currently, I have input as below.

    {
      "id": "100",
      "fulfilments": {
        "edges": [
          {
            "node": {
              "id": "117",
              "items": {
                "edges": [
                  {
                    "node": {
                      "id": "129",
                      "req_qty": 3,
                      "fil_qty": 0,
                      "rej_qty": 3,
                      "orderItem": {
                        "id": "95",
                        "ref": "22020006"
                      }
                    }
                  },
                  {
                    "node": {
                      "id": "128",
                      "req_qty": 2,
                      "fil_qty": 0,
                      "rej_qty": 2,
                      "orderItem": {
                        "id": "94",
                        "ref": "22020005"
                      }
                    }
                  }
                ]
              }
            }
          },
          {
            "node": {
              "id": "116",
              "ref": "-1",
              "status": "EXPIRED",
              "items": {
                "edges": [
                  {
                    "node": {
                      "id": "127",
                      "req_qty": 3,
                      "fil_qty": 0,
                      "rej_qty": 3,
                      "orderItem": {
                        "id": "94",
                        "ref": "22020005"
                      }
                    }
                  },
                  {
                    "node": {
                      "id": "126",
                      "req_qty": 3,
                      "fil_qty": 0,
                      "rej_qty": 3,
                      "orderItem": {
                        "id": "93",
                        "ref": "22020004"
                      }
                    }
                  }
                ]
              }
            }
          }
        ]
      }
    }
    

    Expected output:

    {
        "id": "100",
        "fulfilments":[
            {
                "fulfilment_id": 117,
                "items":[
                    {
                        "sku": "22020006", //from orderItem.ref
                        "req_qty": 3,
                        "fil_qty": 0,
                        "rej_qty": 3
                    },{
                        "sku": "22020005", //from orderItem.ref
                        "req_qty": 2,
                        "fil_qty": 0,
                        "rej_qty": 2
                    }
                ]
            },{
                "fulfilment_id": 116,
                "items":[
                    {
                        "sku": "22020005", //from orderItem.ref
                        "req_qty": 3,
                        "fil_qty": 0,
                        "rej_qty": 3
                    },{
                        "sku": "22020004", //from orderItem.ref
                        "req_qty": 3,
                        "fil_qty": 0,
                        "rej_qty": 3
                    }
                ]
            }
        ]
    }
    
    

    Could you please advise?

    opened by nessoft 11
  • NEED URGENT HELP : How to convert complicated JSON nested array with JOLT?

    NEED URGENT HELP : How to convert complicated JSON nested array with JOLT?

    I'm trying to convert nested arrays into objects depending on the number of values in the second nested array. I can't seem to get the number of the value fields and use that as a key in my spec. Now this is my input JSON file:

    `{
    "meta": {
      "regId": "us",
      "cId": "SomeProduct",
      "weId": 15
    
    },
    "data": {
         "name": "R",
      "details": {
       "headers": [
         "id",
       "cityId",
       "cityName"
    
     ],
     "values": [
       [
         1539,
         17,
         "Moskow"
       ],
       [
         1540,
         18,
         "Berlin"
       ],
       [
        1541,
         19,
         "Vienna"
       ]
        ]
      }
    }
        }`
    

    My desired output: [ {"regId": "us", "cId": "SomeProduct", "weId": 15, "name":"R", "id":1539, "cityId":17, "cityName":Moskow }, {"regId": "us", "cId": "SomeProduct", "weId": 15, "name":"R", "id":1540, "cityId":18, "cityName":Berlin }, {"regId": "us", "cId": "SomeProduct", "weId": 15, "name":"R", "id":1541, "cityId":19, "cityName":Vienna } ]

    My current spec: [ { "operation": "shift", "spec": { "meta": { "*": "&" }, "data": { "name": "&", "details": { "values": { "*": { "*": "&1.@(3,headers[&0])" } } } } } } ]

    Hey @MagdaToczek do you mind taking a look?

    opened by nenyri 10
  • Extract/Remove substring value from input string

    Extract/Remove substring value from input string

    Hello @milosimpson ,

    I was wondering if it is possible to extract or remove a part of string value from the input.

    For example, if we can want to extract last part of the string for "Values": "first_second_third" into "Values": "third".

    Also, if we can lets say remove double quotes ("") from a input string value like for example "Values": "[{\"first\": \"second\"}]" into "Values": [{"first": "second"}]; as we can see that not only removes the '\' but also converts the string value into a JSON list.

    Below is an actual example of my use case -

    Input JSON -

    {
    "ID": "first_second_third",
    "Values": 
    "[{\"QuestionId\":\"R01\",\"Response\":null,\"NumberValue\":1},{\"QuestionId\":\"R02\",\"Response\":null,\"NumberValue\":2}]"
    }
    

    Output JSON -

    "ID": "third",
    "Values": 
    [
    {"QuestionId":"R01","Response":null,"NumberValue":1},
    {"QuestionId":"R02","Response":null,"NumberValue":2}
    ]
    

    Please advice. Thank you.

    opened by avish-3pg 10
  • Modify-overwrite-beta

    Modify-overwrite-beta

    Hi,

    Could you please explain below output?

    i/p
    {
    	"customer_orders": {
    		"id": 1,
    		"ship_locations": [
    			{
    				"ship_location_id": 25
    			},
    			{
    				"ship_location_id": 26
    			}
    		]
    	},
    	"carrier_orders": {
    		"id": 2,
    		"ship_locations": [
    			{
    				"ship_location_id": 27
    			},
    			{
    				"ship_location_id": 28
    			}
    		]
    	}
    }
    
    Specs:
    
    [
      {
        "operation": "remove",
        "spec": {
          "customer_orders": {
            "ship_locations": ""
          }
        }
      },
      {
        "operation": "shift",
        "spec": {
          "context_id": "shipmentId",
          "customer_orders": {
            "@(2,carrier_orders.ship_locations)": "customer_orders.ship_locations",
            "*": "&1.&0"
          },
          "*": "&0"
        }
    	},
    
      {
        "operation": "modify-overwrite-beta",
        "spec": {
          "customer_orders": {
            "ship_locations": {
              "*": {
                "ship_location_id": "=concat(customer-,@(1,ship_location_id))"
              }
            }
          }
        }
      }
    ]
    
    
    output:
    
    {
      "customer_orders" : {
        "ship_locations" : [ {
          "ship_location_id" : "customer-27"
        }, {
          "ship_location_id" : "customer-28"
        } ],
        "id" : 1
      },
      "carrier_orders" : {
        "id" : 2,
        "ship_locations" : [ {
          "ship_location_id" : "customer-27"
        }, {
          "ship_location_id" : "customer-28"
        } ]
      }
    }
    
    According to me output should be
    
    {
      "customer_orders" : {
        "ship_locations" : [ {
          "ship_location_id" : "customer-27"
        }, {
          "ship_location_id" : "customer-28"
        } ],
        "id" : 1
      },
      "carrier_orders" : {
        "id" : 2,
        "ship_locations" : [ {
          "ship_location_id" : "27" // Because overwrite applied for customer_orders only
        }, {
          "ship_location_id" : "28" // same reason as above
        } ]
      }
    }
    
    
    opened by SUNYTYAGI 10
  • JSON Transformation : want help to fetch array list from 3 level down with proper mapping

    JSON Transformation : want help to fetch array list from 3 level down with proper mapping

    Input : { "Report": { "level 10": { "class 1": { "Message": "demo message", "Details": { "Detail 1": { "IDENTIFICATION": { "PAN": [ "1" ] } }, "Detail 2": { "IDENTIFICATION": { "PAN": [ "2", "3" ] } }, "Detail 3": { "IDENTIFICATION": { "PAN": [ "4" ] } } } }, "class 2": { "Message": "Demo message", "Details": { "Detail 1": { "IDENTIFICATION": { "PAN": [ "5" ] } }, "Detail 2": { "IDENTIFICATION": { "PAN": [ "6", "7" ] } }, "Detail 3": { "IDENTIFICATION": { "PAN": [ "8" ] } } } } } } }

    Expected Output : { "Report": { "level 10 PAN": [ { "class index": "class 1", "detail index": "detail 1", "PAN": "1" }, { "class index": "class 1", "detail index": "detail 2", "PAN": "2" }, { "class index": "class 1", "detail index": "detail 2", "PAN": "3" }, { "class index": "class 1", "detail index": "detail 3", "PAN": "4" }, { "class index": "class 2", "detail index": "detail 1", "PAN": "5" }, { "class index": "class 2", "detail index": "detail 2", "PAN": "6" }, { "class index": "class 2", "detail index": "detail 2", "PAN": "7" }, { "class index": "class 2", "detail index": "detail 3", "PAN": "8" } ] } }

    opened by ketulpatel03 1
  • Performance issues with Defaultr

    Performance issues with Defaultr

    Hey, I'm trying to transform a lot of small sized JSON objects with Jolt and noticed that Shiftr and Defaultr have a huge gap in runtime performance. Shiftr is doing really great while Defaultr is magnitudes slower. I've profiled my use case a bit and found the bottleneck to be DeepCopy.simpleDeepCopy( literalValue ). The deep copy uses Java Serialization with InputStream and OutputStream and makes up for over 80% of the total Defaultr runtime. I'm not quite sure why a spec like this needs any deep copying:

    {
      "operation": "default",
      "spec": {
        "deviceIdType": "name"
      }
    }
    opened by dwieland 1
  • Help with key lookup

    Help with key lookup

    Hi everyone,

    I've spent quite a lot of time figuring it out but I'm stuck, i have a nested JSON and i want to enrich the values of "attr" with those matching the keys of "codes", please help :pray:

    Here is my input :

    {
      "items": {
        "a1b2xxxx": {
          "name": "item 1",
          "attr": [
            "A",
            "B",
            "C"
          ]
        },
        "c2b2cxxxx": {
          "name": "item 2",
          "attr": [
            "D",
            "E",
            "F"
          ]
        }
      },
      "codes": {
        "A": {
          "color": "green"
        },
        "B": {
          "size": "M"
        },
        "C": {
          "sku": "NS"
        },
        "D": {
          "stock": 2
        },
        "E": {
          "some_key": "some_value"
        },
        "F": {
          "foo": "bar"
        }
      }
    }
    

    This is the desired output :

    {
      "items": {
        "a1b2xxxx": {
          "name": "item 1",
          "attr":  {
            "A" : {"color":"green"},
            "B" : {"size" : "M"},
            "C" : {"sku" : "NS"}
          }
          
        },
        "c2b2xxxx": {
          "name": "item 2",
          "attr": {
            "D" : {"stock": 2},
            "E" : {"some_key": "some_value"},
            "F" : {"foo" : "bar"}
          }
        }
      },
      "codes": {
        "A": {
          "color": "green"
        },
        "B": {
          "size": "M"
        },
        "C": {
          "sku": "NS"
        },
        "D": {
          "stock": 2
        },
        "E": {
          "some_key": "some_value"
        },
        "F": {
          "foo": "bar"
        }
      }
    }
    
    opened by arshad10244 0
  • Help with modifying a pattern of strings

    Help with modifying a pattern of strings

    Can someone help me with the spec for this transformation? Thanks a lot. The length of the string may change, and the number of "." is also not fixed. After the jolt transform, I only want to keep the part of the string which comes after the last "." So I don't think substring is gonna work in this case.

    JSON input: { "myString": "keep.only.the.last.word" }

    Expected output: { "myString": "word" }

    opened by yq4103 1
  • Transform Flat JSON to an array

    Transform Flat JSON to an array

    Hi all, need help with what's probably a basic transform, but can't find an example.

    Here's my input and desired output:

    INPUT

    { "1": "Value1", "2": "Value2", "3": "Value3" }

    Desired Output

    { "to": "table_name", "data" [ { "1": { "value": "Value1" }, "2": { "value": "Value2" }, "3": { "value": "Value3" } } ] }

    Thanks!

    opened by marcoconnor-virtustream 0
Releases(jolt-0.1.6)
  • jolt-0.1.6(Mar 28, 2022)

  • jolt-0.1.5(Aug 19, 2021)

    BELL-5125: Fix security vulnerability in Jolt and release a new version Add support for Object[] to Function Added support for Object[] in Cardinality spec Update README.md Update OWNERS

    Source code(tar.gz)
    Source code(zip)
  • jolt-0.1.1(May 1, 2018)

    String manipulation

    • leftPad and rightPad
    • trim
    • substring
    • split

    Math

    • intSubtract, doubleSubtract, and longSubtract

    Object Manipulation

    • squashNulls - works on Lists and Maps
    • recursivelySquashNulls - recursively works on Lists and Maps

    Fixes for

    • size modify function
    • JoltUtils navigate methods
    • parens in string literals
    Source code(tar.gz)
    Source code(zip)
  • jolt-0.1.0(Dec 9, 2016)

  • jolt-0.0.24(Oct 26, 2016)

    More "modify-beta" functions : "divide" and "divideAndRound".

    Turns out the "elementAt" function did not break, it was just that params in the "method signature" were flipped.

    This release is being used by Bazaarvoice in production.

    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.23(Oct 1, 2016)

    • Fixed bug in Jolt Sort.
    • Usability
      • Better Exceptions thrown when parsing a Json
      • Ability to specify your own ClassLoader to lookup Transform classes

    Do not use this release if you use the "modify-beta" transform.

    • More Functions for the "modify-beta" transforms, but now some are broken
      • array "avg"
      • toBoolean
      • size
      • toString()
    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.22(Jul 14, 2016)

    Context

    Historically, Jolt has been focused on fixing / operating on the "format" of the input JSON. It did not have a good solution / Transform for modifying the actual data, namely the "right hand side" of the input.

    "modify" is the product of a bunch of refactoring of Shiftr.

    • The "left hand side" of the spec is basically Shiftr; it does a similar parallel tree walk of the input and the spec, and reuses a lot of the Shiftr wildcard logic.
    • The "right hand side" of modify determines what will be put in the data, while the "right hand side" of shift determines where the existing data should be moved to in the new output Map.

    Note, Modify operates on the same in memory copy of the data, whereas Shift creates a new top level output Map to populate.

    Usage and Special Characters

    Modify comes in 3 "flavors" that control how it operates on the input data, both at a leaf level but also as it is walking the tree.

    • "modify-overwrite-beta" -- (always writes)
    • "modify-default-beta" -- (writes when key/index is missing or the value at key/index is null)
    • "modify-define-beta" -- (writes when key/index is missing)

    The idea is to pick the base flavor you want, and then tweak modify's behavior on a case by case basis by applying node specific overrides.

    • "+key": "..." + means overwrite
    • "~key": "..." ~ means default
    • "_key": "..." _ means define

    Additionally the "?" character can suppress the modify operation, to only operate if the key/index exists

    • "key?": "..." ? means only act if input contains that key/index

    Example, say we want to process a document that may or may not have an "address" subsection.

    Spec :
    {
       "address?" : {  // If address exists in the input JSON, then match otherwise skip
           "~state" : "Texas"  // means if "state" does not exist or is null, then make it be "Texas"
       }
    }
    

    Functions

    Everything on the "right hand side" of modify is actually a "function". In the example above the "right hand side" of "Texas" is actually the function "insert this literal value".

    Beyond that, you can invoke "named" functions by using the "=" special character.

    Example : Say the input has a list of scores, and we want the min and max values of the list.

    input :
    {
      "scores" : [ 4, 2, 8, 7, 5 ]
    }
    
    spec :
    {
          // Pull individual data out of the scores array
          "firstScore" : "=firstElement(@(1,scores))",
          "lastScore" : "=lastElement(@(1,scores))",
    
          // Assuming that the scores array is always size of 5
          "scoreAtMidPoint" : "=elementAt(@(1,scores),2)"
    }
    
    output :
    {
      "scores" : [ 4, 2, 8, 7, 5 ],
      "firstScore" : 4,
      "lastScore" : 5,
      "scoreAtMidPoint" : 8
    }
    

    Available functions:

    Existence

    • isPresent - returns if arg[0] is present
    • notNull - returns if arg[0] is not null
    • isNull - returns if arg[0] is null

    Strings

    • toUpper - returns the Uppercased version of the input
    • toLower - returns the Lowercased version of the input
    • concat - String concatenate all the supplied arguments

    Lists

    • toList - returns args as list
    • firstElement - returns first element
    • lastElement - returns last element
    • elementAt - returns element at # index

    Math

    • max(args) - returns max element from the list of args
      • supports int, long, double, and their toString() values
    • min(args) - returns min element from list of args
      • supports int, long, double, and their toString() values
    • abs - returns abs of value
      • supports int, long, double, and their toString() values
      • supports list of the same inputs, returns list

    Type Conversion

    • toInteger - returns toInteger()
    • toLong - returns toLong()
    • toDouble - returns toDouble()

    Note all off the Type Conversion functions support

    • int, long, double, and their toString() values
    • list of the same inputs, returns list

    Elvis Operator

    Use an Array on the "right hand side" to specify series of functions to run / try until one returns a non-Optional.absent() result.

    • "key": [ "=func1", "=func2" ]

    Purpose, allows for looking up a value, but if it is not found, applying a default.

    Example, back to the "min and maxScore" example from above,

    spec :
    {
       // if the input document did not have a "scores" entry,
       //  or it was empty,
       //  or it did not contain any 'numbers'  
       // then fall back to null and zero
       "maxScore" : [ "=max(@(1, scores))", null],
       "minScore" : [ "=min(@(1, scores)), 0]"
    }
    

    Details about the Java Implementation

    Introduced a Function interface, marked as @Deprecated as a warning

    • As it is work in progress and implementation outside Jolt is discouraged.

    Changed baseSpec#apply(...) signature to supply availability of input

    • via Optional, which is known at lower level
    • Matched signature change into Shiftr and Cardinality, now it is possible to introduce "?" into them if needed

    Beta

    The new Modify Transform in this release is sufficient for the Bazaarvoice internal project that needs it. Usecases beyond that are not fully thought out / tested.

    In general, it still feels like it needs some work / polish before we consider it done, but the ability to do Type conversions and String concatenation are compelling and often requested features.

    Plans

    Things to do to finish the "beta"

    • Make sure all the things can be "escaped".
    • If possible, fully implement the existing behavior of the current "cardinality" and "default" transforms as functions of Modify, so they can be deprecated / removed.
      • The "default" transform is dated and clunky, as it has not been refactored and curated like Shiftr has.
    • Expand the set of built in functions.
      • min and max work on lists
      • average
      • toBoolean
      • toString
      • length of list, string
      • String join : concat is good, but it can have a fencepost problem
      • sort

    After beta

    • Allow for users to specify their own functions.
      • Tricky, as one wants to provide some guard rails, but at the same time not limit what ppl can do.
    • Allow for a "registry" of Transforms and Functions.
      • But not as a simple Static Map, as that can cause problems if to libraries internally used Jolt.
    • Maybe "backport" the functions to Shiftr to allow for some interesting use-cases.

    Even further out

    • Allow functions to be specified inline with the spec as some kind of scripting language, aka JavaScript or Groovy
      • Doubly tricky
      • Build and arrangement of Jars, so that "jolt-core" suddenly does have a Groovy dependency
      • The interface between Jolt code and the script language.
    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.21(May 10, 2016)

    Primary driver is users wanting to be able to use Jolt Shiftr to process Json-LD flavored documents.

    See this test for examples. https://github.com/bazaarvoice/jolt/blob/c8e6ce62a2ea7551333d666d16ddba9f11feb7c4/jolt-core/src/test/resources/json/shiftr/escapeAllTheThings.json

    Note that, because Java, the "backslash" as to be doubled.

    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.20(May 10, 2016)

  • jolt-0.0.19(Jan 4, 2016)

    Interesting and powerful feature; for an example see : https://raw.githubusercontent.com/bazaarvoice/jolt/f8f18e8544f7f3ffef8c5a8f9e059a844857387f/jolt-core/src/test/resources/json/shiftr/transposeNestedLookup.json

    Additionally, "fixed" an issue where Shiftr could not differentiate between something not existing or a legitimate null in the input data.s

    However this necessitated two small backward incompatibilities

    1. Low level Java interfaces used by Shiftr to output data, generally got updated to have an Optional return type.
    2. If you had specs that relied upon the behavior of not passing legitimate null, then you will need to do some work.
    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.18(Dec 31, 2015)

    • ArrayOrderObliviousDiffy works better now
    • "@" Transpose / Lookup logic now works from inside an Array reference
    • Shiftr Array logic handles negative numbers better
    • Shfrt can now pass input "null"s thru to the output
    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.17(Dec 28, 2015)

    The "remove" transform can now fully handle arrays.

    • As top level inputs
    • Recursing thru Arrays
    • Removing specific array indices

    Shift specs can now have "." in the output path if escaped

    So that you can have output paths like "ref.local" that do not have that be nested maps.

    Example: RHS Shiftr values and how they would manifest in the ouput tree

    • "data.ref.local" --> { "data" : { "ref" : { "local" : X } } }
    • "data.ref.local" --> { "data" : { "ref.local" : X } }

    Additional Fixes

    • Performance fix to ArrayOrderObliviousDiffy
    • Updated dependency versions
    • Fixed issue with Travis CI

    Backwards incompatible Change

    The class ChainrFactory was in the wrong package; "bazarvoice" instead of "bazaarvoice".
    If you were referencing that class, you will need to fix you imports.

    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.16(Apr 27, 2015)

    You should be able to reuse Chainr instances to perform multiple transforms. There was a bug in the "shift" operation where "per run" information was leaking into the reusable spec.

    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.15(Mar 9, 2015)

    Compiles with Java 1.7

    Updated Dependencies

    • Jackson 2.2.3 -> 2.5.0
    • commons-lang 2.6 -> commons-lang3 3.3.2
    • Guava 16.0.1 -> 18.0
    • TestNG 6.8.7 -> 6.8.21
    • ArgParse 0.4.2 -> 0.4.4

    Code Cleanup based on Intellij inspections

    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.14(Oct 13, 2014)

    Ability to transpose data / do rudimentary filtering -> new '@' logic

    • long standing issue with Shiftr.

    Ability to "hardcode" a value in the spec file to get placed in the output -> new '#' logic

    • created to handle a use-case of processing a Boolean value.
    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.13(Aug 21, 2014)

    • Fixed Bug in Removr/StarDoublePathElement
      • StarDoublePathElement was throwing StringOutOfBoundExceptions when the spec is of format abc-$ and the key to match was abc-1.
      • Added more boundary condition test cases.
    • Added more utility functions in StringUtils and JsonUtils
    • Bumped the Guava version to 16.0.1
    • Fixed minor typos.
    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.12(Feb 14, 2014)

    New JsonUtil interface "extracted" from existing JsonUtils class static methods

    • Interface and Impl will allow clients to specify their own pre-configured ObjectMapper
    • JsonUtils static methods refactored to use a "stock" JsonUtil instance.
    • Two JsonUtils methods deprecated

    Fixed bug in JsonUtils

    • The problem was that JsonUtils static methods were using Object.class.getResourceAsStream() which can behave oddly.
    • Now that the static methods in JsonUtils are backed by a real instance of JsonUtil, that JsonUtil can use itself to load resources.

    Fancy Jackson Annotations

    • Unit test of new JsonUtilImpl ability to take in a preconfigured Jackson Module
    • Unit test demonstrates recursive polymorphic JSON deserialization in Jackson 2.2

    Bumped Dependency Versions

    • Jackson, Guava, TestNg, ArgParse versions.
    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.11(Nov 20, 2013)

    The * wildcard can now be used in Removr specs. The reason for this is, we needed to remove extraneous keys before a Shiftr operation. The extraneous keys were being matched and pass thru to the output by Shiftr wildcards.

    Additionally, there was a bug fix in Sortr, when sorting input arrays that are "unmodifiable".

    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.10(Sep 5, 2013)

    Command Line

    • Diff Json
    • Sort Json
    • Transform Json

    New Maven "convenience" artifact

    • New "jolt-complete" artifact combines "jolt-core" and "json-utils" artifacts and provides a ChainrFactory that can create Chainr instances from the classpath, file paths, and Java File objects.
    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.9(Sep 5, 2013)

  • jolt-0.0.8(Aug 20, 2013)

  • jolt-0.0.7(Aug 20, 2013)

    • Apache RAT integrated into the maven build to verify that each Java file has a copyright header.
    • @snkinard added a command line tool for Diffy. The first step to a cli for Jolt transforms.
    • Travis CI build added, automatically checks Pull requests.
    • Chainr heavily refactored
      • Custom Java Transforms can now be loaded via Guice
        • New class of transform added, the ContextualTransform.
        • Can take input and context into account
        • Canonical example is generating urls, they may need to vary, "http" vs "https", on each run of the transform.
      • Guice specific code lives in a "jolt-guice" maven artifact
    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.6(Aug 20, 2013)

    • Jolt Artifacts now available from Maven Central
    • CardinalityTransform added by Sam
      • Useful if your source of JSON documents do not have consistent cardinality of their elements
      • Eg, the "photos" element varies between an array and map, depending on how many photos there actually are
    • Lots of doc updates and prep for open sourcing
    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.5(Aug 20, 2013)

    • Added Shiftr "#" wildcard
      • Enables Shiftr to transform maps into arrays (with a non-deterministic order)
    • Enabled Jackson 2.x ability to have comments in JSON
      • It it nice to be able to document transforms
      • Commented unit test transforms to serve as documentation
    Source code(tar.gz)
    Source code(zip)
  • jolt-0.0.4(Aug 20, 2013)

  • jolt-0.0.3(Aug 20, 2013)

Owner
Bazaarvoice
Bazaarvoice
Sawmill is a JSON transformation Java library

Update: June 25, 2020 The 2.0 release of Sawmill introduces a breaking change to the GeoIpProcessor to comply with the updated license of the MaxMind

Logz.io 100 Jan 1, 2023
Convert Java to JSON. Convert JSON to Java. Pretty print JSON. Java JSON serializer.

json-io Perfect Java serialization to and from JSON format (available on Maven Central). To include in your project: <dependency> <groupId>com.cedar

John DeRegnaucourt 303 Dec 30, 2022
JSON query and transformation language

JSLT JSLT is a complete query and transformation language for JSON. The language design is inspired by jq, XPath, and XQuery. JSLT can be used as: a q

Schibsted Media Group 510 Dec 30, 2022
A simple java JSON deserializer that can convert a JSON into a java object in an easy way

JSavON A simple java JSON deserializer that can convert a JSON into a java object in an easy way. This library also provide a strong object convertion

null 0 Mar 18, 2022
Generate Java types from JSON or JSON Schema and annotates those types for data-binding with Jackson, Gson, etc

jsonschema2pojo jsonschema2pojo generates Java types from JSON Schema (or example JSON) and can annotate those types for data-binding with Jackson 2.x

Joe Littlejohn 5.9k Jan 5, 2023
Essential-json - JSON without fuss

Essential JSON Essential JSON Rationale Description Usage Inclusion in your project Parsing JSON Rendering JSON Building JSON Converting to JSON Refer

Claude Brisson 1 Nov 9, 2021
A 250 lines single-source-file hackable JSON deserializer for the JVM. Reinventing the JSON wheel.

JSON Wheel Have you ever written scripts in Java 11+ and needed to operate on some JSON string? Have you ever needed to extract just that one deeply-n

Roman Böhm 14 Jan 4, 2023
A Java serialization/deserialization library to convert Java Objects into JSON and back

Gson Gson is a Java library that can be used to convert Java Objects into their JSON representation. It can also be used to convert a JSON string to a

Google 21.7k Jan 8, 2023
A universal types-preserving Java serialization library that can convert arbitrary Java Objects into JSON and back

A universal types-preserving Java serialization library that can convert arbitrary Java Objects into JSON and back, with a transparent support of any kind of self-references and with a full Java 9 compatibility.

Andrey Mogilev 9 Dec 30, 2021
A modern JSON library for Kotlin and Java.

Moshi Moshi is a modern JSON library for Android and Java. It makes it easy to parse JSON into Java objects: String json = ...; Moshi moshi = new Mos

Square 8.7k Dec 31, 2022
Genson a fast & modular Java <> Json library

Genson Genson is a complete json <-> java conversion library, providing full databinding, streaming and much more. Gensons main strengths? Easy to use

null 212 Jan 3, 2023
Lean JSON Library for Java, with a compact, elegant API.

mJson is an extremely lightweight Java JSON library with a very concise API. The source code is a single Java file. The license is Apache 2.0. Because

Borislav Iordanov 77 Dec 25, 2022
Elide is a Java library that lets you stand up a GraphQL/JSON-API web service with minimal effort.

Elide Opinionated APIs for web & mobile applications. Read this in other languages: 中文. Table of Contents Background Documentation Install Usage Secur

Yahoo 921 Jan 3, 2023
JSON Library for Java with a focus on providing a clean DSL

JSON Library for Java with a focus on providing a clean DSL

Vaishnav Anil 0 Jul 11, 2022
High performance JVM JSON library

DSL-JSON library Fastest JVM (Java/Android/Scala/Kotlin) JSON library with advanced compile-time databinding support. Compatible with DSL Platform. Ja

New Generation Software Ltd 835 Jan 2, 2023
Screaming fast JSON parsing and serialization library for Android.

#LoganSquare The fastest JSON parsing and serializing library available for Android. Based on Jackson's streaming API, LoganSquare is able to consiste

BlueLine Labs 3.2k Dec 18, 2022
A JSON Transmission Protocol and an ORM Library for automatically providing APIs and Docs.

?? 零代码、热更新、全自动 ORM 库,后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构。 ?? A JSON Transmission Protocol and an ORM Library for automatically providing APIs and Docs.

Tencent 14.4k Dec 31, 2022
A fast JSON parser/generator for Java.

fastjson Fastjson is a Java library that can be used to convert Java Objects into their JSON representation. It can also be used to convert a JSON str

Alibaba 25.1k Dec 31, 2022
A reference implementation of a JSON package in Java.

JSON in Java [package org.json] Click here if you just want the latest release jar file. Overview JSON is a light-weight language-independent data int

Sean Leary 4.2k Jan 6, 2023