That’s it, you’re all set to start playing with the packaged data. There are parameters you can set such as loading a schema or imposing strict validation so be sure to go through the project’s README for more detail.
Now that you have a Data Package instance, let’s see what the data looks like. A data package can contain more than one resource so you have to use the Package.getResource() method to specify which resource you’d like to access.
Let’s iterate over the data:
Notice how we’re fetching all values as String. This may not be what you want, particularly for the atomic number and mass. Alternatively, you can trigger data type inference and casting like this:
And that’s it, your data is now associated with the appropriate data types!
Inferring the Schema
We wouldn’t have had to infer the data types if we had included a Table Schema when creating an instance of our Data Package. If a Table Schema is not available, then it’s something that can also be inferred and created with tableschema-java:
The type inference algorithm tries to cast to available types and each successful type casting increments a popularity score for the successful type cast in question. At the end, the best score so far is returned.
The inference algorithm traverses all of the table’s rows and attempts to cast every single value of the table. When dealing with large tables, you might want to limit the number of rows that the inference algorithm processes:
Be sure to go through tableschema-java’s README as well to learn more about how to operate with Table Schema.
In case you discovered an issue that you’d like to contribute a fix for, or if you would like to extend functionality:
Make sure that all tests pass, and submit a PR with your contributions once you’re ready.