GRASS GSoC 2024 Add JSON output: Difference between revisions
Wenzeslaus (talk | contribs) (Grass -> GRASS) |
|||
(21 intermediate revisions by 2 users not shown) | |||
Line 13: | Line 13: | ||
;Abstract | ;Abstract | ||
::At the moment, most of the tools in | ::At the moment, most of the tools in GRASS have custom human readable outputs in plain text. Some of these modules could benefit from storing their output in a portable and commonly used data format. The aim of my project is to use the parson library in various tools so that they can produce JSON outputs. The addition of JSON as an output format will be accompanied with addition of Python test cases to verify the output works as intended and to avoid regressions in future. An option to specify the desired output format (plain or JSON) will also be added to each of the tools updated. The layout of the JSON format will be discussed with mentors prior to implementation and will be optimized towards easy ingestion with Pandas. | ||
__TOC__ | __TOC__ | ||
== Pull Requests == | |||
{| class="wikitable" | |||
|+ | |||
|- | |||
! Module !! PR Title !! PR Link !! Status at end of GSoC Period | |||
|- | |||
| lib || add standard parser option for JSON formatting || https://github.com/OSGeo/grass/pull/3704/ || merged | |||
|- | |||
| r.info || add JSON output || https://github.com/OSGeo/grass/pull/3744/ || merged | |||
|- | |||
| v.info || add JSON output || https://github.com/OSGeo/grass/pull/3755/ || merged | |||
|- | |||
| r.univar || add JSON output || https://github.com/OSGeo/grass/pull/3783/ || merged | |||
|- | |||
| v.univar || add JSON output || https://github.com/OSGeo/grass/pull/3784/ || merged | |||
|- | |||
| r.profile || add JSON output || https://github.com/OSGeo/grass/pull/3872/ || merged | |||
|- | |||
| r.stats || add JSON output || https://github.com/OSGeo/grass/pull/3884/ || open | |||
|- | |||
| r.report || add JSON output || https://github.com/OSGeo/grass/pull/3935/ || merged | |||
|- | |||
| g.region || add JSON output || https://github.com/OSGeo/grass/pull/3941/ || merged | |||
|- | |||
| v.distance || add JSON output || https://github.com/OSGeo/grass/pull/3942/ || open | |||
|- | |||
| r.category || add JSON output || https://github.com/OSGeo/grass/pull/4018/ || merged | |||
|- | |||
| v.category || add JSON output || https://github.com/OSGeo/grass/pull/4020/ || open | |||
|- | |||
| db.describe || add JSON output || https://github.com/OSGeo/grass/pull/4021/ || merged | |||
|- | |||
| v.to.db || add JSON output || https://github.com/OSGeo/grass/pull/4036/ || open | |||
|- | |||
| g.proj || add JSON output || https://github.com/OSGeo/grass/pull/4104/ || open | |||
|- | |||
| r.object.geometry || add JSON output || https://github.com/OSGeo/grass/pull/4105/ || merged | |||
|- | |||
| g.region || fix ruff lint error in tests || https://github.com/OSGeo/grass/pull/4167/ || merged | |||
|- | |||
| db.describe || fix illegal memory access report || https://github.com/OSGeo/grass/pull/4202/ || merged | |||
|} | |||
== Reports == | == Reports == | ||
Introduction: https://discourse.osgeo.org/t/gsoc-2024-introduction-juno/28253 | |||
# Week | |||
Community Bonding Period: https://discourse.osgeo.org/t/gsoc-2024-week-0-report-add-json-support-to-grass-modules/28299 | |||
# Week 1: https://discourse.osgeo.org/t/gsoc-2024-week-1-report-add-json-support-to-grass-modules/30673 | |||
# Week 2: https://discourse.osgeo.org/t/gsoc-2024-week-2-report-add-json-support-to-grass-modules/30764 | |||
# Week 3: https://discourse.osgeo.org/t/gsoc-2024-week-3-report-add-json-support-to-grass-modules/30791 | |||
# Week 4: https://discourse.osgeo.org/t/gsoc-2024-week-4-report-add-json-support-to-grass-modules/30834 | |||
# Week 5: https://discourse.osgeo.org/t/gsoc-2024-week-5-report-add-json-support-to-grass-modules/30882 | |||
# Week 6: https://discourse.osgeo.org/t/gsoc-2024-week-6-report-add-json-support-to-grass-modules/30906 | |||
# Week 7: https://discourse.osgeo.org/t/gsoc-2024-week-7-report-add-json-support-to-grass-modules/30946 | |||
# Week 8: https://discourse.osgeo.org/t/gsoc-2024-week-8-report-add-json-support-to-grass-modules/30993 | |||
# Week 9: https://discourse.osgeo.org/t/gsoc-2024-week-9-report-add-json-support-to-grass-modules/31007 | |||
# Week 10: https://discourse.osgeo.org/t/gsoc-2024-week-10-report-add-json-support-to-grass-modules/49643 | |||
# Week 12: https://discourse.osgeo.org/t/gsoc-2024-week-12-report-add-json-support-to-grass-modules/49735 | |||
Final: https://discourse.osgeo.org/t/gsoc-2024-final-report-add-json-output-to-different-tools-in-c/49784 | |||
{{GSoC}} | {{GSoC}} | ||
== Final Report == | |||
=== The State of the Art Before GSoC === | |||
Before this project, the majority of GRASS GIS tools produced outputs in plain text, which required manual parsing or conversion to be used in other software systems. Some modules already had JSON support, but the implementation was inconsistent, using different flags or options. This made it challenging to automate tasks or integrate GRASS GIS outputs directly with modern data processing pipelines. | |||
=== The Addition (Added Value) That My Project Brought to the Software === | |||
The project brought significant improvements by adding JSON output support to 16 GRASS GIS tools. This enhancement allows users to specify their desired output format (plain text or JSON), making it easier to integrate with data analysis tools and workflows. Additionally, the project standardized the options for tools that already had JSON support, improving consistency across the platform. The introduction of comprehensive Python test cases for these outputs ensures that the enhancements are reliable and future-proof. | |||
=== Potential Future Work === | |||
JSON support for 4 modules is currently a work in progress and should hopefully be complete soon. Further work is needed to extend JSON output support to the remaining tools within GRASS GIS. Future developers can build on this foundation, focusing on additional modules or enhancing the JSON schema to support more complex use cases. | |||
== Examples == | |||
=== Using r.category JSON output with Python === | |||
<source lang="python"> | |||
import grass.script as gs | |||
output = gs.read_command( | |||
"r.category", | |||
map="towns", | |||
output_format="json" | |||
) | |||
categories = json.loads(output) | |||
print(categories) | |||
</source> | |||
<pre> | |||
[ | |||
{ | |||
"category": 1, | |||
"description": "CARY" | |||
}, | |||
{ | |||
"category": 2, | |||
"description": "GARNER" | |||
}, | |||
{ | |||
"category": 3, | |||
"description": "APEX" | |||
}, | |||
{ | |||
"category": 4, | |||
"description": "RALEIGH-CITY" | |||
}, | |||
{ | |||
"category": 5, | |||
"description": "RALEIGH-SOUTH" | |||
}, | |||
{ | |||
"category": 6, | |||
"description": "RALEIGH-WEST" | |||
} | |||
] | |||
</pre> | |||
=== Using r.profile JSON output with pandas and Matplotlib === | |||
<source lang="python"> | |||
import grass.script as gs | |||
import pandas as pd | |||
import matplotlib.pyplot as plt | |||
# Run r.profile command | |||
elevation = gs.read_command( | |||
"r.profile", | |||
input="elevation", | |||
coordinates="641712,226095,641546,224138,641546,222048,641049,221186", | |||
format="json", | |||
flags="gc" | |||
) | |||
df = pd.read_json(elevation) | |||
print(df) | |||
# Convert the RGB color values to hex format for matplotlib | |||
df["color"] = df.apply(lambda x: "#{:02x}{:02x}{:02x}".format(int(x["red"]), int(x["green"]), int(x["blue"])), axis=1) | |||
# Create the scatter plot | |||
plt.figure(figsize=(10, 6)) | |||
plt.scatter(df['distance'], df['elevation'], c=df['color'], marker='o') | |||
plt.title('Profile of Distance vs. Elevation with Color Coding') | |||
plt.xlabel('Distance (meters)') | |||
plt.ylabel('Elevation') | |||
plt.grid(True) | |||
plt.show() | |||
</source> | |||
<pre> | |||
easting northing distance elevation red green blue color | |||
0 641712.000000 226095.000000 0.000000 84.530815 111 255 0 #6fff00 | |||
1 641669.739905 225596.789117 500.000000 97.633720 255 244 0 #fff400 | |||
2 641627.479809 225098.578233 1000.000000 104.868874 255 198 0 #ffc600 | |||
3 641585.219714 224600.367350 1500.000000 97.171303 255 247 0 #fff700 | |||
4 641546.000000 224138.000000 1964.027749 81.972504 79 255 0 #4fff00 | |||
5 641546.000000 223638.000000 2464.027749 72.764458 0 245 29 #00f51d | |||
6 641546.000000 223138.000000 2964.027749 80.820168 64 255 0 #40ff00 | |||
7 641546.000000 222638.000000 3464.027749 71.326347 0 241 42 #00f12a | |||
8 641546.000000 222138.000000 3964.027749 71.669518 0 242 39 #00f227 | |||
9 641546.000000 222048.000000 4054.027749 71.669518 0 242 39 #00f227 | |||
10 641296.254788 221614.840296 4554.027749 78.522743 35 255 0 #23ff00 | |||
</pre> | |||
[[File:Distance Elevation Color Coded.png]] | |||
=== Using r.info with pandas with session setup === | |||
<source lang="python"> | |||
import subprocess | |||
import sys | |||
sys.path.append(subprocess.check_output(["grass", "--config", "python_path"], text=True).strip()) | |||
import grass.script as gs | |||
gs.setup.init("~/grassdata/nc_spm_08_grass7/") | |||
</source> | |||
<source lang="python"> | |||
data = gs.parse_command("r.info", map="elevation", format="json") | |||
print(data["cells"]) | |||
</source> | |||
<pre> | |||
2025000 | |||
</pre> | |||
<source lang="python"> | |||
print(pd.DataFrame([data]).T) | |||
</source> | |||
<pre> | |||
... | |||
north 228500 | |||
south 215000 | |||
nsres 10 | |||
east 645000 | |||
west 630000 | |||
ewres 10 | |||
... | |||
</pre> | |||
=== Using r.info JSON output with jq in interactive shell === | |||
<source lang="bash"> | |||
r.info lakes format=json | jq '.title' | |||
</source> | |||
<pre> | |||
"South-West Wake county: Wake county lakes" | |||
</pre> | |||
<source lang="bash"> | |||
r.info elevation format=json | jq .min,.max | |||
</source> | |||
<pre> | |||
55.578792572021484 | |||
156.32986450195312 | |||
</pre> | |||
=== Using r.info with jq in command line === | |||
<source lang="bash"> | |||
$ grass-dev --tmp-mapset "~/grassdata/nc_spm_08_grass7/" --exec r.info map=elevation format=json | jq '.title' | |||
</source> | |||
<pre> | |||
"South-West Wake county: Elevation NED 10m" | |||
</pre> |
Latest revision as of 20:31, 22 August 2024
Accepted Google Summer of Code 2024 project.
Student Name: | Kriti Birda |
Organization: | OSGeo - Open Source Geospatial Foundation |
Mentor Name: | Corey White and Vaclav Petras |
Title: | Add JSON output to different GRASS tools in C |
- Abstract
-
- At the moment, most of the tools in GRASS have custom human readable outputs in plain text. Some of these modules could benefit from storing their output in a portable and commonly used data format. The aim of my project is to use the parson library in various tools so that they can produce JSON outputs. The addition of JSON as an output format will be accompanied with addition of Python test cases to verify the output works as intended and to avoid regressions in future. An option to specify the desired output format (plain or JSON) will also be added to each of the tools updated. The layout of the JSON format will be discussed with mentors prior to implementation and will be optimized towards easy ingestion with Pandas.
Pull Requests
Reports
Introduction: https://discourse.osgeo.org/t/gsoc-2024-introduction-juno/28253
Community Bonding Period: https://discourse.osgeo.org/t/gsoc-2024-week-0-report-add-json-support-to-grass-modules/28299
- Week 1: https://discourse.osgeo.org/t/gsoc-2024-week-1-report-add-json-support-to-grass-modules/30673
- Week 2: https://discourse.osgeo.org/t/gsoc-2024-week-2-report-add-json-support-to-grass-modules/30764
- Week 3: https://discourse.osgeo.org/t/gsoc-2024-week-3-report-add-json-support-to-grass-modules/30791
- Week 4: https://discourse.osgeo.org/t/gsoc-2024-week-4-report-add-json-support-to-grass-modules/30834
- Week 5: https://discourse.osgeo.org/t/gsoc-2024-week-5-report-add-json-support-to-grass-modules/30882
- Week 6: https://discourse.osgeo.org/t/gsoc-2024-week-6-report-add-json-support-to-grass-modules/30906
- Week 7: https://discourse.osgeo.org/t/gsoc-2024-week-7-report-add-json-support-to-grass-modules/30946
- Week 8: https://discourse.osgeo.org/t/gsoc-2024-week-8-report-add-json-support-to-grass-modules/30993
- Week 9: https://discourse.osgeo.org/t/gsoc-2024-week-9-report-add-json-support-to-grass-modules/31007
- Week 10: https://discourse.osgeo.org/t/gsoc-2024-week-10-report-add-json-support-to-grass-modules/49643
- Week 12: https://discourse.osgeo.org/t/gsoc-2024-week-12-report-add-json-support-to-grass-modules/49735
Final: https://discourse.osgeo.org/t/gsoc-2024-final-report-add-json-output-to-different-tools-in-c/49784
Final Report
The State of the Art Before GSoC
Before this project, the majority of GRASS GIS tools produced outputs in plain text, which required manual parsing or conversion to be used in other software systems. Some modules already had JSON support, but the implementation was inconsistent, using different flags or options. This made it challenging to automate tasks or integrate GRASS GIS outputs directly with modern data processing pipelines.
The Addition (Added Value) That My Project Brought to the Software
The project brought significant improvements by adding JSON output support to 16 GRASS GIS tools. This enhancement allows users to specify their desired output format (plain text or JSON), making it easier to integrate with data analysis tools and workflows. Additionally, the project standardized the options for tools that already had JSON support, improving consistency across the platform. The introduction of comprehensive Python test cases for these outputs ensures that the enhancements are reliable and future-proof.
Potential Future Work
JSON support for 4 modules is currently a work in progress and should hopefully be complete soon. Further work is needed to extend JSON output support to the remaining tools within GRASS GIS. Future developers can build on this foundation, focusing on additional modules or enhancing the JSON schema to support more complex use cases.
Examples
Using r.category JSON output with Python
import grass.script as gs
output = gs.read_command(
"r.category",
map="towns",
output_format="json"
)
categories = json.loads(output)
print(categories)
[ { "category": 1, "description": "CARY" }, { "category": 2, "description": "GARNER" }, { "category": 3, "description": "APEX" }, { "category": 4, "description": "RALEIGH-CITY" }, { "category": 5, "description": "RALEIGH-SOUTH" }, { "category": 6, "description": "RALEIGH-WEST" } ]
Using r.profile JSON output with pandas and Matplotlib
import grass.script as gs
import pandas as pd
import matplotlib.pyplot as plt
# Run r.profile command
elevation = gs.read_command(
"r.profile",
input="elevation",
coordinates="641712,226095,641546,224138,641546,222048,641049,221186",
format="json",
flags="gc"
)
df = pd.read_json(elevation)
print(df)
# Convert the RGB color values to hex format for matplotlib
df["color"] = df.apply(lambda x: "#{:02x}{:02x}{:02x}".format(int(x["red"]), int(x["green"]), int(x["blue"])), axis=1)
# Create the scatter plot
plt.figure(figsize=(10, 6))
plt.scatter(df['distance'], df['elevation'], c=df['color'], marker='o')
plt.title('Profile of Distance vs. Elevation with Color Coding')
plt.xlabel('Distance (meters)')
plt.ylabel('Elevation')
plt.grid(True)
plt.show()
easting northing distance elevation red green blue color 0 641712.000000 226095.000000 0.000000 84.530815 111 255 0 #6fff00 1 641669.739905 225596.789117 500.000000 97.633720 255 244 0 #fff400 2 641627.479809 225098.578233 1000.000000 104.868874 255 198 0 #ffc600 3 641585.219714 224600.367350 1500.000000 97.171303 255 247 0 #fff700 4 641546.000000 224138.000000 1964.027749 81.972504 79 255 0 #4fff00 5 641546.000000 223638.000000 2464.027749 72.764458 0 245 29 #00f51d 6 641546.000000 223138.000000 2964.027749 80.820168 64 255 0 #40ff00 7 641546.000000 222638.000000 3464.027749 71.326347 0 241 42 #00f12a 8 641546.000000 222138.000000 3964.027749 71.669518 0 242 39 #00f227 9 641546.000000 222048.000000 4054.027749 71.669518 0 242 39 #00f227 10 641296.254788 221614.840296 4554.027749 78.522743 35 255 0 #23ff00
Using r.info with pandas with session setup
import subprocess
import sys
sys.path.append(subprocess.check_output(["grass", "--config", "python_path"], text=True).strip())
import grass.script as gs
gs.setup.init("~/grassdata/nc_spm_08_grass7/")
data = gs.parse_command("r.info", map="elevation", format="json")
print(data["cells"])
2025000
print(pd.DataFrame([data]).T)
... north 228500 south 215000 nsres 10 east 645000 west 630000 ewres 10 ...
Using r.info JSON output with jq in interactive shell
r.info lakes format=json | jq '.title'
"South-West Wake county: Wake county lakes"
r.info elevation format=json | jq .min,.max
55.578792572021484 156.32986450195312
Using r.info with jq in command line
$ grass-dev --tmp-mapset "~/grassdata/nc_spm_08_grass7/" --exec r.info map=elevation format=json | jq '.title'
"South-West Wake county: Elevation NED 10m"