ggsql: A Grammar of Graphics for SQL
128 points - today at 12:51 PM
SourceComments
I was kind of guessing that it doesn't run in a database, that it's a SQL-like syntax for a visualisation DSL handled by front end chart library.
That appears to be what is described in https://ggsql.org/get_started/anatomy.html
But then https://ggsql.org/faq.html has a section, "Can I use SQL queries inside the VISUALISE clause," which says, "Some parts of the syntax are passed on directly to the database".
The homepage says "ggsql interfaces directly with your database"
But it's not shown how that happens AFAICT
confused
Or is the idea that SQL is such a great language to write in that a lot of people will be thrilled to do their ggplots in this SQL-like language?
EDIT: OK, after looking at almost all of the documentation, I think I've finally figured it out. It's a standalone visualization app with a SQL-like API that currently has backends for DuckDB and SQLite and renders plots with Vegalite. They plan to support more backends and renderers in the future. As a commenter below said, it's supposed to help SQL specialists who don't know Python or R make visualizations.
It would be nice if it included a rendering engine.
This can replace a lot of Excel in the end.
It makes so much sense now that it exists!
my questions are less about the syntax, which i'm largely familiar with knowing both SQL and ggplot.
i'm more interested in the backend architecture. Looking at the Cargo.toml [1], I was surprised to not see a visualization dependency like D3 or Vega. Is this intentional?
I'm certainly going to take this for a spin and I think this could be incredible for agentic analytics. I'm mostly curious right now what "deployment" looks like both currently in a utopian future.
utopia is easier -- what if databases supported it directly?!? but even then I think I'd rather have databases spit out an intermediate representation (IR) that could be handed to a viz engine, similar to how vega works. or perhaps the SQL is the IR?!
another question that arises from the question of composability: how distinct would a ggplot IR be from a metrics layer spec? could i use ggsql to create an IR that I then use R's ggplot to render (or vise versa maybe?)
as for the deployment story today, I'll likely learn most by doing (with agents). My experiment will be to kick off an agent to do something like: extract this dataset to S3 using dlt [2], model it using dbt [3], then use ggsql to visualize.
p.s. @thomasp85, I was a big fan of tidygraph back in the day [4]. love how small our data world is.
[1]: https://github.com/posit-dev/ggsql/blob/main/Cargo.toml
[2]: https://github.com/dlt-hub/dlt
[3]: https://github.com/dbt-labs/dbt-fusion
[4]: https://stackoverflow.com/questions/46466351/how-to-hide-unc...
2) how to make manual adjustments?