TL;DR This is an article about stumbling into an interesting syntax for describing graphs. If you want to see it in action you can head straight for the demos!
About 18 months ago I created Flowchart Fun: an app for generating flowcharts from text.
To create your flowchart, the app determines the nodes and edges from the input text according to a set of rules:
- text creates a node
- an indent creates an edge
- text before a colon
:labels an edge
Node A edge to: Node B
Simple! Elegant! Chic!
But as with all things, over time some complexity crept in.
#1 - Too Many Parents
As long as nodes have zero or one parent, creating edges with indentation works great, e.g. a tree 🌳 or document outline; however, pretty often flowcharts contain at least one node with multiple parents.
In our syntax a node only "lives" on one line, so we added the pointer syntax
() to reference nodes elsewhere in the document.
Are you hungry? yes: Do you want to eat healthy? yes: No you don't oh right: CAKE! hell no: (CAKE!) no: yes you are (CAKE!)
#2 - Long Labels
When the label is short, like
CAKE!, creating pointers is as easy as... pie, but what about when the label is a college essay?
Typing long labels twice (once for the node and once for the pointer) would be time consuming and clutter the document- not to mention the label could change and cause the pointer to break. 😕
To solve this problem we use the concept of ID's. We can give our node an ID
[a] and reference the ID in the pointer
(a). Crisis averted!
[a] This is an extraordinarily long label that will flow onto many lines and ultimately cause a lot of heartache for this otherwise simple graph. b (a)
#3 - Adding Styles
As a frontend developer, I love this solution. After all, the DOM is a tree of nodes too.
To add colors and shapes, I brought Cytoscape's concept of classes into the syntax.
[a.blue] This is an extraordinarily long label that will flow onto many lines and ultimately cause a lot of heartache for this otherwise simple graph. [.star.yellow] b (a)
Not bad… could it be better?
This is how Flowchart Fun works today, but I’ve always felt it missed the opportunity of piggy-backing on a more well-known syntax: CSS Selectors.
Many people are familiar with
# representing id’s,
. representing classes, and being able to string them together to refer to specific elements in the DOM. Consider this alternative:
#a.blue This is an extraordinarily long label that will flow onto many lines and ultimately cause a lot of heartache for this otherwise simple graph. .star.yellow b (#a)
With that, we have something that looks a little more familiar! For my frontends, it may even remind you of Pug or Jade. The familiarity is already a win, but after staring at this revised syntax for a bit something else clicked.
Once upon a time, someone wrote to me asking how they could use the syntax to create Bayesian Decision Trees. Naturally, being a very intelligent smart person, I immediately knew what they were talking about- but in case you do not, some internet searching reveals that it's something like this:
The idea is that each edge stores a probability. Then, the probability of arriving at a node can be determined by walking up the tree and multiplying the edges on the way.
I built a crude version of this idea and eventually moved on, but one thing always stuck with me- being able to associate arbitrary data to nodes and edges would be really powerful. 🪄
I eventually had the galaxy brain moment. CSS Selectors. Specifically, the attribute syntax for CSS selectors (i.e.,
[key=value]) is a known way to target- or in our case express- auxiliary data about a node.
#id.class1.class2[key=value][key2=value2] Node Label
Now, in addition to capturing the relationship between data points (nodes, edges) we can also store information about our data points; effectively storing tabular data. Look ma, I'm a CSV!
[population=12396372] São Paulo [population=6775561] Rio de Janeiro [population=3094325] Distrito Federal [population=2900319] Bahia [population=2703391] Ceará
Taking it a step further, rather than just being descriptors, we could turn around and use our selectors the way Håkon Wium Lie (the myth, the legend, the inventor of CSS) originally intended: to target a set of nodes in our graph. For example:
.color red .color blue .color yellow // create edges from node 'paintbrush' to 'red', 'blue', and 'yellow' // with the label 'paints' paintbrush paints: (.color)
If you've read this far, then you're either great at procrastinating or you have the same burning passion for quirky domain-specific languages as I do. Either way, I promise I'm at the point.
This is what I think is cool about this— instead of just describing a flowchart, this syntax can be used to describe a ton of different types of data visualizations. It can be used to produce bar charts, area charts, or even Sankey diagrams.
We can even use the same document to produce multiple types of graphs to illustrate different properties of the data. Wowowow!
A CSS-Inspired Syntax for Flowcharts?
Sure, why not! This is what I've done so far:
What do you think? 👀
I'm really interested in getting some feedback, especially for these questions:
- What could people see this being used for?
What other examples should I add to the site?
- I named the repo Graph Selector Syntax, but that's not very fun. Can anyone suggest a name?
- Should I migrate Flowchart Fun to this syntax?
Thank you for reading! Find me at @tone_row_ on Twitter