A C++ API for Vega-Lite
In this post, we present the first public release of XVega, a C++ library for producing Vega-Lite charts.
Data science workflows differ from traditional software development in that engineers make use of available tools to explore and reason about a problem. In such exploratory work, engineers load data, crunch numbers, produce simple visualizations and iterate… Progress happens in quick incremental iterations, which is possible when tooling does not get in the way.
This kind of interactive computing is generally associated with the Python or R programming languages. However, with the advent of the Cling C++ interpreter from CERN, and the subsequent development of the xeus-cling Jupyter kernel, new possibilities have opened up in this space.
The Jupyter stack — that started in the scientific Python community has evolved into a language-agnostic framework that can now be leveraged by C++ developers. It bridges the gap between the countless scientific computing libraries and tools available in C++ and the Jupyter ecosystem.
The scientific C++ stack now has numerous projects under its belt — such as xtensor, xframe, etc. However, there is little support for visualization — especially for interactive plots. While there exist matplotlib-cpp and matplotplusplus (with their plotting API resembling the original matplotlib library) — they suffer from the same cons as the original library does (such as the imperative API and the confusion between dual object-oriented and state-based interface).
Owing to all these shortcomings, along with the observation that JupyterLab comes with existing support for Vega and Vega-Lite Charts (through the application/vnd.vegalite.v3+json
MIME type), one can leverage this support to bridge the gap rather than reinvent the wheel. Apart from standalone use — one could also integrate such a system into other projects such as xeus-SQLite.
The main idea is to programmatically fill in a JSON that conforms to the Vega-Lite specification and respects the notion of grammar of graphics. It is analogous to what Altair did for Python. We will expose different APIs responsible for filling in certain parts of the JSON.
The fundamentals with XVega are still the same, i.e. the three essential elements of a Chart are Data, Marks and Encodings as usual and importing the library is as simple as writing two statements:
#include "xvega/xvega.hpp"using namespace xv;
The experience is similar to what Altair offers and, hence, the central piece to the library is the Chart() object — which knows how to emit the JSON dictionary representing the data and visualization encodings.
For those unfamiliar with the Vega ecosystem, a quick recap for the above terms is given below:
- Marks — What graphic should represent the data?
- Encodings — Mapping between Data and Visual Elements of the Chart (such as x-axis, etc.).
- Encoding Types: Quantitative (real-valued), Nominal (unordered categorical), Ordinal (ordered categorical), Temporal (time-series).

The core strength of using such a system is the separation of specification and execution. The declarative API makes it easy to specify “what” should be done rather than focus on incidental details of the “how”. It means that rather than having a special “hist()” function for plotting a histogram, passing “bin=True” does the job.

We can of-course customize the binning parameters with a “Bin()” object instead. And while we are doing that, let’s add a colour encoding as well to get a sense of the 3rd dimension.

Another plus of using Vega-Lite is the possibility of using transformations within the specification rather than doing it before.
(E.g., one can do linear regression as a part of this declarative API).

Lastly, support for Interactions and Selections is a no-brainer. It’s as simple as defining what to use and adding it to the Chart() object.

Developing such a system for C++ comes with its own challenges and to provide a seamless experience like Altair, several things are needed to be taken care of:
- Multiple types for a single entity: the Vega-Lite specification allows variables of different kinds (such as a boolean type and an integer type may be equally valid for a particular property). Variants and Visitors in C++ allow us to achieve this.
- Out of order keyword arguments: Method chaining is the classical approach to tackle out-of-order keyword arguments in C++ and is what is used in XVega indeed.
- Optional fields: A lot of values in the Vega-Lite specification are optional, and this is made possible by the optionally contained values in C++ (i.e. using std::optional).
Installation
You can install XVega with conda or mamba:
mamba install -c conda-forge xvega
or
conda install -c conda-forge xvega
What is coming?
XVega is still at an early stage and under active development. We are currently working on integrating it with the xeus-sqlite and other SQL Jupyter kernels to enable the visualization from SQL queries. We are also working on improving the compilation time of XVega with Cling.
Acknowledgements
This work on XVega was funded by QuantStack. Thanks to Sylvain Corlay and Johan Mabille for their continuous support.
About the Author

My name is Madhur Tandon, and I currently work with QuantStack as a Scientific Software Engineer. Before joining QuantStack, I have worked with Mozilla, Deepnote, INCF (International Neuroinformatics Coordinating Facility), TCS Research and Elucidata. I have also been a speaker at JupyterCon 2020 and PyData Delhi 2017 and 2018. I graduated from IIIT-Delhi this year with a Bachelor’s degree in Computer Science with Honors. Besides core Data Science and Machine Learning, I am interested in tools that enable and enhance data scientists’ workflow and experience.