Desktop GIS software in the cloud with JupyterHub: A QGreenland workshop success story
đ We are Trey Stafford and Matt Fisher, co-authors of the QGreenland data packageâs source code. This year, we had the pleasure of running a hands-on geospatial data and open science QGreenland Researcher Workshop. It was important for attendees to participate in the workshop in a hands-on way while minimizing the negative impacts of installing software, requiring expensive personal computers, and troubleshooting unique computer configurations. We felt a JupyterHub was a good fit for our workshop for this reason, if it could accommodate our need for running QGIS â a desktop application.
In this blog post, we will introduce QGreenland, describe our experience using JupyterHub in the cloud for our workshopâs computing environment, and discuss challenges we overcame to enable our attendees to use QGIS in a cloud graphical desktop environment. Finally, we will highlight some workshop outcomes and discuss opportunities for enhancement based on new developments in the Jupyter ecosystem.
In our workshop, 25â30 international learners (including from Germany, India, France, Canada, Poland, and the United States) used QGIS in a JupyterHubâs browser-based Linux desktop environment to collaboratively test, explore, visualize, and process Earth science data simultaneously with the same user experience they expect from using QGIS on their personal computers! Better yet, getting started was as simple as logging in. Our workshop was a success story not just in education, but also in open source and collaborative development, and we want to share what we learned.

The JupyterHub used by the QGreenland 2023 Researcher Workshop was generously provided by the NASA CryoCloud team, whose mission is to help researchers transition to cloud-based collaboration.
About QGreenland
QGreenland is an open-source Greenland-focused geospatial data package for QGIS, a community-owned graphical Geographic Information System (GIS) platform. Researchers and members of the public leverage QGreenlandâs ready-to-use interdisciplinary datasets to do field planning, teach about glaciers, and much more.
QGreenlandâs MIT-licensed source code uses community-maintained open software like GDAL and PyQGIS to automate data normalization and populate the QGIS project with important information like data provenance and the order of layers in the QGIS Layers Panel. Check out our documentation to learn more! QGreenland also has a YouTube channel with tutorials produced by CIRES Education and Outreach.
Based on user research, QGreenland has enabled:
- the public to more easily access data gathered by researchers visiting Greenland: âIn Greenland, people are often asking, âhow can we find the data the foreign scientists bring back from Greenland?â Now we can directly utilize much of it.â
- researchers to plan field work: âBeing able to use QGreenland at our field station was critical to our research process!â
- educators to develop interactive lessons about Greenland and climate change: ââŠusing QGreenland for presentations because it is presentation quality already.â
QGreenlandâs 2023 researcher workshop
One of the QGreenland teamâs most important forms of direct user interaction and support is facilitating workshops. Most recently, we hosted a 3-day (total of 9 hours) virtual workshop for researchers focused on working with geospatial data in an open science framework. All of the materials covered in the workshop were built using open-source tools and are MIT-licensed and published on GitHub.
A âpersonal computerâ in the cloud
We decided early on that we wanted to use JupyterHub to solve the diverse problems that come with âbring your own deviceâ workshops. We experimented with administering our own JupyterHub on Kubernetes, but the setup overhead was too high for our short workshop. CryoCloudâs JupyterHub enabled us to avoid this overhead and focus on serving our participants. Because the software that comprises CryoCloud is open-source and developed in collaboration with the communities CryoCloud serves, we could directly contribute to curating a computing environment ideal for our participants.
JupyterHub is known for providing access to Jupyter Notebooks via JupyterLab, but it turns out it can also be used to host pretty much any interactive web based application! The jupyter-server-proxy project enables this, and there are additional packages that make running specific applications easier. jupyter-rsession-proxy makes it easy to run RStudio inside JupyterHub, jupyter-vscode-proxy allows running code-server (fully open source self-hosted version of Visual Studio Code) inside a JupyterHub, etc. Pertinent to our use case is jupyter-remote-desktop-proxy, which lets you run a complete Linux desktop environment inside your JupyterHub! This was critical for our workshop, as it allowed us to use QGIS â purely desktop software, not adapted for the web â from inside a web browser. Workshop participants did not need to install anything. This enabled participants to focus on the content of our workshop rather than the logistics of setting up and debugging tools on their varied machines.
The CryoCloud JupyterHub enabled each of our workshop participants to provision their own compute environment (JupyterLab + Linux Desktop) with all of our workshopâs dependencies pre-installed. It also set everyone on equitable footing â someone accessing the workshop on a 10 year old laptop would get the same computing resources as someone on a brand new MacBook Pro.
Challenges scaling QGreenland
The CryoCloud JupyterHub already had jupyter-remote-desktop-proxy and QGIS installed, so we could validate this approach to our workshop quickly. However, to use QGreenland at this scale, we needed to solve a couple of usability problems. The first issue was a user experience problem: the operating system did not have appropriate file type associations for QGIS, so files like the QGreenland project file would not open in QGIS when double-clicked in the desktop file browser. We quickly discovered a solution and integrated it with a simple pull request to the Docker image we were using.
The second problem was a performance problem: QGIS would take several minutes to open QGreenland from the hubâs shared storage drive. After some investigation, it turned out this was due to us loading multiple GB of data from an NFS share! While a long term solution might involve getting QGIS to load data directly from cloud object storage (like S3), we instead decided to go a different route â provision each user a small, fast and temporary Elastic Block Store disk. At the start of the workshop, we provided all users a small script that would copy the dataset from NFS to this faster disk once, and this drastically reduced load times from about 5 minutes to under 3 seconds! You can follow our debugging process on this issue, and find the JupyterHub config used to provision these disks here.
By overcoming these challenges, we created a smooth, intuitive, and performant computing experience for all of our participants, most of whom had never been exposed to this sort of collaborative computing environment.
Outcomes
The workshop participants engaged in small group work to complete various exercises, group discussions, and data scenarios. Each group produced Jupyter Notebooks and GitHub Discussions posts as deliverables. We created an outcomes webpage to summarize our participantsâ accomplishments. One highlight was participantsâ insightful commentary on FAIR & CARE principles.
Based on these outcomes, we consider our workshop a success. While we put in a significant amount of time creating our materials, CryoCloudâs cloud costs and our time investment in preparing computing resources were relatively small. For approximately 25 people, our cloud costs break down to roughly $1/person/day!
Conclusion
The CryoCloud JupyterHub met our workshop needs and provided a delightful experience for administrators and participants alike, and we are excited for whatâs next. JupyterLab 4 and jupyter_collaboration v1.0.0, a real-time collaboration extension, were just announced, and the CryoCloud team is currently working to integrate these new releases into their hub. Real-time collaboration will enable exciting cloud use cases, like small groups working together on the same notebook without a screen share, or organizers providing technical support in a live notebook. We anticipate running this workshop again. We are excited to use JupyterHub again and look forward to experimenting with these new features!
Acknowledgements
Reviewers
In alphabetical order, thanks to Twila Moon, Yuvi Panda, Tasha Snow, and Alyse Thurber for their time contributing to this post!
CryoCloud
Snow, Tasha, Millstein, Joanna, Scheick, Jessica, Sauthoff, Wilson, Leong, Wei Ji, Colliander, James, PĂ©rez, Fernando, James Munroe, Felikson, Denis, Sutterley, Tyler, & Siegfried, Matthew. (2023). CryoCloud JupyterBook (2023.01.26). Zenodo. https://doi.org/10.5281/zenodo.7576602
2i2c
2i2c is a non-profit organization that runs open-source infrastructure for collaborative computing, and maintains the CryoCloud JupyterHub used in this workshop. You can see the complete configuration of this JupyterHub in this public repository.