Jupyter receives the ACM Software System Award

Project Jupyter
Jupyter Blog
Published in
9 min readMay 2, 2018

--

It is our pleasure to announce that Project Jupyter has been awarded the 2017 ACM Software System Award, a significant honor for the project. We are humbled to join an illustrious list of projects that contains major highlights of computing history, including Unix, TeX, S (R’s predecessor), the Web, Mosaic, Java, INGRES (modern databases) and more.

Officially, the recipients of the award are the fifteen members of the Jupyter steering council as of November 2016, the date of nomination (listed in chronological order of joining the project): Fernando Pérez, Brian Granger, Min Ragan-Kelley, Paul Ivanov, Thomas Kluyver, Jason Grout, Matthias Bussonnier, Damián Avila, Steven Silvester, Jonathan Frederic, Kyle Kelley, Jessica Hamrick, Carol Willing, Sylvain Corlay and Peter Parente.

A tiny subset of the Jupyter contributors and users that made Jupyter possible — Biannual development meeting, 2016, LBNL.

This is the largest team ever to receive this award, and we are delighted that the ACM was willing to recognize that modern collaborative projects are created by large teams, and should be rewarded as such. Still, we emphasize that Jupyter is made possible by many more people than these fifteen recipients. This award honors the large group of contributors and users that has made IPython and Jupyter what they are today. The recipients are stewards of this common good, and it is our responsibility to help this broader community continue to thrive.

Below, we’ll summarize the story of our journey, including the technical and human sides of this effort. You can learn more about Jupyter from our website, and you can meet the vibrant Jupyter community by attending JupyterCon, August 21–25, 2018, in New York City.

In the beginning

Project Jupyter was officially unveiled with its current name in 2014 at the SciPy scientific Python conference. However, Jupyter’s roots date back nearly 17 years to when Fernando Pérez announced his open source IPython project as a graduate student in 2001. IPython provided tools for interactive computing in the Python language (the ‘I’ is for ‘Interactive’), with an emphasis on the exploratory workflow of scientists: run some code, plot and examine some results, think about the next step based on these outcomes, and iterate. IPython itself was born out of merging an initial prototype with Nathan Gray’s LazyPython and Janko Hauser’s IPP, inspired by a 2001 O’Reilly Radar post — collaboration has been part of our DNA since day one.

From those humble beginnings, a community of like-minded scientists grew around IPython. Some contributors have moved on to other endeavors, while others are still at the heart of the project. For example, Brian Granger and Min Ragan-Kelley joined the effort around 2004 and today lead multiple areas of the project. Our team gradually grew, both with members who were able to dedicate significant amounts of effort to the project as well as a larger, but equally significant, “long tail” community of users and contributors.

In 2011, after development of our first interactive client-server tool (our Qt Console), multiple notebook prototypes, and a summer-long coding sprint by Brian Granger, we were able to release the first version of the IPython Notebook. This effort paved the path to our modern architecture and vision of Jupyter.

What is Jupyter?

Project Jupyter develops open source software, standardizes protocols for interactive computing across dozens of programming languages, and defines open formats for communicating results with others.

Interactive computation

On the technical front, Jupyter occupies an interesting area of today’s computing landscape. Our world is flooded with data that requires computers to process, analyze, and manipulate, yet the questions and insights are still the purview of humans. Our tools are explicitly designed for the task of computing interactively, that is, where a human executes code, looks at the results of this execution, and decides the next steps based on these outcomes. Jupyter has become an important part of the daily workflow in research, education, journalism, and industry.

Whether running a quick script at the IPython terminal, or doing a deep dive into a dataset in a Jupyter notebook, our tools aim to make this workflow as fluid, pleasant, and effective as possible. For example, we built powerful completion tools to help you discover the structure of your code and data, a flexible display protocol to show results enriched by the multimedia capabilities of your web browser, and an interactive widget system to let you easily create GUI controls like sliders to explore parameters of your computation. All these tools have evolved from their IPython origins into open, documented protocols that can be implemented in any programming language as a “Jupyter kernel”. There are over 100 Jupyter kernels today, created by many members of the community.

Exploring a large dataset interactively using the widget protocol and tools.

Our experience building and using the Jupyter Notebook application for the last few years has now led to its next-generation successor, JupyterLab, which is now ready for users. JupyterLab is a web application that exposes all the elements above not only as an end-user application, but also as interoperable building blocks designed to enable entirely new workflows. JupyterLab has already been adopted by large scientific projects such as the Large Synoptic Survey Telescope project.

Communicating results

In today’s data-rich world, working with the computer is only half of the picture. Its complement is working with other humans, be it your partners, colleagues, students, clients, or even your future self months down the road. The open Jupyter notebook file format is designed to capture, display and share natural language, code, and results in a single computational narrative. These narratives exist in the tradition of literate programming that dates back to Knuth’s work, but here the focus is weaving computation and data specific to a given problem, in what we sometimes refer to as literate computing. While existing computational systems like Maple, Mathematica and SageMath all informed our experience, our focus in Jupyter has been on the creation of open standardized formats that can benefit the entire scientific community and support the long-term sharing and archiving of computational knowledge, regardless of programming language.

We have also built tools to support Jupyter deployment in multi-user environments, whether a single server in your group or a large cloud deployment supporting thousands of students. JupyterHub and projects that build upon it, like Binder and BinderHub, now support industry deployments, large-scale education, reproducible research, and the seamless sharing of live computational environments.

Data Science class at UC Berkeley, taught using Jupyter.

We are delighted to see, for example, how the LIGO Collaboration, awarded the 2017 Nobel Prize in Physics for the observation of gravitational waves, offers their data and analysis code for the public in the form of Jupyter Notebooks hosted on Binder at their Open Science Center.

Measurement and prediction of gravitational waves formed by two black holes merging. Adapted from https://github.com/minrk/ligo-binder.

Open standards nourish an innovative ecosystem

In Project Jupyter, we have concentrated on standardizing protocols and formats evolved from community needs, independent of any specific implementation. The stability and interoperability of open standards provides a foundation for others to experiment, collaborate, and build tools inspired by their unique goals and perspectives.

For example, while we provide the nbviewer service that renders notebooks from any online source for convenient sharing, many people would rather see their notebooks directly on GitHub. This was not possible originally, but the existence of a well-documented notebook format enabled GitHub to develop their own rendering pipeline, which now shows HTML versions of notebooks rendered in a way that conforms to their security requirements.

Similarly, there exist multiple client applications in addition to the Jupyter Notebook and JupyterLab to create and execute notebooks, each with its own use case and focus: the open source nteract project develops a lightweight desktop application to run notebooks; CoCalc, a startup founded by William Stein, the creator of SageMath, offers a web-based client with real-time collaboration that includes Jupyter alongside SageMath, LaTeX, and tools focused on education; and Google now provides Colaboratory, another web notebook frontend that runs alongside the rest of the Google Documents suite, with execution in the Google Cloud.

These are only a few examples, but they illustrate the value of open protocols and standards: they serve open-source communities, startups, and large corporations equally well. We hope that as the project grows, interested parties will continue to engage with us so we can keep refining these ideas and developing new ones in support of a more interoperable and open ecosystem.

Growing a community

IPython and Jupyter have grown to be the product of thousands of contributors, and the ACM Software System Award should be seen as a recognition of this combined work. Over the years, we evolved from the typical pattern of an ad-hoc assembly of interested people loosely coordinating on a mailing list to a much more structured project. We formalized our governance model and instituted a Steering Council. We continue to evolve these ideas as the project grows, always seeking to ensure the project is welcoming, supports an increasingly diverse community, and helps solidify a foundation for it to be sustainable. This process isn’t unique to Jupyter, and we’ve learned from other larger projects such as Python itself.

Jupyter exists at the intersection of distributed open source development, university-centered research and education, and industry engagement. While the original team came mostly from the academic world, from the start we’ve recognized the value of engaging industry and other partners. This led, for example, to our BSD licensing choice, best articulated by the late John Hunter in 2004. Beyond licensing, we’ve actively sought to maintain a dialog with all these stakeholders:

  • We are part of the NumFOCUS Foundation, working as part of a rich tapestry of other scientifically-focused open source projects. Jupyter is a portal to many of these tools, and we need the entire ecosystem to remain healthy.
  • We have obtained significant funding from the Alfred P. Sloan Foundation, the Gordon and Betty Moore Foundation, and the Helmsley Trust.
  • We engage directly with industry partners. Many of our developers hail from industry: we have ongoing active collaborations with companies such as Bloomberg and Quansight on the development of JupyterLab, and with O’Reilly Media on JupyterCon. We have received funding and direct support in the past from Bloomberg, Microsoft, Google, Anaconda, and others.

The problem of sustainably developing open source software systems of lasting intellectual and technical value, that serve users as diverse as high-school educators, large universities, Nobel prize-winning science teams, startups, and the largest technology companies in the world, is an ongoing challenge. We need to build healthy communities, find significant resources, provide organizational infrastructure, and value professional and personal time invested in open source. There is a rising awareness among volunteers, business leaders, academic promotion and tenure boards, professional organizations, government agencies, and others of the need to support and sustain critical open source projects. We invite you to engage with us as we continue to explore solutions to these needs and build these foundations for the future.

Acknowledgments

The award was given to the above fifteen members of the Steering Council. But this award truly belongs to the community, and we’d like to thank all that have made Jupyter possible, from newcomers to long-term contributors. The project exists to serve the community and wouldn’t be possible without you.

We are grateful for the generous support of our funders. Jupyter’s scale and complexity require dedicated effort, and this would be impossible without the financial resources provided (now and in the past) by the Alfred P. Sloan Foundation, the Gordon and Betty Moore Foundation, the Helmsley Trust, the Simons Foundation, Lawrence Berkeley National Laboratory, the European Union Horizon 2020 program, Anaconda Inc, Bloomberg, Enthought, Google, Microsoft, Rackspace, and O’Reilly Media. Finally, the recipients of the award have been supported by our employers, who often have put faith in the long-term value of this type of work well before the outcomes were evident: Anaconda, Berkeley Lab, Bloomberg, CalPoly, DeepMind, European XFEL, Google, JP Morgan, Netflix, QuantStack, Simula Research Lab, UC Berkeley and Valassis Digital.

--

--

Project Jupyter exists to develop open-source software, open standards, and services for interactive and reproducible computing.