6

Lessons learned from Jupyter’s Contributor in Residence pilot

 3 years ago
source link: https://blog.jupyter.org/lessons-learned-from-jupyters-contributor-in-residence-pilot-427e2b361a7b
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Lessons learned from Jupyter’s Contributor in Residence pilot

By Georgiana Dolocan and Chris Holdgraf

Project Jupyter, like most open source communities, often lacks the time and people power to maintain existing projects. Existing maintainers are often overloaded with work, while new volunteer contributions are often driven by the excitement of new feature development. Although on-boarding new maintainers is a sustained long-term team effort, it is something necessary and beneficial not only for the project, but also for existing maintainers and new contributors alike.

For this reason, we decided to pilot a Contributor in Residence program that was generously funded for a year through CZI’s Essential Open Source Science program. Last year we ran the first iteration of the JupyterHub Contributor in Residence role. This post is a short report about what we learned, in the hopes that it guides other projects in pursuing similar programs.

tl;dr

The Contributor in Residence program is a great way to:

  • Introduce new members into your community
  • Create pathways for others to get more experience, and grow in their role and leadership
  • Get a lot of work done that is hard to motivate volunteers to do
  • Spend time thinking at a “meta” level about how to improve dynamics in the community
  • Have a paid opportunity to support individuals who can’t contribute via volunteerism alone

However, doing this well requires planning ahead of time and attention throughout the program. It is crucial that you develop a plan and pathways for integrating the CIR into your community, and ensuring that they have the right perspective to support the community. If you do this, the CIR program can be a fantastic win-win situation for everyone involved.

What is the Contributor in Residence?

The Contributor in Residence (CIR) carries out a variety of technical and community-focused actions, looking for opportunities to keep momentum moving forward in conversations, improving infrastructure to minimize human toil and streamline information, and making improvements to the codebase that are focused more around sustainability, stability, and understandability.

For more information about our CIR program, check out these blog posts:

0*6hQJEAS34fJRY65r?q=20
lessons-learned-from-jupyters-contributor-in-residence-pilot-427e2b361a7b
The JupyterHub welcome bot image, hand-drawn by our first Contributor in Residence, Georgiana

How management and planning worked

We didn’t define explicit “metrics-based” outcomes for this Contributor in Residence pilot, as we were not sure the behaviors that would be most beneficial in the long-run. Instead, Georgiana kept track of the activities that she worked on throughout the year, and reflected with us often in identifying challenges to overcome, and opportunities to improve the work she was doing.

In practice, Georgiana did not have a sole formal ongoing mentor providing her regular guidance throughout the project. She received mentorship from a variety of JupyterHub Team members across the repositories, and via participation in issues in GitHub (more on that later).

Georgiana kept track of her actions at this GitHub project. This contains a collection of issues, to-do items, and pull-requests throughout the JupyterHub community, and is a public record of much of the work that she focused on. While we scoped the CIR to cover all of the JupyterHub repositories, these efforts had a general focus on The Littlest JupyterHub, where she is now the primary maintainer and community liaison for the project. In addition to these development-focused efforts, Georgiana also spent a lot of time improving infrastructure and team processes at the JupyterHub-wide level, generally via the JupyterHub Team Compass.

Tips for running a CIR program

The following are a few best-practices that we picked up during our pilot.

Define an explicit work plan and system for managing tasks

The biggest challenge Georgiana faced was in identifying specific projects to work on, and working through the ambiguity of a position that explicitly had no single clear workstream. To get around this, Georgiana sought out guidance and advice from others in the community, and over time gained an understanding of ways that she could contribute. This will be a challenge for any Contributor in Residence moving forward, and we will need to think of new ways to support this role with concrete suggestions. Georgiana wrote a bit about this experience in her CIR update blog post.

Dedicate management and mentorship

A related challenge was in ensuring that the CIR felt like they had a clear idea of their expectations, to-do items, and progress. In future iterations, we would like to dedicate more resources towards providing mentorship and management of the person. This role shouldn’t be a “task giver,” but rather someone to help guide, triage, and prioritize for the benefit of the community. They should have regular touch-points with the CIR to make sure they are supported in their role.

Have an onboarding session to help them navigate your project

Regardless of ongoing mentorship, you should definitely have an onboarding session. Many open source communities have multiple projects, varying stages of contributor documentation, different spaces for conversation and interaction. Moreover, the technical infrastructure may have underlying architectures and organization that is hard to learn on your own. Have a core maintainer (or the mentor if you’re using one) go over your technology’s structure and layout, and double-check the things the CIR is unsure of, and introduce them to various discussions and action points to learn from. This may require multiple meetings in the first few weeks/months.

Assess your current strategy every 3 months and adjust based on current needs

Choose a time interval when you revisit your plan. Project needs, community dynamics, and time availability will change over time, so take the time to think about what parts of the plan need to be modified to fit the situation at hand. An option could be bringing this up during team meetings to brainstorm ideas and get feedback, then write it down to put everything in order and make it real. Consider writing 3-month blog posts of your progress and status — these become public records of the experience gained from the program, and are good opportunities to reflect on how things are going.

Actively invite them into the community

Remember that most open source communities are just that — communities. Take time to make sure others in the community know who the CIR is, invite them to participate in community conversations, welcome them in gitter/slack/issues/etc. Anything you can do to make them feel welcome is great. Make sure that you keep doing this over time, not just once at the beginning of their fellowship. When someone feels that they are welcome and a part of a community, they will be much more comfortable at navigating how to support it themselves.

Tailor the role to fit the CIR as a person too

The CIR responsibilities are usually already sketched up before actually having a person to fill the role. Having an overall plan is an awesome idea and a very useful program compass. However, once the CIR is identified, it is important to re-address the overall focus and goals of the role to ensure that it aligns with the interests and skillsets of the incoming CIR. This includes things like familiarity with the project, future career directions for the CIR, and working styles for people involved.

Start slower

Jumping directly into a new project with a lot of complexity and many sub-projects is not sustainable. In our initial plan, we decided to track about 10 main repositories in the first three months of the CIR pilot. This was not sustainable, and led to an overwhelming feeling because of the constant influx of new issues, questions, etc. Especially for someone new to the community and their new role, this led to a feeling of “I’m not learning fast enough,” and difficulty concentrating on the plan. So please, start slow, take the time to acclimate to the project and go repository-by-repository.

Remember that knowing everything was never part of the role

Being a CIR for a complex project can be very overwhelming. There may be different sub-projects, technology stacks, and team processes across a community. It is crucial to consider the CIR program as a two-way street — it is primarily a learning experience and a chance to grow for the CIR, as well as an opportunity for the project to get some much-needed assistance. The CIR should not be expected to know or do everything either at the beginning or the end of their tenure — instead, celebrate the process and the learning that has come from it.

Ideas for things the CIR can work on

Here are a few things that we found particularly useful for the CIR’s workstream, that will likely be portable across other communities as well.

Identify automation and standardization action points

Observe whether a repetitive task could be performed by an automated system like a bot or a CI system, as this has the potential to ease the workload for maintainers.

For example, the JupyterHub GitHub org has about 60 repositories. Some of them had issue and pull requests templates, but most of them didn’t. Some had instructions of how to make a new release of the project, others didn’t. Some had their instructions written in a README and others had documentation, etc.

When the CIR project took off, the JupyterHub and Binder team already started the quest of automating and standardizing its sub-projects. The CIR’s job was to help the other team members propagate the changes (bots setup, templates, badges, docs) to as many repositories as possible.

Research and discover best-practices from other communities

CIRs can also act as helpful eyes and ears to watch other communities and understand the practices that they adopt. Many practices are useful across projects, and a person serving in this “meta” role is in a good position to think at a high level about opportunities.

For example — a simple “thank you” to a contributor can boost their experience a lot, especially when it is their first time contributing to a project. GitHub has a mechanism of letting you know when someone opens an issue or a PR to a project for the first time, but that little notification can be missed when under a huge load of work. For this, there is “the welcome bot” that can be set up to express the community’s gratitude towards their first time contributors.

Our CIR noticed that “The Turing Way” — an open source community-driven guide to reproducible, ethical, inclusive and collaborative data science — made excellent use of GitHub’s welcome bot, including illustrations to make the message even more pleasant and celebratory. Georgiana was inspired by this, which led to adding welcome bots across the JupyterHub repository (including our own hand-drawn images!). Every time a contributor reacts with a ❤ to the bot’s message reinforces the feeling that the bot has its desired, positive effect.

0*WUb7phui0cNKOuL8?q=20
lessons-learned-from-jupyters-contributor-in-residence-pilot-427e2b361a7b
Welcome images that Georgiana drew for the “welcome bots” that she deployed across the JupyterHub repositories.

Spend time on fixing bugs and improving testing infrastructure

Having a working testing infrastructure helps maintainers and contributors alike. Various updates to the underlying infrastructure of the CI system, or even something as simple as a dependency version bump, can break it. So keeping an eye on the health of the testing infrastructure is something that should concern a CIR. This is particularly useful as the world of “maintenance automation” changes very rapidly, with new frameworks, plugins, and standards. Having a pair of eyes constantly thinking about how to improve this process is very helpful.

Share maintainer knowledge by improving the docs

Having just a handful of people knowing how to to release a new version of a project is not scalable and puts an enormous pressure on that group. Instead, each project should have public instructions on how to do this, so that other contributors that have the permissions can cut a new project release.

The process of creating RELEASE.md files with these instructions for JupyterHub sub-projects, just like the standardization process, had already been started inside the JupyterHub org by the team members before the CIR started. The CIR’s job was to extend this idea to as many sub-projects as possible. This standardization and improvement quest continues to move forward thanks to the CIR and to Erik’s and Simon’s coordination efforts!

More generally — the CIR is often someone relatively new to a project, which makes them a valuable perspective to understand where team documentation is unclear to newcomers. Use the fresh eyes of the CIR to understand how to make the project more welcoming and easy to navigate, and it will clear a pathway for more to join the project in the future.

What we still haven’t figured out

While we learned a lot during this pilot phase, there are a few things we still haven’t figured out.

How to define a pipeline for new CIRs

The tenure of a CIR seems like it will be relatively short — maybe one to two years on average. This means that we need a pipeline of new individuals that are ready to learn and take on this role once a previous CIR moves on. Being effective as a CIR was only possible for someone with some context for the JupyterHub repositories and team processes, and we likely would not have had success hiring someone on a one-year contract who had never worked in the JupyterHub community before. In future iterations we will need a reliable pipeline of newcomers to the project to potentially serve as CIRs. Programs like Outreachy and Google Summer of Code are excellent sources of future CIRs, but we need to think about the “pipeline” for gracefully using them to power the CIR project.

Find mentorship that isn’t “nights and weekends”

While we may be able to find funding for mentorship and management, many people in the JupyterHub community already had full-time jobs. Having funding to pay for a part of someone’s time is helpful, but this is only possible if you can actually buy out part of somebody’s time (often not possible in certain careers and roles). It may be possible to spend extra time on mentorship in the evenings and weekends, but a sustainable model for this program will require somebody dedicating part of their “FTE time” on mentorship. We need to identify certain kinds of roles / individuals that have this flexibility in their work-load, and bake it into the strategy of the program.

Define a funding mechanism for CIRs

Finally, this CIR pilot was funded by a one-year grant from CZI. In future iterations we will need to have a strategy for funding this position in a reliable manner (or at least, in a repeatable manner). To make this sustainable, we must define a pattern to follow and require resources to do a good job. This will both reduce the amount of effort needed to plan a new round for the CIR, and will make it easier for other organizations to understand how their financial contributions could help fund another round of the program. Perhaps this can be an ongoing fundraising effort, similar to how other programs have “target funding levels” in their annual giving strategy, or perhaps it can be a coordinated effort across projects to join resources together for these positions. Either way, it will be important to have reliable funding to ensure that the CIR program isn’t in a constant state of sputtering and jumpstarting.

We hope that this post was an informative and useful set of lessons-learned from the JupyterHub team. If you’ve got any suggestions, comments, or questions for us, we’d love to chat. Feel free to open up an issue in the Jupyter Community Forum! We look forward to continuing this experiment, and hope to see other communities experimenting with the same model in the future.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK