How people organise themselves in Galaxy. WIP. Help wanted.

Author(s) AvatarggscAvatarRoss Lazarus
Overview
Creative Commons License: CC-BY Questions:
  • What are the main communities?

  • What are the main management bodies?

  • How do they all fit together?

  • Who pays for the ‘free’ analysis computational resources?

  • How are decisions really made?

  • Who’s in charge?

  • Why is contribution from the community so important?

  • How can I join in?

Objectives:
  • Understand how Galaxy communities develop and get things done

  • Understand how different communities collaborate to make things happen

  • Understand opportunities for engagement and contributing your skills

Time estimation: 30 minutes
Supporting Materials:
Last modification: May 22, 2023
License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MIT
Comment: Note to contributors
  • Work in progress!! First draft to try to get a structure to make sense.
  • Needs many contributors to make it useful. What would you like to have known, when you first tried getting things done in Galaxy? Please add what’s missing and fix what’s broken. Headings are mostly stubs waiting to be edited and extended
  • Trying to describe the big picture will necessarily be big. Will probably need to break this already very wordy module into separate parts.
  • Add your story or stories to the stories tutorial too please!
  • Ross has strong opinions.
  • Many of them are probably wrong but he doesn’t know which ones yet.
  • Please feel free to contribute your own, to make this more useful to future readers.
Comment: Note to readers

This module describes how people organise themselves in the Galaxy project, as communities and smaller groups, to get things done. Galaxy produces outputs and consumes resources described elsewhere - this lesson describes the most important ways that individual participants choose to work together.

Open source values are crucial to this part of the project, since they encourage taking personal responsibility for contributing to the project, and ensure that participants are valued and encouraged in workspaces that are safe and enjoyable places for everyone.

Agenda: Field Guide part 1. People in Galaxy.
  1. Introduction
  2. Field guide to people and organisations in the Galaxy project
    1. Communities of Practice
    2. Regions
    3. Developers and system administrators
    4. Collaborating investigators
    5. Contributors
    6. Users
    7. Institutional adopters
    8. Educators
    9. Governance and organisation
    10. Ensuring continuity and sustainability for the project
    11. Your ideas
    12. Further reading

Introduction

People and their interactions in communities and structures are the key to understanding Galaxy. Many organised activities are the result of people organising themselves around particular interests, helping the project to improve open science analysis practice. Many individuals are part of more than one of the organisations and communities described below.

Galaxy participants must govern themselves, and community participants work on things they find interesting, on their own terms. Grant deliverables are the source of sustainability for resources, so priorities must take their importance into account. Taking responsibility for improving the project where possible, is encouraged in all open projects. Participants choose their own ways to engage, and the small groups where most activity takes place, choose their own work arrangements, to meet milestones and reporting requirements for tasks.

First experiences of working in an open source community may be a little surprising for those familiar with more usual, hierarchical management styles. Community contributors in open projects effectively “choose their own adventure”, by managing their own engagement, as they pursue their own interests, at their own pace.

Day to day decision making is necessarily devolved to where the work is being implemented. Often a chat channel is used for public communication, and that can be a good place to learn more about their work. Consistently active and helpful contributors gain community respect. In most working situations, when leadership is needed, it derives from local consensus, rather than by job title.

Participant leadership of new initiatives, is highly encouraged, and the companion lesson on community development success stories illustrates how this works. Open projects depend on community engagement for sustainability, and they grow when participants lead new activities. Galaxy thrives, when contributors take personal responsibility for improving the project, rather than waiting for someone else to do it for them.

Collaborators are based in three (Australia, Europe, US) main groups, so normal real-time meetings for participants from at least 2 of them are sometimes very early or very late in the local day. This can be challenging for those with family and other obligations. As a result, asynchronous communication mechanisms, such as task related chat channels, are used for most routine communication.

Annual face to face meetings have always been important events. The pandemic interfered with this, but Galaxy community conferences are now back. They have and continue to be important, for establishing and refreshing personal relationships, and extending collaborations. They are often followed or preceeded by training sessions and hackathons to take advantage of the concentration of expertise. Conference sessions are well organised, offering varied and informative content each day, and there are well established social traditions, that explain a lot about how the project really succeeds.

People could also be considered as resources, since they provide project inputs. Investigators bring resources and skills. Contributors volunteer their skills to make things even better. The sum of this combined effort produces the project outputs. The focus here is on how people organise themselves to get things done, rather than what they do. Those details are on the Hub. Resources used in project activities are described in another lesson.

Field guide to people and organisations in the Galaxy project

Communities of Practice

Participants sharing a particular interest are encouraged to organise project activities that attract and engage others interested in that topic. Many of these communities support specialised framework “flavours”, by supporting training, toolkits and workflows for particular kinds of data and analyses.

Galaxy began with tools mostly for genomic sequence data. From the outset, it was designed to accommodate any open source command line analysis package, and any new specialised data format. Over time, this design has proven readily adaptable for many new fields. Galaxy servers can be “flavoured” for specialist researchers, by configuring new data types and installing relevant command line software analysis packages as tools. The development of some of these important communities is described in the community development success stories lesson associated with this topic.

Best practice tool kits are maintained by contributors working in many fields:

Regions

Participants create geographically local active communities

  • North America
  • Europe
  • Australia
  • India
  • add your own here…

Developers and system administrators

System administrators and developers working on core source code form communities recognisable in supporting the various source code repositories. They communicate through github issues, pull requests, and chat channels like other open source project developers. Examples include:

  • Source code. Team and many independent contributors
  • Tool wrappers. IUC. Communities of Practice
  • Ansible and other deployment automation infrastructure - system administrators

These communities make a substantial investment, building developer and system administration capacity for the community, through collaboration with the GTN, providing specialised developer training material, training intensives and on line chat support.

  • Galantries/Smorgasbords

Galaxy depends on thousands of independent, external developers for analysis packages for tools. Those developers are essential for Galaxy, but they do not organise activities at present.

Collaborating investigators

Independent principal investigators bring multiple independent grants that support interacting project deliverables. These grants provide dedicated professional and other resources needed to secure and manage core project resources such as the source code, Hub, usegalaxy.* analysis services, Galaxy Training Network infrastructure and services, user and community support, and project governance.

Free analysis consumes substantial research infrastructure and computational resources, and project activities are supported by highly skilled professionals

Collaborating grants operate efficiently to achieve their own goals while helping expand and support the project. These pooled resources and efficient collaboration magnifies the impact of each individual grant in terms of value to open scientists.

No single grant could encompass the complexity of institutions and resources involved. Some are code focussed, others deliver resources such as the GTN and usegalaxy.* services.

The sum of community engagement and contribution, and highly productive and integrated grants, returns substantial additional value on grant investment.

Contributors

Galaxy is an open project so contributors are welcomed and supported. Project success and sustainability depends on contributions from community members to multiply the impact of the underlying grant resources.

Project complexity offers many opportunities for contribution. Anyone is free to join in any activities that interest them. Suggestions are always welcomed for ways to make it easier to navigate.

In the past, community members turn into highly valued contributors. Finding their own particular way to join in according to their situation. Effort is being devoted to documenting and clarifying the process, to make it far easier to navigate, for all kinds of contributors. This training module is part of that effort. The contributor stories material helps illustrate how useful contributions can be made.

  • Issues can be raised and pull request submitted for review at the github repository.
  • The GTN provides integrated tutorials on using Galaxy, and technical capacity building training material, such as on server administration, and on making GTN tutorials. Infrastructure for slides narrated in multiple languages, and hands-on tutorials applying real Galaxy tools to real data are supported.
  • The IUC provides best practice guidance and GTN training modules to help developers build new tool wrappers for Galaxy, and automated infrastructure to help maintain them efficiently.
  • GGSC offers opportunities for motivated community members to initiate and lead community development collaborations that can help the project to expand into activities that help support and expand subcommunities of users.
  • Working groups provide opportunities for the contribution of technical and other skills in focussed project areas ranging from GUI development through to training and outreach.

Users

Galaxy users vary widely in interests and technical skills. The majority use Galaxy through the usual form and workflow based web GUI and do not want to write their own analysis code.

Galaxy has support for Jupyter based interactive tools, that allow technically capable users to write their own code, and run it on Galaxy history datasets. The resulting notebook is a reproducible and shareable way to provide tailored code to Galaxy users.

Institutional adopters

  • Institutions: organisations deploying and using private Galaxy services
    • NHGRI AnVIL

Educators

  • GTN!

Governance and organisation

  • The participant Code of Conduct regulates all project related interactions so they are safe and enjoyable for everyone.
  • Participants, groups and communities are self governing, based on open source community shared values.
  • Unusual challenge: Respect independent grant holder responsibilities
  • Executive Board
  • GGSC
  • Working groups
  • Road maps
  • IUC/IWC/IDC and other project initiatives - open but initiated by the Team?
  • ….
  • Your ideas here please?

Ensuring continuity and sustainability for the project

This training material is designed to make the project and its impacts more widely and well understood as a way of improving long term sustainability. In the shorter term, continuity and resilience are also of concern. Galaxy is old enough to have experienced the unexpected and tragic loss of key contributors.

Sustainability

The project is currently sustained with expanding resources and community contribution, and with self-governing structures coordinating the delivery of increasing value for scientists.

  • Expanding the project with new collaborations increases the grant resources, communities and potential leadership base, and is encouraged where possible.
  • Expanding the pool of regular, long term community contributors is important for sustainability because they are very productive and important in supporting and mentoring participants. Training for contributors is part of the GTN mission, so there is a strong base to work with. Recruitment strategies and other ways to bring in new people are needed.
  • Giving back and other ways of recognising contribution remains in need of dedicated resources and work.

Continuity

  • Reliable 24x7 services such as usegalaxy.* provide, are surprisingly hard to sustain. These are very large scale, complicated computing, networking, storage and software systems with a great deal of local complexity, because each operates on distinct mixtures of research computing infrastructure. In addition to a deep understanding of Galaxy code, they require an unusual mix of skills and experience, to design, commission, test and run. Once working, hardware failures, bugs in core software infrastructure components, and other unpredictable problems, must be diagnosed efficiently and fixed quickly. As a result, their stable operation depends on highly specialised and dedicated personnel. A spare system administrator sitting idly on call for each of the three sites is a not practicable proposition. Capacity building is part of the training and mentorship that already happens but might need even more investment.
  • Retaining, mentoring and manpower planning for professional staff
  • Governance and small group leadership

Your ideas

  • for people things to describe here please

Further reading

For more on the main components, and stories of how people get things done in the project, choose from the other lessons in this Topic: