Research

Research

VERSO is involved in a variety of research activities  including the development, governance, and sustainability of open-source projects, investigating how communities collaborate, make decisions, and manage resources.

Publications from UVM on Open Source

Multidisciplinary learning through collective performance favors decentralization.
Meluso, J. & Hébert-Dufresne, L.
2023 120 (34): e2303568120.
Proceedings of the National Academy of Sciences

https://doi.org/10.1073/pnas.2303568120
Many models of learning in teams assume that team members can share solutions or learn concurrently. However, these assumptions break down in multidisciplinary teams where team members often complete distinct, interrelated pieces of larger tasks. Such contexts make it difficult for individuals to separate the performance effects of their own actions from the actions of interacting neighbors. In this work, we show that individuals can overcome this challenge by learning from network neighbors through mediating artifacts (like collective performance assessments). When neighbors’ actions influence collective outcomes, teams with different networks perform relatively similarly to one another. However, varying a team’s network can affect performance on tasks that weight individuals’ contributions by network properties. Consequently, when individuals innovate (through “exploring” searches), dense networks hurt performance slightly by increasing uncertainty. In contrast, dense networks moderately help performance when individuals refine their work (through “exploiting” searches) by efficiently finding local optima. We also find that decentralization improves team performance across a battery of 34 tasks. Our results offer design principles for multidisciplinary teams within which other forms of learning prove more difficult

 

 

Beyond the Repository: Best practices for open source ecosystems researchers
Amanda Casari, Julia Ferraioli, and Juniper Lovato.
March/April 2023
Queue 21, 2, Pages 30 (March/April 2023), 21 pages.
 
Much of the existing research about open source elects to study software repositories instead of ecosystems. An open source repository most often refers to the artifacts recorded in a version control system and occasionally includes interactions around the repository itself. An open source ecosystem refers to a collection of repositories, the community, their interactions, incentives, behavioral norms, and culture. The decentralized nature of open source makes holistic analysis of the ecosystem an arduous task, with communities and identities intersecting in organic and evolving ways. Despite these complexities, the increased scrutiny on software security and supply chains makes it of the utmost importance to take an ecosystem-based approach when performing research about open source. This article provides guidelines and best practices for research using data collected from open source ecosystems, encouraging research teams to work with communities in respectful ways.
 
Indirect social learning through collective performance favors decentralizationJohn Meluso, Laurent Hébert-Dufresne
Many models of learning in teams assume that team members can share solutions or learn concurrently. These assumptions break down in multi-disciplinary teams where team members tend to complete distinct, interrelated pieces of larger tasks. Such contexts make it difficult for individuals to separate the performance effects of their own actions from the actions of interacting neighbors. In this work, we show that individuals can also learn indirectly from network neighbors. In this context, individuals in dense networks tend to perform better when refining their work (through “exploiting” searches) by efficiently finding local optima. However when individuals innovate (through “exploring” searches), dense networks hurt performance by increasing uncertainty. We also find that decentralization improves team performance across a wide variety of tasks. Our results offer new design principles for multi-disciplinary teams where direct learning is not realistic or possible.
 

Limits of Individual Consent and Models of Distributed Consent in Online Social Networks

Juniper L. Lovato, Antoine Allard, Randall Harp, Jeremiah Onaolapo, and Laurent Hébert-Dufresne.
2022
FAccT ’22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, June 2022, Pages 2251–2262
 
Personal data are not discrete in socially-networked digital environments. A user who consents to allow access to their profile can expose the personal data of their network connections to non-consented access. Therefore, the traditional consent model (informed and individual) is not appropriate in social networks where informed consent may not be possible for all users affected by data processing and where information is distributed across users. Here, we outline the adequacy of consent for data transactions. Informed by the shortcomings of individual consent, we introduce both a platform-specific model of “distributed consent” and a cross-platform model of a “consent passport.” In both models, individuals and groups can coordinate by giving consent conditional on that of their network connections. We simulate the impact of these distributed consent models on the observability of social networks and find that low adoption would allow macroscopic subsets of networks to preserve their connectivity and privacy.
 

A Review & Framework for Modeling Complex Engineered System Development Processes

John Meluso, Jesse Austin-Breneman, James P. Bagrow, Laurent Hébert-Dufresne
 

Developing complex engineered systems (CES) poses significant challenges for engineers, managers, designers, and businesspeople alike due to the inherent complexity of the systems and contexts involved. Furthermore, experts have expressed great interest in filling the gap in theory about how CES develop. This article begins to address that gap in two ways. First, it reviews the numerous definitions of CES along with existing theory and methods on CES development processes. Then, it proposes the ComplEx System Integrated Utilities Model (CESIUM), a novel framework for exploring how numerous system and development process characteristics may affect the performance of CES. CESIUM creates simulated representations of a system architecture, the corresponding engineering organization, and the new product development process through which the organization designs the system. It does so by representing the system as a network of interdependent artifacts designed by agents. Agents iteratively design their artifacts through optimization and share information with other agents, thereby advancing the CES toward a solution. This paper describes the model, conducts a sensitivity analysis, provides validation, and suggests directions for future study.

 
Open source ecosystems need equitable credit across contributions
Amanda Casari, Katie McLaughlin, Milo Z. Trujillo, Jean-Gabriel Young, James P. Bagrow & Laurent Hébert-Dufresne
January 2021
Nat Comput Sci 1, 2 (2021)
 
Collaborative and creative communities are more equitable when all contributions to a project are acknowledged. Equitable communities are, in turn, more transparent, more accessible to newcomers, and more encouraging of innovation — hence we should foster these communities, starting with proper attribution of credit. However, to date, no standard and comprehensive contribution acknowledgement system exists in open source, not just for software development but for the broader ecosystems of conferences, organization and outreach efforts, and technical knowledge. Furthermore, both closed and open source projects are built on a complex web of open source dependencies, and we lack a nuanced understanding of who creates and maintains these projects1. As a result, large sums and efforts go to open source software projects without knowing whom the investments support and where they have impact
 

Which contributions count? Analysis of attribution in open source

J.-G. Young, A. Casari, K. McLaughlin, M. Z. Trujillo, L. Hébert-Dufresne and J. P. Bagrow

2021

2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR)

https://doi.ieeecomputersociety.org/10.1109/MSR52588.2021.00036

Open source software projects usually acknowledge contributions with text files, websites, and other idiosyncratic methods. These data sources are hard to mine, which is why contributorship is most frequently measured through changes to repositories, such as commits, pushes, or patches. Recently, some open source projects have taken to recording contributor actions with standardized systems; this opens up a unique opportunity to understand how community-generated notions of contributorship map onto codebases as the measure of contribution. Here, we characterize contributor acknowledgment models in open source by analyzing thousands of projects that use a model called All Contributors to acknowledge diverse contributions like outreach, finance, infrastructure, and community management. We analyze the life cycle of projects through this model’s lens and contrast its representation of contributorship with the picture given by other methods of acknowledgment, including GitHub’s top committers indicator and contributions derived from actions taken on the platform. We find that community-generated systems of contribution acknowledgment make work like idea generation or bug finding more visible, which generates a more extensive picture of collaboration. Further, we find that models requiring explicit attribution lead to more clearly defined boundaries around what is and is not a contribution.

Multidisciplinary learning through collective performance favors decentralization

Meluso, John, Hébert-Dufresne, Laurent

August, 2023

Proceedings of the National Academy of Sciences Volume 120

https://doi.org/10.1073/pnas.2303568120

 

Many models of learning in teams assume that team members can share solutions or learn concurrently. However, these assumptions break down in multidisciplinary teams where team members often complete distinct, interrelated pieces of larger tasks. Such contexts make it difficult for individuals to separate the performance effects of their own actions from the actions of interacting neighbors. In this work, we show that individuals can overcome this challenge by learning from network neighbors through mediating artifacts (like collective performance assessments). When neighbors’ actions influence collective outcomes, teams with different networks perform relatively similarly to one another. However, varying a team’s network can affect performance on tasks that weight individuals’ contributions by network properties. Consequently, when individuals innovate (through “exploring” searches), dense networks hurt performance slightly by increasing uncertainty. In contrast, dense networks moderately help performance when individuals refine their work (through “exploiting” searches) by efficiently finding local optima. We also find that decentralization improves team performance across a battery of 34 tasks. Our results offer design principles for multidisciplinary teams within which other forms of learning prove more difficult.

 

The OCEAN mailing list data set: Network analysis spanning mailing lists and code repositories

M. Warrick and S. F. Rosenblatt and J. Young and A. Casari and L. Hebert-Dufresne and J. Bagrow

2022

2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR)

https://doi.ieeecomputersociety.org/10.1145/3524842.3528479  

Communication surrounding the development of an open source project largely occurs outside the software repository itself. Historically, large communities often used a collection of mailing lists to discuss the different aspects of their projects. Multimodal tool use, with software development and communication happening on different channels, complicates the study of open source projects as a sociotechnical system. Here, we combine and standardize mailing lists of the Python community, resulting in 954,287 messages from 1995 to the present. We share all scraping and cleaning code to facilitate reproduction of this work, as well as smaller datasets for the Golang (122,721 messages), Angular (20,041 messages) and Node.js (12,514 messages) communities. To showcase the usefulness of these data, we focus on the CPython repository and merge the technical layer (which GitHub account works on what file and with whom) with the social layer (messages from unique email addresses) by identifying 33% of GitHub contributors in the mailing list data. We then explore correlations between the valence of social messaging and the structure of the collaboration network. We discuss how these data provide a laboratory to test theories from standard organizational science in large open source projects

Reading List about Open Work

This is a list of books in the general public related to Open Source Development, related skills and history. For UVM students the links to the Library are below.

 

Core Concepts

Raymond, Eric S. The Cathedral and the Bazaar. Sebastopol: O’Reilly Media, Incorporated, 2001. Web.

http://primo.uvm.edu/permalink/f/12s06dh/TN_cdi_proquest_miscellaneous_37837812

 

 

Torvalds, Linus, and Diamond, David. Just for Fun : The Story of an Accidental Revolutionary / Linus Torvalds and David Diamond. 1st ed. New York, NY: HarperBusiness, 2001. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER1266212

 

DiBona, Chris, Sam Ockman, and Mark Stone. Open Sources. Sebastopol: O’Reilly Media, Incorporated, 1999. Web.

http://primo.uvm.edu/permalink/f/12s06dh/TN_cdi_proquest_ebookcentral_EBC443191

 

Lindberg, Van. Intellectual Property and Open Source. Sebastopol: O’Reilly Media, Incorporated, 2008. Web.

http://primo.uvm.edu/permalink/f/12s06dh/TN_cdi_proquest_ebookcentral_EBC540358

 

 

Rathee, Sachin, and Amol Chobe. Getting Started with Open Source Technologies. Berkeley, CA: Apress L. P, 2022. Web.

http://primo.uvm.edu/permalink/f/12s06dh/TN_cdi_safari_books_v2_9781484281277

 

VM Brasseur. Forge Your Future with Open Source. Pragmatic helf, 2018. Web.

http://primo.uvm.edu/permalink/f/12s06dh/TN_cdi_safari_books_v2_9781680506389

 

Haff, Gordon. How Open Source Ate Software. 2nd ed. Berkeley, CA: Apress L. P, 2021. Web.

http://primo.uvm.edu/permalink/f/12s06dh/TN_cdi_springer_books_10_1007_978_1_4842_6800_1

 

Jhangiani, Rajiv, and Biswas-Diener, Robert, Editor. Open [electronic Resource] : The Philosophy and Practices That Are Revolutionizing Education and Science / Edited by Rajiv S. Jhangiani and Robert Biswas-Diener. 2017. Web.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER4842686  

 

Hall, G. Brent. Open Source Approaches in Spatial Data Handling [electronic Resource] / by G. Brent Hall. Berlin ; London: Springer, 2008. Advances in Geographic Information Science ; 2. Web.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER3680474

 

 

Laurent, Andrew M. St. Understanding Open Source and Free Software Licensing. Sebastopol: O’Reilly Media, Incorporated, 2004. Web.

http://primo.uvm.edu/permalink/f/12s06dh/TN_cdi_askewsholts_vlebooks_9780596553951  

 

 

OW2. Open Source Good Governance Handbook : Version 1.0 / OW2 & the Good Governance Initiative Participants. Version 1.0.. ed. 2022. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER5552099

 

Brown, Amy, and Wilson, Greg, Editor. The Architecture of Open Source Applications : Elegance, Evolution, and a Few Fearless Hacks / Edited by Amy Brown & Greg Wilson. 2011. Web.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER5552101

 

Weber, Steve. The Success of Open Source / Steven Weber. Cambridge, Mass. ; London: Harvard UP, 2005. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER5552110

 

Beebe, Barton Carl. Trademark Law : An Open-source Casebook / Barton Beebe. Version 9 (2022). ed. 2022. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER5552117  

 

Tozzi, Christopher J., and Zittrain, Jonathan , Writer of Foreword. For Fun and Profit : A History of the Free and Open Source Software Revolution / Christopher Tozzi ; Foreword by Jonathan Zittrain. 2017. Print. History of Computing.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER5552118

 

Design

Norman, Donald A. The Design of Everyday Things / Don Norman. Revised and Expanded ed. 2013. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER2853424  

 

Buley, Leah. The User Experience Team of One : A Research and Design Survival Guide / Leah Buley. 2013. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER3321692

 

Eyal, Nir, and Hoover, Ryan, Author. Hooked : How to Build Habit-forming Products / Nir Eyal with Ryan Hoover. 2014. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER4955980

 

 

Knapp, Jake, and Zeratsky, John , Author. Sprint : How to Solve Big Problems and Test New Ideas in Just Five Days / Jake Knapp ; with John Zeratsky and Braden Kowitz. First Simon & Schuster Hardcover ed. 2016. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER5552100

 

Innovation/Entrepreneurship

Olsen, Dan. The Lean Product Playbook. New York: Wiley, 2015. Web.

http://primo.uvm.edu/permalink/f/12s06dh/TN_cdi_skillsoft_books24x7_bkb00082554

 

Ries, Eric. The Lean Startup : How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses / Eric Ries. First ed. 2011. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER3657809

 

Moore, Geoffrey A. Crossing the Chasm : Marketing and Selling Technology Products to Mainstream Customers / Geoffrey A. Moore ; with a Foreword by Regis McKenna. New York, N.Y.]: HarperBusiness, 1991. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER600862

 

Christensen, Clayton M. The Innovator’s Dilemma : When New Technologies Cause Great Firms to Fail / Clayton M. Christensen. Third Edition?].. ed. 2016. Print. Management of Innovation and Change Ser.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER4938007

 

 

Impact

Alan Cooper. Inmates Are Running the Asylum, The: Why High-Tech Products Drive Us Crazy and How to Restore the Sanity. Sams, 2004. Web.

http://primo.uvm.edu/permalink/f/12s06dh/TN_cdi_safari_books_v2_0672326140 

 

Williams, Sarah. Data Action : Using Data for Public Good / Sarah Williams. 2020. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER5247276

 

Benjamin, Ruha. Race after Technology : Abolitionist Tools for the New Jim Code / Ruha Benjamin. 2019. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER5413100

 

O’Neil, Cathy. Weapons of Math Destruction : How Big Data Increases Inequality and Threatens Democracy / Cathy O’Neil. First ed. 2016. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER3847355

 

Eubanks, Virginia. Automating Inequality : How High-tech Tools Profile, Police, and Punish the Poor / Virginia Eubanks. First ed. 2018. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER4174984

 

Noble, Safiya Umoja. Algorithms of Oppression : How Search Engines Reinforce Racism / Safiya Umoja Noble. 2018. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER4322328 

 

 

D’Ignazio, Catherine, and Klein, Lauren F., Author. Data Feminism / Catherine D’Ignazio and Lauren F. Klein. 2020. Print. Ideas Ser. 

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER5552442

 

 

Clearfield, Chris, and Tilcsik, András, Author. Meltdown : What Plane Crashes, Oil Spills, and Dumb Business Decisions Can Teach Us about How to Succeed at Work and at Home / Chris Clearfield and András Tilcsik. 2019. Print.

http://primo.uvm.edu/permalink/f/1mpllsg/UVM_VOYAGER5552098