A first look at the WordPress developer community

I’m researching the organization of WordPress development as part of my master’s degree. The full article is available in Portuguese in this link. In this post I will briefly share the first results I got analysing WP code repository with two goals in mind:

  1. Compare core developers’ contributions made directly on the repo with the community’s contributions made via a patch on Trac.
  2. Problematize the notion that free software communities are composed by individuals from all over the world by checking where WordPress developers live.

The data was extracted from WP Git repository (git://develop.git.wordpress.org/) on May 2014 using a modified version of gitinspector. The software was adapted to parse the commit message searching for a “props” tag to identify contributions made by non-core developers. Only commits changing PHP, JS or CSS files were taken into account. 25,692 commits were analysed. Among those, 32 core developers made 14,882 commits (58% of total) and 1346 non-core developers made 10,810 commits (42% of total). Considering only non-core developers, 767 individuals made a single contribution and 368 made two contributions.

After getting the list of developers that contributed to WordPress, their location was retrieved by a script from their profile page on wordpress.org. The location field on the profile page is an open non-mandatory field so OpenStreetMap API was used to normalize the data. Using this method it was possible to determine the country of residence of 603 developers (43% of total).

Figure 1 shows the cumulative distribution of core developers’ contributions to the code base (including the commits they made with contributions sent by others via Trac). The different lines shows the cumulative fraction of commits, lines added and lines removed. As can be seem 9 developers were responsible for more than 80% of the contributions.

fig1

Figure 2 plots the cumulative distribution to the code base made by core and non-core developers. In this case there is a larger distribution of the work, 50 developers did 80% of the commits.

fig2The next four figures show the country of residence of the developers and the main language spoken in those countries. Figure 3 represents core developers’ country of residence. The majority of the 32 core developers are based in USA (56%). There are no core developers in Latin America and Africa and only one in Oceania and one in Asia.

fig3As shown in figure 4, the vast majority (84%) of core developers live in countries where the main language is English (27 out of 32).

fig4Figure 5 shows the same analysis as figure 3 but considering core and non-core developers. It was possible to identify developers from 58 different countries. Almost 50% (298 individuals) of them live in USA, 10% (55 individuals) in UK and 5% (32 individuals) in Canada. The remaining 35% are distributed across 55 countries. 55% of the total number of developers are in North America, 32% in Europe, 9% in Asia and the remaining 4% in Oceania, South America and Africa.

fig5At last, figure 6 represents the main language spoken in core and non-core developers’ country of residence. Again the majority (67%) is based in English speaking countries.

fig6The results shown in this post are part of an ongoing research. Any feedback or suggestion will be greatly appreciated. Next I intend to expand this study  applying the same analysis to the plugins repository and Trac.