Open source components aimed at connecting applications to cloud resources and those written in Python have jumped up the list of critical packages, according to the latest rankings of the open source software ecosystem -- a reordering that underscores the projects that need to be well-funded to improve the security of the software ecosystem.
The data-collection effort -- known as the "Census of Free and Open Source Software" -- classifies the open source projects into eight top 500 lists, depending on their ecosystem, whether version information is included, and whether direct and indirect dependencies are taken into account. The latest survey of software, known as Census III, found that packages for Python software and those meant to connect developers with specific cloud services -- such as a toolkit for Amazon's Elastic Computing Cloud (EC2) or the API for connecting Go programs to Google Cloud -- have become much more popular and, thus, critical to software development.
While cloud-native and hybrid development are by no means new, cloud providers have created an increasing number of software development kits (SDKs) for developers. Their widespread use has boosted those tools in the rankings of critical software, says David Wheeler, director of open source supply chain security for the Linux Foundation, which collaborates with Harvard Business School to produce the census.
"Cloud providers offer a lot of specialized services, but the early uses of cloud were a lot of lift-and-shift moves," he says. "Increasingly, we're seeing people write software specifically intended to be run on a cloud, [and there is a] rising level of these kinds of packages -- it's something that is dramatically increasing."
The third "Census of Free and Open Source Software" report comes more than two years after the official publication of Census II in March 2022 -- an initial version of that report was released in 2020 -- and nine years after the original census report. The data-collection exercises aim to identify the most critical open source software so that the public and private sectors can effectively invest in the projects as a path to improve software security. Each software package is scored using data from software supply chain firms FOSSA, Snyk, Sonatype, and the Synopsys Cybersecurity Research Center (CyRC).
The resilience of the software supply chain has become a major concern of the software industry and national governments. The Biden administration, for example, released a National Cybersecurity Strategy that firmly emphasized finding ways to improve the security of software and the open source ecosystem on which most applications rely.
The Amazon Web Services (AWS) software development kit for Python, known as Boto3, rose to fifth place on the list of critical software on the "Non-npm, Direct, Version Agnostic Packages" list. The library was not ranked in the previous Census II. A similar package -- aws-sdk -- rose to the seventh spot on the JavaScript-ecosystem "npm, Direct, Version Agnostic Packages" list, from 307th in the previous census.
Other cloud-focused packages saw similar jumps: The software development kit to connect Go programs to Google Cloud ranked eighth, while the AWS kit for .NET rose to number 30. Neither were ranked in the previous census.
Because the Node Package Manager (npm) ecosystem sees a significant volume of JavaScript downloads -- 4.5 trillion in 2024, compared to 530 billion for Python, according to Sonatype -- the data overwhelms measurements of popularity. As a result, the census breaks out npm downloads from those for other software ecosystems.
The data underscores the criticality of open source software to the infrastructure underpinning cloud services, says Brian Fox, CTO and co-founder of Sonatype, a software supply chain management firm.
"Open source across the board just continues to see 'hockey stick' growth year after year, which is shocking -- we're starting to see really, really big numbers," he says. "That's the reason why they're doing the census, because it is so important to be shining a light on these things."
Replacing or patching outdated software has become a central focus of efforts to eliminate vulnerabilities from software. Over the past decade, for example, Python developers have only slowly moved to use Python 3, which was originally introduced in 2006. Last year, 1% of Python developers used Python 2 as their primary programming language, down from 13% in 2019, according to data from JetBrains' annual "Developer Ecosystem" report.
As a result, a project designed to allow compatibility between software written in Python 2 and code in Python 3 -- the "Six" project -- has become a critical software component, according to Census III. Typically, Python versions are supported for five years. Python 3.11 -- currently used by 27% of developers as their primary programming language, making it the most popular version at present -- will reach its end of life in October 2027. The final version of Python 2 -- version 2.7 -- passed its end of life in January 2020.
The data does not address how often developers encounter -- and interact with -- components written in Python 2. The overwhelming shift to Python 3 is driving the use of Six, as developers need to use older code with programs written in the latest version of Python. In addition, certain groups of developers -- such as 29% of data scientists and 19% of Web developers -- continue to use some Python 2 code, according to data from JetBrains, a maker of development tools.
"If you look at the raw numbers, Python 3 is far more common, but in various specific domains Python 2 is still widely, widely used, which is why Six is showing up more," the Linux Foundation's Wheeler says. "I would argue it's why we're finally able to get so many more Python 3 users is because the bridge to move from 2 to 3 is easier."
While Census III is available to download from the Linux Foundation, companies should be automating their package management and regularly testing and updating their software, says Sonatype's Fox. The real lesson from the census is not which packages should be given the most attention, but which projects need additional funds and paid maintainers.
"The sustainability of the [open source ecosystem] is something that should be top of mind," he says. "We're dependent more and more on largely an aging and unpaid workforce for maintaining critical software -- those two things together don't end well."