As a step towards delivering optimal solutions, our current sprint at Beavr Labs revolved around improving build times and overall app performance.
While there are a lot of ongoing discussions regarding optimizing DB calls by using Redis Cache, improving time complexities of certain hefty procedures, and removing redundant code blocks, the key aspect was around migrating from an existing package manager and environment to something more performant!
This blog will take you on an adventurous journey from the old comfortable streets of npm to the express highways of its more modern competitors, and help you learn and appreciate how each package manager tries to solve the same problem in its own unique way, and how interestingly none of them is the clear winner. It all boils down to what is required for a specific project.
Introduction to npm: The Node Package Manager
Developed in 2009 by Isaac Schlueter, npm has grown into one of the largest software registries in the world. It hosts a vast collection of open-source packages and tools, making it an essential resource for developers across various domains. The npm registry is a place for developers to share their creations and collaborate with others.
Key Components of NPM
npm local cache
When you install a package using npm, it is downloaded to your local cache. This means that you don't have to download the package again each time you start your project.
npm dependency management system
The npm dependency management system allows you to list all of the dependencies that your project needs in a package.json file. npm will then automatically install and update these dependencies as needed.
How does npm work:
You run the npm install command.
npm checks the local cache to see if the package is already installed.
If the package is not already installed, npm downloads it from the npm registry.
npm installs the package to your local cache.
npm adds the package to your project's package.json file.
Next time you start your project, npm will go through your project's package.json and install the listed dependencies.
npm also allows you to manage your dependencies in a number of ways. For example, you can specify which version of a package you want to install, or you can install specific packages from a private registry.
npm is a powerful tool that can help you save time and effort, improve the quality of your code, and increase collaboration with other developers.
Why are devs leaving npm?
npm is a popular package manager, but it also has some disadvantages:
Security risks: npm has been known to have security vulnerabilities. For example, in 2017, a vulnerability in the npm registry allowed attackers to inject malicious code into packages.
Performance: npm can be slow to install and update packages, especially for large projects with many dependencies.
Complexity: npm can be complex to use, especially for new developers.
Dependency hell: It is a situation where your project has dependencies that are incompatible with each other. This can be difficult to fix, and it can lead to build failures and other problems.
That was enough context to start with this blog's core intention!
Yarn is another popular package manager that is known for its speed and reliability. Yarn also has a number of features that npm does not, such as workspaces and lock files.
Yarn Berry is a new version of Yarn that is still under development. Yarn Berry aims to be faster, more reliable, and easier to use than npm.
pnpm is a fast and lightweight package manager that uses a unique dependency resolution algorithm (using hardlinks and symlinks, which have been covered a bit later in this blog). pnpm can be significantly faster than npm for installing and updating packages, especially for large projects.
Bun is a new package manager that is designed to be fast, simple, and secure. Bun uses a unique dependency resolution algorithm that is based on the concept of "layers".
Each of these package managers has its own strengths and weaknesses. In this blog post, we will compare and contrast the four package managers and discuss which one is the best choice for different types of projects.
Introduction to Yarn
Yarn Berry: The Latest Evolution
Yarn Berry represents the latest evolution of the Yarn package manager. It introduces features and performance enhancements that further streamline the development workflow. One of the standout features of Yarn Berry is its Plug'n'Play (PnP) system, an efficient approach to managing dependencies.
Exploring Yarn Berry’s most significant achievement - PnP
Well, not anymore - thanks to Yarn’s Plug’n’Play (PnP)!
“The way Yarn PnP works, it tells Yarn to generate a single Node.js loader file in place of the typical node_modules folder. This loader file, named .pnp.cjs, contains all information about your project's dependency tree, informing your tools as to the location of the packages on the disk and letting them know how to resolve require and import calls.”
PnP has many advantages over the traditional node_modules installation strategy:
Faster install times: PnP installs dependencies much faster than the node_modules installation strategy. This is because PnP does not need to create a node_modules directory or download all the dependency files to disk.
Less disk space: PnP uses less disk space than the node_modules installation strategy. This is because PnP does not need to download all the dependency files to disk.
Efficient dependency resolution: The node_modules installation strategy is inefficient. But, PnP can resolve dependencies in parallel and does not need to traverse the node_modules directory.
Better support for workspaces: This is because PnP can manage dependencies for all the projects in a workspace in a single place.
“Yarn PnP allows to reuse the same package artifacts across all projects on the disk. Unlike pnpm, which uses a content-addressable store where each file from each package needs to be hardlinked into its final destination, the PnP loader directly references packages via their cache path, removing a lot of complexity.”
Yarn simplifies parallel task execution using its native integration for background jobs. By employing the background job syntax (&), tasks specified in the scripts field can be run simultaneously. Each task's output is labeled for easy identification. For instance, running linting and tests in parallel is as straightforward as:
yarn lint & yarn test
Advantages of Yarn
Faster and more efficient dependency installation: This is made possible by the "caching" mechanism. Additionally, it uses a parallel dependency resolution algorithm, which can further improve performance.
Better support for workspaces and monorepos: Yarn now makes it easier to handle projects and groups of projects. It can manage all the things these projects need in one spot. This helps in sharing and keeping everything up-to-date in a big project group.
Robust error handling: Yarn provides detailed and helpful error messages. This can make it easier to diagnose and fix dependency installation problems.
Large and active community: This implies that there are many resources available to help users with their problems.
Disadvantages of Yarn
Larger lockfile: Yarn generates a larger lockfile than other package managers. This can make it slower to read and write the lockfile, and it can also make it more difficult to share the lockfile with others.
Less mature than npm: Yarn is a newer package manager than npm, and it has a smaller community. This means that there are fewer plugins and tools available for Yarn than for npm.
More complex to configure: Yarn can be more complex to configure than other package managers. It has lots of extra features that others don't, like handling many projects at once and resolving dependencies parallelly.
Project structures for which Yarn is most suitable
Large and complex projects: Yarn can handle large and complex projects with many dependencies very well thanks to its caching and parallel dependency resolution algorithms.
Workspaces and monorepos: Yarn is a good choice for projects that use workspaces or monorepos. This is because Yarn can manage dependencies for all of the projects in a workspace or monorepo in a single place.
Projects that need robust error handling: Yarn provides detailed and helpful error messages. This can make it easier to diagnose and fix dependency installation problems.
Overall, Yarn is a fast, efficient, and scalable package manager that is well-suited for large and complex projects, workspaces, and monorepos, and projects that need robust error handling.
Pnpm is special because it can use one version of a package for lots of projects. This smart way of storing packages saves space and ensures consistency. It's really helpful for large-scale applications with numerous dependencies.
Use of Content Addressing in pnpm
Content addressing, a technique employed by pnpm, is a smart way of managing files based on their content, not their location on a disk. Instead of identifying files by where they're stored, content addressing focuses on what's inside those files. For pnpm, this means it can efficiently resolve dependencies by recognizing files based on their actual content. This approach allows pnpm to efficiently handle numerous dependencies in large projects.
When you install a package using pnpm, pnpm will first check the global store to see if the package is already installed. The global store is a centralized location where all of the packages that you have installed are stored. If the package is not already installed, pnpm will download it from the npm registry and install it in the global store.
Next, pnpm will create a link to the package in the node_modules directory of your project. This link allows pnpm to quickly and efficiently resolve the package's dependencies.
pnpm also uses a lock file to track the versions of the packages that are installed in your project. The lock file is created when you run the pnpm install command. The lock file ensures that your project always uses the same versions of the packages that were installed when you created the lock file.
Thus, the process followed by pnpm for installing dependencies can be summarized and visualized as follows:
Dependency resolution —> Directory structure calculation —> Linking dependencies.
This approach is significantly faster than the traditional three-stage installation process of resolving, fetching, and writing all dependencies to node_modules.
Creating a hierarchical node_modules directory
When installing dependencies with npm or Yarn Classic, all packages are hoisted to the root of the node_modules directory. As a result, the source code has access to dependencies that are not added as dependencies to the project.
By default, pnpm uses symlinks to add only the direct dependencies of the project into the root of the node_modules directory.
Advantages of pnpm
Faster dependency resolution: pnpm is significantly faster than npm for installing and updating packages, especially for large projects.
Less disk space usage: pnpm uses a content-addressable file system to store packages, which can help to reduce disk space usage.
Improved security: pnpm uses a lock file to track the versions of the packages that are installed in your project, which can help to prevent security vulnerabilities.
Strict dependency management: By default, pnpm creates a non-flat node_modules structure. Unlike flat structures, this non-flat approach ensures that the project's code cannot access arbitrary packages. This strictness enhances the project's stability and security by preventing unintended access to dependencies, fostering a controlled development environment.
Disadvantages of pnpm
Not as widely used: pnpm is not as widely used as npm, which means that there is less documentation and support available.
More complex to use: pnpm can be more complex to use than npm, especially for new developers.
Ideal Project Structure that benefits from pnpm
Large and complex projects: pnpm is particularly well-suited for large and complex projects. This is because pnpm uses a different caching mechanism than other package managers. Instead of caching individual packages, pnpm caches entire dependency trees. This means that pnpm can install dependencies much faster and more efficiently than other package managers, especially for large projects with many dependencies.
Projects with multiple dependencies: pnpm is also a good choice for projects with multiple dependencies. This is because pnpm can resolve dependencies in parallel, which can significantly reduce the time it takes to install all of the dependencies for a project.
Projects with monorepos: pnpm is also a good choice for projects with monorepos. Monorepos are repositories that contain multiple independent projects. pnpm can manage dependencies for all of the projects in a monorepo in a single workspace. This can simplify dependency management and make it easier to share dependencies between projects.
Overall, pnpm is a fast and lightweight package manager that is a good choice for projects with a large number of dependencies.
Comparison of pnpm, yarn and npm
A New Challenger Approaches
Node.js has a large community and a wide range of libraries and tools available. However, it has also been criticized for its performance, its limited features, and its outdated API.
Better performance: Bun is generally faster than Node.js, especially when it comes to startup times and memory usage.
More features: Bun supports a wider range of features than Node.js, including built-in support for TypeScript and JSX, support for both ESM and CommonJS module systems, and a more modern and user-friendly API.
A more modern API: Bun's API is designed to be more modern and user-friendly than Node.js's API.
Comparing Bun with npm, Yarn, and pnpm
A few important points to consider while comparing Bun’s package manager with its contemporaries.
Bun uses a different package resolution algorithm than npm, Yarn, and pnpm. Bun's algorithm is designed to be fast and efficient, even when it comes to resolving dependencies that are spread across multiple registries.
Bun's package resolution algorithm works as follows:
Bun starts by creating a dependency graph for the project. The dependency graph shows how the different packages in the project depend on each other.
Bun then resolves the dependencies in the dependency graph one by one. Bun starts with the root package in the dependency graph and then recursively resolves the dependencies of that package.
Bun uses a number of heuristics to resolve dependencies, such as:
Preferring dependencies that are already installed in the project.
Preferring dependencies that are from the registry that is closest to the project.
Preferring dependencies that have a higher score in Bun's dependency database.
Bun returns the resolved dependency tree to the user.
Bun's package resolution algorithm has a number of advantages over the algorithms used by other package managers, such as:
It is faster, especially when it comes to resolving dependencies that are spread across multiple registries.
It is more efficient, using less memory and CPU time.
It is more robust, handling errors and conflicts more gracefully.
Bun offers a number of features that are not available in other package managers, such as:
Built-in support for TypeScript and JSX modules
Support for both ESM and CommonJS module systems
A more modern and user-friendly API
Thanks for reading The Busy Beavr! Subscribe for free to receive new posts and support my work.