Caching and installing node dependencies in pipeline

Hello :wave:

I’m using Bitbucket pipelines for some projects and I’m not pretty sure what’s the best approach for caching and installing node dependencies.

npm ci is the official command to install dependencies in automated environments. It offers several advantages over regular npm install commands for CI environments:

  • throws an error if dependencies in package and lock don’t match
  • installs faster as it skips several user-oriented features
  • removes node_modules if present to make sure the dependencies installation is fresh
  • will never write package.json

I’ve been using npm ci for a while in my pipeline steps (working fine so far) but I wanted to speed up the setup of the builds so I went through my installation steps again. I realized I didn’t enable the cache of dependencies and that could definitely make my setup faster. The caching dependencies mechanism will upload them after first successful build (node_modules folder for node) and then download them in following builds to save time reinstalling dependencies.

Here is where I see a conflict between npm ci and pipeline caching. The pipeline will download saved dependencies first thing, but then ci command will remove them as it wants to make sure it’s a fresh installation.

As caching dependencies + npm ci doesn’t seem like a good combination to me, I replaced npm ci by npm install as:

  • won’t delete already existing node_modules
  • will install exactly same versions from package-lock (this was drastically improved during last minor npm cli updates)

The setup is little bit faster now and makes sense to me. For summing up the speed of different combinations I got:

  • npm ci only → ~30sec
  • npm ci + pipeline caching dependencies → ~60sec
  • npm install + pipeline caching dependencies → ~20sec

The only concern is by using npm install it won’t make throw an error if dependencies on package and lock are different. I don’t think it can happen in my current setup, but it’s still something to be worried about I guess.

Any thoughts about this? Something I’m missing? Is npm install + pipeline caching the proper approach?

Thanks!

1 Like

Hi Belco

Thanks for sharing your observations.
I also find this whole topic of “npm + efficient caching” pretty confusing as well.

After stumbling upon this article about running Cypress in a Bitbucket pipeline, a few of my assumptions were somehow confirmed : Setting up Bitbucket Pipelines with proper caching of NPM and Cypress

Let me highlight the relevant parts:

pipelines:
  default:
  - step:
      name: Install, build and tests
      caches:
        - npm
      script:
        - npm ci
definitions:
  caches:
    npm: $HOME/.npm

The trick here, is to not rely on the default node cache which only deal with node_modules but create a custom one, in this case called npm which keeps the content of $HOME/.npm.

Therefore, when npm ci is executed, it indeed removes the local node_modules folder as stated in the doc, but I believe it still relies on a more global immutable cache of packages, some kind of direct mini-mirror of the remote registry, which lives in the aforementioned $HOME/.npm dir.

How does the cache invalidation work? If package-lock.json changes should it invalidate the cache?