Batch-Download Image Data From Pinterest

Pinterest is a popular image-based social platform. It provides a feature named ‘Collection’, where multiple images can be put together to make an album, making finding multiple images of the same topic much easier.

Under some scenarios, we may want to download all the images present in a collection. For example, we may need images in the ‘Flower’ collection fetched to train a neural network. However, Pinterest doesn’t have such feature, and to make things worse, most functions on this website require user login, adding to the difficulty of making a crawler from scratch.

Here I demonstrate using the gallery-dl library to solve this task. Generally speaking, it has the following advantages:

  • Open-source, no need to worry about malware issues
  • Developed in Python, can be easily installed using package managers such as pip
  • Very lightweight, taking less than 20 MB of RAM when downloading, can be deployed to devices such as NAS or Raspberry Pi
  • Robust, can be configured with multiple options, and can handle large image-batches well, support many more websites other than Pinterest

The usage is pretty simple. Most basic syntax:

gallery-dl URL

For Pinterest, we may want to add the following options before the URL:

--proxy http://host:port	# bypass possible network block
-d DESTINATION_PATH		# image save path
--cookies PATH_TO_COOKIE_TEXT_FILE

Since Pinterest has a relatively strict auth system, gallery-dl needs Pinterest browser cookies to make downloading possible. Just use an extension such as EditThisCookie to export cookie text, save it in an plain text file and you’re all set.

That’s it. For more advanced configurations of gallery-dl, please refer to the project’s GitHub docs.

Written on August 14, 2022