Home/Blog/web-scraping-for-dummiespeerlist logo
Go Back

Web scraping for dummies

16 Sep 2023dev, javascript

So, you want to scrape a website for some data. Let me tell you how I do it.


Premise

Today I’ll tell you how I scrape websites and use the data for crafting useful databases. I know Python and JavaScript and use both of them to scrape as per project requirements. However, today in this post we are going to set up a Python environment. (Also, because I use Cheerio for DOM traversal in JavaScript and the library does not have a decent documentation site for beginners.)

I’m a Bleach fan. Hence, we’ll scrape data from the fandom website about the protagonist of Bleach, Ichigo Kurosaki, and store it as a JSON file.

Step 1: Set up an environment

Start by creating a folder wherever you please in your local system,

mkdir kurosaki

Initialize a virtual environment in the folder, (we need a virtual environment: first to replicate an isolated environment like one running on a server and secondly it will help you remove dependency clashes in your local system)

python -m venv kurosaki

Activate the virtual environment every time you start coding, ( venv creates a folder named venv and in turn, a folder bin which contains all of the scripts required to activate and deactivate the virtual environment)

source <venv>/bin/activate