All posts by Tuna Peyo

Spider concepts

Mar 20,22

Let’s now briefly discuss how our spider will actually work. The spider will grab a link that needs to be …

Read More

Parse HTML

Mar 20,22

In this section we will start writing code for the actual link collection. We will create a new file called …

Read More

Create sets

Mar 20,22

A set in Python is a collection type that contains an unordered collection of unique and immutable objects. Unlike lists …

Read More

Add and delete URLs

Mar 20,22

Once we create queue and crawled files, we need to add data to them. We can do this by using …

Read More

Create queue and crawled files

Mar 20,22

We will create two files inside the project folder for each website we crawl: queue file – this file will …

Read More

Create a new project

Mar 20,22

We will start with creating a function that will create a folder for each new website. The function will be …

Read More
Geek University 2022