Create workers 1

In this section we will write the code that will enable multiple spiders to run at the same time and crawl the page simultaneously. Write the following code at the end of the main.py file:

def create_jobs():  # this function is called as long there are links that need to be crawled
    for link in file_to_set(QUEUE_FILE):
        queue.put(link)  # stores the link in the thread queue
    queue.join()
    crawl()  # calls the crawl() function to get the update version


def crawl():  # this function will check if there are items in the queue, and if there are, it will crawl them
    queued_links = file_to_set(QUEUE_FILE)  # converts to set
    if len(queued_links) > 0:  # checks if there are items that need to be crawled
        print(str(len(queued_links)) + ' links in the queue')  # prints the info message
        create_jobs()
Geek University 2022