Create workers 1

In this section we will write the code that will enable multiple spiders to run at the same time and crawl the page simultaneously. Write the following code at the end of the main.py file:

 def create_jobs(): # this function is called as long there are links that need to be crawled
     for link in file_to_set(QUEUE_FILE):
         queue.put(link) # stores the link in the thread queue
     queue.join() 
     crawl() # calls the crawl() function to get the update version


 def crawl(): # this function will check if there are items in the queue, and if there are, it will crawl them
     queued_links = file_to_set(QUEUE_FILE) # converts to set
     if len(queued_links) > 0: # checks if there are items that need to be crawled
         print(str(len(queued_links)) + ' links in the queue') # prints the info message
         create_jobs()
SEE ALL Add a note
YOU
Add your Comment
 

Who’s Online

There are no users currently online
Geek University 2021