Give the crawler information

We will now write the code that will give the spider some basic information, such as the name of the project, base URL, and such. The project name and the base URL are the only information that will be provided by the user. Here is the code needs to be added to the spider.py file, under the Spider class:

    def __init__(self, project_name, base_url, domain_name):
        Spider.project_name = project_name # sets the value for the class variable, so that all spiders have the same information
        Spider.base_url = base_url
        Spider.domain_name = domain_name
        Spider.queue_file = Spider.project_name + '/queue.txt' # defines the path for the queue file
        Spider.crawled_file = Spider.project_name + '/crawled.txt'
        self.boot() # the method that will create the project directory and the data files
        self.crawl_page('First spider', Spider.base_url) # the method that will start the page crawling and print the message to the user
Geek University 2022