Downloading PDFs from a table

Hi everyone,

I’m attempting to extract and download pdfs from a table: Here is the table with 11 pages.

I’d like to specifically extract the rows with the Type: WI3, then click the icon next to the trash icon.

When I click the icon - it takes me to this window

Which I then click download pdf - the window opens with the pdf, I’d then like to download it then search for the next Type: WI3 in the table until the 11 pages are done.

What would be the best way to approach this?

Hi @Mpumi_Setshedi

In that case, I suggest you use a DOM to access these page elements. Using BotCity this approach is possible, see the documentation

I reproduced the walkthrough you mentioned locally, using the same website that appears in the print screen above.

And I solved it this way:

# self.find_element("page-item", By.CLASS_NAME) == next page button
while self.find_element("page-item", By.CLASS_NAME) != None:
    # Find the table element on the page
    table = self.find_element("table", By.CLASS_NAME)
    # Finds all elements with tag "tr" inside table element
    lines = table.find_elements_by_tag_name('tr')[1:]
    # tr refers to each row in the table
    # Starts at 2 because the first row is the header
    tr = 2
    for line in lines:
        # Finds the element that contains the type of the table
        type = self.find_element(f"/html/body/div/div[2]/div/table/tbody/tr{[tr]}/td[4]", By.XPATH)
        # Finds the element that contains WIID of the table
        WIID = self.find_element(f"/html/body/div/div[2]/div/table/tbody/tr{[tr]}/td[2]", By.XPATH)
        tr = tr + 1
        if (type.text) == "WI3":    
            # Go to the download link
            # Click on donwload
            self.find_element("/html/body/div/div[2]/div/div[2]/div/div/div[1]/p/a/button", By.XPATH).click()
            # backs to table page
    # Click next page button
    self.find_element("page-item", By.CLASS_NAME).click()


self.find_element(“page-item”, By.CLASS_NAME) is the html element referring to the button to move to the next page, as shown in the image below, that is, while this button exists on the screen, the loop will continue to occur.


tr = 2, because each tr is a row of the table. It starts at 2 because the first line is the header.

type = self.find_element(f"/html/body/div/div[2]/div/table/tbody/tr{[tr]}/td[4]", By.XPATH) is the element that contains the Type of the table.

WIID = self.find_element(f"/html/body/div/div[2]/div/table/tbody/tr{[tr]}/td[2]", By.XPATH) is the element that has the WIID from the table.

Finally, I just check if the element type is the element I want to download the file, if it is, I navigate to the download page passing the WIID in the url, and click download.

I use self.back() to go back to the page with the table and when the loop is finished I click on the next page to continue this process until the last active page.

Hi @livia-macon

Than you for the response. I haven’t had access to a computer since I posted this, I do now. I will explore the information you have shared and revert any feedback.