Chrome Options and Driver
A ChromOptions
object is instantiated as follows:
from selenium import webdriver options = webdriver.ChromeOptions()
Available methods of ChromeOption
object can be found in the documentation of Selenium . Common usages:
-
for
.add_argument
, for a list of arguments we refer to here.For example, if we want to maximize the window on chrome launched, then write
option.add_argument("--start-maximized")
. -
for preference, we use
.add_experimental_option
, for a list of preferences we refer to here. I am not sure whether.set_preference
would override the default preference.
In our case we want to specify the default download directory (which cannot be changed after the Driver
object is instantisated):
options = webdriver.ChromeOptions() prefs = {"download.default_directory" : f"{os.getcwd()}/download"} options.add_experimental_option("prefs", prefs)
Now we build our driver using our options.
browser = webdriver.Chrome(executable_path=DRIVER_PATH, chrome_options=options)
Implicit Wait
From the documentation of Selenium implicit-wait is:
An implicit wait tells WebDriver to poll the DOM for a certain amount of time when trying to find any element (or elements) not immediately available. The default setting is 0 (zero). Once set, the implicit wait is set for the life of the WebDriver object.
We will discuss explicit-wait later in the discussion of downloading files.
In my situation I choose to wait for 10 seconds as dom elements may take time to render:
browser.implicitly_wait(10)
Select Element and Click
In case an automation task can be done where every desired dom element can be selected by id
attribute, we define the following util function:
def get_el_by_id(id): el = browser.find_element_by_id(id) assert el is not None, f"Element of id: {id} cannot be found" return el
A list of methods to find dom element(s) can be found here.
Buttons
Selecting a button and click is as simple as:
submit_btn = get_el_by_id("butSubmit") submit_btn.click()
Dropdown List and Selection
We can select an element in dropdown list by using their value
attribute:
from selenium.webdriver.support.select import Select # dropdown_id: the id of the dropdown dom element # target_value: the value attribute of our target selection_list = Select(get_el_by_id(dropdown_id)) selection_list.select_by_value(target_value)
Download Files and Explicit Wait
As usual we select an element that would trigger download action and click it:
confirm_btn = get_el_by_id("downloadBtn") confirm_btn.click()
Next we look at the download directory:
import time def every_downloads_chrome(browser): if not browser.current_url.startswith("chrome://downloads"): browser.get("chrome://downloads/") return browser.execute_script(""" var items = document.querySelector('downloads-manager') .shadowRoot.getElementById('downloadsList').items; if (items.length == 0) { return ["no_download"] } if (items.every(e => e.state === "COMPLETE")) { return items.map(e => e.fileUrl || e.file_url); } """) # waits for all the files to be completed and returns the paths time.sleep(4) # timeout for 2 minutes paths = WebDriverWait(browser, 120, 1).until(every_downloads_chrome) assert len(paths) >= 1 and paths[-1] != "no_download", "No file is being downloaded" # get the name of lastly downloaded file: latest_download_filename = paths[-1].split("/")[-1] # print the downloaded file for filepath in paths: print(f"Files donwloaded: {filepath}")
Points to note:
-
Since website may have a delay after clicking the download button, we wait for 4 seconds to make sure there is a file to be downloaded.
-
We also handle the case that the server has an internal error that ruins our download process. In this case, we return
["no_download"]
whenitems.length == 0
.Note that the return value
[]
wouldn't stop the waiting process by experiment.
Documentation for WebDriverWait
: https://selenium-python.readthedocs.io/api.html?highlight=WebDriverWait.