gogoWebsite

python web page automatic filling-use python-webdriver to realize automatic filling of forms

Updated to 7 hours ago

In daily work, some forms are often filled out repeatedly, which is time-consuming and labor-intensive if done manually, and the network latency is very devastating. If you can use programs to automatically fill out forms, the efficiency can be more than doubled and it can be ported to multiple computers to further improve work efficiency. webdriver is an automated testing tool in Python's selenium library. It can completely simulate browser operations without having to deal with complex requests and posts, and is very friendly to crawler beginners.

1. Environment configuration

python3.6+selenium library+xlrd library+xlwt library

Where xlrd and xlwt libraries are used to read and write data in excel tables.

You also need to download a browser driver file to open the browser. Please be careful to select the version that matches the computer system (max/windows64-bit/windows32-bit)

Put the downloaded ones into the browser root directory and the python root directory

2. Open the web page

Taking IE browser as an example, the following two lines of code can open an IE browser and access the website we need to fill in the form

driver=()

("/")

If the website needs to log in (the one that needs to fill in the form is usually the company's internal website), write a login function and call the driver as a parameter

driver = login(driver)

Be sure to pass the driver back so that the driver can continue to accept the program's instructions.

3. Element positioning

The working principle of webdriver is to find an element in the web page, and you can fill in data or click it.

The main elements I use are

driver.find_element_by_id("someid")# Positioning by element id

driver.find_element_by_css_selector("input[value="ok"")# Find an input element whose value is "ok"

driver.find_element_by_xpath("//span[contains(@style,"COLOR: red")]/span[1]")# Find the first span child element of a span element with a style attribute value of "COLOR:red"

(1) Positioning by id

If we want to fill in a certain value or click a button in a certain position in the web form, we must first use the developer tool to view the source code of this element, and then first observe whether it has an id. If there is an id, directly use the id to locate the element. Then,

driver.find_element_by_id("someid").click()#Click element

driver.find_element_by_id("someid").send_keys("somekeys")#Fill in "somekeys"

driver.find_element_by_id("someid").clear()# Clear the existing values ​​in the input box

Implement what we want to do.

(2) Positioning through ccs selector

If the element we want to operate does not have an ID, then we need to find its different characteristics from other elements of the web page. ccs selector is a very flexible positioning method, and positioning with value is a good choice. by

driver.find_element_by_css_selector("input[value="ok"")

For example, the input in double quotes can be replaced with any web page element (div, span, input, a, etc.), the brackets are a certain attribute of the element (style, id, value, class, etc.), and the equal sign is followed by the value of the attribute.

Note that if multiple elements in the web page satisfy the ccs selector's conditions at the same time, if there are multiple inputs with value = "OK", then find_element_by_css_selector will only locate the one that is the first in the html source code, and find_elements_by_css_selector will find all elements in the source code that meet the conditions and return these found elements in the form of a list. For example, many prompt boxes pop up in the web page, we need to confirm one by one, so we can do it like this

list=driver.find_elements_by_css_selector("input[value=" OK "]") for l inlist:

()

However, if these prompt boxes overlap and the top-level prompt box is actually in the source code, then the first "OK" element in the list will be blocked by the prompt box stacked on the above and cannot be clicked. At this time, just order the array in reverse and click from the last "OK" element.

query=driver.find_elements_by_css_selector("input[value=" OK "]") for q in query[::-1]:

()

(3) Positioning through xpath

xpath positioning is more complex but very comprehensive. When the class and style attributes of this element are the same as other elements and really have no characteristics to position in one step, we can use xpath to first find the father and son elements of the element we want, and then locate the element we want. For example

driver.find_element_by_xpath("//*[@class="submit clear"]/input[1]").click()

text=driver.find_element_by_xpath("//input[@value=" OK "]/../preceding-sibling::div[1]").text

driver.find_elements_by_xpath("//span[contains(@style,"COLOR: red")]/span[1]")

// in quotes means relative positioning, and it means starting from anywhere in the source code.

//It can be followed by any element, * represents any element, that is, it can be positioned to filter any element according to the attribute.

In the brackets are the filter conditions for attributes, and any attribute can be added after @. The filter criteria represented by contains(@style,"COLOR: red") are: the style attribute contains "COLOR: red". Why not use @style="COLOR: red" directly here

The reason is that when we review the source code, the style attribute of this element may be "COLOR: red", but the style attribute of the dynamic interface often changes, and it is equivalent to not being able to locate this element when the program is running.

/.. Can locate the father element of this element

/ Can locate child elements of this element

/preceding-sibling:: Can locate the older element of this element

/following-sibling:: Can locate the younger brother element of this element

For example, /input[1] represents the first input in the child element, /../preceding-sibling::div[1] represents the first div in the elder element of the parent element

(4) Positioning through the current node

Sometimes we will encounter situations where we need to judge the current state of the element (whether it is selected) and then decide the next operation. At this time, we need to use a variable to save the current node.

LTE=driver.find_element_by_xpath("//input[@]/../span[1]"

Then use get_attribute to get the attribute of the current node element. In this example, if the element is blue, you do not need to click. The code is implemented as:

if LTE.get_attribute("style")=="COLOR: blue":pass

else:

()

There is a need to filter out specific texts:

red=driver.find_elements_by_xpath("//span[contains(@style,"COLOR: red")]/span[1]")# Find all red text

for r inred:if "low elimination" in :#If the text message contains "low elimination"

r.find_element_by_xpath("./../preceding-sibling::input[1]").click()#Note that when positioning from the current node, it should start with "./'

break

If the element you are looking for needs to be scrolled to see, you can use js to focus this element, and the page will scroll to the position of the element

target=driver.find_element_by_css_selector("input[value=" OK "]")

driver.execute_script("arguments[0].scrollIntoView();", target)

()

4. Uncertainty situation handling

(1) Popular pop-up windows

During the process of filling out the form, a pop-up may appear or may not appear in some places. At this time, no matter what the pop-up is, it can be solved by processing it with the try..except statement

Popup window triggered by js:

try:

driver.find_element_by_css_selector("input[value=" OK "]").click()exceptException as e:pass

Web alert pop-up window:

try:

driver.switch_to.()exceptException:pass

dismiss() corresponds to the "cancel" item of the alert pop-up window, accept() corresponds to the "determine" item, driver.switch_to. You can get the text content of the pop-up window.

(2) Pop-up windows with varying numbers

For the multiple prompt boxes mentioned above, in addition to using query=driver.find_elements_by_css_selector("input[value=" OK "]") to find all elements at once and click in sequence or inverse order, you can also use a while loop to solve the problem.

while(1):try:

driver.find_element_by_css_selector("input[value=" OK "]").click()exceptException as e:break

(3) Network delay

Some web pages need to be loaded for a period of time after clicking to query information. The loading page cannot find the element we are looking for next, so the program will report an error. There are two solutions at this time.

One is to wait for a period of time and wait for the web page to load. The disadvantage of this method is that it is difficult to find the best time to wait. If the page is too short, it will not be loaded yet, and it will affect efficiency if it is too long.

(2)

Another way is to use a while loop to keep looking for the next element we are looking for

while(1):try:

driver.find_element_by_id("continueTrade").click()break

exceptException:pass

The premise of this method is that the next element to be found will definitely appear

5. Frame processing

To sum up, you don’t need to cut frameset, you can cut frames layer by layer. It is best to use driver.switch_to.default_content() after a series of forms filling operations to return to the original document, so it is not easy to be confused.

Here is a little more about how to cut the frame when there is no id

frame= .find_element_by_xpath("/html/body/div[12]/iframe")#First locate the frame position and use a variable to store this node

.switch_to_frame(frame)#Try to this node again

6. Excel data reading and writing

It is very simple to read and write excel data, just look at the code:

defread(file):

data= xlrd.open_workbook(file)#Open excel file

table = ()[0]#Read the data of the first sheet

phones = table.col_values(0)#Storage the first column of data in a list

peoples = table.col_values(1)#Storages the second column of data in a list

returnphones,peoplesdefwrite(result):

file=()#Create an excel file

table = file.add_sheet("sheet1")#Add a sheet

for i in range(len(result)):#Write data

(i,0,result[i][0])

(i,1,result[i][1])

(i,2,result[i][2])

("")

Conclusion: Hope technology can free people from meaningless repetitive labor: D