This is the module made for super web miner which enables everyone to download a large quantity of things from certain website
Parameters ( {Id,Name,Class,Link,Partial_link,Tag,Xpath,CSS} are all parameters for function: Objects() )
Parameters Value Note browser 'chrome' Only support chrome currently headless False/True Hide or display browser GUI,False defaultly. url Any url you'd like to input https://www.bing.com defaultly. Id 'None'/Others Id,Name,Class,Link,Partial_link,Tag,Xpath,CSS: Only one element need to be given,if more than one specified, only find the first element. All elements are 'None' defaultly. Name 'None'/Others Id,Name,Class,Link,Partial_link,Tag,Xpath,CSS: Only one element need to be given,if more than one specified, only find the first element. All elements are 'None' defaultly. Class 'None'/Others Id,Name,Class,Link,Partial_link,Tag,Xpath,CSS: Only one element need to be given,if more than one specified, only find the first element. All elements are 'None' defaultly. Link 'None'/Others Id,Name,Class,Link,Partial_link,Tag,Xpath,CSS: Only one element need to be given,if more than one specified, only find the first element. All elements are 'None' defaultly. Partial_link 'None'/Others Id,Name,Class,Link,Partial_link,Tag,Xpath,CSS: Only one element need to be given,if more than one specified, only find the first element. All elements are 'None' defaultly. Tag 'None'/Others Id,Name,Class,Link,Partial_link,Tag,Xpath,CSS: Only one element need to be given,if more than one specified, only find the first element. All elements are 'None' defaultly. Xpath 'None'/Others Id,Name,Class,Link,Partial_link,Tag,Xpath,CSS: Only one element need to be given,if more than one specified, only find the first element. All elements are 'None' defaultly. CSS 'None'/Others Id,Name,Class,Link,Partial_link,Tag,Xpath,CSS: Only one element need to be given,if more than one specified, only find the first element. All elements are 'None' defaultly.
Inter members
Member Discirption Operations available Operations note engine The engine of the web miner engine.quit() Used to close the browser and end all threads
Inter functions
Functions Return Value Description MineEngine() -1:Error occurred 0:Run over Initialize the web miner engine, make class member 'engine' prepared Objects() The list of the objects found Get the objects to be found
Parameters
Parameters Value Attribute The attribute you want to locate, such as 'src' Obj_list The list of the objects you find
Return Value: The list of the the attributes you want to find
Parameters
Parameters Value Obj_list The 2-dimension list of the properties of the objects you find
Return Value: [[text,id,location,tag_name,size],...],[] means that error occured
Parameters
Parameters Value Note engine the engine object in the SuperMiner Obj_list The list of the objects you find [] defaultly Obj_index The index of the object in Obj_list, which means that you will acts on this object If -1 get, it means that all objects in Obj_list will be acted on send_keys True/False To/Not to send message/keyboard/mouse actions,False defaultly message The text/keyboard/mouse action 'Hello world' defaultly click Ture/False False defaultly clear True/False Clear the text inputed, False defaultly submit True/False Submit the form, False defaultly right_click True/False False defaultly double_click True/False False defaultly rollpage True/False Scroll the page to get more information, False defaultly roll_times The times to roll the page Only works when rollpage is True, 20 defaultly time_sleep The sleep time per page scrolling To make sure the content loaded fully, 1s defaultly
Return Value: -1:Error occured; Others:Run over
Parameters
Parameters Value Note engine the miner engine url_list The list of urls for objects You can get it from Attributes() data_type The type of data you want to download 'img'/'text'/'page' available,instead of 'text',page is recommanded now file_type the type of the file saving data '.html' default folder_name the name of the file folder saving data you downloaded 'downloads' default encode encoding format of non-img files 'utf-8' default web_name whether to name the file as the link of web True default web_index the index of / in site link, the name is choosed from this / to the end of web link
Return Value: -1:Error occured;Others:Run over