Muat naik fail CSV yang mengandungi URL daripada halaman HTML dan gunakan Flask untuk membaca URL yang anda mahu rangkak

Question

Pada masa ini saya perlu membuat sistem berasaskan web yang boleh memuat naik fail CSV yang mengandungi senarai URL. Selepas memuat naik, sistem akan membaca URL baris demi baris dan akan digunakan untuk langkah merangkak seterusnya. Di sini, merangkak memerlukan log masuk ke laman web sebelum merangkak. Saya sudah mempunyai kod sumber untuk laman web log masuk. Walau bagaimanapun, masalahnya ialah saya ingin menyambungkan halaman html bernama "upload_page.html" dengan fail flask bernama "upload_csv.py". Di manakah kod sumber untuk log masuk dan mengikis diletakkan dalam fail kelalang? upload_page.html<d

P粉207969787 · Answer

csv_file = request.files['file']
# Load the CSV data into a DataFrame
df = pd.read_csv(csv_file)
final_data = []
# Initialize the web driver
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-gpu")
driver = webdriver.Chrome(options=chrome_options)
# Loop over the rows in the DataFrame and scrape each link
for index, row in df.iterrows():
    link = row['Link']
    # Login to the website
    # Replace this with your own login code
    driver.get("https://example.com/login")
    username_field = driver.find_element_by_name("username")
    password_field = driver.find_element_by_name("password")
    username_field.send_keys("myusername")
    password_field.send_keys("mypassword")
    password_field.send_keys(Keys.RETURN)
    # Wait for the login to complete
    WebDriverWait(driver, 10).until(EC.url_changes("https://example.com/login"))
    # Scrape the website
    driver.get(link)
    start = time.time()
    # will be used in the while loop
    initialScroll = 0
    finalScroll = 1000

    while True:
        driver.execute_script(f"window.scrollTo({initialScroll},{finalScroll})")
        # this command scrolls the window starting from the pixel value stored in the initialScroll
        # variable to the pixel value stored at the finalScroll variable
        initialScroll = finalScroll
        finalScroll += 1000

        # we will stop the script for 3 seconds so that the data can load
        time.sleep(2)
        end = time.time()
        # We will scroll for 20 seconds.
        if round(end - start) > 20:
            break

Muat naik fail CSV yang mengandungi URL daripada halaman HTML dan gunakan Flask untuk membaca URL yang anda mahu rangkak

membalas semua(1)saya akan balas