114

Access Chrome's network tab (e.g. XHR requests) with Selenium

 1 year ago
source link: https://gist.github.com/lorey/079c5e178c9c9d3c30ad87df7f70491d
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Access Chrome's network tab (e.g. XHR requests) with Selenium · GitHub

Instantly share code, notes, and snippets.

Access Chrome's network tab (e.g. XHR requests) with Selenium

# # This small example shows you how to access JS-based requests via Selenium # Like this, one can access raw data for scraping, # for example on many JS-intensive/React-based websites #

from time import sleep

from selenium import webdriver from selenium.webdriver import DesiredCapabilities

# make chrome log requests capabilities = DesiredCapabilities.CHROME capabilities["loggingPrefs"] = {"performance": "ALL"} # newer: goog:loggingPrefs driver = webdriver.Chrome( desired_capabilities=capabilities, executable_path="./chromedriver" )

# fetch a site that does xhr requests driver.get("https://sitewithajaxorsomething.com") sleep(5) # wait for the requests to take place

# extract requests from logs logs_raw = driver.get_log("performance") logs = [json.loads(lr["message"])["message"] for lr in logs_raw]

def log_filter(log_): return ( # is an actual response log_["method"] == "Network.responseReceived" # and json and "json" in log_["params"]["response"]["mimeType"] )

for log in filter(log_filter, logs): request_id = log["params"]["requestId"] resp_url = log["params"]["response"]["url"] print(f"Caught {resp_url}") print(driver.execute_cdp_cmd("Network.getResponseBody", {"requestId": request_id}))

I've been trying to achieve this for at least a week working on it, and for a few months thinking about it. You are great.

This is really great, however at the final step of getting the response body using the requestId I get

self.driver.execute_cdp_cmd("Network.getResponseBody", {"requestId": request_id})
2021-05-06 14:04:12 jim-ThinkPad-S5-S540 selenium.webdriver.remote.remote_connection[36958] DEBUG POST http://127.0.0.1:42437/session/b29c0918324a3defb5d6d11100dd3bec/goog/cdp/execute {"cmd": "Network.getResponseBody", "params": {"requestId": "37056.284"}}
2021-05-06 14:04:12 jim-ThinkPad-S5-S540 urllib3.connectionpool[36958] DEBUG http://127.0.0.1:42437 "POST /session/b29c0918324a3defb5d6d11100dd3bec/goog/cdp/execute HTTP/1.1" 500 253
2021-05-06 14:04:12 jim-ThinkPad-S5-S540 selenium.webdriver.remote.remote_connection[36958] DEBUG Finished Request
*** selenium.common.exceptions.WebDriverException: Message: unknown error: unhandled inspector error: {"code":-32000,"message":"No resource with given identifier found"}
  (Session info: chrome=89.0.4389.114)

Can you please help me out, on how to do this with firefox browser? I tried few steps , but it didnt work out.

Copy link

Author

lorey commented on Jun 2, 2021

edited

Sorry, this is not intended for Firefox, @shans0535. Have you tried selenium-wire or just a mitm-proxy instead?

To : lee-hodg

I think it's an error that came from accessing a place without resources.
It works well with try-except syntax.

This is really great, however at the final step of getting the response body using the requestId I get

self.driver.execute_cdp_cmd("Network.getResponseBody", {"requestId": request_id})
2021-05-06 14:04:12 jim-ThinkPad-S5-S540 selenium.webdriver.remote.remote_connection[36958] DEBUG POST http://127.0.0.1:42437/session/b29c0918324a3defb5d6d11100dd3bec/goog/cdp/execute {"cmd": "Network.getResponseBody", "params": {"requestId": "37056.284"}}
2021-05-06 14:04:12 jim-ThinkPad-S5-S540 urllib3.connectionpool[36958] DEBUG http://127.0.0.1:42437 "POST /session/b29c0918324a3defb5d6d11100dd3bec/goog/cdp/execute HTTP/1.1" 500 253
2021-05-06 14:04:12 jim-ThinkPad-S5-S540 selenium.webdriver.remote.remote_connection[36958] DEBUG Finished Request
*** selenium.common.exceptions.WebDriverException: Message: unknown error: unhandled inspector error: {"code":-32000,"message":"No resource with given identifier found"}
  (Session info: chrome=89.0.4389.114)

I was working on a way to do this for a week or two before I found your post. Works beautifully for what I needed, thanks a bunch.

it's work! Senk's) I was looking for a solution for a long time, and you helped! +1

Copy link

Author

lorey commented on Nov 8, 2021

Thanks for the kindness everyone. Glad I could help you out. Please feel free to check out my profile with similar tools and libraries at https://github.com/lorey <3

Awsome!!

how get xhr from real browser online?

Copy link

Author

lorey commented on Jan 10

Selenium is using a real browser. If you want to do it manually yourself, check out developer tools (e.g. F12 in Chrome, tab "Network").

Copy link

nikolaysm commented on Feb 18

@lorey, thanks for sharing.

For Chrome >=75 we have to do small changes.

As specified in the release notes for ChromeDriver 75.0.3770.8, capability loggingPrefs has been renamed to goog:loggingPrefs

im looking to print the response after click button to know the status response of this click if it's successful or failed the only way to know the status its to open dev tool and go to network and check the response manual from here


so i need method to print this status in log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK