-
mitmproxy
An interactive TLS-capable intercepting HTTP proxy for penetration testers and software developers.
-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
I thought that mitmproxy did this, but cursory searches didn't show anything; that said, their actual format[1] has even more fidelity (I'd guess it's comparable to wireshark)
One should be aware that WARC is great for preservation, but getting content back out of it would require specialized tooling ala: https://github.com/alard/warc-proxy
1: https://github.com/mitmproxy/mitmproxy/blob/9.0.1/mitmproxy/...
I thought that mitmproxy did this, but cursory searches didn't show anything; that said, their actual format[1] has even more fidelity (I'd guess it's comparable to wireshark)
One should be aware that WARC is great for preservation, but getting content back out of it would require specialized tooling ala: https://github.com/alard/warc-proxy
1: https://github.com/mitmproxy/mitmproxy/blob/9.0.1/mitmproxy/...
If you weren't already aware, Scrapy has strong support for this via their HTTPCache middleware; you can choose whether to have it actually behave like a cache, choosing to returned already scraped content if matched or merely to act as a pass-through cache: https://docs.scrapy.org/en/2.7/topics/downloader-middleware....
Their OOtB storage does what the sibling comment says about sha1-ing the request and then sharding the output filename by the first two characters: https://github.com/scrapy/scrapy/blob/2.7.1/scrapy/extension...