8

How to circumvent Sci-Hub ISP block

 2 years ago
source link: https://fragile-credences.github.io/scihub-proxy/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

How to circumvent Sci-Hub ISP block

In the UK, many internet service providers (ISPs) block Sci-Hub. However, a simple proxy is enough to circumvent this (you don’t even need a VPN). Routing requests through a suitable1 proxy lets you open Sci-Hub in your regular browser as if it weren’t blocked2.

Routing all your traffic through a proxy may come with privacy and security concerns, and will slow your connection a bit. We want to use our proxy only for accessing Sci-Hub.

You can use extensions like ProxySwitchy to tell your browser to automatically use certain proxies, or no proxy at all, for sets of websites that you define.

Unfortunately, this extension, and others like it, require permissions to insert arbitrary JavaScript into any page you visit (the web store accurately explains that the extension can “read and change all your data on the websites you visit”). That’s likely due to insufficiently granular permission definitions by Chrome, and is not the fault of the presumably well-intentioned extension authors. But it freaks me out a little bit (bad things have happened).

Luckily, we can achieve the same effect by writing our own proxy auto-configuration file. A proxy auto-configuration or PAC file contains just a single JavaScript function like this:

function FindProxyForURL (url, host) {
  
  // Sci-Hub requests
  if (shExpMatch(host, 'sci-hub.se') || shExpMatch(host,'*.sci-hub.se')) {

                  // Your proxy address and port number
    return 'PROXY 123.456.789:9279'; 
  }

  // All other requests
  return 'DIRECT';
}

We can instruct the operating system to read this file. Search for instructions on Google (example).

The first time you use the proxy to access Sci-Hub, the browser will ask you for the username and password to your proxy server.

There are many free proxies on the Internet, but I find that using the services of an actual for-profit proxy company is well worth it, for the greater speed and reliability. Currently webshare.io (referral link) offers 1 GB per month free, which is quite a lot of Sci-Hub PDFs. After that you can get 250 GB for $2.99 per month.3

Step by step instructions

  1. Create an account on webshare.io (referral link)
  2. Choose a proxy from your list, and copy its address and port number into your PAC file, following the pattern above.
  3. Set your operating system to read its proxy settings from this PAC file4. Instructions for this are easy to Google (example).
  4. Open Sci-Hub in your browser. The first time you do this, it should ask for your proxy username and password. You can find these in your webshare.io account.
  5. Don’t forget to only use Sci-Hub to look at really old papers that have lapsed into the public domain :)
  1. Obviously, the proxy must not itself be on a network that blocks Sci-Hub. I have not come across any proxy that blocks Sci-Hub in this way. 

  2. Changing your DNS resolver to a public one like Google’s instead of your ISP’s is not sufficient as of 2021, for two ISPs I’ve tested, and I suspect for all UK ISPs that implement blocking. (Many people believe changing the DNS resolver is sufficient. Probably ISPs used to implement simple DNS level blocking and have recently upped their game.) My guess is that instead of merely blocking the request to resolve sci-hub.se at the DNS resolver level, the ISPs are also doing a reverse lookup on every requested IP address to check whether it corresponds to a blacklisted domain. 

  3. Their home page exemplifies a dark pattern by not showing the pricing by the GB; it just says you get ‘up to unlimited’ bandwidth. You’ll be able to see the actual pricing after you create an account. 

  4. One gotcha is that Windows 10 forces you to call your PAC file from a web server; it cannot be a local file (??!). To work around this, you can upload your file as a Gist and link to the /raw

May 15, 2021

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK