samlanning.com
« Back to Blog

The Tor and CloudFlare Problem

Mar 06, 2016

A Bit of Background

For a while, users of Tor have been exposed to a great deal of frustration while browsing the web; when visiting websites that are using CloudFlare, they are presented with a Captcha before being able to load the website. (for every, single new website they navigate to in a single session).

This makes navigating the web a particularly tiresome experience, especially as so much of the web is protected by CloudFlare's services.

Right now, there is an ongoing discussion on the Tor Project's ticket system regarding this problem, inviting anyone to propose potential solutions to the problem. There have been a few potential solutions discussed, but ultimately nothing that solves all the problems for all concerned parties.

Over the course of this ticket being open though, CloudFlare's CTO has been very active in this thread, and CloudFlare now provides the option for its customers to decide on how to treat tor traffic, specifically, it allows customers to whitelist Tor exit nodes. This is great news and shows CloudFlare's willingness to discuss and come to a solution with the community.

In this post I propose a take a critical approach to discussing the existing suggestions and propose a new potential solution.

Existing suggestions

Whitelist tor nodes

...and not just providing the option for CloudFlare's customers to do this, but doing it by default for all CloudFlare customers (as individual customers can now already choose to do this).

While this would instantly make the Tor community very happy, this is an unreasonable and unrealistic suggestion.

Lets look at a few facts:

  • Malicious people exist on the internet and to malicious things.
  • People know about Tor.
  • Tor is sometimes used by malicious people to do malicious things.

If we whitelist all tor exit nodes by default for all 2 million of the websites using CloudFlare, then malicious people that were not previously using Tor will now use Tor to do these malicious things, the percentage of traffic through Tor that can be considered malicious will increase, Tor's reputation will decrease, there will be bad press, CloudFlare customers will choose to outright block Tor users (which is worse than the current Captchas), and everything goes to shit.

This is obviously not what we want.

Provide a "Read Only" mode for Tor users.

Now this can mean one of two things:

  1. Let GET requests through to the servers, and block any other requests.
  2. Only allow access to already cached pages (so no requests arrive at the server).

Unfortunately neither of these two "solutions" are viable.

The first one, although well intentioned, does not prevent all ways in which a user may "interact" with a server. There are plenty of situations where servers will perform some non-trivial processing on user data given to them by a GET request (query strings, cookies, headers, and even the URL itself), for example a user search. Limiting to GET requests != Read Only. Attacks could still be performed using GET requests.

The second solution is also unsubstantial, it would restrict the content that Tor users can see to that which has been cached. Anything not cached or that has expired would not be "visitable" by Tor users. So user searches, for example, would be completely disabled.

In any case, not all Tor usage fits into a "Read Only" model, plenty of Tor users will want to interact with websites (login with a pseudonym, post on forums etc etc...), so any solution that involves a Read Only mode would require a way to switch to a "normal" mode at certain points.

Switching out of Read Only mode

So how could we do this, again there has been some discussion around a potential solution:

Use One of the above Read Only solutions, but when a POST request (or any other non-GET request) is made, or in the case of solution (2) above, when a GET request misses the cache, throw a 4XX error with a Captcha, and either:

  1. (javascript free version): require them to re-submit the request (display a user message asking them to try again) (this could cause a lot of frustration, after having for example filled out a really long form, and being required to do so again).
  2. try to re-submit the request using javascript, and if the user has javascript disabled, falling back to (1).

Unfortunately, although this gets us closer, this completely breaks any website interactivity that uses AJAX calls or WebSockets. We go from an inconvenient Captcha, to rendering many websites completely unusable. Not to mention the significant amount of frustration when people are told they need to re-fill-out a form.

A New Solution?

This idea requires work from both the Tor developers (specifically those who work on TBB), and the CloudFlare developers.

The User experience

For non Tor users, or Tor users using an older TBB, the experience is unchanged. Older Tor users will still have to use a Captcha, which will grant them full access to a website as is currently done now. For users using the latest TBB, upon landing on a website protected by CloudFlare, they will see something like this:

Tor Prove Human Screenshot

Note: the wording in this screenshot is by no means final.

Now the user can choose to either ignore the warning, dismiss it, or click "Prove You're Human". Ignoring the warning will allow the user to continue using the site in a Read Only mode; here I think the most appropriate Implementation would be to use Cached-Only pages (not sending any requests on to the server). For any cache misses it can display the Captcha.

Now when a user submits a form, the page will remain in a "loading" state while a new tab is opened and focused for the user to complete a Captcha. (We could optionally have the same warning displayed on this page, but without the button or dismiss icon). Once the user has completed the captcha, the tab will close and the existing (paused) tab will continue (actually make the request).

A similar thing would happen for any AJAX or WebSocket requests, the request would be paused until a Captcha is completed in a separate tab or window.

This would allows for, I think, the minimum amount of friction for performing any particular task on a website, requiring a Captcha only when necessary, and indicating to a user that they are viewing a reduced-functionality version of a website.

A Technical Implementation

On the TBB side, the browser would need to indicate that it supports this "prove human" functionality by way of either User-Agent, or by specifying a particular header. For example, along with the request, it could send X-Human-Proof: Available.

The CloudFlare server, upon receiving a request, if:

  • The threat level has been determined as "CAPTCHA"
  • The user agent supports the "Human Proof" feature (i.e. has the appropriate X-Human-Proof header).
  • There is no cookie set for the Captcha (no existing proof-of-human).
  • The request is a GET.
  • The requested URL is cached.

Then return the cached contents, along with a header like X-Human-Proof-Required: <some URL to visit for Captcha>. In any other situation, behave as normal. (Note: the URL will need to be for the same domain as the request, so site-relative probably will make most sense, i.e. starting with /)

The TBB, upon seeing a response with the header X-Human-Proof-Required, will mark any domains that return this as "requiring human proof" (for the given session), and for any pages whose URL contains a domain in this list, display the bar shown in the screenshot (unless it's already been dismissed).

Now when any non-GET request is made to a domain marked as "requiring human proof" (whether AJAX, WebSocket or otherwise), pause the request, and open a new tab to the URL required (given in the X-Human-Proof-Required header). Wait for a response from the given domain that does not contain the X-Human-Proof-Required header, then continue the paused request (actually send the request to the server).

Future Improvements

This would give us a good foundation for building on iterative UX improvements, and improving mechanisms for how user agents prove to servers that they are being operated by humans. From here we could:

  • Submit an RFC for these headers, and try and make an official spec for the behaviour.
  • Make these changes in the client (handling of headers, pausing requests, opening challenge in new tab etc...) upstream, and across other browsers.
  • Iteratively improve the UI, such as displaying a blocking-dialog on any pages that are waiting on a captcha (or other challenge) to be completed.
  • Encourage websites that don't use CloudFlare, but block tor exit nodes to instead behave in this manner.

Potential Issues

The biggest issue I see with this solution is that it would require some non-trivial engineering effort from the Tor developers. For CloudFlare, I feel that this engineering effort would be comparatively less difficult. But I honesty feel it would pay off.

Another thing I did think of is that this mechanism may encourage website operators to more eagerly block Tor traffic and require "proof-of-humanness" to use a website to its full capacity, but I'm unsure about that.

After having given this idea some thought for a couple of days, other than the above points, I am yet to come up with any significant issues. Please let me know if you can think of any and I'll update this post.

I look forward to seeing if this idea can get us any further to finding a complete solution.

Comments

Sam Lanning

Sam Lanning

Studied Computer Science @ Oxford University. Interested in Communication, Security, Privacy, Anonymity, P2P, E2E, Mesh, Censorship Resistance etc...

Oxford - UK