If you have anchor tags with href attributes external to your own website, you have no log of when those links are clicked. Example:
<a href="http://external.site.example.com/">
Click to go to "external.site.example.com"
</a>
It has become very common for people to use the JavaScript onclick event and ajax to report back to the server when a click has happened, but this is no good if the user doesn’t have JavaScript enabled, or if the HTML is on an external site such as an RSS aggregator. For those cases, it has become common to change the href to link back to a script on your own site which performs a HTTP redirect. Example:
<a href="http://my.site.example.com/redirect.cgi?url=http://external.site.example.com/">
Click to go to "external.site.example.com"
</a>
If you just allow any old “url” parameter to your redirect script, this is what is called an “Open Redirect.” Open redirects have been abused heavily by scammers and spammers and should not exist.
One option is to maintain a whitelist of URLs that are allowed to be redirected to. This is cumbersome to maintain. Another option is to add an additional parameter which authenticates the URL. For example, you could take a hash of the URL, along with a private Salt, and then provide that as a parameter too. For my examples I will use a simple short salt, “ABCDEFGHIJKLM”. Here is an example of how I would generate and use such a URL:
Generate the authentication token:
mike@haven:~$ echo "ABCDEFGHIJKLM_http://external.site.example.com/"|md5sum
a54de5ffcb50cead775cac0b254af460 -
mike@haven:~$
Build the URL:
<a href="http://my.site.example.com/redirect.cgi?url=http://external.site.example.com/&auth=a54de5ffcb50cead775cac0b254af460">
Click to go to "external.site.example.com"
</a>
The redirect.cgi is only doing two things. It checks that the MD5 of “ABCDEFGHIJKLM_http://external.site.example.com/" matches a54de5ffcb50cead775cac0b254af460, and if so it redirects. We don’t even need to write a script to do that, it can all be done inside the Apache configuration or a htaccess file. Here follows the mod_rewrite configuration:
RewriteEngine On
RewriteMap unescape int:unescape
RewriteCond %{QUERY_STRING} ^(?:.*\&)?url=([^\&]*%3[fF][^\&]*)
RewriteRule ^/bounce$ ${unescape:%1} [R=301,L,NE]
RewriteCond %{QUERY_STRING} ^(?:.*\&)?url=([^\&]+)
RewriteRule ^/bounce$ ${unescape:%1}? [R=301,L,NE]
On its own, the above would create an open redirector. The next step is to add some mod_security configuration to check the auth parameter:
## Enable ModSecurity and allow HTTP request parsing
SecRuleEngine On
SecRequestBodyAccess On
## Allow redirects if there is a valid auth parameter matching the url parameter
SecRule REQUEST_URI ^/+bounce(\?.*)?$ "chain,phase:1,nolog,allow"
SecRule ARGS:auth ^[a-f0-9]{32}$ "chain,setvar:tx.auth=%{MATCHED_VAR}"
SecRule ARGS:url ^(?i)https?://.+$ "chain,setvar:tx.urlnsalt=**ABCDEFGHIJKLM**_%{MATCHED_VAR}"
SecRule TX:urlnsalt "@streq %{TX.auth}" "t:md5,t:hexEncode"
## Block all other requests for /bounce with a 403 FORBIDDEN error
SecRule REQUEST_URI ^/+bounce(\?.*)?$ "phase:1,log,deny,status:403"
It’s pretty amazing what you can do with mod_security.
You might still consider it a pain to convert:
http://external.site.example.com/
To:
But for dynamically generated websites especially, it can be done completely transparently. This website its self is generated using server side XSLT, with a self built framework utilising Perls XML::LibXML and XML::LibXSLT modules.
So with a little Perl, XPath and XML trickery, I’ve been able to update the framework to convert all external anchor tags to use my redirector dynamically. Here’s the code:
## $xml is the XML::LibXML::Document
## $stylesheet is the XML::LibXSLT::Stylesheet
## Apply the stylesheet to the XML to generate the final output
my $doc = $stylesheet->transform( $xml, );
## If the final output is text/html, fixup the anchor tags...
if( $stylesheet->media_type eq 'text/html' ){
## Iterate over each anchor tag
foreach my $node ( $doc->findnodes('html/body//a') ){
## If the href is pointing to an external url
my $href = $node->getAttribute('href') || '';
if( $href =~ /^https?:\/\/([-a-z0-9\.]+)/i && lc($1) ne lc($ENV{HTTP_HOST}) ){
## Overwrite the href attribute
$node->setAttribute( 'href',
sprintf( '/bounce?auth=%s&url=%s',
Digest::MD5::md5_hex( "ABCDEFGHIJKLM_$href" ),
URI::Escape::uri_escape( $href ),
)
);
}
}
}
Now if I use:
<a href="http://example.com/">Click me</a>
in a blog post, my framework automatically converts it to the redirect version, and if someone then clicks the link when reading my RSS feed, even though it was an external link and they weren’t viewing the blog post from my website, I still know the link was clicked.
Want to leave a tip?You can follow this Blog using RSS or Mastodon. To read more, visit my blog index.