[geeklog-devel] Blocking those inclusion attempts

Dirk Haun dirk at haun-online.de
Sat Dec 15 08:36:14 EST 2007


Ramnath R Iyer wrote:

>Wouldn't this also block something like -
>
>GET http://www.geeklog.net/forum/index.php HTTP/1.1

Yes, it would.


>HTTP v1.1 allows complete URIs to be specified in the request line.

It does? [checks RFC 2616] Oops ...

I've seen those on occasion but always assumed they were from some
broken UA and/or shady part of the net (and they usually were). Hence:

--- snip ---
# anything that can't event talk HTTP properly should be blocked right
away ...
RewriteCond %{THE_REQUEST} "^GET http" [OR]  
RewriteCond %{REQUEST_URI} ^http [OR]
--- snip ---

(from the geeklog.net .htaccess)

First hit for such a request in today's access.log, for example:

--- snip ---
87.118.118.209 - - [15/Dec/2007:00:04:32 -0500] "GET http://
www.geeklog.net/ HTTP/1.0" 403 14 "http://www.geeklog.net/forum/
createtopic.php?method=newtopic&forum=4%2B%255B0,35044,2570%255D%2B-%253E
%2B%255BN%255D%2BPOST%2Bhttp://www.geeklog.net/forum/createtopic.php%2B
%255B0,0,45880%255D" "Mozilla/2.0 compatible; Check&Get 1.14 (Windows NT)"
--- snip ---

I could write a lengthy blog post about that single request alone,
starting with the IP address it's coming from[1] ...


Anyway, back on topic: So even if they are rare, it looks like my
suggested rewrite rule could potentially block legit requests.

Can someone suggest an improvement? What's the regexp for "contains, but
does not start with http:"?

bye, Dirk

[1] <http://spam.tinyweb.net/article.php/morbid-spam>


-- 
http://www.haun-online.de/
http://geeklog.info/




More information about the geeklog-devel mailing list