[geeklog-spam] Wikispam blacklist

Dirk Haun dirk at haun-online.de
Mon May 28 07:30:17 EDT 2007


Dirk Haun wrote:

>Turns out there's a manually updated spam blacklist that is shared
>between three wiki engines (MoinMoin, TWiki, MediaWiki). This is more or
>less a continuation of the MT-Blacklist (RIP) - minus the RDF feed for
>the updates.

Okay, so the problem with the faulty entries in that list has been resolved.

The maintainer, Thomas Waldmann from MoinMoin wiki, also kindly gave me
the Python script they use for the syndication. Since there is no RDF
feed for the updates, I've hacked something together myself.

So, if you want to try this out, you can re-install the old MT-Blacklist
modules and use these settings in your Spam-X config.php (URLs mangled
as I don't want them to be picked up by the search engines just yet):

$_SPX_CONF['mtblacklist_url'] = 'http://www#geeklog#net/backend/spam-
merge.txt';
$_SPX_CONF['rss_url'] = 'http://www#geeklog#net/backend/spam-merge-
changes.rdf';

Please note that this is experimental and I may have to take it down
without prior notice in case there's a problem (but I would announce
that here, of course).

Also, the list currently only includes entries provided by the MoinMoin
and TWiki communities since, according to Thomas, the MediaWiki entries
often contained faulty regexps or phrases that were too generic. Still,
it's a list of over 3000 entries, including some generic rules.

A note on the MT-Blacklist modules: You'll need both
MTBlackList.Examine.class.php and Import.Admin.class.php as well as the
magpierss directory. You can take those from a Geeklog 1.4.0 tarball. If
you get them from CVS, make sure to take them from the geeklog_1_4_1_1
branch, as the modules from the trunk may depend on other changes in CVS.

bye, Dirk


-- 
http://spam.tinyweb.net/




More information about the geeklog-spam mailing list