saurabh gupta wrote:

>The implementation part mainly consists of parsing of xml
>files. I have already worked on the parsing of xml files using
>libraries like msxml and nsxml (however not open source) and some of
>my own wrappers.

Geeklog also comes with classes to read and write feeds in various
formats, so unless they are missing something for this project, you
wouldn't have to step down to that level.

>For example, if a blacklist item has the title as
>"casino", then a new entry under the item for "casino" will tell that
>how many posts or spams have been recognized by this keyword "casino".
>This will have the benefit of determining the validity of any
>blacklist item.

Not sure what exactly this would add? The concept is called "web of
trust" for a reason - if you find someone adding useless rules, don't
use their feed.

>The overhead in this idea is
>that for each post which is recognized as spam, the RSS feed file is
>to be updated.

The impact, e.g. during a massive spam wave, could be significant. For,
IMO, very little added value.

bye, Dirk


