Admin Forum:Deleting Nonsense Pages Bot

Forums: Admin Central Index → Technical Help → Deleting Nonsense Pages Bot Wikia's forums are a place for the community to help other members. To contact staff directly or to report bugs, please use Special:Contact.

I have a spare account called Head.Boy.Hog.Bot and I thought it could come in handy on the Hogwarts RPG Wiki. I was wondering if it was possible to get it to automatically delete pages that are created and don't have much on it. (Like a few words.) Is it possible to do so? If it is, who would I do that? Thanks!


 * I'm not sure how a bot could identify a page as containing 'nonsense' content. And having a bot deleting a page is a bit risky, IMO, as if you don't control it, it may delete pages that have valid content. You could make it add the pages to a precise category then review the pages and delete them manually. Hunter789 01:23, August 24, 2011 (UTC)


 * Okay.

Well, here's the thing. I was intrigued by this challenge. I don't know why, because honestly, the software already gives you a report of small pages at Special:ShortPages. It's sort of redundant to make a bot which does this same task. And, like Hunter789 already pointed out, you do want to have human oversight over deletions based upon reasons of content.

But for some reason I'm still intrigued by this. Maybe because if we solved this, we could have bots that automatically labelled pages of 0 lenght, and therefore flag pages that had recently been blanked. I think there is an application for a bot that will count the nubmer of words on a page, despite Special:ShortPages, and so I've spent some time on it.

I'm kind of okay with regex and user-fixes.py solutions, but I'm not brilliant. So this dowsn't work. But it might inspire someone else to figure out what I've done wrong. Here's my best shot at this so far. But again, it doesn't work. It runs, so it's not got syntax errors, but it doesn't catch anything. The basic theory here is that the regex expression is meant to find pages that have a "word boundary" number between 0 and 50, indicating an extremely short page. Then it's supposed to dump the contents of the page + a category, back on to the page. fixes['length'] = { 'regex': True, 'msg': { 'en':u'cat tagging exceptionally small articles' },   'replacements': [ (r'(^[\b(\w+?)\b]{0,50}$)',r'\1'), ],  }

Once that's in user-fixes.py, it's then triggered by

python replace.py fix:length -cat:whatever So there ya go. Doesn't do what it's supposed to do, but I think it's really close. Again, maybe someone else will drop by and figure out why it's not quite working. 22:49:43 Thu 15 Sep 2011