User:Whobot

This bot is run by Who for categorization related tasks. Whobot runs Pearle Wisebot code. This bot is approved and running with a bot flag at 10 second variable intervals.

Feel free to report any errors or problems on the talk page. I watch the bot as it runs and fix errors almost immediately.

Not to be confused with K1, sometimes referred to as "Whobot", from the Doctor Who TV series. Which is listed on List of robots in Doctor Who, but could not find a PD image.


 * Edit count: 27,489 as of 02:23, 28 November 2005 (UTC)

Tasks

 * Current tasks

Cleanup

 * Category:Articles to check for link ordering

Authorized behavior

 * The following is almost verbatim to text found on User:Pearle

Whobot has obtained authorization from Wikipedia talk:Bots for the purpose of executing the following tasks. All tasks are performed by User:Who running Whobot and using data files on his home computer in the following formats. The original code and data was obtained from User:Beland, and Whobot is a clone of Pearle Wisebot with slight modifications.

Automatically move categories

 * Parse a file and match commands of the form:
 * MOVE_CONTENTS Category:Name_of_A Category:Name_of_B cfd_log_date or speedy
 * MOVE_CONTENTS_INCL_CATS Category:Name_of_A Category:Name_of_B cfd_log_date or speedy


 * Download Category:Name_of_A
 * Parse the page to extract all of its member articles and subcategories.
 * For each member, replace all instances of, with  , preserving sort fields.  Members that contain any nowiki or pre tags in the wikisource will be skipped.

Moving a category is the equivalent of deletion, so this function will only be run on commands that have been approved by Categories for deletion.


 * New feature, now will list the CFD day page in the edit summary for easier reference.
 * Example: Recat per WP:CFD Category:Mad_scientists to Category:Fictional_scientists

Remove articles from a category

 * 1) Accept commands of the following form:
 * REMOVE_X_FROM_CAT Page_name Category:Category_name


 * 1) Download the wikisource of Page_name
 * 2) Remove the string  from the text
 * 3) Post the new text

Tag categores with
Categories nominated to Categories for deletion need to be tagged with or similar template to inform watchers of the potention deletion or renaming. Pearle can do this with commands of the form:
 * ADD_CFD_TAG Category:Category_name_here

For nominations en masse, the tag should be changed to e.g.:

REMOVE_CFD_TAG

 * See Wikipedia talk:Bots

New category/interwiki style
Minor changes and bugfixes may occur in response to community complaints or suggestions.

Rules

 * Whobot should attempt to do a category/interwiki cleanup whenever it edits an article, but there will be no mass cleanup run (except for articles already edited by Whobot) unless requested.
 * HTML comments on the same line following a category or interwiki tag will remain there. Any other text there will trigger a flag for review.
 * If a category or interwiki tag is found in the "body text" area, it will be flagged for review.
 * Canonicalize "zh-cn" (Chinese simplified) and "zh-tw" (Chinese traditional) to "zh" because the simplified/traditional distinction is now being solved in software.
 * Canonicalize "minnan" to "zh-min-nan", since only the latter is in the official, automatically updated list.
 * Canonicalize "nb" to "no", since only the latter is in the official, automatically updated list. (Added after observing the need for this in practice. -- Beland 4 July 2005 17:03 (UTC))
 * Canonicalize dk to da. (Same as above. -- Beland 02:48, 25 August 2005 (UTC))
 * Multi-line HTML comments must be preserved
 * Separate category and interwiki links mashed together on the same line.
 * Don't change interwiki link sort order.

Algorithm

 * Break the article up into segments, each of which is tagged. Use two arrays, one for content, and one for names.


 * Parse input into segments, each of which is labeled by type.
 * Find nowiki tags everywhere.
 * Find comment tags everywhere else.
 * Find HTML tags everywhere else.
 * Find category links everywhere else.
 * Find interwiki links everywhere else.
 * Find template tags everywhere else.
 * Lump html tags following a category segment (except category and interwiki links) until the next newline into the category segment.
 * Lump everything following an interwiki segment (except category and interwiki links) until the next newline into the interwiki segment.
 * The remainder of the page will be tagged as body text.
 * Move any category or interwiki links at the top of the page to the very bottom.
 * Move before the category links, preserving whatever whitespace preceded or followed them.
 * Delete these comments near the category/interwiki section (case and whitespace insensitive):


 * Determine whether or not the page should be flagged for manual review. Find the last non-category, non-interwiki segment.  If there are any interwiki or category links before this segment, flag the page for manual review by adding a template at the end.
 * If the page has not been flagged: consolidate all interwiki links at the end, preceded by category links, preceded by all other segment types. Be sure to retain the original order of segments in each of the three groups.

msg: syntax cleanup
The syntax is depreciated in favor of. Pearle is authorized to make this change wherever it is needed. was rumored to break in MediaWiki 1.5, though it is apparently still working.

Code
You may find the Whobot revised code here.