User:SuggestBot

SuggestBot is a program that attempts to help Wikipedia users find pages to edit. More detail is below.

If you want to see some personalized recommendations for you, please leave your name at User:SuggestBot/Requests.

About SuggestBot
I am a Wikipedia bot belonging to ForteTuba. My job in life is to match people with pages they might like to contribute to based on their past contributions. I use a variety of algorithms, including standard information retrieval and collaborative filtering techniques, to make suggestions. I also sometimes point people to the Community Portal, or their past edits, as a source of inspiration.

I mostly run at the GroupLens Research Lab on various machines, mostly using a recent copy of the Wikipedia database. I have found that I need to download people's contributions when making recommendations (to avoid recommending pages they've edited since the last dump) and I also occasionally download recent contributions to check whether people are taking recommendations. I try to be laid back about this. I'm still under development, cobbled together from bits of Perl for now.

I tell people about suggestions in one of two ways:
 * People who ask, I post them directly to their talk page, like this.
 * People I randomly pick, I create a subpage of my user page and put the links there. Then I put a brief note on their talk page.

If I made personalized recommendations for you, please tell me whether they were useful and how to make them better. Comments on recommendations, as well as general comments, suggestions, or complaints, are best left on my talk page. Comments are welcome and valuable, as they will help me do a better job of helping Wikipedia.

No one had strong objections on Wikipedia_talk:Bots when I was proposed, so I'm running off and on. I've run a couple of pilot tests, and I have a page where people can request suggestions. Eventually I want to become a Wikipedia Tool.

Limitations/issues

 * It's still not good with non-low-ASCII characters in usernames. Sigh.
 * Some people would like wanted articles (redlinks). Hard because the only info we have on a redlink is a title and the pages that link to it -- no edit history to work from.  Might be able to do this.
 * Someone suggested removing section stubs from the stubs list. Probably the right thing to do.
 * Right now you have to make requests each time you want recommendations. It should have an easy way to support repeat customers.
 * Should probably remember what's been recommended to a person, and avoid re-recommending for repeat customers.
 * Needs to eventually, automatically, re-download lists of articles.
 * Automated posting of suggestions/notifications is broken for some talk pages, and I don't know why. Probably redirects, someone pointed this out to me.
 * Only reads up to N (=500 as of Mar 6 2006) of a person's most recent edits when making recs. It tries to get older edits from a dump, in order to not recommend articles people have edited in the past, but this isn't perfect because dumps go out of date (there might be a gap between your last 500 edits and any edits it finds in the dump, and articles in that space might be recommended).
 * Doesn't handle redirects (also leading to recommendation of already-edited pages). This appears to be a relatively minor issue based on a little bit of testing of recommended items.  I'd like to do this on the back end, so that if a person has edited several versions of a page, SuggestBot would "know" that they were all the same page (and maybe do better).
 * Ignores anything outside of main namespace (taking article talk pages into account might be interesting, a better representation of people's interests than just edits of articles directly. On the other hand, people often post on talk pages of articles they'd like to see deleted?)...

Changelog

 * Try to improve profiles by ignoring minor and disambig edits. -- 11:28, 7 August 2006 (UTC)
 * Kick over coedit recommender to 7-17 dump. -- 11:28, 7 August 2006 (UTC)
 * Removed random recommendations, they were rarely followed. -- 05:01, 27 March 2006 (UTC)
 * Eliminate most previously edited articles by looking at a relatively recent local dump. Many dumps fail on en, and processing them takes days, so for now we're on a mostly-processed version of the 2-19-06 dump. -- 16:24, 15 March 2006 (UTC)
 * Maybe fixed all accented character issues? -- 15:58, 15 March 2006 (UTC)
 * Add a filter to not recommend articles in the top N% (N=1) of edited articles -- a better way to handle the controversial article problem, and consistent across recommendation algorithms. -- 00:11, 15 March 2006 (UTC)
 * Instead of recommending among all articles, focus on recommending articles tagged as stubs or needing work. (Somewhat like OpenTask but giving more weight to stubs since there are so many more of them.) -- 21:49, 14 March 2006 (UTC)
 * Expand edit removing to include protection actions. -- ForteTuba 16:16, 14 March 2006 (UTC)
 * Make some random recommendations, to make sure all articles get recommended eventually (a la User:Pearle's maintenance of Template:Opentask). -- 18:48, 13 March 2006 (UTC)
 * Harshly penalize articles with lots of links in the link-based recommender, to recommend fewer popular pages (that presumably have less opportunity to contribute to). -- 16:35, 10 March 2006 (UTC)
 * Remove many edits as input to recommendations, if the comment suggests they're reverting vandals. These edits appear to cause recommendations to zero in on controversial pages. -- 16:35, 10 March 2006 (UTC)
 * Fix many accented character issues. -- 21:41, 7 March 2006 (UTC)