You shall not publish: Edit filters on EN Wikipedia
Master Thesis Defence
Lyudmila Vaseva
6 January 2020
Overview
- Motivation and research questions
- Analysis sources
- Findings
- Directions for future studies
Motivation
Source: Halfaker et al. "The Rise and Decline of an Open Collaboration System: How Wikipedia’s reaction to popularity is causing its decline"
Research questions
- Q1: What is the role of edit filters among existing algorithmic quality-control mechanisms on Wikipedia (bots, semi-automated tools, ORES)?
- Q2: Edit filters are a classical rule-based system. Why are they still active today when more sophisticated ML approaches exist?
- Q3: Which type of tasks do filters take over?
- Q4: How have these tasks evolved over time (are there changes in the type, number, etc.)?
Analysis Sources
- Literature
- Documentation
- Data
- edit filters triggered before an edit is published
- disallow certain types of obvious, pervasive (perhaps automated), and difficult to remove vandalism directly
- can target malicious users directly without restricting everyone (<-> page protection)
- historically faster and more reliable, by being a direct part of the core software
- people fed up with bot governance
Q2: Edit filters are a classical rule-based system. Why are they still active today when more sophisticated ML approaches exist?
- introduced before most vandalism fighting ML systems came along
- rule-based systems are more transparent and accountable
- easier to work with
- allow for finer levels of control than ML: i.e. disallowing specific users
- allow more easily for collaboration
Q3: Which type of tasks do filters take over?
Q4: How have these tasks evolved over time (are there changes in the type, number, etc.)?
Data source: R.S. Geiger and A. Halfaker. 2017. Code and Datasets for: Operationalizing Conflict and Cooperation Between Automated Software Agents in Wikipedia. Figshare (2017). https://doi.org/10.6084/m9.figshare.5362216
Directions for future studies
- Verify results
- What proportion of quality control work do filters take over?
- To implement a bot or to implement a filter?
- What are the repercussions on affected editors?
- What are the differences between how filters are governed on EN Wikipedia compared to other language versions?
There are 954 edit filters on EN Wikipedia: roughly 21% of them are active, 16% are disabled, and 63% are deleted