<h2class="title">You shall not publish: Edit filters on EN Wikipedia</h2>
<p>Master Thesis Defence</p>
<pclass="author">Lyudmila Vaseva</p>
<pclass="date">6 January 2020</p>
</section>
<sectionclass="slide level1">
<h2id="overview">Overview</h2>
<ul>
<liclass="fragment">Motivation and research questions</li>
<liclass="fragment">Analysis sources</li>
<liclass="fragment">Findings</li>
<liclass="fragment">Directions for future studies</li>
</ul>
</section>
<sectionclass="slide level1">
<h2id="what-is-an-edit-filter">What is an edit filter</h2>
</section>
<sectionclass="slide level1">
<h2id="motivation">Motivation</h2>
<p><imgsrc="images/editors-rise-decline.png"height="450"class="stretch"alt="Rise and decline in numbers of editors on EN Wikipedia"><small>Source: Halfaker et al. "The Rise and Decline of an Open Collaboration System: How Wikipedia’s reaction to popularity is causing its decline"</small></p>
<liclass="fragment">Q1: What is the role of edit filters among existing algorithmic quality-control mechanisms on Wikipedia (bots, semi-automated tools, ORES)?</li>
<liclass="fragment">Q2: Edit filters are a classical rule-based system. Why are they still active today when more sophisticated ML approaches exist?</li>
<liclass="fragment">Q3: Which type of tasks do filters take over?</li>
<liclass="fragment">Q4: How have these tasks evolved over time (are there changes in the type, number, etc.)?</li>
</ul>
</section>
<sectionclass="slide level1">
<h2id="analysis-sources">Analysis Sources</h2>
<ul>
<liclass="fragment">Literature</li>
<liclass="fragment">Documentation</li>
<liclass="fragment">Data</li>
</ul>
</section>
<sectionclass="slide level1">
<h2id="q1-what-is-the-role-of-edit-filters-among-existing-algorithmic-quality-control-mechanisms-on-wikipedia-bots-semi-automated-tools-ores">Q1: What is the role of edit filters among existing algorithmic quality-control mechanisms on Wikipedia (bots, semi-automated tools, ORES)?</h2>
</section>
<sectionclass="slide level1">
<p><imgsrc="images/funnel-with-filters-new.png"class="stretch"height="500"alt="Funnel diagramm of all vandal fighting mechanisms according to me"></p>
</section>
<sectionclass="slide level1">
<ul>
<liclass="fragment">edit filters triggered <em>before</em> an edit is published</li>
<liclass="fragment">disallow certain types of obvious, pervasive (perhaps automated), and difficult to remove vandalism directly</li>
<liclass="fragment">historically faster and more reliable, by being a direct part of the core software</li>
<liclass="fragment">people fed up with bot governance</li>
</ul>
</section>
<sectionclass="slide level1">
<h2id="q2-edit-filters-are-a-classical-rule-based-system.-why-are-they-still-active-today-when-more-sophisticated-ml-approaches-exist">Q2: Edit filters are a classical rule-based system. Why are they still active today when more sophisticated ML approaches exist?</h2>
</section>
<sectionclass="slide level1">
<ul>
<liclass="fragment">introduced before most vandalism fighting ML systems came along</li>
<liclass="fragment">rule-based systems are more transparent and accountable</li>
<liclass="fragment">easier to work with</li>
<liclass="fragment">allow for finer levels of control than ML: i.e. disallowing specific users</li>
<liclass="fragment">allow more easily for collaboration</li>
</ul>
</section>
<sectionclass="slide level1">
<h2id="q3-which-type-of-tasks-do-filters-take-over">Q3: Which type of tasks do filters take over?</h2>
</section>
<sectionclass="slide level1">
<p><imgsrc="images/all-actions-enabled-public-filters.png"alt="Filter actions for enabled public filters"align="left"width="450"><imgsrc="images/all-actions-enabled-hidden-filters.png"alt="Filter actions for enabled hidden filters"align="right"width="450"></p>
</section>
<sectionclass="slide level1">
<p><imgsrc="images/manual-tags-distribution-enabled-filters.png"alt="Distribution of manually assigned labels for enabled filters"></p>
</section>
<sectionclass="slide level1">
<h2id="q4-how-have-these-tasks-evolved-over-time-are-there-changes-in-the-type-number-etc.">Q4: How have these tasks evolved over time (are there changes in the type, number, etc.)?</h2>
</section>
<sectionclass="slide level1">
<p><imgsrc="images/filter-hits-zoomed.png"alt="Number of filter hits per month, Mar 2009-Jan 2019"></p>
</section>
<sectionclass="slide level1">
<p><imgsrc="images/number-edits-over-the-years.png"alt="Number of edits over the years"></p>
</section>
<sectionclass="slide level1">
<p><imgsrc="images/reverts.png"alt="Number of reverts per month, Jul 2001-Jul 2017"><small>Data source: R.S. Geiger and A. Halfaker. 2017. Code and Datasets for: Operationalizing Conflict and Cooperation Between Automated Software Agents in Wikipedia. Figshare (2017). https://doi.org/10.6084/m9.figshare.5362216</small></p>
</section>
<sectionclass="slide level1">
<p><imgsrc="images/filter-hits-actions.png"alt="Number of filter hits per month, according to filter action"></p>
</section>
<sectionclass="slide level1">
<p><imgsrc="images/filter-hits-manual-tags.png"alt="Number of filter hits per month, according to manuall assigned labels"height="250"><imgsrc="images/filter-hits-editor-actions.png"alt="Number of filter hits per month, according to causing editor's action"height="250"></p>
<pclass="fragment">
<small>filter 527 “T34234: log/throttle possible sleeper account creations”</small>
</p>
</section>
<sectionclass="slide level1">
<h3id="directions-for-future-studies">Directions for future studies</h3>
<ul>
<liclass="fragment">Verify results</li>
<liclass="fragment">What proportion of quality control work do filters take over?</li>
<liclass="fragment">To implement a bot or to implement a filter?</li>
<liclass="fragment">What are the repercussions on affected editors?</li>
<liclass="fragment">What are the differences between how filters are governed on EN Wikipedia compared to other language versions?</li>
</ul>
</section>
<sectionid="thank-you"class="slide level1">
<h1>Thank you!</h1>
<p>These slides are licensed under the <ahref="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0 License</a>.</p>
<p>The project is available under: <ahref="https://git.imp.fu-berlin.de/luvaseva/wikifilters"class="uri">https://git.imp.fu-berlin.de/luvaseva/wikifilters</a></p>
<p><imgsrc="images/general-stats-donut.png"class="stretch"height="500"alt="There are 954 edit filters on EN Wikipedia: roughly 21% of them
are active, 16% are disabled, and 63% are deleted"><small>There are 954 edit filters on EN Wikipedia: roughly 21% of them are active, 16% are disabled, and 63% are deleted</small></p>
</section>
<sectionclass="slide level1">
<p><imgsrc="images/detailed-manual-tags-distribution.png"class="stretch"height="500"alt="Distribution of detailed manual tags"></p>