<!DOCTYPE html> <html> <head> <meta charset="utf-8"> <meta name="generator" content="pandoc"> <meta name="author" content="Master Thesis Defence"> <title>You shall not publish: Edit filters on EN Wikipedia</title> <meta name="apple-mobile-web-app-capable" content="yes"> <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent"> <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui"> <link rel="stylesheet" href="reveal.js/css/reveal.css"> <style type="text/css">code{white-space: pre;}</style> <link rel="stylesheet" href="reveal.js/css/theme/white.css" id="theme"> <!-- Printing and PDF exports --> <script> var link = document.createElement( 'link' ); link.rel = 'stylesheet'; link.type = 'text/css'; link.href = window.location.search.match( /print-pdf/gi ) ? 'reveal.js/css/print/pdf.css' : 'reveal.js/css/print/paper.css'; document.getElementsByTagName( 'head' )[0].appendChild( link ); </script> <!--[if lt IE 9]> <script src="reveal.js/lib/js/html5shiv.js"></script> <![endif]--> </head> <body> <div class="reveal"> <div class="slides"> <section data-background="images/gandalf.png" data-background-opacity="0.2"> <h2 class="title">You shall not publish: Edit filters on EN Wikipedia</h2> <p >Master Thesis Defence</p> <p class="author">Lyudmila Vaseva</p> <p class="date">6 January 2020</p> </section> <section class="slide level1"> <h2 id="overview">Overview</h2> <ul> <li class="fragment">Motivation and research questions</li> <li class="fragment">Analysis sources</li> <li class="fragment">Findings</li> <li class="fragment">Directions for future studies</li> </ul> </section> <section class="slide level1"> <h2 id="what-is-an-edit-filter">What is an edit filter</h2> </section> <section class="slide level1"> <h2 id="motivation">Motivation</h2> <p><img src="images/editors-rise-decline.png" height="450" class="stretch" alt="Rise and decline in numbers of editors on EN Wikipedia"> <small>Source: Halfaker et al. "The Rise and Decline of an Open Collaboration System: How Wikipedia’s reaction to popularity is causing its decline"</small></p> </section> <section class="slide level1"> <h2 id="research-questions">Research questions</h2> <ul> <li class="fragment">Q1: What is the role of edit filters among existing algorithmic quality-control mechanisms on Wikipedia (bots, semi-automated tools, ORES)?</li> <li class="fragment">Q2: Edit filters are a classical rule-based system. Why are they still active today when more sophisticated ML approaches exist?</li> <li class="fragment">Q3: Which type of tasks do filters take over?</li> <li class="fragment">Q4: How have these tasks evolved over time (are there changes in the type, number, etc.)?</li> </ul> </section> <section class="slide level1"> <h2 id="analysis-sources">Analysis Sources</h2> <ul> <li class="fragment">Literature</li> <li class="fragment">Documentation</li> <li class="fragment">Data</li> </ul> </section> <section class="slide level1"> <h2 id="q1-what-is-the-role-of-edit-filters-among-existing-algorithmic-quality-control-mechanisms-on-wikipedia-bots-semi-automated-tools-ores">Q1: What is the role of edit filters among existing algorithmic quality-control mechanisms on Wikipedia (bots, semi-automated tools, ORES)?</h2> </section> <section class="slide level1"> <p><img src="images/funnel-with-filters-new.png" class="stretch" height="500" alt="Funnel diagramm of all vandal fighting mechanisms according to me"></p> </section> <section class="slide level1"> <ul> <li class="fragment">edit filters triggered <em>before</em> an edit is published</li> <li class="fragment">disallow certain types of obvious, pervasive (perhaps automated), and difficult to remove vandalism directly</li> <li class="fragment">can target malicious users directly without restricting everyone (<-> page protection)</li> <li class="fragment">historically faster and more reliable, by being a direct part of the core software</li> <li class="fragment">people fed up with bot governance</li> </ul> </section> <section class="slide level1"> <h2 id="q2-edit-filters-are-a-classical-rule-based-system.-why-are-they-still-active-today-when-more-sophisticated-ml-approaches-exist">Q2: Edit filters are a classical rule-based system. Why are they still active today when more sophisticated ML approaches exist?</h2> </section> <section class="slide level1"> <ul> <li class="fragment">introduced before most vandalism fighting ML systems came along</li> <li class="fragment">rule-based systems are more transparent and accountable</li> <li class="fragment">easier to work with</li> <li class="fragment">allow for finer levels of control than ML: i.e. disallowing specific users</li> <li class="fragment">allow more easily for collaboration</li> </ul> </section> <section class="slide level1"> <h2 id="q3-which-type-of-tasks-do-filters-take-over">Q3: Which type of tasks do filters take over?</h2> </section> <section class="slide level1"> <p><img src="images/all-actions-enabled-public-filters.png" alt="Filter actions for enabled public filters" align="left" width="450"> <img src="images/all-actions-enabled-hidden-filters.png" alt="Filter actions for enabled hidden filters" align="right" width="450"></p> </section> <section class="slide level1"> <p><img src="images/manual-tags-distribution-enabled-filters.png" alt="Distribution of manually assigned labels for enabled filters"></p> </section> <section class="slide level1"> <h2 id="q4-how-have-these-tasks-evolved-over-time-are-there-changes-in-the-type-number-etc.">Q4: How have these tasks evolved over time (are there changes in the type, number, etc.)?</h2> </section> <section class="slide level1"> <p><img src="images/filter-hits-zoomed.png" alt="Number of filter hits per month, Mar 2009-Jan 2019"></p> </section> <section class="slide level1"> <p><img src="images/number-edits-over-the-years.png" alt="Number of edits over the years"></p> </section> <section class="slide level1"> <p><img src="images/reverts.png" alt="Number of reverts per month, Jul 2001-Jul 2017"> <small>Data source: R.S. Geiger and A. Halfaker. 2017. Code and Datasets for: Operationalizing Conflict and Cooperation Between Automated Software Agents in Wikipedia. Figshare (2017). https://doi.org/10.6084/m9.figshare.5362216</small></p> </section> <section class="slide level1"> <p><img src="images/filter-hits-actions.png" alt="Number of filter hits per month, according to filter action"></p> </section> <section class="slide level1"> <p><img src="images/filter-hits-manual-tags.png" alt="Number of filter hits per month, according to manuall assigned labels" height="250"> <img src="images/filter-hits-editor-actions.png" alt="Number of filter hits per month, according to causing editor's action" height="250"></p> <p class="fragment"> <small>filter 527 “T34234: log/throttle possible sleeper account creations”</small> </p> </section> <section class="slide level1"> <h3 id="directions-for-future-studies">Directions for future studies</h3> <ul> <li class="fragment">Verify results</li> <li class="fragment">What proportion of quality control work do filters take over?</li> <li class="fragment">To implement a bot or to implement a filter?</li> <li class="fragment">What are the repercussions on affected editors?</li> <li class="fragment">What are the differences between how filters are governed on EN Wikipedia compared to other language versions?</li> </ul> </section> <section id="thank-you" class="slide level1"> <h1>Thank you!</h1> <p>These slides are licensed under the <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0 License</a>.</p> <p><img src="images/Cc-by_new_white.png" alt="by" /> <img src="images/Cc-sa_white.png" alt="sa" /></p> <p>The project is available under: <a href="https://git.imp.fu-berlin.de/luvaseva/wikifilters" class="uri">https://git.imp.fu-berlin.de/luvaseva/wikifilters</a></p> </section> <section id="questions-comments-thoughts" class="slide level1"> <h1>Questions? Comments? Thoughts?</h1> </section> <section class="slide level1"> <p><img src="images/general-stats-donut.png" class="stretch" height="500" alt="There are 954 edit filters on EN Wikipedia: roughly 21% of them are active, 16% are disabled, and 63% are deleted"> <small>There are 954 edit filters on EN Wikipedia: roughly 21% of them are active, 16% are disabled, and 63% are deleted</small></p> </section> <section class="slide level1"> <p><img src="images/detailed-manual-tags-distribution.png" class="stretch" height="500" alt="Distribution of detailed manual tags"></p> </section> </div> </div> <script src="reveal.js/lib/js/head.min.js"></script> <script src="reveal.js/js/reveal.js"></script> <script> // Full list of configuration options available at: // https://github.com/hakimel/reveal.js#configuration Reveal.initialize({ controls: false, slideNumber: 'c/t', // Optional reveal.js plugins dependencies: [ { src: 'reveal.js/lib/js/classList.js', condition: function() { return !document.body.classList; } }, { src: 'reveal.js/plugin/zoom-js/zoom.js', async: true }, { src: 'reveal.js/plugin/notes/notes.js', async: true } ] }); </script> </body> </html>