Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="generator" content="pandoc">
<meta name="author" content="Master Thesis Defence">
<title>You shall not publish: Edit filters on EN Wikipedia</title>
<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
<link rel="stylesheet" href="reveal.js/css/reveal.css">
<style type="text/css">code{white-space: pre;}</style>
<link rel="stylesheet" href="reveal.js/css/theme/white.css" id="theme">
<!-- Printing and PDF exports -->
<script>
var link = document.createElement( 'link' );
link.rel = 'stylesheet';
link.type = 'text/css';
link.href = window.location.search.match( /print-pdf/gi ) ? 'reveal.js/css/print/pdf.css' : 'reveal.js/css/print/paper.css';
document.getElementsByTagName( 'head' )[0].appendChild( link );
</script>
<!--[if lt IE 9]>
<script src="reveal.js/lib/js/html5shiv.js"></script>
<![endif]-->
</head>
<body>
<div class="reveal">
<div class="slides">
<section data-background="images/gandalf.png" data-background-opacity="0.2">
<h2 class="title">You shall not publish: Edit filters on EN Wikipedia</h2>
<p >Master Thesis Defence</p>
<p class="author">Lyudmila Vaseva</p>
<p class="date">6 January 2020</p>
</section>
<section class="slide level1">
<h2 id="overview">Overview</h2>
<ul>
<li class="fragment">Motivation and research questions</li>
<li class="fragment">Analysis sources</li>
<li class="fragment">Findings</li>
<li class="fragment">Directions for future studies</li>
</ul>
</section>
<section class="slide level1">
<h2 id="what-is-an-edit-filter">What is an edit filter</h2>
</section>
<section class="slide level1">
<h2 id="motivation">Motivation</h2>
<p><img src="images/editors-rise-decline.png" height="450" class="stretch" alt="Rise and decline in numbers of editors on EN Wikipedia"> <small>Source: Halfaker et al. "The Rise and Decline of an Open Collaboration System: How Wikipedia’s reaction to popularity is causing its decline"</small></p>
</section>
<section class="slide level1">
<h2 id="research-questions">Research questions</h2>
<ul>
<li class="fragment">Q1: What is the role of edit filters among existing algorithmic quality-control mechanisms on Wikipedia (bots, semi-automated tools, ORES)?</li>
<li class="fragment">Q2: Edit filters are a classical rule-based system. Why are they still active today when more sophisticated ML approaches exist?</li>
<li class="fragment">Q3: Which type of tasks do filters take over?</li>
<li class="fragment">Q4: How have these tasks evolved over time (are there changes in the type, number, etc.)?</li>
</ul>
</section>
<section class="slide level1">
<h2 id="analysis-sources">Analysis Sources</h2>
<ul>
<li class="fragment">Literature</li>
<li class="fragment">Documentation</li>
<li class="fragment">Data</li>
</ul>
</section>
<section class="slide level1">
<h2 id="q1-what-is-the-role-of-edit-filters-among-existing-algorithmic-quality-control-mechanisms-on-wikipedia-bots-semi-automated-tools-ores">Q1: What is the role of edit filters among existing algorithmic quality-control mechanisms on Wikipedia (bots, semi-automated tools, ORES)?</h2>
</section>
<section class="slide level1">
<p><img src="images/funnel-with-filters-new.png" class="stretch" height="500" alt="Funnel diagramm of all vandal fighting mechanisms according to me"></p>
</section>
<section class="slide level1">
<ul>
<li class="fragment">edit filters triggered <em>before</em> an edit is published</li>
<li class="fragment">disallow certain types of obvious, pervasive (perhaps automated), and difficult to remove vandalism directly</li>
<li class="fragment">can target malicious users directly without restricting everyone (<-> page protection)</li>
<li class="fragment">historically faster and more reliable, by being a direct part of the core software</li>
<li class="fragment">people fed up with bot governance</li>
</ul>
</section>
<section class="slide level1">
<h2 id="q2-edit-filters-are-a-classical-rule-based-system.-why-are-they-still-active-today-when-more-sophisticated-ml-approaches-exist">Q2: Edit filters are a classical rule-based system. Why are they still active today when more sophisticated ML approaches exist?</h2>
</section>
<section class="slide level1">
<ul>
<li class="fragment">introduced before most vandalism fighting ML systems came along</li>
<li class="fragment">rule-based systems are more transparent and accountable</li>
<li class="fragment">easier to work with</li>
<li class="fragment">allow for finer levels of control than ML: i.e. disallowing specific users</li>
<li class="fragment">allow more easily for collaboration</li>
</ul>
</section>
<section class="slide level1">
<h2 id="q3-which-type-of-tasks-do-filters-take-over">Q3: Which type of tasks do filters take over?</h2>
</section>
<section class="slide level1">
<p><img src="images/all-actions-enabled-public-filters.png" alt="Filter actions for enabled public filters" align="left" width="450"> <img src="images/all-actions-enabled-hidden-filters.png" alt="Filter actions for enabled hidden filters" align="right" width="450"></p>
</section>
<section class="slide level1">
<p><img src="images/manual-tags-distribution-enabled-filters.png" alt="Distribution of manually assigned labels for enabled filters"></p>
</section>
<section class="slide level1">
<h2 id="q4-how-have-these-tasks-evolved-over-time-are-there-changes-in-the-type-number-etc.">Q4: How have these tasks evolved over time (are there changes in the type, number, etc.)?</h2>
</section>
<section class="slide level1">
<p><img src="images/filter-hits-zoomed.png" alt="Number of filter hits per month, Mar 2009-Jan 2019"></p>
</section>
<section class="slide level1">
<p><img src="images/number-edits-over-the-years.png" alt="Number of edits over the years"></p>
</section>
<section class="slide level1">
<p><img src="images/reverts.png" alt="Number of reverts per month, Jul 2001-Jul 2017"> <small>Data source: R.S. Geiger and A. Halfaker. 2017. Code and Datasets for: Operationalizing Conflict and Cooperation Between Automated Software Agents in Wikipedia. Figshare (2017). https://doi.org/10.6084/m9.figshare.5362216</small></p>
</section>
<section class="slide level1">
<p><img src="images/filter-hits-actions.png" alt="Number of filter hits per month, according to filter action"></p>
</section>
<section class="slide level1">
<p><img src="images/filter-hits-manual-tags.png" alt="Number of filter hits per month, according to manuall assigned labels" height="250"> <img src="images/filter-hits-editor-actions.png" alt="Number of filter hits per month, according to causing editor's action" height="250"></p>
<p class="fragment">
<small>filter 527 “T34234: log/throttle possible sleeper account creations”</small>
</p>
</section>
<section class="slide level1">
<h3 id="directions-for-future-studies">Directions for future studies</h3>
<ul>
<li class="fragment">Verify results</li>
<li class="fragment">What proportion of quality control work do filters take over?</li>
<li class="fragment">To implement a bot or to implement a filter?</li>
<li class="fragment">What are the repercussions on affected editors?</li>
<li class="fragment">What are the differences between how filters are governed on EN Wikipedia compared to other language versions?</li>
</ul>
</section>
<section id="thank-you" class="slide level1">
<h1>Thank you!</h1>
<p>These slides are licensed under the <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0 License</a>.</p>
<p><img src="images/Cc-by_new_white.png" alt="by" /> <img src="images/Cc-sa_white.png" alt="sa" /></p>
<p>The project is available under: <a href="https://git.imp.fu-berlin.de/luvaseva/wikifilters" class="uri">https://git.imp.fu-berlin.de/luvaseva/wikifilters</a></p>
</section>
<section id="questions-comments-thoughts" class="slide level1">
<h1>Questions? Comments? Thoughts?</h1>
</section>
<section class="slide level1">
<p><img src="images/general-stats-donut.png" class="stretch" height="500" alt="There are 954 edit filters on EN Wikipedia: roughly 21% of them
are active, 16% are disabled, and 63% are deleted"> <small>There are 954 edit filters on EN Wikipedia: roughly 21% of them are active, 16% are disabled, and 63% are deleted</small></p>
</section>
<section class="slide level1">
<p><img src="images/detailed-manual-tags-distribution.png" class="stretch" height="500" alt="Distribution of detailed manual tags"></p>
</section>
</div>
</div>
<script src="reveal.js/lib/js/head.min.js"></script>
<script src="reveal.js/js/reveal.js"></script>
<script>
// Full list of configuration options available at:
// https://github.com/hakimel/reveal.js#configuration
Reveal.initialize({
controls: false,
slideNumber: 'c/t',
// Optional reveal.js plugins
dependencies: [
{ src: 'reveal.js/lib/js/classList.js', condition: function() { return !document.body.classList; } },
{ src: 'reveal.js/plugin/zoom-js/zoom.js', async: true },
{ src: 'reveal.js/plugin/notes/notes.js', async: true }
]
});
</script>
</body>
</html>