\chapter{Background: Quality-control mechanisms on Wikipedia}
\label{chap:background}
\begin{comment}
- algorithmic governance
- code is law
\end{comment}
In the present chapter we study scientific literature on quality control mechanisms in Wikipedia in order to better understand the role of edit filters in this ecosystem.
There are works on vandalism detection in general/detection of unencyclopedic content (~\cite{PotSteGer2008}), %TODO is this significant? are there really that many "in general"?
as well as several articles dedicated to bots and the role they play in mainataining quality on Wikipedia (\cite{GeiHal2013},~\cite{Geiger2014},\cite{GeiHal2017},~\cite{GeiRib2010},~\cite{HalRied2012},~\cite{Livingstone2016},~\cite{MueDoHer2013},~\cite{MuellerBirn2014}... ),
a couple which discuss combating vandalism by means of semi-automated tools such as Huggle, Twinkle and STiki (\cite{GeiRib2010},~\cite{HalRied2012},~\cite{WestKanLee2010} ...)
and also some accounts on the emerging machine learning service ORES (~\cite{HalTar2015},~\cite{HalGeiMorSarWig2018}).
Time and again, the literature refers also to more ``manual'' forms of quality control by editors using watchlists to keep an eye on articles they care about or even accidentially discovering edits made in bad faith. %quote
There are works on vandalism detection in general/detection of unencyclopedic content~\cite{PotSteGer2008}, %TODO is this significant? are there really that many "in general"?
as well as several articles dedicated to bots and the role they play in mainataining quality on Wikipedia~\cite{GeiHal2013},\cite{Geiger2014},\cite{GeiHal2017},\cite{GeiRib2010},\cite{HalRied2012},\cite{Livingstone2016},\cite{MueDoHer2013},\cite{MuellerBirn2014}...,
a couple which discuss combating vandalism by means of semi-automated tools such as Huggle, Twinkle and STiki~\cite{GeiRib2010},\cite{HalRied2012},\cite{WestKanLee2010}, \cite{GeiHal2013} ...
and also some accounts on the emerging machine learning service ORES~\cite{HalTar2015},\cite{HalGeiMorSarWig2018}.
Time and again, the literature refers also to more ``manual'' forms of quality control by editors using watchlists to keep an eye on articles they care about or even accidentially discovering edits made in bad faith~\cite{Livingstone2016}, \cite{AstHal2018}.
There is one mechanism though that is very ostentatiously missing from all these reports: edit filters.
At first, scientific studies on Wikipedia largely ignored algorithmic quality control mechanisms.
...
...
@@ -15,12 +20,13 @@ This has gradually changed since around 2009 when the first papers specifically
In 2010, Geiger and Ribes insistently highlighted that the scientific community could no longer ingore(syn) these mechanisms as insignificant(syn) or noise in the data~\cite{GeiRib2010}.
For one, their (the mechanisms') relative usage has continued to increase since they were first introduced, and in an observed two-months period in 2009 bots made 16.33\% of all edits~\cite{Geiger2009}.
Others were worried it was getting increasingly intransparent how the encyclopedia functions and not only ``[k]eeping traces obscure help[ed] the powerful to remain in power''~\cite{ForGei2012} but entry barriers for new users were gradually set higher, since they not only had to learn to use/interact with a myriad of technical tools/.. (learn wikisyntax, ..) but also navigate their ground in a complex system with a decentralised mode of governance. %TODO another reference here would be nice
Others were worried it was getting increasingly intransparent how the encyclopedia functions and not only ``[k]eeping traces obscure help[ed] the powerful to remain in power''~\cite{ForGei2012} but entry barriers for new users were gradually set higher, since they not only had to learn to use/interact with a myriad of technical tools/.. (learn wikisyntax, ..) but also navigate their ground in a complex system with a decentralised socio-technical mode of governance~\cite{Geiger2017}.
Ford and Geiger even cite a case where an editor was not sure whether a person deleted their articles or a bot~\cite{ForGei2012}.
What is more, Geiger and Ribes argue, the algorithmic quality control mechanisms change the system not only in matter of scale (using bots/tools is faster, hence more reverts are possible) but in matter of substance: how everything interacts with each other~\cite{GeiRib2010}.
On the grounds of quality control specifically, the introduction of tools (and bots) was fairly revolutionary (actually that's true for diffs already %TODO: reformulate
: they enabled efficient patrolling of articles by users with little to no knowledge about the particular topic (thanks to their representation of the edits/information: e.g. diffs)
On the grounds of quality control specifically, the introduction of tools (and bots) was fairly revolutionary:
they enabled efficient patrolling of articles by users with little to no knowledge about the particular topic.
Thanks to Wikipedia's particular software architecture, this is possible even in the most ``manual'' quality control work (e.g. using watchlists to patrol articles): representing information changes via diffs allows editors to quickly spot content that deviates from its immediate context~\cite{GeiRib2010}.
\begin{comment}
%Why is it important we study these mechanisms?
...
...
@@ -88,8 +94,8 @@ is_bot edits Percentage of all edits
\section{Bots}
%todo also mention bot papers that discuss more general aspects of bots?
According to literature, bots constitute the first line of defence against malicious edits. %TODO quote
They are also undoubtedly the vandal fighting mechanism studied most in depth by the scientific community.
According to literature, bots constitute the first line of defence against malicious edits~\cite{GeiHal2013}.
They are also undoubtedly the vandal fighting mechanism studied most in depth by the scientific community.%TODO replace "vandal fighting" with "quality control"?
Geiger and Ribes~\cite{GeiRib2010} define bots as
``fully-automated software
...
...
@@ -102,16 +108,16 @@ indepth analysis of ClueBot NG, and its place within vandal fighting infrastruct
historical review of bots' and semi-automated tools' involvement in vandal fighting\cite{HalRied2012}
... (smth else??)
Further bots involved in vandal fighting discussed by the literature include (ClueBot NG~\cite{GeiHal2013},~\cite{HalRied2012},)
Further bots involved in vandal fighting discussed by the literature include (besides ClueBot NG~\cite{GeiHal2013},\cite{HalRied2012},)
Very crucial for the current analysis will also be Livingstone's observation in the preamble to his interview with the first large scale bot operator Ram-man that
``In the Wikimedia software, there are tasks that do all sorts of things [...].
If these things are not in the software, an external bot could do them. [...]
The main difference is where it runs and who runs it.''~\cite{Livingstone2016}
The main difference is where it runs and who runs it''~\cite{Livingstone2016}.
This thought/note is also scrutinised by Geiger~\cite{Geiger2014} who examines in detail what the difference and repercussions are of code that is part of the core software and code that runs alongside it (such as bots). %TODO more detail: so what are they?
...
...
@@ -309,7 +315,7 @@ some of the quality control work is still done ``manually'' by humand editors.
These are, on one hand, editors who use the ``undo'' functionality from within the page's revision history.
On the other hand, there are also editors who engage with the classical/standard encyclopedia editing mechanism (click the ``edit'' button on an article, enter changes in the editor which opens, write an edit summary for the edit, click ``save'') rather than using further automated tools to aid them.
When editors use these mechanisms for vandalism fighting, oftentimes they haven't noticed the vandalising edits by chance but rather have been actively watching the pages in question via the so-called watchlists~\cite{AstHal2018}.
This also gives us a hint as to what type of quality control work humans take over: less obvious and less rapid, editors who patrol pages via watchlists have some relationship to/deeper expertise on the topic. %TODO quote needed.
This also gives us a hint as to what type of quality control work humans take over: less obvious and less rapid, editors who patrol pages via watchlists have some relationship to/deeper expertise on the topic. %TODO quote needed. according to~\cite{AstHal2018} along the funnel, increasingly complex judgement is required
%TODO vgl also funnel diagram incoming edits quality assurance by Halfaker
\section{Conclusion}
...
...
@@ -320,6 +326,13 @@ This also gives us a hint as to what type of quality control work humans take ov
maybe move it to edit filters chapter
\cite{GeiHal2017}
Claudia's paper:
"“In both cases of algorithmic governance
– software features and bots – making rules part of the infrastructure, to a certain extent, makes
them harder to change and easier to enforce” (p. 87)"