diff --git a/thesis/2-Background.tex b/thesis/2-Background.tex index 80e4588f520eb852959c190653f6e9d390d4e1a5..42bde5e073d56c67039b5df818d33de01a8860fc 100644 --- a/thesis/2-Background.tex +++ b/thesis/2-Background.tex @@ -92,30 +92,28 @@ Semi-automated quality control tools are similar to bots in the sense that they The difference however is that with semi-automated tools humans do the final assessment and decide what happens with the edits in question. There is a scientific discussion of several tools: -Huggle\footnote{\url{https://en.wikipedia.org/wiki/Wikipedia:Huggle}}, which is probably the most popular and widely used one is studied in~\cite{GeiHal2013},~\cite{HalRied2012}, and \cite{GeiRib2010}. -Another very popular tool, Twinkle\footnote{\url{https://en.wikipedia.org/wiki/Wikipedia:Twinkle}}, is mentioned by ~\cite{GeiHal2013} (it's really just a mention),~\cite{GeiRib2010}.. -STiki\footnote{\url{https://en.wikipedia.org/wiki/Wikipedia:STiki}} is presented (syn!) by its authors in~\cite{WestKanLee2010}. -Various older (and partially inactive) tools (syn!) are also mentioned (syn!) by the literature: +Huggle\footnote{\url{https://en.wikipedia.org/wiki/Wikipedia:Huggle}}, which is probably the most popular and widely used one, is studied in~\cite{GeiHal2013},~\cite{HalRied2012}, and \cite{GeiRib2010}. +Another very popular tool, Twinkle\footnote{\url{https://en.wikipedia.org/wiki/Wikipedia:Twinkle}}, is commented on by~\cite{GeiHal2013},~\cite{GeiRib2010}, and~\cite{HalGeiMorRied2013}. +STiki\footnote{\url{https://en.wikipedia.org/wiki/Wikipedia:STiki}} is presented by its authors in~\cite{WestKanLee2010} and also discussed (syn!) by~\cite{GeiHal2013}. +Various older (and partially inactive) applications are also mentioned by the literature: Geiger and Ribes touch on Lupin's Anti-vandal tool\footnote{\url{https://en.wikipedia.org/wiki/User:Lupin/Anti-vandal_tool}}~\cite{GeiRib2010}, -Halfaker and Riedl discuss (syn!) VandalProof~\cite{HalRied2012}. +Halfaker and Riedl talk about VandalProof~\cite{HalRied2012}. -Some of these tools are more automated than others: Huggle and STiki for instance are able to revert an edit, issue a warning to the offending editor, and post a report on the AIV dashboard (if the user has already exhausted the warning limit) upon a single click, -whereas the javascript based browser extension Twinkle adds contextual links to other parts of Wikipedia which facilitates fulfilment of particular tasks (rollback multiple edits, report problematic users to AIV, nominate an article for deletion)~\cite{GeiRib2010}. -The main feature of Huggle and STiki -is that they both compile a central queue of potentially harmful edits for all their users to check. +Some of these tools are more automated than others: Huggle and STiki for instance are able to revert an edit, issue a warning to the offending editor, and post a report on the AIV dashboard (if the user has already exhausted the warning limit) upon a single click. +The javascript based browser extension Twinkle on the other hand adds contextual links to other parts of Wikipedia which facilitates fulfilment of particular tasks such as rollback multiple edits, report problematic users to AIV, nominate an article for deletion~\cite{GeiRib2010}. +The main feature of Huggle and STiki is that they both compile a central queue of potentially harmful edits for all their users to check. The difference between both programs are the heuristics they use for their queues: -By default, Huggle sends edits by users with warnings on their user talk page to the top of the queue, places edits by IP editors higher and ignores edits made by bots and other Huggle users altogether\cite{GeiRib2010}, -while STiki relies on the ``spatio-temporal properties of revision metadata''~\cite{WestKanLee2010} for deciding the likelihood of an edit to be vandalism. +By default, Huggle sends edits by users with warnings on their user talk page to the top of the queue, places edits by IP editors higher and ignores edits made by bots and other Huggle users altogether\cite{GeiRib2010}. +In contrast, STiki relies on the ``spatio-temporal properties of revision metadata''~\cite{WestKanLee2010} for deciding the likelihood of an edit to be vandalism. Huggle's queue can be reconfigured, however, some technical savvy and motivation is needed for this and thus, as~\cite{GeiRib2010} warn, it makes certain paths of action easier to take than others. +Another common trait of both programs is that as a standard, editors need the ``rollback'' permission in order to be able to use them~\cite{HalRied2012}. %TODO another source is STiki's doc -Nonetheless, a trait common to all of them is that as a standard, editors need the ``rollback'' permission in order to be able to use the software~\cite{HalRied2012}. %TODO ist that so? I can't find with certainty any info about Twinkle - -Some critique or concerns that have been voiced regarding semi-automated anti-vandalism tools compare these to massively multiplayer online role-playing games (MMORPGs)~\cite{HalRied2012}. -They fear(syn) that some of the users of said tools see themselves as vandal fighters on a mission to slay the greatest number of monsters (vandals) possible and by doing so excell in the ranks -\footnote{STiki really has a leader board: \url{https://en.wikipedia.org/wiki/Wikipedia:STiki/leaderboard}}. -This is for one a harmful way to view the project, neglecting the ``assume good faith'' guideline %TODO quote -and also leads to such users seeking out easy to judge cases from the queues in order to move onto the next entry more swiftly -leaving more subtle cases (syn!), which really require human judgement, to others. +Some critique that has been voiced regarding semi-automated anti-vandalism tools compares these to massively multiplayer online role-playing games (MMORPGs)~\cite{HalRied2012}. +The concern is that some of the users of said tools see themselves as vandal fighters on a mission to slay the greatest number of monsters (vandals) possible and by doing so to excell in the ranks +\footnote{STiki actually has a leader board: \url{https://en.wikipedia.org/wiki/Wikipedia:STiki/leaderboard}}. +This is for one a harmful way to view the project, neglecting the ``assume good faith'' guideline~\cite{Wikipedia:GoodFaith} +and also leads to such users seeking out easy to judge instancies from the queues in order to move onto the next entry more swiftly and gather more points +leaving more subtle cases, which really require human judgement, to others. \begin{comment} %Huggle @@ -157,13 +155,13 @@ and VandalProof which \section{ORES} -ORES is an API based free libre and open source (FLOSS) machine learning service ``designed to improve the way editors maintain the quality of Wikipedia''~\cite{HalTar2015} and increase the transparency of the quality control process. +ORES is an API based free libre and open source (FLOSS) machine learning service ``designed to improve the way editors maintain the quality of Wikipedia'' \cite{HalTar2015} and increase the transparency of the quality control process. It uses learning models to predict a quality score for each article and edit based on edit/article quality assessments manually assigned by Wikipedians. Potentially damaging edits are highlighted, which allows editors who engage in vandal fighting to examine them in greater detail. The service was officially introduced in November 2015 by Aaron Halfaker\footnote{\url{https://wikimediafoundation.org/role/staff-contractors/}} (principal research scientist at the Wikimedia Foundation) and Dario Taraborelli\footnote{\url{http://nitens.org/taraborelli/cv}} (Head of Research at Wikimedia Foundation at the time)~\cite{HalTar2015}. Its development is ongoing, coordinated and advanced by Wikimedia's Scoring Platform team. Since ORES is API based, in theory a myriad of services can be developed that use the predicted scores or, new models can be trained and made available for everyone to use. -The Scoring platform team reports that popular vandal fighting tools(syn?) such as Huggle have already adopted ORES scores for the compilation of their queues~\cite{HalTar2015}. +The Scoring platform team reports that popular vandal fighting tools such as Huggle have already adopted ORES scores for the compilation of their queues~\cite{HalTar2015}. What is unique about ORES is that all the algorithms, models, training data, and code are public, so everyone (with sufficient knowledge of the matter) can scrutinise them and reconstruct what is going on. This is certainly not true for machine learning services applied by commercial companies who have interest in keeping their models secret. Halfaker and Taraborelli express the hope that ORES would help hone quality control mechanisms on Wikipedia, and by decoupling the damage prediction from the actual decision how to deal with an edit make the encyclopedia more welcoming towards newcomers. @@ -177,38 +175,35 @@ The researchers also warn that wording is tremendously important for the percept For completion, it should be noted at this point that despite the steady increase of the proportion of fully and semi-automated tools usage for fighting vandalism~\cite{Geiger2009}, some of the quality control work is still done ``manually'' by human editors. These are, on one hand, editors who use the ``undo'' functionality from within the page's revision history. -On the other hand, there are also editors who engage with the classical/standard encyclopedia editing mechanism (click the ``edit'' button on an article, enter changes in the editor which opens, write an edit summary for the edit, click ``save'') rather than using further automated tools to aid them. -When editors use these mechanisms for vandalism fighting, oftentimes they haven't noticed the vandalising edits by chance but rather have been actively watching the pages in question via the so-called watchlists~\cite{AstHal2018}. +On the other hand, there are also editors who engage with the classic encyclopedia editing mechanism (click the ``edit'' button on an article, enter changes in the dialog which opens, write an edit summary for the edit, click ``save'') rather than using further automated tools to aid them. +When Wikipedians use these mechanisms for vandalism fighting, oftentimes they haven't noticed the vandalising edits by chance but rather have been actively watching the pages in question via the so-called watchlists~\cite{AstHal2018}. This also gives us a hint as to what type of quality control work humans take over: less obvious and less rapid, editors who patrol pages via watchlists have some relationship to/deeper expertise on the topic. %TODO quote needed. according to~\cite{AstHal2018} along the funnel, increasingly complex judgement is required %TODO vgl also funnel diagram incoming edits quality assurance by Halfaker \section{Conclusion} -\cite{AstHal2018} have a diagram describing the new edit review pipeline. Filters are absent. -%TODO move funnel diagram here (descending degree of automacy -%TODO find where in text to reference the graphic directly +For clarity, I have summarised the various aspects of algorithmic quality control mechanisms we discussed in the present chapter in table~\ref{table:mechanisms-comparison-literature}. +Their work can be fittingly illustrated by figure~\ref{fig:funnel-no-filters}, proposed in a similar fashion also by~\cite{AstHal2018}. +%TODO what I haven't discussed so far is the temporal/pipeline dimension +One thing is certain: so far, on grounds of literature study alone it remains unclear what the role/purpose of edit filters is. + +%TODO is it better to introduce the graphic earlier? \begin{figure} \centering \includegraphics[width=0.9\columnwidth]{pics/funnel-diagramm-no-filters.JPG} \caption{State of the scientific literature: edit filters are missing from the quality control frame}~\label{fig:funnel-no-filters} \end{figure} -%TODO merge with rise and decline graphic from~\cite{HalGeiMorRied2013} - - -So far, on grounds of literature study alone it remains unclear what the role/purpose of edit filters is. -Features of the algorithmic mechanisms summarised in table: %TODO reduce table to 1 page! (check which entries actually result from the text \begin{landscape} - \begin{longtable}{ | p{4cm} | p{5cm} | p{5cm} | p{5cm} | } + \begin{longtable}{ | p{4cm} | p{5.5cm} | p{5.5cm} | p{5.5cm} | } \hline & Bots & Semi-Automated tools & ORES \\ \hline \multirow{7}{*}{Properties} & rule/ML based & rule/ML based & ML framework \\ & run on user's infrastructure ("bespoke code") & extra infrastructure & not used directly, can be incorporated in other tools \\ - & no requirement for code to be public & most popular are open source (but it's not a hard requirement) & open source \\ + & no requirement for code to be public & most popular are open source (but not a hard requirement) & open source \\ & & heuristics obfuscated by the interface & \\ - & trigger after an edit is published & trigger after an edit is published & \\ & latency varies & generally higher latency than bots & \\ & mostly single dev/operator (recently: bot frameworks) & few devs & few devs \\ \hline diff --git a/thesis/references.bib b/thesis/references.bib index 1f9a74cdc2503b99533883238d71077f1c9ae263..26c901bc6e526a46174de5b6b75b8ddffb1bd65e 100644 --- a/thesis/references.bib +++ b/thesis/references.bib @@ -317,6 +317,15 @@ \url{https://en.wikipedia.org/wiki/Wikipedia:Administrator_intervention_against_vandalism}} } +@misc{Wikipedia:GoodFaith, + key = "Wikipedia Assume Good Faith", + author = {}, + title = {}, + year = 2019, + note = {Retreived March 26, 2019 from + \url{https://en.wikipedia.org/w/index.php?title=Wikipedia:Assume_good_faith&oldid=889253693}} +} + @misc{Wikipedia:DatBot, key = "Wikipedia DatBot", author = {},