From 3f80f664b79f1a6b7ebc73eaad68c03592ec4240 Mon Sep 17 00:00:00 2001 From: Lyudmila Vaseva <vaseva@mi.fu-berlin.de> Date: Mon, 22 Jul 2019 14:25:53 +0200 Subject: [PATCH] Clean up chap 4 except conclusion --- thesis/2-Background.tex | 12 +++++++++ thesis/4-Edit-Filters.tex | 53 +++++++++++++-------------------------- 2 files changed, 29 insertions(+), 36 deletions(-) diff --git a/thesis/2-Background.tex b/thesis/2-Background.tex index dcd6f21..a0e5855 100644 --- a/thesis/2-Background.tex +++ b/thesis/2-Background.tex @@ -92,6 +92,18 @@ Present authors also signal that these tools still tend to reject the majority o The researchers also warn that wording is tremendously important for the perception of edits and people who authored them: labels such as ``good'' or ``bad'' are not helpful. %TODO Concerns? + +%TODO Incorporate this, moved from chap. 4 +\begin{comment} +\subsection{Alternatives to Edit Filters} + +Since edit filters run against every edit saved on Wikipedia, it is generally adviced against rarely tripped filters and a number of alternatives is offered to edit filter managers and editors proposing new filters. +For example, there is the page protection mechanism suitable for handling a higher number of incidents concerning single page. +Also, title and spam blacklists exist and these might be the way to handle disruptive page titles or link spam~\cite{Wikipedia:EditFilter}. +(It is worth to note at this place, that both blacklists are also rule-based.) +Moreover, it is recommended to run in-depth checks (e.g. for single articles) separately, for example by using bots~\cite{Wikipedia:EditFilterRequested}. +\end{comment} + \section{Semi-automated} \label{section:semi-automated} diff --git a/thesis/4-Edit-Filters.tex b/thesis/4-Edit-Filters.tex index e1285f4..e84d24c 100644 --- a/thesis/4-Edit-Filters.tex +++ b/thesis/4-Edit-Filters.tex @@ -242,6 +242,7 @@ As all newly implemented filters, these are initially enabled in logging only mo It is not uncommon, that the action(s) a particular filter triggers change over time. Sometimes, when a wave of particularly persistent vandalism arises, a filter is temporarily set to ``warn'' or ``disallow'' and the actions are removed again as soon as the filter is not tripped very frequently anymore. %TODO examples? +Such action changes, updates to an edit filter's pattern, or a warning template, as well as problems with filters behaviour are discussed on the Edit Filter Noticeboard~\cite{Wikipedia:EditFilterNoticeboard}. Last but not least, performance seems to be fairly important for the edit filter system: On multiple occasions, there are notes on recommended order of operations, so that the filter evaluates as resource sparing as possible~\cite{Wikipedia:EditFilterInstructions} or invitations to consider whether an edit filter is the most suitable mechanism for solving a particular issue at all~\cite{Wikipedia:EditFilter},~\cite{Wikipedia:EditFilterRequested}. @@ -261,8 +262,9 @@ According to~\cite{Wikipedia:EditFilter} this right is given only to editors who Further down on the page it is clarified that it is administrators who can assign the permission to users (also to themselves) and they should only assign it to non-admins in exceptional cases, ``to highly trusted users, when there is a clear and demonstrated need for it''. If editors wish to be given this permission, they can hone and prove their skills by helping with requested edit filters and false positives~\cite{Wikipedia:EditFilter}. -The formal process for requesting the \emph{abusefilter-modify} permission is to raise the request at the edit filter noticeboard~\cite{Wikipedia:EditFilterNoticeboard}. -A discussion is held there, usually for 7 days, before a decision is reached~\cite{Wikipedia:EditFilter}. +The formal process for requesting the \emph{abusefilter-modify} permission is to raise the request at the Edit Filter Noticeboard~\cite{Wikipedia:EditFilterNoticeboard}. +A discussion is held there, usually for 7 days, before a decision is reached~\cite{Wikipedia:EditFilter} +\footnote{According to the documentation, the Edit Filter Noticeboard is also the place to discuss potential permission withdraws in cases of misuse where raising the issue directly with the editor concerned has not resolved the problem.}. As of 2017, when the ``edit filter helper'' group was introduced (editors in this group have the \emph{abusefilter-view-private} permission)~\cite{Wikipedia:EditFilterHelper}, the usual process seems to be that editors request this right first and only later the full \emph{abusefilter-modify} permissions\footnote{That is the tendency we observe at the Edit filter noticeboard~\cite{Wikipedia:EditFilterNoticeboard}.}. @@ -287,21 +289,26 @@ The interesting patterns of collaboration between the two technologies are discu %TODO: Flowchart of the filtering process! -\subsection{What happens when an editor triggers an edit filter? Do they notice this at all?} +So what happens when an editor's action matches the pattern of an edit filter? Do they notice this at all? -As described section~\ref{sec:mediawiki-ext}, a variety of different actions may occur when a filter gets tripped. -Of these, only \emph{tag}, \emph{throttle}, \emph{warn}, and \emph{disallow} seem to be used today. +As described section~\ref{sec:mediawiki-ext}, a variety of different actions may occur when a filter's pattern matches. +Of these, only \emph{tag}, \emph{throttle}, \emph{warn}, and \emph{disallow} seem to be used today (and \emph{log}, which is always enabled). If a filter is set to \emph{warn} or \emph{disallow}, the editor is notified that they hit a filter by a warning message (see figure~\ref{fig:screenshot-warn-disallow}). These warnings describe the problem that occurred and present the editor with possible paths of action: complain on the False Positives page~\cite{Wikipedia:EditFilterFalsePositives} in case of \emph{disallow} (the edit is not saved), -or, complain on the False Positives page and publish the change anyway in case of \emph{warn}. +or, complain on the False Positives page +\footnote{Edit filter managers and other interested editors monitor the False Positives page and verify or disprove the reported incidents. +Edit filter managers use actual false positives to improve the filters, give advice to good faith editors who tripped a filter and discourage authors of vandalism edits who reported these as false positives from continuing with their disruption. +% who moderates the false positives page? where does the info come from that it is edit filter managers? I think this info comes from observation +} +and publish the change anyway in case of \emph{warn}. (Of course, in case of a warning, the editor can modify their edit before publishing it.) On the other hand, when the filter action is set to \emph{tag} or \emph{log} only, the editor doesn't really notice they tripped a filter unless they are looking more closely. Tagged edits are marked as such in the page's revision history for example (see figure~\ref{fig:tags-in-history}) and all edits that trigger an edit filter are listed in the Abuse Log~\cite{Wikipedia:AbuseLog} (see figure~\ref{fig:screenshot-abuse-log}). %TODO How about throttling: the AbuseLog is currently timing out when I try to filter entries according to action(=throttle) -\begin{figure} +\begin{figure}[t] \centering \includegraphics[width=0.9\columnwidth]{pics/screenshots-filter-trigger/Screenshot-tags-in-revision-history.png} \caption{Tagged edits are marked as such in a page's revision history}~\label{fig:tags-in-history} @@ -313,30 +320,13 @@ and all edits that trigger an edit filter are listed in the Abuse Log~\cite{Wiki \caption{Abuse Log showing all filter triggers by User:Schnuppi4223}~\label{fig:screenshot-abuse-log} \end{figure} -\begin{figure} +\begin{figure}[t] \centering \includegraphics[width=0.9\columnwidth]{pics/screenshots-filter-trigger/Screenshot-trigger-warning-filter.png} \caption{Editor gets notified their edit triggered multiple edit filters}~\label{fig:screenshot-warn-disallow} \end{figure} -\subsection{How are problems handled?} - -There are several pages where problematic behaviour concerning edit filters are reported and potential solutions are considered. - -For instance, current filters behaviour is discussed on the Edit Filter Noticeboard~\cite{Wikipedia:EditFilterNoticeboard}. -Issues handled here include changing the edit filter action of single filters, changing edit filter warning templates, problems with specific patterns or variables and proposals for filter deletions (or for introducing new filters). -Furthermore, on the noticeboard discussions take place about giving edit filter manager rights to users, or withdrawing these if a misuse was observed and raising the issue with the editor directly didn't resolve the problem~\cite{Wikipedia:EditFilter}. - -False positives among the filter hits are reported and discussed on a separate page~\cite{Wikipedia:EditFilterFalsePositives}. -Edit filter managers and other interested editors monitor this page and verify or disprove the reported incidents. -Edit filter managers use true false positives to improve the filters, give advice to good faith editors who tripped a filter and discourage authors of vandalism edits who reported these as false positives from continuing with their disrtuption. -% who moderates the false positives page? where does the info come from that it is edit filter managers? I think this info comes from observation - -Moreover, edit filter managers are advised to consult and comply with personal security best practices (such as choosing a strong password and using two-factor authentication). -If their account is compromised, they lose their edit filter manager rights and get blocked, since this threatens site security. -Edit filter managers are encouraged to actively report problems with their account so that in case of doubt these can be blocked~\cite{Wikipedia:EditFilter}. - %************************************************************************ \section{Edit filters' role in the quality control ecosystem} @@ -349,20 +339,11 @@ In the present chapter, we aim to understand how edit filters work, who implemen The purpose of the present section is to review what we have learnt so far about edit filters and summarise how they fit in Wikipedia's quality control ecosystem. As timeline~\ref{fig:timeline} shows, the time span in which algorithmic quality control mechanisms (first vandal fighting bots and semi-automated tools, and later filters) were introduced fits logically the period after the exponential growth of Wikipedia took off in 2006 (compare figures~\ref{fig:editors-development},~\ref{fig:edits-development}). -The surge in editors numbers and contributions implied a rapidly increasing workload for community members dedicated to quality assurance +The surge in numbers of editors and contributions implied a rapidly increasing workload for community members dedicated to quality assurance which could not be feasibly handled manually anymore and thus the community turned to technical solutions. -As shown elsewhere~\cite{HalGeiMorRied2013}, this shift had a lot of repercussions: -one of the most severe of them being that newcomers' edits were reverted stricter than before (accepted or rejected on a yes-no basis with the help of automated tools, instead of manually seeking to improve the contributions and ``massage'' them into an acceptable form), which in consequence drove a lot of them away. +As shown elsewhere~\cite{HalGeiMorRied2013}, this shift had a lot of repercussions—one of the most severe of them being that newcomers' edits were reverted stricter than before (accepted or rejected on a yes-no basis with the help of automated tools, instead of manually seeking to improve the contributions and ``massage'' them into an acceptable form), which in consequence drove a lot of them away. %TODO sounds ending abruptly; maybe a kind of a recap with historical background, compare introduction -%TODO decide what to do with this; I think it's already mentioned somewhere -\begin{comment} -- there is also the guideline "be bold" (or similar), so one could expect to be able to for example add unwikified text, which is then corrected by somebody else -This tended to be the case in the early days of Wikipedia. -Messy edits were done and others took them and re-modelled them. - Since the rise of algorithmic quality contorl mechanisms though, edits are more often than not considered on an accept/reject basis but no "modelling" them into "proper" encyclopedic pieces of writing takes place anymore. %TODO find out which paper was making this case -\end{comment} - \begin{table} \begin{tabular}{ r | p{.8\textwidth}} Oct 2001 & automatically import entries from Easton’s Bible Dictionary by a script \\ -- GitLab