Skip to content
Snippets Groups Projects
Commit 3005937f authored by Lyudmila Vaseva's avatar Lyudmila Vaseva
Browse files

Clean up bot section

parent b4d9d28f
No related branches found
No related tags found
No related merge requests found
......@@ -46,29 +46,28 @@ Geiger and Ribes~\cite{GeiRib2010} define bots as
agents that perform algorithmically-defined tasks involved
with editing, maintenance, and administration in Wikipedia''.
%TODO revise
Different aspects of bots and their involvement in quality control(syn!) have been investigated:
In the paper referenced above, the researchers employ their method of trace ethnography (more on it in chapter~\ref{chap:methods}) to follow a disrupting editor around Wikipedia and comprehend the measures taken/applied in collaboration by bots (ClueBot and HBC AIV helperbot7) as well as humans using semi-automated tools (Huggle and Twinkle) up until they achieved that the malicious editor in question was banned~\cite{GeiRib2010}.
Halfaker and Riedl offer a historical review of bots and semi-automated tools and their involvement in vandal fighting~\cite{HalRied2012} assembling a comprehensive list of tools and commenting/touching on/discussing/studying their work principle (syn!) (rule vs machine learning based).
Different aspects of bots and their involvement in quality control have been investigated:
In the paper referenced above, the researchers employ their method of trace ethnography (more on it in chapter~\ref{chap:methods}) to follow a disrupting editor around Wikipedia and comprehend the measures taken in collaboration by bots (ClueBot and HBC AIV helperbot7) as well as humans using semi-automated tools (Huggle and Twinkle) up until they achieved that the malicious editor in question was banned~\cite{GeiRib2010}.
Halfaker and Riedl offer a historical review of bots and semi-automated tools and their involvement in vandal fighting~\cite{HalRied2012}, assembling a comprehensive list of tools and touching on their working principle (rule vs machine learning based).
They also develop a bot taxonomy we will come back to in chapter~\ref{chap:overview-en-wiki}. %TODO quote bot taxonomy here?
In~\cite{GeiHal2013}, Geiger and Halfaker conduct an in-depth analysis of ClueBot NG, ClueBot's machine learning based successor, and its place within Wikipedia's vandal fighting infrastructure~\cite{GeiHal2013} concluding that quality control on Wikipedia is a robust process and most malicious edits eventually get reverted even if some of the actors (syn!) are inactive, although at a different speed.
In~\cite{GeiHal2013}, Geiger and Halfaker conduct an in-depth analysis of ClueBot NG, ClueBot's machine learning based successor, and its place within Wikipedia's vandal fighting infrastructure concluding that quality control on Wikipedia is a robust process and most malicious edits eventually get reverted even with some of the actors (temporaly) inactive, although at a different speed.
They discuss the mean times to revert of different mechanisms, their observations coinciding with diagram~\ref{fig:funnel-no-filters},
and also comment on the (un)realiability of external infrastructure bots rely upon (run on private computers, which causes downtimes).
Further bots involved in vandal fighting (besides ClueBot~\cite{GeiRib2010} and ClueBot NG~\cite{GeiHal2013}, \cite{HalRied2012},) discussed by the literature include :
Further bots involved in vandal fighting (besides ClueBot~\cite{GeiRib2010} and ClueBot NG~\cite{GeiHal2013}, \cite{HalRied2012}) discussed by the literature include:
XLinkBot (which reverts edits containing links to domains blacklisted as spam)~\cite{HalRied2012},
HBC AIV Helperbots (responsible for various maintenance tasks which help to keep entries on the Administrator intervention against vandalism (AIV) dashboard up-to-date)~\cite{HalRied2012}, \cite{GeiRib2010},
MartinBot and AntiVandalBot (one of the first rule-based bots which detected obvious cases of vandalism)~\cite{HalRied2012},
DumbBOT and EmausBot (which do batch cleanup tasks)~\cite{GeiHal2013}.
Very crucial for the current analysis will also be Livingstone's observation in the preamble to his interview with the first large scale bot operator Ram-man that
``In the Wikimedia software, there are tasks that do all sorts of things [...].
``[i]n the Wikimedia software, there are tasks that do all sorts of things [...].
If these things are not in the software, an external bot could do them. [...]
The main difference is where it runs and who runs it''~\cite{Livingstone2016}.
This thought/note is also scrutinised by Geiger~\cite{Geiger2014} who examines in detail what the difference and repercussions are of code that is part of the core software and code that runs alongside it (such as bots) which he calls ``bespoke code''.
Geiger(syn) pictures Wikipedia as a big socio-technical assemblage of software pieces and social processes, often completely intransparent for an outside observer who is not able to identify the single components of this system and how they interact with one another to provide the end result to the public.
He underlines that components/parts/... which are not strictly part of the server-side codebase but run by various volunteers (which is well true for the most parts of Wikipedia, it is a community project) on their private infrastructure constitute the major part of Wikipedia and also that they can experience/suffer an outage/downtime any time/at any moment.
The vital tasks they perform such as for example vandalism fighting are often taken for granted, much to their developers' aggravation.
This thought is also scrutinised by Geiger~\cite{Geiger2014} who examines in detail what the difference and repercussions are of code that is part of the core software and code that runs alongside it (such as bots) which he calls ``bespoke code''.
Geiger pictures Wikipedia as a big socio-technical assemblage of software pieces and social processes, often completely intransparent for an outside observer who is not able to identify the single components of this system and how they interact with one another to provide the end result to the public.
He underlines that components which are not strictly part of the server-side codebase but run by various volunteers (which is well true for the most parts of Wikipedia, it is a community project) on their private infrastructure constitute the major part of Wikipedia and also that they can experience a downtime at any moment.
The vital tasks they perform, such as vandalism fighting, are often taken for granted, much to their developers' aggravation.
\begin{comment}
\cite{GeiRib2010}
......@@ -81,8 +80,9 @@ community."
\end{comment}
%Concerns
To wrap up, people have been long sceptical (and some still are) about the employment of fully automated agents such as bots within Wikipedia. %TODO that's not exactly wrapping up..
Above all, there is a fear of bots (above all syn such with admin permissions) running rampant and their operators not reacting fast enough to prevent the damage.
A final aspect in the bot discussion relevant here are the concerns of the community.
People have been long sceptical (and some still are) about the employment of fully automated agents such as bots within Wikipedia.
Above all, there is a fear of bots (especially such with admin permissions) running rampant and their operators not reacting fast enough to prevent the damage.
This led to the social understanding that ``bots ought to be better behaved than people''~\cite{Geiger2011} which still plays a crucial role in bot development today.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment