Skip to content
Snippets Groups Projects
Commit 2bb32305 authored by Lyudmila Vaseva's avatar Lyudmila Vaseva
Browse files

Continue work on background

parent 06506431
No related branches found
No related tags found
No related merge requests found
......@@ -503,10 +503,10 @@ Wikipedia and revert unwanted
contributions in one click."
"Several years and many iterations later [...] these tools have had an
increa sing role in ma int a ining
increasing role in maintaining
article quality on Wikipedia."
"ClueBot _ NG ha s replaced
"ClueBot _ NG has replaced
AntiVandalBot’s simple rules"
"Huggle has replaced VandalProof with
......@@ -539,7 +539,7 @@ One humorous entry even argues that
Wikipedia has become a MMORPG—
a massively multiplayer online role-
playing game—with “monsters”
(va nda ls) to slay, “experience”
(vandals) to slay, “experience”
(edit or revert count) to earn, and
“overlords” (administrators) to submit
to (http://en.wikipedia.org/wiki/
......
......@@ -14,75 +14,57 @@ To this end, in the current chapter we study scientific literature on vandalism
%TODO Literature review!
% How: within the subsections? as a separate section?
% Aim: I want to know why are there filters?
Distinction filters/Bots: what tasks are handled by bots and what by filters (and why)? What difference does it make for admins? For users whose edits are being targeted?
Why is it important we study these mechanisms?
- their relative usage increases/has increased since they were first introduced
- not transparent, especially for new users (see~\cite{ForGei2012}: "As it is, Kipsizoo is not even
\cite{GeiRib2010}
"at present, bots make 16.33\% of all edits."
%TODO more recent data? the last month argument via recentchanges (vgl \cite{Geiger2017}) doesn't hold here
- the whole ecosystem is not transparent, especially for new users (see~\cite{ForGei2012}: "As it is, Kipsizoo is not even
sure whether a real person who deleted the articles or a bot." )
"Keeping traces obscure help the powerful to remain in power"~\cite{ForGei2012}
- "inofficial", run and maintained by the community
- higher entry barriers: new users have to orientate themselves in the picture and learn to use the software
\cite{GeiRib2010}
"often-unofficial technologies have fundamentally
transformed the nature of editing and administration in
Wikipedia"
"Of note is the fact that these tools are largely
unofficial and maintained by members of the Wikipedia
community."
- higher entry barriers: new users have to orientate themselves in the picture and learn to use the software (decentralised mode of governance, often "impenetrable for new editors", vgl~\cite{ForGei2012})
- gamification concerns (is fighting vandalism becoming a game where certain users aim to revert as many edits as possible in order to get a higher score; and as a consequence these same users often times enforce reverts more rigorously than recommended and also pick cases that are easy and fast to arbitrate and do not require much additional research)
\cite{HalRied2012}
"Some Wikipedians feel that such
motivational measures have gone
too far in making Wikipedia like a
game rather than a serious project.
One humorous entry even argues that
Wikipedia has become a MMORPG—
a massively multiplayer online role-
playing game—with “monsters”
(vandals) to slay, “experience”
(edit or revert count) to earn, and
“overlords” (administrators) to submit
to (http://en.wikipedia.org/wiki/
Wikipedia:MMORPG)."
- they change the system not only in matter of scale (using bots/tools is faster, hence more reverts are possible) but in matter of substance: how everything interacts with each other
- they enable efficient patrolling of articles by users with little to no knowledge about the particular contents (thanks to their representation of the edits/information: e.g. diffs)
\cite{GeiRib2010}
!! tools not only speed up the process but:
"These tools greatly lower certain barriers to participation and render editing
activity into work that can be performed by "average
volunteers" who may have little to no knowledge of the
content of the article at hand"
\cite{GeiRib2010}
Partial explanation why literature paid little attention to (semi-)automated tools up to this date:
- old data according to which bots accounted for a very little amount of edits (2-4%)
("that this number has grown
dramatically: at present, bots make 16.33% of all edits.")
- "largely involved in single-use tasks like importing public domain material" (so not the case anymore, check e.g. MusikBot)
- "characterized in the literature as mere force-multipliers,
increasing the speed with which editors perform their work
while generally leaving untouched the nature of the tasks
themselves"
!! tools not only speed up the process but:
"These tools greatly lower certain barriers to participation and render editing
activity into work that can be performed by "average
volunteers" who may have little to no knowledge of the
content of the article at hand"
critical discussion
"Such acts of inclusion and exclusion may be necessary, but
they are inherently moral in quality, speaking to questions of
who is left out and what knowledge is erased."
"It is for
this reason that the argument that bots and assisted editing
tools are merely force multipliers is narrow and dangerous"
"In and outside of the Wikipedian community, tools
like Huggle are often compared with video games in both
serious critiques and humorous commentaries:"
"We should not fall into the trap of speaking of bots and
assisted editing tools as constraining the moral agency of
editors"
"these tools makes certain pathways of action easier for vandal
fighters and others harder"
"Ultimately, these tools take their users
through standardized scripts of action in which it always
possible to act otherwise, but such deviations demand
inventiveness and time."
---
socio-technical assemblages (see Geiger)
* Huggle, Twinkle, AWB, Bots exist nearly since the very beginning (2002?), why did the community introduce filters in 2009?
\subsection{Humans}
Some of the quality control work is done ``manually'' by humand editors.
That means, they engage in the standard encyclopedia editing mechanism (click the ``edit'' button on an article, enter changes in the editor which opens, write an edit summary for the edit, click ``save'') rather than using further automated tools to aid them.
According to research focusing on vandalism fighting, the amount/share/proportion of editors who engage in counter-vandalism measures that way shrinks in favour of semi or fully automated tools. %TODO quotes!
* what part of the quality control work do humans take over? (in contrast to the algorithmic mechanisms)
\url{https://en.wikipedia.org/wiki/Wikipedia:Recent_changes_patrol}
%Numbers
\cite{GeiRib2010}
Check Figure 1: Edits to AIV by tool (in the meantime 10 years old. is there newer data on the topic??)
not really, see:
......@@ -94,45 +76,71 @@ made about 20\% of all edits to encyclopedia articles."
Geiger's evidence:
https://quarry.wmflabs.org/query/20703
Percent of bot edits in previous month (enwiki, all pages)
\begin{verbatim}
is_bot edits Percentage of all edits
0 7619466 79.4974
1 1965083 20.5026
\end{verbatim}
https://quarry.wmflabs.org/query/20704
Percent of bot edits in previous month (enwiki, articles only)
\begin{verbatim}
is_bot edits Percentage of all edits
0 4273810 80.2025
1 1054966 19.7975
\end{verbatim}
However, a month is a relatively small period and you can't make an argument about general trends based on it.
For instance, these same quarries ran on April 12, 2019 render following results:
https://quarry.wmflabs.org/query/35104
Percent of bot edits in previous month (enwiki, all pages)
\begin{verbatim}
is_bot edits Percentage of all edits
0 6710916 89.7318
1 767948 10.2682
\end{verbatim}
https://quarry.wmflabs.org/query/35105
Percent of bot edits in previous month (enwiki, articles only)
\begin{verbatim}
is_bot edits Percentage of all edits
0 3426624 92.1408
1 292274 7.8592
\end{verbatim}
\subsection{Semi-automated tools}
%TODO consider adding screenshots
\subsection{Humans}
Huggle, Twinkle, STiki~\cite{WestKanLee2010}
\url{http://en.wikipedia.org/wiki/Wikipedia:STiki}
Despite steady increase of the proportion of fully and semi-automated tools usage for fighting vandalism %TODO quote!
some of the quality control work is still done ``manually'' by humand editors.
These are, on one hand, editors who use the ``undo'' functionality from within the page's revision history.
On the other hand, there are also editors who engage with the classical/standard encyclopedia editing mechanism (click the ``edit'' button on an article, enter changes in the editor which opens, write an edit summary for the edit, click ``save'') rather than using further automated tools to aid them.
When editors use these mechanisms for vandalism fighting, oftentimes they haven't noticed the vandalising edits by chance but rather have been actively watching the pages in question. %TODO: quote watchlist, current paper by Halfaker
This also gives us a hint as to what type of quality control work humans take over: less obvious and less rapid, editors who patrol pages via watchlists have some relationship to/deeper expertise on the topic. %TODO quote needed.
%TODO vgl also funnel diagram incoming edits quality assurance by Halfaker
also ARV, AIVer
\url{https://en.wikipedia.org/wiki/Wikipedia:AutoWikiBrowser}
\subsection{Semi-automated tools}
Semi-automated tools used for vandalism fighting on Wikipedia were discussed by:
more popular/widely used:
STiki~\cite{WestKanLee2010}
\url{http://en.wikipedia.org/wiki/Wikipedia:STiki}
Huggle~\cite{GeiHal2013},~\cite{HalRied2012},\cite{GeiRib2010}
Twinkle
AWB
\url{https://en.wikipedia.org/wiki/Wikipedia:AutoWikiBrowser}
less popular/older, mentioned in older accounts or not discussed at all (there are also more tools, see for example \url{https://en.wikipedia.org/wiki/Category:Wikipedia_counter-vandalism_tools})
VandalProof~\cite{HalRied2012}
ARV
AIV
Lupin's Anti-vandal tool
\url{https://en.wikipedia.org/wiki/User:Lupin/Anti-vandal_tool}
"Please be aware that the original author of AVT (Lupin) is no longer active on Wikipedia. The script is very old and might stop working at any time."
"By using the RC feed to check a wiki-page's differences against a list of common vandal terms, this tool will detect many of the commonly known acts of online vandalism. "
In general, previous research seems to make a distinction of degree? between ``more'' automated tools such as Huggle and STiki and ``less'' automated ones such as Twikle~\cite{GeiHal2013}.
\cite{GeiHal2013}
"Huggle, the most widely-used, fully assisted, counter-
vandalism tool, were made within 1 minute of the
......@@ -143,12 +151,8 @@ time-to-revert distribution that is closer to unassisted edits.
This suggests that Huggle and STiki are targeting different
kinds of edits"
They also suggest that Twinkle (on one side) and Huggle and STiki are not in the same class of semi-automated vandal fighting tools, with Twinkle beeing more "manual" than the other 2.
VandalProof~\cite{HalRied2012}
"Huggle, one of the most popular
antivanda lism editing tools on
antivandalism editing tools on
Wikipedia, is written in C\#.NET
and any user can download and
install it. Huggle lets editors roll back
......@@ -211,14 +215,6 @@ administrator except append this incident of vandalism to his
original report, further attempting to enroll a willing
administrator into the ad-hoc vandal fighting network."
\cite{GeiRib2010}
"often-unofficial technologies have fundamentally
transformed the nature of editing and administration in
Wikipedia"
"Of note is the fact that these tools are largely
unofficial and maintained by members of the Wikipedia
community."
//refers also to bots
\cite{GeiRib2010}
Twinkle description:
......@@ -268,7 +264,37 @@ removed the third vandal fighter's now-obsolete report."
\subsection{ORES}
%\section{Harassment and bullying}
\cite{HalTar2015}
"Today, we’re announcing the release of a new artificial intelligence service designed **to improve the way editors maintain the quality** of Wikipedia" (emphsis mine)
" This service empowers Wikipedia editors by helping them discover damaging edits and can be used to immediately “score” the quality of any Wikipedia article."
"these specs actually work to highlight potentially damaging edits for editors. This allows editors to triage them from the torrent of new edits and review them with increased scrutiny. " (probably triage the edits, not the specs)
"By combining open data and open source machine learning algorithms, our goal is to make quality control in Wikipedia more transparent, auditable, and easy to experiment with."
//so, purpose of ORES is quality control
"Our hope is that ORES will enable critical advancements in how we do quality control—changes that will both make quality control work more efficient and make Wikipedia a more welcoming place for new editors."
"ORES brings automated edit and article quality classification to everyone via a set of open Application Programming Interfaces (APIs). The system works by training models against edit- and article-quality assessments made by Wikipedians and generating automated scores for every single edit and article."
"English Wikipedians have long had automated tools (like Huggle and STiki ) and bots (like ClueBot NG) based on damage-detection AI to reduce their quality control workload. While these automated tools have been amazingly effective at maintaining the quality of Wikipedia, they have also (inadvertently) exacerbated the difficulties that newcomers experience when learning about how to contribute to Wikipedia. "
"These tools encourage the rejection of all new editors’ changes as though they were made in bad faith," //NB!!!
"Despite evidence on their negative impact on newcomers, Huggle, STiki and ClueBot NG haven’t changed substantially since they were first introduced and no new tools have been introduced. " //what about the edit filters? when were Huggle,STiki and ClueBotNG introduced?
"decoupling the damage prediction from the quality control process employed by Wikipedians, we hope to pave the way for experimentation with new tools and processes that are both efficient and welcoming to new editors. "
caution: biases in AI
" An algorithm that flags edits as subjectively “good” or “bad”, with little room for scrutiny or correction, changes the way those contributions and the people who made them are perceived."
"Examples of ORES usage. WikiProject X’s uses the article quality model (wp10) to help WikiProject maintainers prioritize work (left). Ra·un uses an edit quality model (damaging) to call attention to edits that might be vandalism (right)." //interesting for the memo
"Popular vandal fighting tools, like the aforementioned Huggle, have already adopted our revision scoring service."
further ORES applications:
" But revision quality scores can be used to do more than just fight vandalism. For example, Snuggle uses edit quality scores to direct good-faith newcomers to appropriate mentoring spaces,[4] and dashboards designed by the Wiki Education Foundation use automatic scoring of edits to surface the most valuable contributions made by students enrolled in the education program"
\section{Algorithmic Governance}
......
......@@ -39,6 +39,17 @@ propriate moderator tools."
** that's positive! editors get immmediate feedback and can adjust their (good faith) edit and publish it! which is psychologically better than publish something and have it reverted in 2 days
* thought: filter are human centered! (if a bot edits via the API, can it trigger a filter? Actually, I think yes, there were a couple of filters with something like "vandalbot" in their public comment)
\cite{GeiRib2010}
"these tools makes certain pathways of action easier for vandal
fighters and others harder"
"Ultimately, these tools take their users
through standardized scripts of action in which it always
possible to act otherwise, but such deviations demand
inventiveness and time."
%\subsection{Harassment and bullying}
\section{Limitations}
......
......@@ -19,6 +19,11 @@ and 197: "amerikanisch -> US-amerikanisch ([[WP:RS\#Korrektoren]])"
Both are log only filters;
and it's a political fight
\cite{GeiRib2010}
"We should not fall into the trap of speaking of bots and
assisted editing tools as constraining the moral agency of
editors"
\section{The bigger picture: Upload filters}
The planned introduction of upload filters by the EU copyright reform is seen critically by Wikimedia Germany:
......
......@@ -35,6 +35,7 @@ Claudia: * A focus on the Good faith policies/guidelines is a historical develop
* Read these pages
https://en.wikipedia.org/wiki/Category:Wikipedia_edit_filter
https://en.wikipedia.org/wiki/Wikipedia:Edit_warring
https://en.wikipedia.org/wiki/Wikipedia:Blocking_policy#Evasion_of_blocks
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment