6-Discussion.tex

\chapter{Discussion and Limitations}
\label{chap:discussion}

\section{Discussion}

* Till now the whole inquiry is largely descriptive. It's fine the status quo is captured but then we should go a step further and ask "so what"? What do we have from that? Explain the data
  * maybe we won't be able to explain a lot of it and we can open it further as interesting questions to be looked into by ethnographers

* think about what values we embed to what systems and how; --> Lessig

Difference bot/filter: filters are part of the "platform". (vgl also ~\cite{Geiger2014} and criticism towards the view of a hollistic platform)
They are a MediaWiki extension, which means they are run on official Wikimedia infrastructure. (vgl \cite{Geiger2014} and "bespoke code")
This makes them more robust and bestow them another kind of status.
Bots on the other hand are what Stuart Geiger calls "bespoke code": they are auxiliary programms developed, mantained and run by single community members, typically (at least historically?) not on Wikimedia's infrastructure, but instead on private computers or third party servers.
A key difference is also that while bots check already published edits which they eventually may decide to revert, filters are triggered before an edit ever published.

* another difference bots/filters, it's easier to ddos the bot infrastructure, than the filters: buy a cluster and edit till the revert table overflows

% Aim: I want to know why are there filters? How do they fit in the quality control ecosystem?
Distinction filters/Bots: what tasks are handled by bots and what by filters (and why)? What difference does it make for admins? For users whose edits are being targeted? %TODO: good question, but move to analysis, since we won't be able to answer this on grounds of literature review only

\begin{itemize}
    \item What can we filter with a REGEX? And what not? Are regexes the suitable technology for the means the community is trying to achieve?
    \item Filter are classical rule-based systems. What are suitable areas of application for such rule-based system in contrast to ML systems?
\end{itemize}

Discuss results:
so I've now explored and gathered understanding on Background(Context), general workings of the edit filter system and the state of the art of edit filters on the EN Wikipedia.
So what? What important/interesting insights have I gathered when contemplating all of this together?

* also comment on negative results!

* why get certain filters (and not others?)
* do filters solve effectively the task they were conjured up to life to fulfil?
* what kinds of biases/problems are there?
* who is allowed to edit edit filters?

Alternative approaches to community management:
compare with Surviving the Eternal September paper~\cite{KieMonHill2016}
"importance of strong
systems of norm enforcement made possible by leadership,
community engagement, and technology."

"emphasizing decentralized moderation" //all community members help enforce the norms
"ensuring enough leadership capacity is available
when an influx of newcomers is anticipated."
"Designers may
benefit by focusing on tools to let existing leaders bring others
on board and help them clearly communicate norms."
"designers should support an ecosystem of accessible and ap-
propriate moderator tools."

%***************************************

* a realisation: number of filters cannot grow endlessly, every edit is checked against all of them and this consumes computing power! (signaled in various places) (and apparently haven't been chucked with Moore's law). is this the reason why number of filters has been more or less constanst over the years?
* there seems to be a hard condition limit for filters: so the active ones are best of! which filters are best-of? a theory: "I've combated so and so many occurances of vandalism X with my bot. Let us implement a filter for this"

* Claudia thinks it's easier to implement a filter than a bot (less technical knowledge needed)
* Filter trigger before a publication, Bots trigger afterwads
  ** that's positive! editors get immmediate feedback and can adjust their (good faith) edit and publish it! which is psychologically better than publish something and have it reverted in 2 days
* thought: filter are human centered! (if a bot edits via the API, can it trigger a filter? Actually, I think yes, there were a couple of filters with something like "vandalbot" in their public comment)

Claudia: * A focus on the Good faith policies/guidelines is a historical development. After the huge surge in edits Wikipedia experienced starting 2005 the community needed a means to handle these (and the proportional amount of vandalism). They opted for automatisation. Automated system branded a lot of good faith edits as vandalism, which drove new comers away. A policy focus on good faith is part of the intentions to fix this.

 could be that the high hit count was made by false positives, which will have led to disabling the filter (TODO: that's a very interesting question actually; how do we know the high number of hits were actually leggit problems the filter wanted to catch and no false positives?)

 From the talk archive:
//and one more user under the same impression
"The fact that Grawp-style vandalism is easily noticeable and revertible is precisely why we need this extension: because currently we have a lot of people spending a lot of time finding and fixing this stuff when we all have better things to be doing. If we have the AbuseFilter dealing with this simple, silly, yet irritating, vandalism; that gives us all more time to be looking for and fixing the subtle vandalism you mention. This extension is not designed to catch the subtle vandalism, because it's too hard to identify directly. It's designed to catch the obvious vandalism to leave the humans more time to look for the subtle stuff. Happy‑melon 16:35, 9 July 2008 (UTC) "
// and this is the most sensible explaination so far
\cite{GeiRib2010}

"these tools makes certain pathways of action easier for vandal
fighters and others harder"

"Ultimately, these tools take their users
through standardized scripts of action in which it always
possible to act otherwise, but such deviations demand
inventiveness and time."

%\subsection{Harassment and bullying}
* where is the thesis going?
  * should there be some recommended guidelines based on the insights?
  * or some design recommendations?
  * or maybe just a framework for future research: what are questions we just opened?; we still don't know the answer to and should be addressed by future research?

\section{Limitations}

This work presents a first attempt at analysing Wikipedia's edit filter system.
It has several limitations (we could think of).
First, it focuses on English Wikipedia only.
We are convinced that there are valuable lessons to be learnt (about the communities, usefulness of filters, ..) from comparing edit filter use across different language versions.
Second, unfortunately, including an ethnographic analysis was not possible.
This is partially due to the fact that we employ a computer science perspective on the question and partially due to limited time.
Third, the manual filter classification was undertaken by one person only, so biases of this person have certainly shaped the labels.

%TODO describe also negative results!

%Data
Following other pages looked interesting or related, but were left out, mainly because of insufficient time.
(Is there a better reasoning why I looked at the pages I looked at specifically, while left particularly these other pages for later?)