From a04de0bf4efff6038932b638b8594bef52fc4482 Mon Sep 17 00:00:00 2001
From: Lyudmila Vaseva <vaseva@mi.fu-berlin.de>
Date: Wed, 24 Jul 2019 09:18:49 +0200
Subject: [PATCH] Refactor intro and conclusion again

---
 thesis/6-Discussion.tex |  2 +-
 thesis/conclusion.tex   | 31 +++++++++++++++++++++++++------
 thesis/introduction.tex | 29 ++++++++++++-----------------
 thesis/references.bib   |  9 +++++++++
 4 files changed, 47 insertions(+), 24 deletions(-)

diff --git a/thesis/6-Discussion.tex b/thesis/6-Discussion.tex
index 0b0ede3..a12919e 100644
--- a/thesis/6-Discussion.tex
+++ b/thesis/6-Discussion.tex
@@ -23,7 +23,7 @@ What did the filters accomplish differently?
 % before vs after
 A key distinction is that while bots check already published edits which they eventually may decide to revert, filters are triggered before an edit ever published.
 One may argue that nowadays this is not a significant difference.
-Whether a disruptive edit is outright disallowed or caught 2 seconds after its publication by ClueBot NG doesn't have a tremendous impact on the readers:
+Whether a disruptive edit is outright disallowed or caught and reverted 2 seconds after its publication by ClueBot NG doesn't have a tremendous impact on the readers:
 the vast majority of them will never see the edit either way.
 Still, there are various examples of hoaxes that didn't survive long on Wikipedia but the couple of seconds before they were reverted were sufficient for the corrupted version to be indexed by various/multiple/... news aggregators and search engines. %TODO find them!
 
diff --git a/thesis/conclusion.tex b/thesis/conclusion.tex
index aac47ac..0bdd344 100644
--- a/thesis/conclusion.tex
+++ b/thesis/conclusion.tex
@@ -14,17 +14,29 @@ The role of edit filters in Wikipedia's quality control ecosystem, the tasks the
 It was further discussed why such an old-school rule-based technology is still actively used today when more advanced machine learning approaches exist.
 Additionally, interesting paths for future research were suggested.
 
+% TODO more detailed overview of results
+Summing up the most prominent results, edit filters are the first mechanism verifying incoming contributions.
+By acting on unpublished edits they can disallow unconstructive ones directly and thus reduce the workload for other mechanisms.
+At the time of their introduction, the need was felt for a mechanism that swiftly prohibits obvious but difficult to remove vandalism, often caused by the same highly motivated malicious users.
+Although mass-scale page moves to nonsenical names could be taken care of by admin bots, edit filters were viewed as a neater solution since this way such edits are not published at all.
+Also, with some dissatisfaction with bots' development processes (poorly tested and not available source code, low responsiveness of some bot operators), the opportunity for a clean start with a new tool was taken.
+Apart from targeting single highly motivated disrupting editors, edit filters take care of ``common newbie mistakes'' such as publishing text not formatted according to wikisyntax or erasing an entire page instead of properly moving the page to a different name, or suggesting it to the formal Articles for Deletion process.
+By issuing warnings with helpful pointers towards possible alternative actions, edit filters allow a unintentionally disrupting editor to improve their contribution before re-submitting it.
+With feedback provided immediately at publication, the revert-first-ask-questions-later approach of other mechanisms (which frustrates and alienates good intentioned newcomers~\cite{HalGeiMorRied2013}) is inverted.
+Compared to machine learning techniques, rule-based systems such as the edit filters have the advantage of providing higher amount of control for their operators and being easier to use and understand which also enhances accountability.
+
+
 % TODO Refer back to title! Who is allowed to publish? Who decides?
 Taking a step back,
-according to the Wikipedian community people adding references to Brazilian aardvarks or <inser-another-hoax here> shall preferably not publish at all.
+according to the Wikipedian community people adding references to Brazilian aardvarks or proclaiming themselves mayors of small Chinese towns~\cite{Wikipedia:ChenFang} shall preferably not publish at all.
 If we are to handle this type of disruption with edit filters, two approaches seem feasible:
 Warn editors adding the information that their contribution does not contain any references, or outright disallow such edits
 (which does not solve the problem of freely invented sources)
-, but that was pretty much it. %TODO look into all filters tagges as "hoaxing"
-Albeit edit filters may not be the ideal mechanism to deal with hoaxes, what they can do more effectively is prevent someone from moving XXX pages to titles containing ``ON WHEELS'', thus sparing vandal fighters the need to track down and undo these changes, allowing them to use their time more productively by for example fact checking unverified claims and hence reducing the number of fake aardvarks and increasing the overall credibility of the project.
+, but that was pretty much it.
+Albeit edit filters may not be the ideal mechanism to deal with hoaxes, what they can do effectively is prevent someone from moving hundreds of pages to titles containing ``ON WHEELS'', thus sparing vandal fighters the need to track down and undo these changes, allowing them to use their time more productively by for example fact checking unverified claims and hence reducing the number of fake aardvarks and increasing the overall credibility of the project.
 
 %Outlook: centralisation, censorship
-It is impressive how in under 20 years ``a bunch of nobodies created the world's greatest encyclopedia'' to quote Anrew Lih~\cite{Lih2009}.
+It is impressive how in under 20 years ``a bunch of nobodies created the world's greatest encyclopedia'' to quote new media researcher Anrew Lih~\cite{Lih2009}.
 This was possible, among other things, because there was one Wikipedia to which everybody contributed.
 As the project and its needs for quality control grew, a lot of processes became more centralised~\cite{HalGeiMorRied2013}.
 It is, at the end, easier to maintain power and control in a centralised infrastructure.
@@ -33,7 +45,7 @@ It is not an accident that at the very introduction of the AbuseFilter extension
 If there were multiple comparable projects, all of them had to be censored in order to silence people.
 With Wikipedia being the first go-to source of information for a vast quantity of people all over the world today, the debate whose knowledge is included and who decides what is knowledge worth preserving is essential.
 In the present moment, it is more relevant than ever:
-The European Parliament basically voted the introduction of upload filters on the Internet just couple of months ago. %TODO give more details on Copyright directive
+In March 2019, the European Parliament basically voted the introduction of upload filters all over the Internet.
 
 Since Wikipedia is distinctly relevant for the shaping of public opinion, despite its ``neutral point of view'' policy~\cite{Wikipedia:NeutralPointOfView} it is inherently political.
 At the beginnings of this research, I heard the rumour that there was an edit filter on the German Wikipedia targeting gendering.
@@ -41,10 +53,17 @@ At the beginnings of this research, I heard the rumour that there was an edit fi
 It is a political praxis aiming to uncover under-represented groups and their experiences through the conscious use of language.
 Even though no linguistic norm has established gendering to date, conscious decisions for or against the praxis are political, and so are technologies implementing these decisions.
 As it turned out, no such filter existed on the German Wikipedia
-\footnote{Although, as I have heard from women active in the German Wikipedia community, there is a strong general backlash against gendering. The community is also extremely men dominated.}.
+\footnote{Although, as I have heard from women active in the German Wikipedia community, there is a strong general backlash against gendering. The community is also extremely male dominated.}.
 This illustrates a point though:
 Artefacts do have politics and as Lawrence Lessig puts it, it is up to us to decide what values we embed in the systems we create~\cite{Lessig2006}. %TODO Do Artefacts have politics?
 
+%TODO reuse this?
+\begin{comment}
+``Code 2.0 TO WIKIPEDIA, THE ONE SURPRISE THAT TEACHES MORE THAN EVERYTHING HERE.'' reads one of the inscriptions of Lawrence Lessig's ``Code Version 2.0'' (p.v)~\cite{Lessig2006}.
+And although I'm not quite sure what exactly Lessig meant by this regarding the update of his famous book, I readily agree that Wikipedia is important because it teaches us stuff.
+Not only in the literal sense, because it is, well, an encyclopedia.
+Being an open encyclopedia, which has grown to be one of the biggest open collaborative projects in the world, studying its complex governance, community building and algorithmic systems can teach us a lot about other, less open systems.
+\end{comment}
 
 \begin{comment}
 \cite{Lessig2006}
diff --git a/thesis/introduction.tex b/thesis/introduction.tex
index 8cf140f..13f4f76 100644
--- a/thesis/introduction.tex
+++ b/thesis/introduction.tex
@@ -22,27 +22,20 @@ It proved not trivial to erase the snippet from Wikipedia since there were all t
 By then, it was not exactly false either: the coati \emph{was} known as ``Brazilian aardvark'', at least on the Internet.
 
 Now, despite various accounts that Wikipedia seems to be just as accurate and more complete than the encyclopedia Britanica~\cite{}, %TODO quote!
-the stories like the one above are precisely why it is still maintained that information on Wikipedia cannot be trusted, or used as a serious bibliographic reference.
+stories like the one above are precisely why it is still maintained that information on Wikipedia cannot be trusted, or used as a serious bibliographic reference.
 
 %TODO transition is somewhat jumpy
 The Wikipedian community is well-aware of their project's poor reliability reputation and has a long standing history of quality control processes.
 Not only hoaxes, but profanities, malicious vandals, and spammers have been there since the very beginning and their numbers have increased with the rise of the project to prominence.
 %Since its conception in 2001, when nobody believed it was ever going to be a serious encyclopedia, the project has grown steadily.
 At the latest, with the exponential surge in the numbers of users and edits around 2006, the community began realising that they needed a more automated means for quality control.
-The same year, the first anti-vandal bots were implemented, followed by semi-automated revision patroling tools such as Twinkle (in 2007) and Huggle (in the beginnings of 2008).
+The same year, the first anti-vandal bots were implemented, followed by semi-automated tools facilitating revision verification such as Twinkle (in 2007) and Huggle (in the beginnings of 2008).
 In 2009, yet another mechanism dedicated to quality control was introduced.
 Its core developer, Andrew Garrett, known on Wikipedia as User:Werdna, has called it ``abuse filter'', and according to EN Wikipedia's newspaper, The Signpost, its purpose was to ``allow[...] all edits to be checked against automatic filters and heuristics, which can be set up to look for patterns of vandalism including page move vandalism and juvenile-type vandalism, as well as common newbie mistakes''~\cite{Signpost2009}.
 %TODO decide whether to cite the Signpost here already, since it appears again in chapter4
 
 %TODO right now, an abrupt end
 
-%TODO reuse this?
-\begin{comment}
-``Code 2.0 TO WIKIPEDIA, THE ONE SURPRISE THAT TEACHES MORE THAN EVERYTHING HERE.'' reads one of the inscriptions of Lawrence Lessig's ``Code Version 2.0'' (p.v)~\cite{Lessig2006}.
-And although I'm not quite sure what exactly Lessig meant by this regarding the update of his famous book, I readily agree that Wikipedia is important because it teaches us stuff.
-Not only in the literal sense, because it is, well, an encyclopedia.
-Being an open encyclopedia, which has grown to be one of the biggest open collaborative projects in the world, studying its complex governance, community building and algorithmic systems can teach us a lot about other, less open systems.
-\end{comment}
 
 \begin{comment}
 Idea: have opening quotes per chapter
@@ -104,13 +97,15 @@ Which stuff?
 \end{itemize}
 \end{comment}
 
-The present work can be embedded in the context of (algorithmic) quality-control on Wikipedia and in the more general context (syn!) of algorithmic governance.
+The present work can be embedded in the context of (algorithmic) quality-control on Wikipedia and in the more general research area of algorithmic governance.
 %TODO go into algorithmic governance!
-There is a whole ecosystem (syn?) of actors struggling to maintain the anyone-can-edit encyclopedia as accurate and free of malicious content as possible.
-The focus of the present work (syn!) are edit filters, the mechanism initially introduced by User:Werdna under the name of ``abuse filters'', previously unexplored by the scientific community.
-We want to be able to better understand the role of edit filters in the vandal fighting network of humans, bots, semi-automated tools, and the machine learning framework ORES.
-After all, edit filters were introduced to Wikipedia at a time when the majority of the afore mentioned mechanisms already existed and were involved in quality control: in 2009 (whereas the page of the semi-automated tool Twinkle~\cite{Wikipedia:Twinkle} was created in January 2007, the one of the tool Huggle~\cite{Wikipedia:Huggle}—in the beginning of 2008; bots have been around longer, but first records of vandal fighting bots come from 2006).
-%Why were filters introduced, when other mechanisms existed already?
+There is a whole ecosystem of actors struggling to maintain the anyone-can-edit encyclopedia as accurate and free of malicious content as possible.
+The focus of the this work are edit filters, the mechanism initially introduced by User:Werdna under the name of ``abuse filters'', previously unexplored by the scientific community.
+The goal of the current project is to better understand the role of edit filters in the vandal fighting network of humans, bots, semi-automated tools, and the Wikipedian machine learning framework ORES.
+After all, edit filters were introduced to Wikipedia at a time when the majority of the afore mentioned mechanisms already existed and were involved in quality control
+\footnote{Edit filters were introduced in 2009.
+The page of the semi-automated tool Twinkle~\cite{Wikipedia:Twinkle} was created in January 2007, the one of the tool Huggle~\cite{Wikipedia:Huggle}—in the beginning of 2008.
+Bots have been around longer, but first records of vandal fighting bots come from 2006.}.
 
 \begin{comment}
 \section{Algorithmic Governance}
@@ -136,8 +131,8 @@ them harder to change and easier to enforce” (p. 87)"
 \section{Contributions}
 %\section{Aims of this work}
 
-The aim of this work is to find out why edit filters were introduced on Wikipedia and what role they assume in Wikipedia's quality control ecosystem.
-We want to unearth the tasks taken over by filters %in contrast to other quality control meachanisms
+The aim of this work is to find out why edit filters were introduced on Wikipedia and what role they assume in Wikipedia's quality control ecosystem since there is a gap in the academic research on the topic.
+Further, this research seeks to understand what tasks are taken over by filters %in contrast to other quality control meachanisms
 and—as far as practicable—track how these tasks have evolved over time (are there changes in type, numbers, etc.?).
 %and understand how different users of Wikipedia (admins/sysops, regular editors, readers) interact with these and what repercussions the filters have on them.
 Last but not least, it is discussed why a classic rule based system such as the filters is still operational today when more sophisticated machine-learning approaches exist.
diff --git a/thesis/references.bib b/thesis/references.bib
index 638722a..2684535 100644
--- a/thesis/references.bib
+++ b/thesis/references.bib
@@ -517,6 +517,15 @@
                   \url{https://en.wikipedia.org/w/index.php?title=Wikipedia:Administrator_intervention_against_vandalism&oldid=891917401}}
 }
 
+@misc{Wikipedia:ChenFang,
+  key =          "Wikipedia Administrators Noticeboard",
+  author =       {},
+  title =        {Wikipedia: Administrators Noticeboard—Chen Fang Hoax},
+  year =         2019,
+  note =         {Retreived 24 July 2019 from
+                  \url{https://en.wikipedia.org/w/index.php?title=Wikipedia:Administrators%27_noticeboard/Archive241&oldid=891675599#Fictional_entry?}}
+}
+
 @misc{Wikipedia:AntiVandalBot,
   key =          "Wikipedia AntiVandalBot",
   author =       {},
-- 
GitLab