diff --git a/thesis/3-Methods.tex b/thesis/3-Methods.tex index 559f37f5c79450a91b578b211098f286d4063e20..b71d99286e8714e83b5d8c7d51606b36e916bb0b 100644 --- a/thesis/3-Methods.tex +++ b/thesis/3-Methods.tex @@ -64,7 +64,7 @@ Finally, a third coding phase took placeāthe so called axial coding which ``re \section{Open Science} The whole work tries to adhere to the principles of open science and reproducible research. %TODO what are the principle of open science? refs are missing -All the computations I have done and other artefacts I have used or compiled are openly accessible in the project's repository~\cite{github}. +All the computations I have done and other artefacts I have used or compiled are openly accessible in the project's repository~\cite{gitlab}. and can be re-used under a free license (which one?). And have been openly accessible since the very beginning. Everyone interested can follow the process and/or use the data or scripts in order to verify my computations (syn) or run their own and thus continue this research along one of the directions suggested in section~\ref{sec:further-studies} or in a completely new one. diff --git a/thesis/5-Overview-EN-Wiki.tex b/thesis/5-Overview-EN-Wiki.tex index d0f736dbb312034e36e1f065b94d2f186386e484..50b0c5dafabfc5a4bcd378f37405b1a5dc52c8d0 100644 --- a/thesis/5-Overview-EN-Wiki.tex +++ b/thesis/5-Overview-EN-Wiki.tex @@ -19,7 +19,7 @@ And finally, some historical patterns are observed in section~\ref{sec:5-history \label{sec:overview-data} A big part of the present analysis is based upon the \emph{abuse\_filter} table from \emph{enwiki\_p}(the database which stores data for the EN Wikipedia), or more specifically a snapshot thereof which was downloaded on January 6th, 2019 via quarry, a web-based service offered by Wikimedia for running SQL queries against their public databases~\footnote{\url{https://quarry.wmflabs.org/}}. -The complete dataset can be found in the repository for the present paper~\cite{github}. % TODO add a more specific link +The complete dataset can be found in the repository for the present paper~\cite{gitlab}. This table, along with \emph{abuse\_filter\_actions}, \emph{abuse\_filter\_log}, and \emph{abuse\_filter\_history}, are created and used by the AbuseFilter MediaWiki extension~(\cite{gerrit-abusefilter-tables}), as discussed in section~\ref{sec:mediawiki-ext}. Selected queries have been run via quarry against the \emph{abuse\_filter\_log} table as well. @@ -82,7 +82,7 @@ Unfortunately, there was no time, so the analysis of the present section is base Comparing codes from both labeling sessions and refining the coding is one of the possibilities for future research. %TODO (re-formulate!) %TODO disclose links to 1st and 2nd labelling -First round of labeling is available under \url{https://github.com/lusy/wikifilters/blob/master/filter-lists/20190106115600_filters-sorted-by-hits-manual-tags.csv}. +First round of labeling is available under \begin{comment} % Kept as a possible alternative wording for private vs public and labeling decisions in ambiguous cases @@ -121,6 +121,16 @@ a lot of the filters issue warnings intending to guide the editors towards ways Codes from this category often take into consideration the area the editor was intending to contribute to or respectively that they (presumably) unintentionally disrupted. +%TODO decide what to do with this paragraph; most of it should be mentioned already +\begin{comment} +As I recently learned, apparently this guideline arose/took such a central position not from the very beginning of the existence of the collaborative encyclopedia. +It rather arose at a time when, after a significant growth in Wikipedia, it wasn't manageable to govern the project (and most importantly fight emergent vandalism which grew proportionally to the project's growth) manually anymore. +To counteract vandalism, a number of automated measures was applied. +These, however, had also unforseen negative consequences: they drove newcomers away~\cite{HalKitRied2011}(quote literature) (since their edits were often classified as "vandalism", because they were not familiar with guidelines / wiki syntax / etc.) +In an attempt to fix this issue, "Assume good faith" rose to a prominent position among Wikipedia's Guidelines. +(Specifically, the page was created on March 3rd, 2004 and was originally refering to good faith during edit wars. +An expansion of the page from December 29th 2004 starts refering to vandalism. https://en.wikipedia.org/w/index.php?title=Wikipedia:Assume_good_faith&oldid=8915036) +\end{comment} \subsection{Maintenance} diff --git a/thesis/references.bib b/thesis/references.bib index 980a7f2698df1a7b164ba3a63bb55237cf11f74c..858b391cc20dc1cddeff0b1196b30ef48498274b 100644 --- a/thesis/references.bib +++ b/thesis/references.bib @@ -145,12 +145,12 @@ \url{https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/498773/}} } -@misc{github, - key = "Github Repository", +@misc{gitlab, + key = "Gitlab Repository", author = {}, - title = {Github Repository of the thesis}, + title = {Gitlab Repository of the thesis}, year = 2019, - note = {\url{https://github.com/lusy/wikifilters}} + note = {\url{https://git.imp.fu-berlin.de/luvaseva/wikifilters}} } @article{HalGeiMorRied2013,