Skip to content
Snippets Groups Projects
Commit 549ff3c2 authored by Lyudmila Vaseva's avatar Lyudmila Vaseva
Browse files

Refactor labeling process section

parent c48b29c0
No related branches found
No related tags found
No related merge requests found
...@@ -99,6 +99,7 @@ other spaces Wikipedia takes place ...@@ -99,6 +99,7 @@ other spaces Wikipedia takes place
\end{comment} \end{comment}
\section{Grounded Theory} \section{Grounded Theory}
\label{sec:gt}
Grounded theory describes a myriad/... of frameworks/... for building a scientific theory \emph{grounded} in (mostly qualitative) data analysis. Grounded theory describes a myriad/... of frameworks/... for building a scientific theory \emph{grounded} in (mostly qualitative) data analysis.
...@@ -121,6 +122,8 @@ Scholars regard this as useful because that way the danger of trying to press da ...@@ -121,6 +122,8 @@ Scholars regard this as useful because that way the danger of trying to press da
Instead, the codes emerge/stem directly from observations of the data. Instead, the codes emerge/stem directly from observations of the data.
Since coding and analysis take place simultaneously, it is also part of the process/common to come back later and re-code parts of the data with labels that have emerged (syn) later (syn) in the process. Since coding and analysis take place simultaneously, it is also part of the process/common to come back later and re-code parts of the data with labels that have emerged (syn) later (syn) in the process.
Coding~\cite[42-71]{Charmaz2006}
\begin{comment} \begin{comment}
Grounded Theory~\cite{Charmaz2006} Grounded Theory~\cite{Charmaz2006}
Chapter 2: Chapter 2:
......
...@@ -382,60 +382,64 @@ Although apparently there are determined trolls who ``work accounts up'' to admi ...@@ -382,60 +382,64 @@ Although apparently there are determined trolls who ``work accounts up'' to admi
\label{sec:manual-classification} \label{sec:manual-classification}
The aim of this section is to get a better understanding of what exactly it is that edit filters are filtering. The aim of this section is to get a better understanding of what exactly it is that edit filters are filtering.
Based on grounded theory methodology presented in chapter~\ref{chap:methods}, I applied emergent coding to all filters, scrutinising their patterns, comments and actions. Based on the grounded theory methodology presented in chapter~\ref{chap:methods}, I applied emergent coding to all filters, scrutinising their patterns, comments and actions.
%TODO Comment on exact process of coding (check with coding book, I think a lot is explained there already)
Three big clusters of filters were identified, namely ``vandalism'', ``good faith'' and ``maintenance''. %TODO define what each of them are; I actually work with 8 main clusters in the end; Unify this Three big clusters of filters were identified, namely ``vandalism'', ``good faith'', and ``maintenance'' (and the auxiliary cluster ``unknown''). %TODO define what each of them are; I actually work with 8 main clusters in the end; Unify this
These are discussed in more detail later in this section.
\subsection{Labeling process and challenges} \subsection{Labeling process and challenges}
It was not always a straightforward decision to determine what type of edits a certain filter is targeting.
This was of course particularly challenging for private filters where only the public comment (name) of the filter was there to guide the coding.
On the other hand, guidelines state up-front that filters should be hidden only in cases of particularly persistent vandalism, in so far it is probably safe to establish that all hidden filters target some type of vandalism.
However, the classification was difficult for public filters as well, since oftentimes what makes the difference between a good-faith and a vandalism edit is not the content of the edit but the intention of the editor.
While there are cases of juvenile vandalism (putting random swear words in articles) or characters repetiton vandalism which are pretty obvious, that is not the case for sections or articles blanking for example.
For these, from the edit alone there is no way of knowing whether the deletion was malicious or the editor conducting it just wasn't familiar with say the correct procedure for moving an article.
\subsection{A few notes on the labels/labeling process} As already mentioned, I started coding strongly influenced by the coding methodologies applied by grounded theory scholars (see chapter~\ref{chap:methods}) and let the labels emerge during the process.
I looked through the data paying special attention to the name of the filters (``af\_public\_comments'' field of the \emph{abuse\_filter} table), the comments (``af\_comments''), the regular expression pattern constituting the filter (``af\_pattern''), and the designated filter actions (``af\_actions'').
I started coding strongly influenced by the coding methodologies applied by grounded theory scholars~\cite[42-71]{Charmaz2006} (described in more detail in chapter~\ref{chap:methods}) and mostly let the labels emerge during the process. The assigned codes emerged from the data: some of them being literal quotes of terms used in the decription or comments of a filter, while others summarised the perceived filter functionality.
In addition to that, for vandalism related labels, I used some of the vandalism types identified by the community in~\cite{Wikipedia:VandalismTypes}. In addition to that, for vandalism related labels, I used some of the vandalism types identified by the community in~\cite{Wikipedia:VandalismTypes}.
However, I regarded the types more as an inspiration and haven't adopted the proposed typology 1:1 since I found some of the identified types quite general and more specific categories seemed to render more insights. However, this typology was regarded more as an inspiration instead of being adopted 1:1 since some of the types were quite general whereas more specific categories seemed to render more insights.
For instance, I haven't adopted the 'addition of text' category since it seemed more insightful(syn!) to have more specific labels such as 'hoaxing' or 'silly\_vandalism', see below for definition. For instance, I haven't applied the ``addition of text'' category since it seemed more insightful/useful(syn!) to have more specific labels such as ``hoaxing'' or ``silly\_vandalism'' (check the code book in the appendix~\ref{app:code_book} for definitions).
Moreover, I found some of the proposed types redundant. Moreover, I found some of the proposed types redundant.
For example, 'sneaky vandalism' seems to overlap partially with 'hoaxing' and partially with 'sockpuppetry', 'link vandalism' mostly overlaps with 'spam' or 'self\_promotion', although not always and for some reason, 'personal attacks' are listed twice. For example, ``sneaky vandalism'' seems to overlap partially with ``hoaxing'' and partially with ``sockpuppetry'', ``link vandalism'' mostly overlaps with ``spam'' or ``self promotion'' (although not always), and for some reason, ``personal attacks'' are listed twice.
I have labeled the dataset twice. I have labeled the dataset twice.
One motivation therefor was to return to it once I've gained better insight into the data and more detailed understanding of it and use this newly gained knowledge to re-evaluate ambiguous cases, i.e. re-label some data with codes that emerged later in the process. The motivation therefor was to return to it once I've gained better insight into the data and use this newly gained knowledge to re-evaluate ambiguous cases, i.e. re-label some data with codes that emerged later in the process.
This process (syn) of labeling is congrous with the simultaneous coding and data collection suggested by grounded theory scholars~\cite{}. This mode of labeling is congruous with the simultaneous coding and data analysis suggested by grounded theorists (compare section~\ref{sec:gt}).
Another motivation for this second round of labeling was to ensure at least some intra-coder integrity, since, unfortunately, multiple coders were not available~\cite{LazFenHo2017}. %TODO add page num; I also need to elaborate on methodoly here
During the first labeling, I looked through the data paying special attention to the name of the filters ('af\_public\_comments'), the comments ("af\_comments"), as well as the regular expression pattern constituting the filter and identified one or several possible labels. %TODO reword? I also looked at the comments, name and regex the second time.. %1st labeling
In ambiguous cases, I either labeled the filter with the code which I deemed most appropriate and a question mark, or assigned all possible labels (or both). Following challenges were encountered during the first round of labeling.
There were some ambiguous cases which I either tagged with the code I deemed most appropriate and a question mark, or assigned all possible labels (or both).
There were also cases for which I could not gather any insight relying on the name, comments and pattern, since the filters were hidden from public view and the name was not descriptive enough. There were also cases for which I could not gather any insight relying on the name, comments and pattern, since the filters were hidden from public view and the name was not descriptive enough.
However, upon some further reflection, I think it is safe to assume that all hidden filters target a form of (more or less grave) vandalism, since the guidelines suggest that filters should not be hidden unless dealing with cases of persistent and specific vandalism where it could be expected that the vandalising editors will actively look for the filter pattern in their attempts to circumvent the filter\cite{Wikipedia:EditFilter}. However, upon some further reflection, I think it is safe to assume that all hidden filters target a form of (more or less grave) vandalism, since the guidelines suggest that filters should not be hidden in the first place unless dealing with cases of persistent and specific vandalism where it could be expected that the vandalising editors will actively look for the filter pattern in their attempts to circumvent the filter\cite{Wikipedia:EditFilter}.
Therefore, during the second round of labeling I labeled all such cases 'hidden\_vandalism' (all of them where nothing more specific was found). Therefore, during the second round of labeling I tagged all hidden filters for which there weren't any more specific clues (for example in the name of the filter) as ``hidden\_vandalism''.
And then again, there were also cases where I could not determine any suitable label, since I didn't understand the regex pattern, none of the existing categories seemed to fit and I couldn't think of an insightful new category to assign. And then again, there were also cases, not necessarily hidden, where I could not determine any suitable label, since I didn't understand the regex pattern, and/or none of the existing categories seemed to fit, and/or I couldn't think of an insightful new category to assign.
During the 1st labeling, these were labeled 'unknown', 'unclear' or 'not sure'. During the first labeling, these were labeled 'unknown', 'unclear' or 'not sure'.
For the second round, I intend to unify them under 'unclear'. For the second round, I have unified all of them under 'unclear'.
For a number of filters, it was particularly difficult to determine whether they were targeting vandalism or good faith edits. For a number of filters, it was particularly difficult to determine whether they were targeting vandalism or good faith edits.
The only thing that would have distinguished between the two would have been the contributing editor's motivation, which we had no way of knowing. The only thing that would have distinguished between the two would have been the contributing editor's motivation, which we had no way of knowing.
During the first labeling session, I tended to label such filters with 'vandalism?', 'good\_faith?'. During the first labeling session, I tended to label such filters with ``vandalism?, good\_faith?''.
For the cross-validation labeling (2nd time), I intend to stick to the "assume good faith" guideline\footnote{\url{https://en.wikipedia.org/w/index.php?title=Wikipedia:Assume_good_faith&oldid=889253693}} myself For the second labeling, I stuck to the ``assume good faith'' guideline~\cite{Wikipedia:GoodFaith} myself
and only label as vandalism cases where good faith can definitely be no longer assumed/out of the question. and only labeled as vandalism cases where good faith was definitely out of the question.
One characteristic/feature which guided me here is the filter action which represents the judgement of the edit filter manager(s). One feature which guided me here was the filter action which represents the judgement of the edit filter manager(s).
Since communication is crucial when assuming good faith, all ambiguous cases which have a less 'grave' filter action such as "warn" or "tag", will receive a 'good\_faith' label. Since communication is crucial when assuming good faith, all ambiguous cases which have a less ``grave'' filter action such as ``tag'' or ``warn'' (which seeks to give feedback and thereby effect/influence a constructive contribution) have received a ``good\_faith'' label.
On the other hand, I will label all filters set to "disallow" as 'vandalism' or a particular type thereof, since the filter action is a clear sign that at least the edit filter managers have decided that seeking a dialog with the offending editor is no longer an option. On the other hand, filters set to ``disallow'' were tagged as ``vandalism'' or a particular type thereof, since the filter action is a clear sign that at least the edit filter managers have decided that seeking a dialog with the offending editor is no longer an option. %TODO check whether that's really the case
%TODO compare also with revising codes as the analysis goes along according to Grounded Theory %TODO compare also with revising codes as the analysis goes along according to Grounded Theory
The second time, I labeled the whole data set again, this time using the here quoted compiled code book and assigned to every filter every label I deemed appropriate, without looking at the labels I assigned the first time around. For the second round of labeling, I tagged the whole dataset again using the compiled code book (see \ref{app:code_book}) and assigned to every filter exactly one label–the one deemed most appropriate (although oftentimes alternative possibilites were listed as notes), without looking at the labels I assigned the first time around.
I then compared the labels from both coding sessions. %TODO And did what?; how big was the divergence between both coding sessions?; should I select one, most specific label possible? or allow for multiple labels? I intended to compare the labels from both coding sessions and focus on more ambiguous (syn) cases, re-evaluting them using all available information (patterns, public comments, labels from both sessions + any notes I made along the line).
%TODO quote M's methodology book Unfortunately, there was no time, so the analysis of the present section is based upon the second round of labeling.
Comparing codes from both labeling sessions and refining the coding is one of the possibilities for future research. %TODO (re-formulate!)
%TODO disclose links to 1st and 2nd labelling %TODO disclose links to 1st and 2nd labelling
First round of labeling is available under \url{https://github.com/lusy/wikifilters/blob/master/filter-lists/20190106115600_filters-sorted-by-hits-manual-tags.csv}. First round of labeling is available under \url{https://github.com/lusy/wikifilters/blob/master/filter-lists/20190106115600_filters-sorted-by-hits-manual-tags.csv}.
%TODO I actually need a final document where I compare both and decide on final (at least for this work) labeling I rely upon \begin{comment}
% Kept as a possible alternative wording for private vs public and labeling decisions in ambiguous cases
It was not always a straightforward decision to determine what type of edits a certain filter is targeting.
This was of course particularly challenging for private filters where only the public comment (name) of the filter was there to guide the coding.
On the other hand, guidelines state up-front that filters should be hidden only in cases of particularly persistent vandalism, in so far it is probably safe to establish that all hidden filters target some type of vandalism.
However, the classification was difficult for public filters as well, since oftentimes what makes the difference between a good-faith and a vandalism edit is not the content of the edit but the intention of the editor.
While there are cases of juvenile vandalism (putting random swear words in articles) or characters repetiton vandalism which are pretty obvious, that is not the case for sections or articles blanking for example.
For these, from the edit alone there is no way of knowing whether the deletion was malicious or the editor conducting it just wasn't familiar with say the correct procedure for moving an article.
\end{comment}
\subsection{Editors' motivation} \subsection{Editors' motivation}
\begin{comment} \begin{comment}
...@@ -470,6 +474,7 @@ Only if the disrupting editor proves to be uncooperating, ignores warnings and c ...@@ -470,6 +474,7 @@ Only if the disrupting editor proves to be uncooperating, ignores warnings and c
In the subsections that follow the salient properties of each manually labeled category are discussed. In the subsections that follow the salient properties of each manually labeled category are discussed.
\subsection{Vandalism} \subsection{Vandalism}
malicious
The vast majority of edit filters on EN Wikipedia could be said to target (different forms of) vandalism, i.e. maliciously intended disruptive editing. The vast majority of edit filters on EN Wikipedia could be said to target (different forms of) vandalism, i.e. maliciously intended disruptive editing.
Examples thereof are filters for juvenile types of vandalism (inserting swear or obscene words or nonsence sequences of characters into articles), for hoaxing (inserting obvious or less obvious false information in articles) or for template vandalism (modifying a template in a disruptive way which is quite severe, since templates are displayed on various pages). Examples thereof are filters for juvenile types of vandalism (inserting swear or obscene words or nonsence sequences of characters into articles), for hoaxing (inserting obvious or less obvious false information in articles) or for template vandalism (modifying a template in a disruptive way which is quite severe, since templates are displayed on various pages).
...@@ -550,6 +555,7 @@ Filters targeting such behaviours (syn) were identified and grouped in the ``dis ...@@ -550,6 +555,7 @@ Filters targeting such behaviours (syn) were identified and grouped in the ``dis
\subsection{Good Faith} \subsection{Good Faith}
(mostly) disruptive, but not necessarily made with bad intentions
The second big cluster identified (syn!) were filters targeting ``good faith'' edits. The second big cluster identified (syn!) were filters targeting ``good faith'' edits.
``Good faith'' is a term adopted by the Wikipedia community itself, most prominently in the guideline ``assume good faith''~\cite{Wikipedia:GoodFaith}. ``Good faith'' is a term adopted by the Wikipedia community itself, most prominently in the guideline ``assume good faith''~\cite{Wikipedia:GoodFaith}.
...@@ -591,6 +597,7 @@ I actually think, a bot fixing this would be more appropriate. ...@@ -591,6 +597,7 @@ I actually think, a bot fixing this would be more appropriate.
\end{comment} \end{comment}
\subsection{Maintenance} \subsection{Maintenance}
Tracking bugs, etc.
Some of the encountered edit filters on the EN Wikipedia were targeting neither vandalism nor good faith edits. Some of the encountered edit filters on the EN Wikipedia were targeting neither vandalism nor good faith edits.
Rather, they had their focus on (semi-)automated routine (clean up) tasks. Rather, they had their focus on (semi-)automated routine (clean up) tasks.
......
...@@ -13,6 +13,7 @@ ...@@ -13,6 +13,7 @@
\label{app:code_book} \label{app:code_book}
This section provides a detailed overview of all the codes\footnote{Here, I use the words ``codes'', ``tags'' and ``labels'' interchangeably.} used for the manual tagging of edit filters. This section provides a detailed overview of all the codes\footnote{Here, I use the words ``codes'', ``tags'' and ``labels'' interchangeably.} used for the manual tagging of edit filters.
The purpose of the coding was to gain insight into the specific tasks filters are applied for on English Wikipedia.
%TODO put all the labels in a table? %TODO put all the labels in a table?
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment