Skip to content
Snippets Groups Projects
Commit 85610884 authored by Lyudmila Vaseva's avatar Lyudmila Vaseva
Browse files

Refactor chapter 3 and trace ethnography

parent 739d48db
No related branches found
No related tags found
No related merge requests found
\chapter{Methods}
\label{chap:methods}
This chapter describes the methodology applied throughout the thesis.
\section{Open Science}
The whole work tries to adhere to the principles of open science. %TODO what are the principle of open science? refs are missing
All the computations I have done and other artefacts I have used or compiled are openly accessible in the project's repository~\cite{github}.
And have been openly accessible since the very beginning.
Everyone interested can follow the process and/or use the data or scripts in order to verify my computations (syn) or run their own and thus continue this research along one of the directions suggested in section~\ref{sec:further-studies} or in a completely new one.
This chapter describes the methodology applied for the study of edit filters.
\section{Trace Ethnography}
\label{sec:trace-ethnography}
A second important theoretical framework constitutes the trace ethnography.
The concept was first introduced/used by Geiger and Ribes in their 2010 work ``The work of sustaining order in Wikipedia: the banning of a vandal''~\cite{GeiRib2010} and introduced in detail in a 2011 paper~\cite{GeiRib2011}.
The scholars define trace ethnography as a methodology which
The main theoretical framework for the analysis presented in chapters~\ref{chap:filters} and~\ref{chap:overview-en-wiki} constitutes the trace ethnography.
The concept was first utilised by Geiger and Ribes in their 2010 work ``The work of sustaining order in Wikipedia: the banning of a vandal''~\cite{GeiRib2010} and introduced in detail in a 2011 paper~\cite{GeiRib2011} by the same authors.
They define trace ethnography as a methodology which
``combines the richness of participant-observation
with the wealth of data in logs so as to reconstruct
patterns and practices of users in distributed
sociotechnical systems''
and is especially practical for research in distributes technical systems (doppelt gemoppelt with quote) where direct partipants observation is impractical, costly and tend to miss phenomena due to..
They use documents and document traces: ... %TODO which ones
sociotechnical systems''.
It is supposedly especially practical for research in such distributed systems, since there direct partipants observation is impractical, costly and tend to miss phenomena which manifest themselves in the communication between spatially separated sites rather than in the single location.
In~\cite{GeiRib2011} the scholars use documents and document traces: MediaWiki revision data, more specifically–edit summary fields of the single revisions and markers/codes left within the edit summaries; documentation of semi-automated software tools; and even use the tools (Huggle and Twinkle) themselves to observe what traces these leave;
in order to reconstruct quite exactly single strands of actions and comprehend how different agents on Wikipedia work together towards the blocking of a single malicious user.
They (syn!) refer to ``turn[ing] thin documentary traces into “thick descriptions” of actors and events".
They refer to ``turn[ing] thin documentary traces into “thick descriptions” of actors and events''.
What is more, these traces are used by Wikipedians themselves in order to do their work efficiently.
Geiger and Ribes underline the importance of insider knowledge when reconstructing actions and processes based on the traces,
the need for ``an ethnographic understanding of the activities, people, systems, and technologies which contribute to their production''.
They alert that via trace ethnography only that can be observed which is recorded by the system and records are always incomplete.
%TODO pitfalls of using data produced for other purposes?
This consideration is elaborated on in more detail in~\cite{GeiHal2017}, where Geiger and Halfaker make the point that ``found data'' generated by a system for a particular purpose (e.g. revision history whose purpose is to keep a track of who edited what when and possibly revert (to) a particular revision) is rarely ideally fitting as a dataset to answer the particular research question of a scientist.
The researchers also warn of possible privacy breaching through thickening traces:
Geiger and Ribes~\cite{GeiRib2011} also warn of possible privacy breaching through thickening traces:
although records they use to reconstruct paths of action are all open, the thick descriptions they compile can suddenly expose a lot of information about single users which never existed in this form before and who never gave their informed consent for their data being used this way.
\begin{comment}
\cite{GeiHal2017}
"when working with large-scale “found data” [36] of the traces
users leave behind when interacting on a platform, how do we best operationalize culturally-specific
concepts like conflict in a way that aligns with the particular context in which those traces were made?"
Star: "ethnography of infrastructure":
"discusses the “veridical” approach, in which “the information system
is taken unproblematically as a mirror of actions in the world, and often tacitly, as a complete
enough record of those actions” (p. 388).
She contrasts this with seeing the data as “a trace or record
of activities,” in which the information infrastructure “sits (often uneasily) somewhere between
research assistant to the investigator and found cultural artifact."
"Trace
ethnography is not “lurker ethnography” done by someone who never interviews or participates in
a community."
trace literacy --> get to know the community; know how to participate in it
thick description of different prototypical cases:
vgl \cite{GeiHal2017}
iterative mixed method
combination of:
* quantitative methods: mining big data sets/computational social science
"begin with one or
more large (but often thin) datasets generated by a software platform, which has recorded digital
traces that users leave in interacting on that platform. Such researchers then seek to mine as much
signal and significance from these found datasets as they can at scale in order to answer a research
question"
* more traditional social science/qualitative methods, e.g. interviews, observations, experiments
\end{comment}
\begin{comment}
vgl \cite{GeiHal2017}
iterative mixed method
combination of:
* quantitative methods: mining big data sets/computational social science
"begin with one or
more large (but often thin) datasets generated by a software platform, which has recorded digital
traces that users leave in interacting on that platform. Such researchers then seek to mine as much
signal and significance from these found datasets as they can at scale in order to answer a research
question"
* more traditional social science/qualitative methods, e.g. interviews, observations, experiments
\cite{Geiger2014}
"the idea that Wikipedia only takes place on wiki-
pedia.org – or even entirely on the Internet – is a huge misunderstanding (Konieczny, 2009;
Reagle, 2010). Wikipedia is not a virtual world, especially one located entirely on the wiki."
e.g. in order to get hold of abuse_filter_history I had to engage with
- wikipedia.org
- mediawiki.org
- irc channels
- phabricator
- gerrit
- toolserver/cloudservices
----
other spaces Wikipedia takes place
- mailinglists
- WomenEdit/offenes Editieren @Wikimedia
- Wikimania
- Wikimedia's office and daily work
\end{comment}
\section{Grounded Theory}
\section{Emergent Coding}
\label{sec:gt}
Grounded theory describes a myriad/... of frameworks/... for building a scientific theory \emph{grounded} in (mostly qualitative) data analysis.
In order to gain a detailed understanding of what edit filters are used for on English Wikipedia, in chapter~\ref{chap:overview-en-wiki} I applied emergent coding on all filters.
Different variations of the method are widely used by grounded theory scholars for making sense of (mainly qualitative) data.
Grounded theory describes a myriad/... of frameworks/... for building a scientific theory \emph{grounded} in (mostly qualitative) data analysis.
Here, I haven't developed a finished theory,
but instead just employed some methods used by grounded theory %TODO check whether it's written with caps
but instead just employed some methods used by grounded theory
scholars, most prominently/above all–their coding processes.
There are different branches? in grounded theory that diverge slightly or more clearly/distinctly in their assumptions and proposed methods.
I followed the guidelines and .. of constructivist grounded theory proposed/described by Charmaz in~\cite{Charmaz2006}.
......@@ -279,7 +212,10 @@ what is in someone' s mind-particularly if he or she does not tell you."(p.68)
\end{comment}
%\section{Cooking Data With Care}
%or Critical data science? Or both?
\section{Open Science}
The whole work tries to adhere to the principles of open science. %TODO what are the principle of open science? refs are missing
All the computations I have done and other artefacts I have used or compiled are openly accessible in the project's repository~\cite{github}.
And have been openly accessible since the very beginning.
Everyone interested can follow the process and/or use the data or scripts in order to verify my computations (syn) or run their own and thus continue this research along one of the directions suggested in section~\ref{sec:further-studies} or in a completely new one.
\section{Validation}
......@@ -21,6 +21,23 @@ Following pages were analysed in depth: \\
\url{https://en.wikipedia.org/wiki/Wikipedia_talk:Edit_filter/Archive_1} \\
<insert pages here>
\cite{Geiger2014}
"the idea that Wikipedia only takes place on wiki-
pedia.org – or even entirely on the Internet – is a huge misunderstanding (Konieczny, 2009;
Reagle, 2010). Wikipedia is not a virtual world, especially one located entirely on the wiki."
e.g. in order to get hold of abuse_filter_history I had to engage with
- wikipedia.org
- mediawiki.org
- irc channels
- phabricator
- gerrit
- toolserver/cloudservices
----
other spaces Wikipedia takes place
- mailinglists
- WomenEdit/offenes Editieren @Wikimedia
- Wikimania
- Wikimedia's office and daily work
%************************************************************************
\section{Definition}
......
......@@ -176,6 +176,8 @@ Fifth, no access to the details of hidden filters, so no insights into the areas
Following other pages looked interesting or related, but were left out, mainly because of insufficient time.
(Is there a better reasoning why I looked at the pages I looked at specifically, while left particularly these other pages for later?)
I really only used ``found data'' (compare~\ref{sec:trace-ethnography}) (well I also attempted to interpret the found data and link it) and future studies can and should use the first insights of the current research as interview prompts
%************************************************************************
\section{Directions for future studies}
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment