diff --git a/den-wald-vor-lauter-baeume b/den-wald-vor-lauter-baeume index 26a02e219ff8ac467734ce44a94df234fc5d8461..400d5395b1a782213926151f5f97d6531e439eed 100644 --- a/den-wald-vor-lauter-baeume +++ b/den-wald-vor-lauter-baeume @@ -1,5 +1,6 @@ # What is important?? +* filters allow for regex based handling of edits and other editors' actions * filters check every edit at its publication; they are triggered *before* an edit is even published; effect is immediate * bots and semi-automated tools review edits *after* their publication. it takes time (however short it might be) till the edit is examined -> Q: Why are there mechanisms triggered before an edit gets published (such as edit filters), and such triggered afterwards (such as bots)? Is there a qualitative difference? diff --git a/long-list-of-interesting-questions b/long-list-of-interesting-questions index 7bd9e236fcfe235644ddfa7c1cce9da6362e970f..f0e4dea5d3d629c29876ec96fc1c18e0b01383f5 100644 --- a/long-list-of-interesting-questions +++ b/long-list-of-interesting-questions @@ -1,7 +1,7 @@ * How have edit filters's tasks evolved over time? (abuse_filter_history table) * What are the differences between how filters are governed on EN Wikipedia compared to other language versions? -* Are there filters targetting harassment? +* Are there filters targetting harassment?: look into https://en.wikipedia.org/wiki/Wikipedia:Edit_filter_noticeboard/Archive_2#Exploring_how_the_Edit_filter_can_be_used_to_combat_harassment * Ethnographic analysis (e.g. IVs with edit filter managers/admins/users whose edits have been disallowed would be really interesting) * What is to be learned from studying the regex patterns in more detail? * what's filters' genesis story? why were they implemented? (compare with Rambot story) : try to reconstruct by examining traces and old page versions @@ -15,3 +15,4 @@ * What can we filter with a REGEX? And what not? Are regexes the suitable technology for the means the community is trying to achieve? * GT is good for tackling controversial questions: e.g. are filters with disallow action a too severe interference with the editing process that has way too much negative consequences? (e.g. driving away new comers?) * What are the urgent situations in which edit filter managers are given the freedom to act as they see fit and ignore best practices of filter adoption? Who determines they are urgent? +* is there a qualitative difference between complaints of bots and complaints of filters? diff --git a/meeting-notes/20190509.md b/meeting-notes/20190509.md index c84a64584731e8ad83ebd9343120c069f8fb81b4..c9b5211b8ea2cce53077752387fda92cfc1af284 100644 --- a/meeting-notes/20190509.md +++ b/meeting-notes/20190509.md @@ -4,8 +4,8 @@ * beware of wording: "vandalism" is quite a harsh term (see also naming discussion edit filters), try to avoid it especially in contexts where it's not clear whether we are indeed dealing with vandalism (potential harmful edits); maybe replace with "quality assurance/control" wherever suitable * of the 152 edit filter managers on EN wikipedia: - * how many are admins? - * how many run their own bots? + * how many are admins? -- only ~11 are not admins + * how many run their own bots? -- no straight-forward way to find out * if an editor is both an edit filter manager and a bot developer: in which cases would they decide to implement a bot and in which a filter? * stick to research questions from Confluence, they are already carefully crafted and narrowed down as appropriate Q1 We wanted to improve our understanding of the role of filters in existing algorithmic quality-control mechanisms (bots, ORES, humans). diff --git a/thesis/2-Background.tex b/thesis/2-Background.tex index 2d6756199da9d9d0de9d5854714f5899979ad182..a60285b34f3346d3b303514e840f08168570e03c 100644 --- a/thesis/2-Background.tex +++ b/thesis/2-Background.tex @@ -94,7 +94,7 @@ is_bot edits Percentage of all edits \section{Bots} %todo also mention bot papers that discuss more general aspects of bots? -According to literature, bots constitute the first line of defence against malicious edits~\cite{GeiHal2013}. +According to literature, bots constitute the first line of defence against malicious edits~\cite{GeiHal2013}. %TODO but that's actually not true! edit filters are triggered first. Comment on this! They are also undoubtedly the vandal fighting mechanism studied most in depth by the scientific community. %TODO replace "vandal fighting" with "quality control"? Geiger and Ribes~\cite{GeiRib2010} define bots as @@ -187,6 +187,7 @@ removed the third vandal fighter's now-obsolete report." \end{comment} +%TODO: gibts es vergleichbare concerns zu den Gamification concerns bei semi-automated tools bei anderen mechanismen? \section{Semi-automated tools} diff --git a/thesis/6-Discussion.tex b/thesis/6-Discussion.tex index 6a29c58e12d4c215dafaa1e0a27fbc75629fcae6..4dc6afc8fde705143d739fc4eb511ec04d446156 100644 --- a/thesis/6-Discussion.tex +++ b/thesis/6-Discussion.tex @@ -68,6 +68,10 @@ possible to act otherwise, but such deviations demand inventiveness and time." %\subsection{Harassment and bullying} +* where is the thesis going? + * should there be some recommended guidelines based on the insights? + * or some design recommendations? + * or maybe just a framework for future research: what are questions we just opened?; we still don't know the answer to and should be addressed by future research? \section{Limitations} diff --git a/thesis/introduction.tex b/thesis/introduction.tex index b3baa97051da75579d43cd75e96fe3abc7635c42..f94b63a1d1f88d1359a8f149040e51fdb6674f6f 100644 --- a/thesis/introduction.tex +++ b/thesis/introduction.tex @@ -74,6 +74,13 @@ Questions from Confluence * GT is good for tackling controversial questions: e.g. are filters with disallow action a too severe interference with the editing process that has way too much negative consequences? (e.g. driving away new comers?) \end{comment} +%TODO die wichtigsten erkenntnisse mehrmals erwähnen: intro, schluss, tralala; nicht dass sie unter gehen weil ich von lautern Bäumen den Wald nicht mehr sehe + +* where is the thesis going? + * should there be some recommended guidelines based on the insights? + * or some design recommendations? + * or maybe just a framework for future research: what are questions we just opened?; we still don't know the answer to and should be addressed by future research? + %************************************************************ \section{Methods} @@ -83,6 +90,10 @@ Questions from Confluence \item Insbesondere bei Master: Wie kann die Zielerreichung ``gemessen'' werden? \end{itemize} +* methodology: what are the sources of knowledge + * literature: what insights have we won from it? + * documentation (Wikipedia, MediaWiki pages): what have we learnt here + * data (filters stats, REGEX patterns): what do the filters actually do? \section{Structure} diff --git a/todo b/todo index 52513cccd6c0d657779122416cc9c9e4f6c080aa..4a94da35577644c0d6806165f7725fc76c92e573 100644 --- a/todo +++ b/todo @@ -1,13 +1,3 @@ -# Feedback T - -* gibts es vergleichbare concerns zu den Gamification concerns bei semi-automated tools bei anderen mechanismen? -* den Unterschied hervorheben: bots/semi-aut. tool: similar: automatic detection of potential vandalism; different: a person must click (in the tools) -* filters: BEFORE an edit is published; everything else: AFTER -* filters: REGEX! -* die wichtigsten erkenntnisse mehrmals erwähnen: intro, schluss, tralala; nicht dass sie unter gehen weil ich von lautern Bäumen den Wald nicht mehr sehe -* do bots check also entire article text and not only single edits? as a clever person with malicious intentions I could split my malicious stuff into several edits to make it more difficult to discover -- unklar. ich hab das gefühl, die sind schon edit-basiert - - # Papers I still want to read Check: @@ -36,7 +26,13 @@ for fun # Next steps +* an idea for the presi/written text: begin and end every part (section/paragraph) with a question: what question do I want to answer here? what question is still open? +* How many of the edit filter managers also run bots. How do they decide in which case to implement a bot and in which a filter? * Why are there mechanisms triggered before an edit gets published (such as edit filters), and such triggered afterwards (such as bots)? Is there a qualitative difference? +* do bots check also entire article text and not only single edits? as a clever person with malicious intentions I could split my malicious stuff into several edits to make it more difficult to discover -- unklar. ich hab das gefühl, die sind schon edit-basiert +* how stable is the edit filter managers group? how often are new editors accepted? (who/how nominates them? maybe there aren't very many accepted, but then again if only 2 apply and both are granted the right, can you then claim it's exclusive?) -- I think it's somewhat stable. In the last 3 months nothing has changed; for instance, mid-end 2017 there was somewhat high traffic of people requesting edit-filter-helper permissions, since it was newly implemented around that time (before that you could only get the full edit filter manager or nothing at all); it also seems that since then the practice was established that people would request edit filter helper first and only then be perhaps promoted to edit filter manager + + * I want to help people to do their work better using a technical system (e.g. the edit filters). How can I do this? * The edit filter system can be embedded in the vandalism prevention frame. Are there other contexts/frames for which it is relevant? @@ -264,3 +260,13 @@ Claudia: * A focus on the Good faith policies/guidelines is a historical develop * Geiger et al - Defense Mechanisms * Halfaker et al - The rise and decline of an open collaboration system (evtl enough, don't have to read Suh at al in detail) Urquhardt - Bringing theory back to grounded theory + +# Feedback T + +* gibts es vergleichbare concerns zu den Gamification concerns bei semi-automated tools bei anderen mechanismen? +* den Unterschied hervorheben: bots/semi-aut. tool: similar: automatic detection of potential vandalism; different: a person must click (in the tools) +* filters: BEFORE an edit is published; everything else: AFTER +* filters: REGEX! +* die wichtigsten erkenntnisse mehrmals erwähnen: intro, schluss, tralala; nicht dass sie unter gehen weil ich von lautern Bäumen den Wald nicht mehr sehe +* do bots check also entire article text and not only single edits? as a clever person with malicious intentions I could split my malicious stuff into several edits to make it more difficult to discover -- unklar. ich hab das gefühl, die sind schon edit-basiert +