-
Lyudmila Vaseva authoredLyudmila Vaseva authored
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
todo 17.09 KiB
# Papers I still want to read
Check:
* Lovink,Tkacz - Wikipedia Reader
Mayo Fuster Morell - Wikimedia Foundation and Governance of Wikipedia's infrastructure
Shun-ling Chen - The Wikimedia Foundation and the Self-governing Wikipedia Community: A Dynamic Relationship Under Constant Negotiation
evtl for conclusion
* Winner - Do artifacts have politics
* Gillespie - Politics of Platforms
* Harassment Survey Results Report
For algorithmic governance:
Musiani - Governance by algorithms
for fun
* Cathedral & Bazaar
* Lam et al - wp:clubhouse
* Litman - The exclusive right to read - copyright commentary
* Lovink,Tkacz - Wikipedia Reader
Liang - Brief History of the Internet 15th - 18th century
* Wu - When code isn't law
(Where wizzards stay up late)
(Hands - Introduction politics, power and platformativity)
# Next steps
* makes something out of this comment "Reset filter, it hit the 5% limit somehow. -- Shirik 14 May 2010" https://en.wikipedia.org/wiki/Special:AbuseFilter/history/79/diff/prev/5529 : what is the 5% limit? What does it mean to reset a filter?
* check with the history exposed endpoint what filters were introduced immediately (first couple of hours/days): https://en.wikipedia.org/wiki/Special:AbuseFilter/history?user=&filter=
* are there further examples of such collaborations: consider scripting smth that parses the bots descriptions from https://en.wikipedia.org/wiki/Category:All_Wikipedia_bots and looks for "abuse" and "filter"
* consider adding permalinks with exact revision ID as sources!
* https://en.wikipedia.org/wiki/Category:Wikipedia_bot_operators
* an idea for the presi/written text: begin and end every part (section/paragraph) with a question: what question do I want to answer here? what question is still open?
* How many of the edit filter managers also run bots. How do they decide in which case to implement a bot and in which a filter?
* Why are there mechanisms triggered before an edit gets published (such as edit filters), and such triggered afterwards (such as bots)? Is there a qualitative difference?
* do bots check also entire article text and not only single edits? as a clever person with malicious intentions I could split my malicious stuff into several edits to make it more difficult to discover -- unklar. ich hab das gefühl, die sind schon edit-basiert; confirmed by C.
* how stable is the edit filter managers group? how often are new editors accepted? (who/how nominates them? maybe there aren't very many accepted, but then again if only 2 apply and both are granted the right, can you then claim it's exclusive?) -- I think it's somewhat stable. In the last 3 months nothing has changed; for instance, mid-end 2017 there was somewhat high traffic of people requesting edit-filter-helper permissions, since it was newly implemented around that time (before that you could only get the full edit filter manager or nothing at all); it also seems that since then the practice was established that people would request edit filter helper first and only then be perhaps promoted to edit filter manager
* I want to help people to do their work better using a technical system (e.g. the edit filters). How can I do this?
* The edit filter system can be embedded in the vandalism prevention frame. Are there other contexts/frames for which it is relevant?
* Read these pages
https://lists.wikimedia.org/mailman/listinfo
https://en.wikipedia.org/wiki/Wikipedia:Edit_warring
https://en.wikipedia.org/wiki/Wikipedia:Blocking_policy#Evasion_of_blocks
https://en.wikipedia.org/wiki/Wikipedia:Blocking_IP_addresses
https://meta.wikimedia.org/wiki/Vandalbot
https://en.wikipedia.org/wiki/Wikipedia:Most_vandalized_pages
https://en.wikipedia.org/wiki/Wikipedia:The_motivation_of_a_vandal
https://en.wikipedia.org/wiki/Wikipedia:Flagged_revisions
https://en.wikipedia.org/wiki/User:Emijrp/Anti-vandalism_bot_census
https://en.wikipedia.org/wiki/Wikipedia:Counter-Vandalism_Unit/Vandalism_studies/Study1
https://en.wikipedia.org/wiki/Wikipedia:Counter-Vandalism_Unit/Vandalism_studies/Study2
https://en.wikipedia.org/wiki/Wikipedia:Counter-Vandalism_Unit/Vandalism_studies/Obama_article_study
https://en.wikipedia.org/wiki/Wikipedia:TW
https://en.wikipedia.org/wiki/Wikipedia:Dispute_resolution
https://en.wikipedia.org/wiki/Wikipedia:Oversight
https://en.wikipedia.org/wiki/Wikipedia:Revision_deletion
https://en.wikipedia.org/wiki/Wikipedia:Linking_to_external_harassment
https://en.wikipedia.org/wiki/Wikipedia:Personal_security_practices
https://en.wikipedia.org/wiki/Wikipedia:On_privacy,_confidentiality_and_discretion
https://en.wikipedia.org/wiki/Wikipedia:How_to_not_get_outed_on_Wikipedia
https://en.wikipedia.org/wiki/Wikipedia:No_personal_attacks
https://en.wikipedia.org/wiki/Wikipedia:Newbies_aren%27t_always_clueless
https://en.wikipedia.org/wiki/Wikipedia:On_assuming_good_faith
* look at AbuseFilter extention code: how is a filter trigger logged?
https://github.com/wikimedia/mediawiki-extensions-AbuseFilter/blob/master/includes/AbuseFilter.php
* understand how are stats generated
* research filter development over time
* plot number of filters over time (maybe grouped by week instead of a year)
* get a feeling of the actions the filters triggered over time
* ping aaron/amir for access to a backend db to look at filters; explanation how this is helping the community is important
* questions from EN-state-of-the-art
"Non-admins in good standing who wish to review a proposed but hidden filter may message the mailing list for details."
// what is "good standing"?
// what are the arguments for hiding a filter? --> particularly obnoxious vandals can see how their edits are being filtered and circumvent them; (no written quote yet)
// are users still informed if their edit triggers a hidden filter?
Exemptions for "urgent situation" -- what/how are these defined?
Discussions may happen postfactum here and filter may be applied before having been thoroughly tested; in this case the corresponding editor is responsible for checking the logs regularly and making sure the filter acts as desired
"Because even the smallest mistake in editing a filter can disrupt the encyclopedia, only editors who have the required good judgment and technical proficiency are permitted to configure filters."
--> Who are these editors? Who decides they are qualified enough?
# Interesting pages
## Edit filters in different languages:
https://en.wikipedia.org/wiki/Wikipedia:Bots_are_annoying
https://de.wikipedia.org/wiki/Wikipedia:Bearbeitungsfilter
https://es.wikipedia.org/wiki/Wikipedia:Filtro_de_ediciones
https://ca.wikipedia.org/wiki/Viquip%C3%A8dia:Filtre_d%27edicions
https://ru.wikipedia.org/wiki/%D0%92%D0%B8%D0%BA%D0%B8%D0%BF%D0%B5%D0%B4%D0%B8%D1%8F:%D0%A4%D0%B8%D0%BB%D1%8C%D1%82%D1%80_%D0%BF%D1%80%D0%B0%D0%B2%D0%BE%D0%BA
(no bulgarian)
https://de.wikipedia.org/wiki/Spezial:Missbrauchsfilter
## Others
https://en.wikipedia.org/wiki/Wikipedia:Vandalism
https://en.wikipedia.org/wiki/Wikipedia:Vandalism_types
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter/Documentation
https://en.wikipedia.org/wiki/User:MusikBot/FilterMonitor/Recent_changes
https://en.wikipedia.org/w/index.php?title=Special:AbuseFilter&dir=prev
https://de.wikipedia.org/wiki/Hilfe:Bearbeitungsfilter
https://tools.wmflabs.org/ptwikis/Filters:dewiki
https://en.wikipedia.org/wiki/Wikipedia:Tags
https://en.wikipedia.org/wiki/Wikipedia:Usernames_for_administrator_attention (UAA)
https://en.wikipedia.org/wiki/Special:RecentChanges?hidebots=1&hidecategorization=1&hideWikibase=1&tagfilter=abusefilter-condition-limit&limit=50&days=7&urlversion=2
https://en.wikipedia.org/w/index.php?title=Wikipedia:Edit_filter&oldid=221158142 <--- 1st version ever of the Edit_filter page; created 23.06.2008
https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Abuse_filter&oldid=279022713#-modify_right_.28moved_from_AN.29
## In other languages
### Compare colors (actions) in the stats graphs!
https://tools.wmflabs.org/ptwikis/Filters:bgwiki <-- there are also global filters listed here! (what's that?)
https://tools.wmflabs.org/ptwikis/Filters:cawiki
https://tools.wmflabs.org/ptwikis/Filters:dewiki
https://tools.wmflabs.org/ptwikis/Filters:eswiki
## Software
## Questions
* weiß nicht wie relevant das ist, aber: wie funktionieren die gesichteten Versionen bei der Deutschen Wikipedia
# Questions to be asked to edit filter editors
* When did you join the edit filter group?
* How? (What was the process for joining)?
* Why?
* How active were you/what have you been doing since joining?
* What is an example of a typical case for which you will implement a filter?
* What is an example for a case you'd rather not implement a filter for but apply another process? Which process?
* Why does the edit filter mechanism exist? Aren't bots and ORES and semi-automatic tools such as Huggle or Twinkle enough for combating vandalism?
==========================================================
# Checked
## Edit filters in different languages:
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter
## Others
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter_noticeboard <-- announce new filters and put them up for discussion before approving them "for coordination and discussion of edit filter use and management."
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter/Requested
https://de.wikipedia.org/wiki/Wikipedia:Bearbeitungsfilter/Antr%C3%A4ge
https://en.wikipedia.org/wiki/Wikipedia:Long-term_abuse
https://en.wikipedia.org/wiki/Special:AbuseLog (+DE/ES/CAT/BG)
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter/False_positives
https://en.wikipedia.org/wiki/Special:AbuseFilter
https://en.wikipedia.org/wiki/Special:AbuseFilter/1
https://tools.wmflabs.org/ptwikis/Filters:enwiki:61
https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2009-03-23/Abuse_Filter
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter/Instructions
https://en.wikipedia.org/wiki/Wikipedia:No_original_research
https://en.wikipedia.org/wiki/Wikipedia_talk:Edit_filter/Archive_1
https://en.wikipedia.org/wiki/Category:Wikipedia_edit_filter
## Vandalism
https://en.wikipedia.org/wiki/Wikipedia:Vandalism
https://en.wikipedia.org/wiki/Wikipedia:Vandalism_types
https://en.wikipedia.org/wiki/Wikipedia:Administrator_intervention_against_vandalism
https://en.wikipedia.org/wiki/Wikipedia:Disruptive_editing
https://en.wikipedia.org/wiki/Wikipedia:Counter-Vandalism_Unit/Vandalism_studies
https://en.wikipedia.org/wiki/Wikipedia:Requests_for_page_protection
https://en.wikipedia.org/wiki/Wikipedia:Offensive_material
https://en.wikipedia.org/wiki/Wikipedia:Neutral_point_of_view
https://en.wikipedia.org/wiki/Wikipedia:Harassment
https://en.wikipedia.org/wiki/Wikipedia:Assume_good_faith
https://en.wikipedia.org/wiki/Wikipedia:STiki
https://en.wikipedia.org/wiki/Vandalism_on_Wikipedia
https://en.wikipedia.org/wiki/Category:Wikipedia_counter-vandalism_tools
https://en.wikipedia.org/wiki/Wikipedia:Counter-Vandalism_Unit
https://en.wikipedia.org/wiki/Wikipedia:Cleaning_up_vandalism
https://en.wikipedia.org/wiki/Wikipedia:Counter-Vandalism_Unit/Academy
https://en.wikipedia.org/wiki/Wikipedia:Long_term_abuse
https://en.wikipedia.org/wiki/Wikipedia:Sockpuppet_investigations
https://en.wikipedia.org/wiki/Wikipedia:Sock_puppetry
https://en.wikipedia.org/wiki/Wikipedia:Purpose
## Software
https://www.mediawiki.org/wiki/Extension:AbuseFilter/Actions
--> exists interestingly enough in all languages I'm interested in
https://www.mediawiki.org/wiki/Extension:AbuseFilter
https://www.mediawiki.org/wiki/Extension:AbuseFilter/Rules_format
https://phabricator.wikimedia.org/project/view/217/ <-- project tickets AbuseFilters extention
# Done
* Setup CSCW latex template up
* add "af_deleted" column to filter list
* Look at filters: what different types of filters are there? how do we classify them?
* add a special tag for filters targeting spam bots? (!!! important: do research on distinction/collaboration bots/filters)
* consider all types of vandalism (https://en.wikipedia.org/wiki/Wikipedia:Vandalism#Types_of_vandalism) when refining the self assigned tags
(Abuse of tags; Account creation, malicious; Avoidant vandalism; Blanking, illegitimate; Copyrighted material, repeated uploading of; Edit summary vandalism; Format vandalism; Gaming the system; Hidden vandalism; Hoaxing vandalism; Image vandalism; Link vandalism; Page creation, illegitimate; Page lengthening; Page-move vandalism; Silly vandalism; Sneaky vandalism; Spam external linking; Stockbroking vandalism; talk page vandalism; Template vandalism; User and user talk page vandalism; Vandalbots;)
* consider also other forms of (unintenionally) disruptive behaviour: boldly editing; copyright violation disruptive editing or stubbornness --> edit warring; edit summary omission; editing tests by experimenting users; harassment or personal attacks; Incorrect wiki markup and style; lack of understanding of the purpose of wikipedia; misinformation, accidental; NPOV contraventions (Neutral point of view); nonsense, accidental; Policy and guideline pages, good-faith changes to; Reversion or removal of unencyclopedic material, or of edits covered under the biographies of living persons policy; Deletion nominations;
-----
* classify in "vandalism"|"good_faith"|"biased_edits"|"misc" for now
* syntactic vs semantic vs ? (ALL CAPS is syntactic)
* are there ontologies?
* how is spam classified for example?
* add a README to github repo
// do the users notice the logging? or only "bigger" actions such as warnings/being blocked, etc.?
* look for db dumps
https://meta.wikimedia.org/wiki/Research:Quarry
https://meta.wikimedia.org/wiki/Toolserver
https://quarry.wmflabs.org/query/runs/all?from=7666&limit=50
https://upload.wikimedia.org/wikipedia/commons/9/94/MediaWiki_1.28.0_database_schema.svg
https://tools.wmflabs.org/
https://tools.wmflabs.org/admin/tools
https://www.mediawiki.org/wiki/API:Main_page
* create a developer account
Do smth with this info:
Claudia: * A focus on the Good faith policies/guidelines is a historical development. After the huge surge in edits Wikipedia experienced starting 2005 the community needed a means to handle these (and the proportional amount of vandalism). They opted for automatisation. Automated system branded a lot of good faith edits as vandalism, which drove new comers away. A policy focus on good faith is part of the intentions to fix this.
* We need a description of the technical workings of the edit filter system!
* How can we improve it from a computer scientist's/engineer's perspective?
* What task do the edit filters try to solve? Why does this task exist?/Why is it important?
* Think about: what's the computer science take on the field? How can we design a "better"/more efficient/more user friendly system? A system that reflects particular values (vgl Code 2.0, Chapter 3, p.34)?
* go over notes in the filter classification and think about interesting controversies, things that attract the attention
* what are useful categories
* GT is good for tackling controversial questions: e.g. are filters with disallow action a too severe interference with the editing process that has way too much negative consequences? (e.g. driving away new comers?)
* What can we study?
* Discussions on filter patterns? On filter repercussions?
* Whether filters work the desired way/help for a smoother Wikipedia service or is it a lot of work to maintain them and the usefullness is questionable?
* Question: Is it worth it to use a filter which has many side effects?
* What can we filter with a REGEX? And what not? Are regexes the suitable technology for the means the community is trying to achieve?
* What other data sources can I explore?
* Interview with filter managers? with admins? with new editors?
* check filter rules for edits in user/talks name spaces (may be indication of filtering harassment)
* There was the section of what are filters suitable for; should we check filters against this list?
* add also "af_enabled" column to filter list; could be that the high hit count was made by false positives, which will have led to disabling the filter (TODO: that's a very interesting question actually; how do we know the high number of hits were actually leggit problems the filter wanted to catch and no false positives?)
* https://ifex.org/international/2019/02/21/technology-block-internet/ <-- filters
* Geiger et al - Defense Mechanisms
* Halfaker et al - The rise and decline of an open collaboration system (evtl enough, don't have to read Suh at al in detail)
Urquhardt - Bringing theory back to grounded theory
# Feedback T
* gibts es vergleichbare concerns zu den Gamification concerns bei semi-automated tools bei anderen mechanismen?
* den Unterschied hervorheben: bots/semi-aut. tool: similar: automatic detection of potential vandalism; different: a person must click (in the tools)
* filters: BEFORE an edit is published; everything else: AFTER
* filters: REGEX!
* die wichtigsten erkenntnisse mehrmals erwähnen: intro, schluss, tralala; nicht dass sie unter gehen weil ich von lautern Bäumen den Wald nicht mehr sehe
* do bots check also entire article text and not only single edits? as a clever person with malicious intentions I could split my malicious stuff into several edits to make it more difficult to discover -- unklar. ich hab das gefühl, die sind schon edit-basiert