-
Lyudmila Vaseva authoredLyudmila Vaseva authored
- Meeting 09.05.2019 - Presentation at the HCC Research Meeting
- Feedback
- properties | - part of the "software"/ | - "bespoke code": run on | - heuristics obfusca- | - can be used by other tools | platform (MediaWiki ext) | user's infrastructure | ted by the interface | | - public filters are directly | - no requirement for the code | (but often configura- | | visible for everyone | to be made public | ble) | | interested | - you can relatively easily | | | | get all the filters; you | | | | easily get all the bots | | | | | | ---------------------|------------------------------------------------------------------------------------------------------------------------- | | | | | - edit filter managers | - no special persmissions/ | - rollback perm | - mostly Scoring platform | group (EN Wiki) | rights needed | | team (?) Who does this? | (abusefilter-modify perm) | - a bot gets a bot flag | | | | | | ---------------------|----------------------------------------------------------------------------------------------------------- | | | | | - become an edit filter manager | - get an approval to run the | - learn the tool | - understand ML What are the hurdles | - you have to only understand | bot from the BAG | - install Windows^^ | - formal requirements to to participate | REGEXes (relatively simple? | - programming knowledge | (some don't support | develop ORES? | although relatively fast quite | - understand APIs, .. | other OS) | | confusing) | - (but there is a lot to | - get the rollback perm | | | understand with all the | | | | mechanisms) | | | | | |
- Concerns | - powerful, can in theory block | | | | editors based on hidden | | | | filters | | | | | | |
- Form Remarks presi
- TODO
Meeting 09.05.2019 - Presentation at the HCC Research Meeting
Feedback
-
beware of wording: "vandalism" is quite a harsh term (see also naming discussion edit filters), try to avoid it especially in contexts where it's not clear whether we are indeed dealing with vandalism (potential harmful edits); maybe replace with "quality assurance/control" wherever suitable
-
of the 152 edit filter managers on EN wikipedia:
- how many are admins?
- how many run their own bots?
-
if an editor is both an edit filter manager and a bot developer: in which cases would they decide to implement a bot and in which a filter?
-
stick to research questions from Confluence, they are already carefully crafted and narrowed down as appropriate
-
aggregation: clarify for myself what is the argument/the story/the big picture; make sure I still see the forrest and not only a bunch of single trees
- start with headers from the thesis's outline
-
methodology: what are the sources of knowledge
- literature: what insights have we won from it?
- documentation (Wikipedia, MediaWiki pages): what have we learnt here
- data (filters stats, REGEX patterns): what do the filters actually do?
-
people didn't like the "1st line of defense" (concerning bots) wording^^; but as a matter of fact, if we stick with it, it's factually incorrect, since the "1st line of defense" are actually the edit filters
-
how stable is the edit filter managers group? how often are new editors accepted? (who/how nominates them? maybe there aren't very many accepted, but then again if only 2 apply and both are granted the right, can you then claim it's exclusive?)
-
Question: why are the filters still used, when there are all these fancy ML mechanisms?
- hypothesis: because it's simple! REGEXes are generally easier to understand than ML algorithms; it lowers the participation barriers
-
add to Long List of Interesting Questions: is there a qualitative difference between complaints of bots and complaints of filters?
-
where is the thesis going?
- should there be some recommended guidelines based on the insights?
- or some design recommendations?
- or maybe just a framework for future research: what are questions we just opened?; we still don't know the answer to and should be addressed by future research?
-
an idea for the presi/written text: begin and end every part (section/paragraph) with a question: what question do I want to answer here? what question is still open?
-
make a comparison table for the mechanisms
| Filters | Bots | Semi-Automated tools | ORES
| | | |
| - based on REGEXes | - rule/ML based | - rule/ML based | - ML framework
properties | - part of the "software"/ | - "bespoke code": run on | - heuristics obfusca- | - can be used by other tools | platform (MediaWiki ext) | user's infrastructure | ted by the interface | | - public filters are directly | - no requirement for the code | (but often configura- | | visible for everyone | to be made public | ble) | | interested | - you can relatively easily | | | | get all the filters; you | | | | easily get all the bots | | | | | | ---------------------|------------------------------------------------------------------------------------------------------------------------- | | | | | - edit filter managers | - no special persmissions/ | - rollback perm | - mostly Scoring platform | group (EN Wiki) | rights needed | | team (?) Who does this? | (abusefilter-modify perm) | - a bot gets a bot flag | | | | | | ---------------------|----------------------------------------------------------------------------------------------------------- | | | | | - become an edit filter manager | - get an approval to run the | - learn the tool | - understand ML What are the hurdles | - you have to only understand | bot from the BAG | - install Windows^^ | - formal requirements to to participate | REGEXes (relatively simple? | - programming knowledge | (some don't support | develop ORES? | although relatively fast quite | - understand APIs, .. | other OS) | | confusing) | - (but there is a lot to | - get the rollback perm | | | understand with all the | | | | mechanisms) | | | | | |
| | | |
| - censorship infrastructure? | - "botophobia" | - gamification |
Concerns | - powerful, can in theory block | | | | editors based on hidden | | | | filters | | | | | | |
Form Remarks presi
-
make an outline/structure slide for people to be able to follow more easily
-
insert page numbers!
-
keep the literature/documentation part as short as possible; for example for presenting the literature the diagram is sufficient
TODO
- present again, this time with a storyline