-
Lyudmila Vaseva authoredLyudmila Vaseva authored
- Meeting 09.05.2019 - Presentation at the HCC Research Meeting
- Feedback
- properties | - part of the "software"/ | - "bespoke code": run on | - heuristics obfusca- | - can be used by other tools | platform (MediaWiki ext) | user's infrastructure | ted by the interface | | - public filters are directly | - no requirement for the code | (but often configura- | | visible for everyone | to be made public | ble) | | interested | - you can relatively easily | | | | get all the filters; you | | | | easily get all the bots | | | | | | ---------------------|------------------------------------------------------------------------------------------------------------------------- | | | | | - edit filter managers | - no special persmissions/ | - rollback perm | - mostly Scoring platform | group (EN Wiki) | rights needed | | team (?) Who does this? | (abusefilter-modify perm) | - a bot gets a bot flag | | | | | | ---------------------|----------------------------------------------------------------------------------------------------------- | | | | | - become an edit filter manager | - get an approval to run the | - learn the tool | - understand ML What are the hurdles | - you have to only understand | bot from the BAG | - install Windows^^ | - formal requirements to to participate | REGEXes (relatively simple? | - programming knowledge | (some don't support | develop ORES? | although relatively fast quite | - understand APIs, .. | other OS) | | confusing) | - (but there is a lot to | - get the rollback perm | | | understand with all the | | | | mechanisms) | | | | | |
- Concerns | - powerful, can in theory block | | | | editors based on hidden | | | | filters | | | | | | |
- Form Remarks presi
- TODO
Meeting 09.05.2019 - Presentation at the HCC Research Meeting
Feedback
-
beware of wording: "vandalism" is quite a harsh term (see also naming discussion edit filters), try to avoid it especially in contexts where it's not clear whether we are indeed dealing with vandalism (potential harmful edits); maybe replace with "quality assurance/control" wherever suitable
-
of the 152 edit filter managers on EN wikipedia:
- how many are admins?
- how many run their own bots?
-
if an editor is both an edit filter manager and a bot developer: in which cases would they decide to implement a bot and in which a filter?
-
stick to research questions from Confluence, they are already carefully crafted and narrowed down as appropriate Q1 We wanted to improve our understanding of the role of filters in existing algorithmic quality-control mechanisms (bots, ORES, humans). Q2 Which type of tasks do these filters take over in comparison to the other mechanisms? How these tasks evolve over time (are they changes in the type, number, etc.)? Q3 Since filters are classical rule-based systems, what are suitable areas of application for such rule-based system in contrast to the other ML-based approaches.
-
aggregation: clarify for myself what is the argument/the story/the big picture; make sure I still see the forrest and not only a bunch of single trees
- start with headers from the thesis's outline
-
methodology: what are the sources of knowledge
- literature: what insights have we won from it?
- documentation (Wikipedia, MediaWiki pages): what have we learnt here
- data (filters stats, REGEX patterns): what do the filters actually do?
-
people didn't like the "1st line of defense" (concerning bots) wording^^; but as a matter of fact, if we stick with it, it's factually incorrect, since the "1st line of defense" are actually the edit filters
-
how stable is the edit filter managers group? how often are new editors accepted? (who/how nominates them? maybe there aren't very many accepted, but then again if only 2 apply and both are granted the right, can you then claim it's exclusive?)
-
Question: why are the filters still used, when there are all these fancy ML mechanisms?
- hypothesis: because it's simple! REGEXes are generally easier to understand than ML algorithms; it lowers the participation barriers
-
add to Long List of Interesting Questions: is there a qualitative difference between complaints of bots and complaints of filters?
-
where is the thesis going?
- should there be some recommended guidelines based on the insights?
- or some design recommendations?
- or maybe just a framework for future research: what are questions we just opened?; we still don't know the answer to and should be addressed by future research?
-
an idea for the presi/written text: begin and end every part (section/paragraph) with a question: what question do I want to answer here? what question is still open?
-
make a comparison table for the mechanisms
| Filters | Bots | Semi-Automated tools | ORES
| | | |
| - based on REGEXes | - rule/ML based | - rule/ML based | - ML framework
properties | - part of the "software"/ | - "bespoke code": run on | - heuristics obfusca- | - can be used by other tools | platform (MediaWiki ext) | user's infrastructure | ted by the interface | | - public filters are directly | - no requirement for the code | (but often configura- | | visible for everyone | to be made public | ble) | | interested | - you can relatively easily | | | | get all the filters; you | | | | easily get all the bots | | | | | | ---------------------|------------------------------------------------------------------------------------------------------------------------- | | | | | - edit filter managers | - no special persmissions/ | - rollback perm | - mostly Scoring platform | group (EN Wiki) | rights needed | | team (?) Who does this? | (abusefilter-modify perm) | - a bot gets a bot flag | | | | | | ---------------------|----------------------------------------------------------------------------------------------------------- | | | | | - become an edit filter manager | - get an approval to run the | - learn the tool | - understand ML What are the hurdles | - you have to only understand | bot from the BAG | - install Windows^^ | - formal requirements to to participate | REGEXes (relatively simple? | - programming knowledge | (some don't support | develop ORES? | although relatively fast quite | - understand APIs, .. | other OS) | | confusing) | - (but there is a lot to | - get the rollback perm | | | understand with all the | | | | mechanisms) | | | | | |
| | | |
| - censorship infrastructure? | - "botophobia" | - gamification |
Concerns | - powerful, can in theory block | | | | editors based on hidden | | | | filters | | | | | | |
Form Remarks presi
-
make an outline/structure slide for people to be able to follow more easily
-
insert page numbers!
-
keep the literature/documentation part as short as possible; for example for presenting the literature the diagram is sufficient
TODO
- present again, this time with a storyline