-
Lyudmila Vaseva authoredLyudmila Vaseva authored
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
EN-state-of-the-art 29.35 KiB
# Edit Filters: State of the Art on the EN Wikipedia
## When/Why were edit filters introduced?
"The AbuseFilter extension was enabled on the English Wikipedia in 2009."
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter
https://en.wikipedia.org/w/index.php?title=Wikipedia:Edit_filter was created on 23.06.2008
* Why?
"The AbuseFilter extension, developed by User:Werdna, is now enabled on English Wikipedia. The extension allows all edits to be checked against automatic filters and heuristics, which can be set up to look for patterns of vandalism including page move vandalism and juvenile-type vandalism, as well as common newbie mistakes. When a match is found, the extension can take specified actions, ranging from logging the edit to be checked, giving a warning (e.g. "did you really intend to blank a page?"), to more serious actions such as blocking users."
from https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2009-03-23/Abuse_Filter
("The Signpost is a monthly community-written and -edited online newspaper covering the English Wikipedia, its sister projects, the Wikimedia Foundation, and the Wikimedia movement at large." https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/About)
--> so, purpose (at least at the beginning): fight vandalism; find/.. common newbie mistakes
Note: User:Werdna
https://en.wikipedia.org/wiki/User:Werdna
"I'm Andrew Garrett. I started volunteering in 2005, and I worked at the Wikimedia Foundation from 2009 until I left in 2015 because 7 years is a long time. This is my personal account, so while I'll make mistakes, I intend for actions taken with this account to be in my personal capacity only. My work account was Andrew Garrett."
http://www.andrewjgarrett.com/
## What is an edit filter?
* Definition
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter
"The edit filter is a tool that allows editors in the edit filter manager group to set controls mainly[1] to address common patterns of harmful editing.
[1] Edit filters can and have been used to track or tag certain non-harmful edits, for example addition of WikiLove."
DEF:
"A filter automatically compares every edit made to Wikipedia against a defined set of conditions. If an edit matches the conditions of a filter, that filter will respond by logging the edit. It may also tag the edit summary, warn the editor, revoke his/her autoconfirmed status, and/or disallow the edit entirely.[2]"
Footnote 2: "The extension also allows for temporary blocking, but these features are disabled on the English Wikipedia." <-- TODO: Is there wikipedia on which it isn't disallowed?
// do the users notice the logging? or only "bigger" actions such as warnings/being blocked, etc.?
* Has the definition changed over time (Abuse filters --> edit filters)
"The term "edit filter" rather than "abuse filter" is currently used for user-facing elements of the filter as some of the edits it flags are not harmful;[1] the terms are otherwise synonymous."
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter
* Typology: is there one already? Otherwise we can label them per hand (both of us do this separatly)
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter
Hidden filters!
"Non-admins in good standing who wish to review a proposed but hidden filter may message the mailing list for details."
// what is "good standing"?
// what are the arguments for hiding a filter? --> particularly obnoctious vandals can see how their edits are being filtered and circumvent them; security through obscurity
"Filters should only be hidden where necessary, such as in long-term abuse cases where the targeted user(s) could review a public filter and use that knowledge to circumvent it. Filters should not generally be named after abusive editors, but rather with a simple description of the type of abuse, provided not too much information is given away."
// are users still informed if their edit triggers a hidden filter?
"For all filters, including those hidden from public view, a brief description of what the rule targets is displayed in the log, the list of active filters, and in any error messages generated by the filter. "
"Be careful not to test sensitive parts of private filters in a public test filter (such as Filter 1): use a private test filter (for example Filter 2) if testing is required."
harassment! mailinglist
"If it would not be desirable to discuss the need for a given edit filter on-wiki, such as where the purpose of the filter is to combat harassment by an abusive banned user who is likely to come across the details of the request, edit filter managers can be emailed directly or on the wikipedia-en-editfilters mailing list at wikipedia-en-editfilters@lists.wikimedia.org."
https://lists.wikimedia.org/mailman/listinfo/wikipedia-en-editfilters
"private mailing list used by English Wikipedia edit filter managers, "
"primarily for discussing hidden filters."
"The mailing list should not be used as a venue for discussions that could reasonably be held on-wiki."
There's also separate documentation of long term abuse (see notes)
## How do edit filters work?
### Governance/Community Layer
* How is a new filter introduced?
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter
"Except in urgent situations, new edit filters should generally be tested without any actions specified (simply enabled) until a good number of edits have been logged and checked before being implemented in "warn" or "disallow" modes. If the filter is receiving more than a very small percentage of false positives it should usually not be placed in 'disallow' mode."
Alternatives:
"Edit filter managers should be familiar with alternatives that might be more appropriate in a given situation. For example, problems on a single page might be better served with page protection, and problems with page titles or link spam may find the title blacklist and spam blacklist more effective respectively. Because edit filters check every edit in some way, filters that are tripped only rarely are discouraged. "
Exemptions for "urgent situation" -- what/how are these defined?
Discussions may happen postfactum here and filter may be applied before having been thoroughly tested; in this case the corresponding editor is responsible for checking the logs regularly and making sure the filter acts as desired
----
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter/Requested
"This page is for people without the abusefilter-modify permission or people without sufficient knowledge about the coding involved to make requests to enact edit filters."
There's a "Bear the following in mind:" checklist (see also Alternatives above)
- "Filters are applied to all edits. Therefore, problematic changes that apply to a single page are likely not suitable for an edit filter."
- filters, after adding up, make editing slower
- in depth checks should be done by a separate software that users run on their own machines
- no trivial errors should be catched by filters (ala style guidelines)
- there are Titles Blacklist and Link/Spam Blacklist
* Who can edit filters?
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter
"Because even the smallest mistake in editing a filter can disrupt the encyclopedia, only editors who have the required good judgment and technical proficiency are permitted to configure filters."
--> Who are these editors? Who decides they are qualified enough?
"Filters are created and configured by edit filter managers, but they can be requested by any editor."
"all administrators can view private filters"
"This group is assignable by administrators, who may also assign the right to themselves"
"The assignment of the edit filter manager user right to non-admins is highly restricted. It should only be requested by and given to highly trusted users, when there is a clear and demonstrated need for it."
"demonstrated knowledge of the extension's syntax and in understanding and crafting regular expressions is absolutely essential"
"Editors who are not edit filter managers should consider helping out at requested edit filters and troubleshooting at false positives to help gain experience and demonstrate these skills"
"Requests for assignment of the group to non-admins can be made at the edit filter noticeboard, where a discussion will be held before a decision is made;discussions are normally held open for 7 days."
"If an edit filter manager is misusing the user right, the concern should first be raised with them directly. If discussion does not resolve the issue, a request for discussion or removal of the user right may be made at the edit filter noticeboard. "
"If you have the edit filter manager user right, please ensure you follow the Password strength requirements and appropriate personal security practices. Two-factor authentication enrollment is available for edit filter managers. Because edit filters affect every edit made, a compromised account will be blocked and its privileges removed on grounds of site security. In the unlikely event that your account is compromised, notify an administrator or bureaucrat (for administrators) immediately so they can block your account and remove any sensitive privileges to prevent damage. "
//interessanterweise is 2factor-auth auch nur für diese speziellen Benutzer*innen erlaubt; sonst kann man die Seite nicht ansehen
List of current edit filter managers
EN: https://en.wikipedia.org/wiki/Special:ListUsers/abusefilter (currently: 155)
CAT: https://ca.wikipedia.org/wiki/Especial:Usuaris/abusefilter (currently: 4 users)
-- auf Spanisch/Deutsch/Russisch existiert die Rolle nicht; interessant zu wissen, ob sie iwo subsumiert wurde
-- auf Bulgarisch übrigens auch nicht, aber da existiert auch die gesamte EditFilter seite nicht
* How are problems with existing filters handled?
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter_noticeboard
- discuss current filter behaviour
- suggest filter for deletion, since it's not particularly helpful: " unnecessary, is preventing good edits, or is otherwise problematic,"
(you can also raise the issue directly with the filter manager who created or enabled the filter)
apart from that: current ongoing discussions on single filters/problems that may require a filter
### Technical Layer
Software: https://www.mediawiki.org/wiki/Extension:AbuseFilter
https://www.mediawiki.org/wiki/Extension:AbuseFilter/Rules_format
TODO: Flowchart of the filtering process!
* How is a new filter introduced?
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter/Instructions
"This section explains how to create a filter with some preliminary testing, so you don't flood the history page."
- read the docs https://www.mediawiki.org/wiki/Extension:AbuseFilter/Rules_format
- test with debugging tools https://en.wikipedia.org/wiki/Special:AbuseFilter/tools (visible only for users who are already in the edit filter managers user group)
- test with batch testing interface (dito)
- create logging only filter: https://en.wikipedia.org/wiki/Special:AbuseFilter/new (needs permissions)
- Post a message at WP:EFN (edit filter notice board), so other edit filter managers have a chance to improve it
- Finally, fully enable your filter, e.g. add warning, prevention, tagging, etc.
tips on controlling efficiency/order of operations
lazy evaluation: when 1st negative condition is met, filter terminates execution
"You should always order your filters so that the condition that will knock out the largest number of edits is first. Usually this is a user groups or a user editcount check; in general, the last condition should be the regex that is actually looking for the sort of vandalism you're targeting. "
https://en.wikipedia.org/wiki/Special:AbuseFilter
"PLEASE be careful. This is potent stuff. Unless it's urgent, always test your filters with no actions enabled first."
there seems to be a batch testing interface: https://en.wikipedia.org/wiki/Special:AbuseFilter/test
however, it says: "For security reasons, only users with the right to view private abuse filters or modify filters may use this interface."
so, you should already be aproved as a filter manager in order to test?
shouldn't all filter editors be able to test??
same goes for the debugging tools: https://en.wikipedia.org/wiki/Special:AbuseFilter/tools
* What happens when a filter gets triggered?
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter
What do filters do?/What actions they trigger (vgl DEF) in order of graveness:
- disallow -- editor is informed, if their edit is being disallowed and offered the option to report a false positive;
"It is also possible to have a user's autoconfirmed status revoked if a user trips the filter."
caution to use it seldomly and after a thorough discussion on what is a undesirable edit
- warn -- editor is informed that their edit may be problematic and given the option to save or abort the edit (and in report the false positive trigerred by the filter)
- add a tag - "edit is tagged for review by patrollers." -- TODO who are patrollers? are there some in lang versions other than EN?
"Patrols are a specialized type of WikiProject used in the English Wikipedia to watch over a class of pages and take any appropriate actions. Most patrol actions are performed by individual Wikipedians, but some are performed by bots—computer programs or preprogrammed scripts that make automated edits without a need for real time human decision-making. " https://en.wikipedia.org/wiki/Wikipedia:Patrols
- log the edit - "In this case, the edit is merely added to the AbuseLog. When testing new filters, this is the suggested setting to use."
- "throttle"? (mentioned somewhere else)
- https://tools.wmflabs.org/ptwikis/Filters:enwiki::102&11:102&11 mentions "block" as a possible action in the legend
9 different actions possible according to the extention docu (are users whose edits tripped the filters notified for all of them?)
https://www.mediawiki.org/wiki/Extension:AbuseFilter/Actions
2.1 Logging: All filter matches are logged in the abuse log. This cannot be turned off. (so, every filter trigger is always being logged?)
2.2 Warning: The user is warned that their edit may not be appreciated, and is given the opportunity to submit it again. You may specify a specific system message containing the warning to display.
2.3 Throttling: The filter will only match if a rate limit is tripped. You can specify the number of actions to allow, the period of time in which these actions must occur, and how those actions are grouped.
The groupings are which sets of people should have aggregate (shared) throttles. That is, if you type "user", then the same user must match the filter a certain number of times in a certain period of time. You may also combine groups with commas to specify that throttle matches sharing all criteria will be aggregated. For example, using "ip,page", X filter matches in Y seconds from the same IP address to the same page will be required to trip the remainder of the actions.
(So this is something like, do this and that if the user has always received X warnings?)
2.4 Disallowing: Actions matching the filter will be prevented, and a descriptive error message will be shown.
2.5 Revoking auto-promoted groups: Actions matching the filter will cause the user in question to be barred from receiving any extra groups from $wgAutopromote for a period ranging from 3 to 7 days (random). This can be restored at the debug tools page.
2.6 Blocking: Users matching the filter will be blocked indefinitely, with a descriptive block summary indicating the rule that was triggered.
2.7 Removing from privileged groups: Users matching the filter will be removed from all privileged groups (sysop, bureaucrat, etc). A descriptive summary will be used, detailing the rule that was triggered.
2.8 Range-blocking:Somewhat of a "nuclear option", the entire /16 range from which the rule was triggered will be blocked for 1 week.
2.9 Tagging: The edit or change can be 'tagged' with a particular tag, which will be shown on Recent Changes, contributions, logs, new pages, history, and everywhere else. These tags are styleable, so you can have items with a certain tag appear in a different colour or similar.
Collaboration with bots:
"There is a bot reporting users tripping certain filters at WP:AIV and WP:UAA; you can specify the filters here."
https://en.wikipedia.org/wiki/User:DatBot/filters
* what happens afterwards?
If a user disagrees with the filter decision, they have the posibility of reporting a false positive
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter/False_positives
* modifying a filter
each filter has a designated page: e.g. https://en.wikipedia.org/wiki/Special:AbuseFilter/61
where following information can be viewed:
Filter id; public description; filter hits; statistics; code (conditions); notes (left by filter editors, generally to log changes); flags ("Hide details of this filter from public view", "enable this filter", "mark as deleted");
links to: last modified (with diff and user who modified it), edit filter's history; "export this filter to another wiki" tool;
Actions to take when matched:
Trigger actions only if the user trips a rate limit
Trigger these actions after giving the user a warning
Prevent the user from performing the action in question
Revoke the user's autoconfirmed status
Tag the edit in contributions lists and page histories
and the filter can be modified if the viewing editor has the right permissions
statistics are info such as "Of the last 1,728 actions, this filter has matched 10 (0.58%). On average, its run time is 0.34 ms, and it consumes 3 conditions of the condition limit." // not sure what the condition limit is
## Taking stock (a descriptive overview/stats of the current situation)
* how many filters are there (were there over the years)
EN: "There are currently 203 enabled filters, and 11 stale filters with no hits in the past 30 days (Purge). " (06.12.2018)
* and of the development over time
https://en.wikipedia.org/wiki/Wikipedia:Edit_filter_noticeboard
According to the Edit filter Notice board:
"There are currently 196 enabled filters and 13 stale filters with no hits in the past 30 days (Purge). See also the edit filter graphs." (Stand: 24.11.2018)
"There are currently 198 enabled filters and 11 stale filters with no hits in the past 30 days (Purge). See also the edit filter graphs." (Stand: 25.11.2018, seems to change frequently!)
EN: There are currently 201 enabled filters, and 12 stale filters with no hits in the past 30 days (Purge). from https://en.wikipedia.org/wiki/Special:AbuseFilter (29.11.2018)
--> upward tendency^^; not particularly significant
owing to quarries we have all the filters that were triggered from the filter log per year, from 2009 (when filters were first introduced/the MediaWiki extension was enabled) till end of 2018 with their corresponding number of times being triggered:
$ wc quarry-32489-en-wp-all-log-entries-before-20100101-run318768.csv
220 220 1879 quarry-32489-en-wp-all-log-entries-before-20100101-run318768.csv
$ wc quarry-32492-en-wp_-all-abuse-filter-log-entries-in-2010-run318774.csv
163 163 1476 quarry-32492-en-wp_-all-abuse-filter-log-entries-in-2010-run318774.csv
$ wc quarry-32493-en-wp_-all-abuse-filter-log-entries-in-2011-run318775.csv
161 161 1480 quarry-32493-en-wp_-all-abuse-filter-log-entries-in-2011-run318775.csv
$ wc quarry-32493-en-wp_-all-abuse-filter-log-entries-in-2012-run318778.csv
170 170 1576 quarry-32493-en-wp_-all-abuse-filter-log-entries-in-2012-run318778.csv
$ wc quarry-32495-en-wp_-all-abuse-filter-log-entries-in-2013-run318779.csv
178 178 1632 quarry-32495-en-wp_-all-abuse-filter-log-entries-in-2013-run318779.csv
$ wc quarry-32496-en-wp_-all-abuse-filter-log-entries-in-2014-run318780.csv
154 154 1434 quarry-32496-en-wp_-all-abuse-filter-log-entries-in-2014-run318780.csv
$ wc quarry-32497-en-wp_-all-abuse-filter-log-entries-in-2015-run318782.csv
200 200 1845 quarry-32497-en-wp_-all-abuse-filter-log-entries-in-2015-run318782.csv
$ wc quarry-32499-en-wp_-all-abuse-filter-log-entries-in-2016-run318789.csv
204 204 1902 quarry-32499-en-wp_-all-abuse-filter-log-entries-in-2016-run318789.csv
$ wc quarry-32500-en-wp_-all-abuse-filter-log-entries-in-2017-run318797.csv
231 231 2135 quarry-32500-en-wp_-all-abuse-filter-log-entries-in-2017-run318797.csv
$ wc quarry-32503-en-wp_-all-abuse-filter-log-entries-in-2018-run318831.csv
254 254 2353 quarry-32503-en-wp_-all-abuse-filter-log-entries-in-2018-run318831.csv
data is still not enough for us to talk about a tendency towards introducing more filters (after the initial dip)
10 most active filters per year:
==> quarry-32489-en-wp-all-log-entries-before-20100101-run318768.csv <==
afl_filter,count(*)
135,175455
30,160302
61,147377
18,133640
3,95916
172,89710
50,88827
98,80434
65,74098
132,68607
==> quarry-32492-en-wp_-all-abuse-filter-log-entries-in-2010-run318774.csv <==
afl_filter,count(*)
61,245179
135,242018
172,148053
30,119226
225,109912
3,105376
50,101542
132,78633
189,74528
98,54805
==> quarry-32493-en-wp_-all-abuse-filter-log-entries-in-2011-run318775.csv <==
afl_filter,count(*)
61,218493
135,185304
172,119532
402,109347
30,89151
3,75761
384,71911
225,68318
50,67425
432,66480
==> quarry-32493-en-wp_-all-abuse-filter-log-entries-in-2012-run318778.csv <==
afl_filter,count(*)
135,173830
384,144202
432,126156
172,105082
30,93718
3,90724
380,67814
351,59226
279,58853
225,58352
==> quarry-32495-en-wp_-all-abuse-filter-log-entries-in-2013-run318779.csv <==
afl_filter,count(*)
135,133309
384,129807
432,94017
172,92871
30,85722
279,76738
3,70067
380,58668
491,55454
225,48390
==> quarry-32496-en-wp_-all-abuse-filter-log-entries-in-2014-run318780.csv <==
afl_filter,count(*)
384,111570
135,111173
279,97204
172,82042
432,75839
30,62495
3,60656
636,52639
231,39693
380,39624
==> quarry-32497-en-wp_-all-abuse-filter-log-entries-in-2015-run318782.csv <==
afl_filter,count(*)
650,226460
61,196986
636,191320
527,189911
633,162319
384,141534
279,110137
135,99057
686,95356
172,82874
==> quarry-32499-en-wp_-all-abuse-filter-log-entries-in-2016-run318789.csv <==
afl_filter,count(*)
527,437099
61,274945
650,229083
633,218696
636,179948
384,179871
279,106699
135,95131
172,79843
30,68968
==> quarry-32500-en-wp_-all-abuse-filter-log-entries-in-2017-run318797.csv <==
afl_filter,count(*)
61,250394
633,218146
384,200748
527,192441
636,156409
650,151604
135,80056
172,70837
712,59537
833,58133
==> quarry-32503-en-wp_-all-abuse-filter-log-entries-in-2018-run318831.csv <==
afl_filter,count(*)
527,358210
61,234867
633,201400
384,177543
833,161030
636,144674
650,79381
135,75348
686,70550
172,64266
what do the most active filters do?
135 publicly available description: "repeating characters"; tag, warn
30 "large deletion from article by new editors"; tag, warn
61 "new user removing references" ("new user" is handled by "!("confirmed" in user_groups)"); tag
18 "test type edits from clicking on edit bar" (people don't replace Example texts when click-editing); filter seems to have been deleted in Feb 2012
3 "new user blanking articles"; tag, warn
172 "section blanking"; tag
50 "shouting" (contribution consists of all caps, numbers and punctuation); tag, warn
98 "creating very short new article"; tag
65 "excessive whitespace" (note: "associated with ascii art and some types of vandalism"); seems to have been deleted in Jan 2010
132 "removal of all categories"; tag, warn
225 "vandalism in all caps" (difference to 50? seems to be swear words, but shouldn't they be catched by 50 anyway?); k, action is "disallow"
189 "BLP vandalism or libel" (äh.. wat? seems to be insulting living people); tag
402 "new article without references"; seems to have been deleted in Apr 2013, before that disabled with comment "disabling, no real use"
384 "addition of bad words or other vandalism" (seems to be a blacklist); disallow
432 "starting new line with lower case letters"; tag, warn //I recall there was a rule of thumb recommending not to user filters for style things? although that's not really style, but rather wrong grammar..
380 hidden; public comment "multiple obscenities"; disallow
351 "text added after categories and interwiki"; tag, warn
279 "repeated attempts to vandalise"; tag, throttle (triggered when someone hits "edit" repeatedly in a short ammount of time)
491 "edits ending with emoticons or !"; tag, warn
636 "unexplained removal of sourced content"; warn (that, together with 634 and 635 refutes my theory that warn always goes together with tag)
231 "long string of characters containing no spaces" (that's surely english though^^); tag, warn
650 "creation of a new article without any categories"; weird, it's markes as enabled here https://en.wikipedia.org/wiki/Special:AbuseFilter/650 , but does not appear in the actions data set; ah, ok, that is because there are no actions (other than logging probably)
527 hidden; public comments "T34234: log/throttle possible sleeper account creations"; throttle
633 "possible canned edit summary" (I think that's an edit summary that does not reflect the real edit; pre-filled on mobile though); tag
686 "IP adding possible unreferenced material to BLP" (BLP= biography of living people? I thought, it was forbidden to edit them without a registered account); no actions
712 "possibly changing date of birth in infobox" ("possibly"? and I thought infoboxes were pre-generated from wikidata?); no actions
833 "newer user possibly adding a unreferenced or improperly referenced material"; no actions
* get a sense of what gets filtered (more qualitative)
* vandalism
* unintentional suboptimal behavior from new users who don't know better (blanking an article/section; creating an article without categories; adding larger texts without references; large unwikified new article (#180)); or from users who are too lazy (to write proper edit summaries)
* editing behaviours and styles not suitable for an encyclopedia (poor grammar/not commiting to ortography norms; use of emoticons and !; ascii art?)
* "unexplained removal of sourced content" (#636) may be an attempt to silence a view point the editor doesn't like
* related: "IP adding possibly unreferenced material to BLP" (#686): similar to previous one; attempt at self-promotion? or libel
* and "users creating autobiographies" (#148)
* "log/throttle possible sleeper account creations" (#527): attempt to prevent sock-puppeting
* 712 is kinda weird^^
* 364 is a hidden filter with "changing the name in a BLP infobox": targeted vandalism?
* possible copyright issues/link spam for SEO: "adding external images/links" (#220)
* why get certain filters
(and not others?)
* has the willingness of the community to use filters increased over time?
looking at aggregated values of number of triggered filters per year, the answer is rather it's quite constant
* how often were (which) filters triggered
https://tools.wmflabs.org/ptwikis/Filters:enwiki
--> stats for the last 30 days: filters sorted alphabetically, stats according to number of actions of each type (no action match, warn, tag, disallow, block)
https://tools.wmflabs.org/ptwikis/Filters:enwiki:102
--> for single filters you can view a graph of filter activity for the last year
also possible to compare a small amount of filters on a graph:
https://tools.wmflabs.org/ptwikis/Filters:enwiki:102&11
https://en.wikipedia.org/wiki/Special:AbuseFilter
Sortable table of all filters with following columns:
Filter ID Public description Actions Status Last modified Visibility Hit count
links to single filters, e.g. --> https://en.wikipedia.org/wiki/Special:AbuseFilter/1 (see bellow for detailed filter page)
"Actions" is one of: warn | tag | disallow | throttle | ?? (possibly more, not directly visible)
"Status" is: enabled | disabled
"Last modified" provides a link to diff between versions and the user who did the modification
"Visibility" is: private | public
"Hit count": which period is counted? total number of hits since the filter was enabled? (for all enabled periods, in case it was enabled/disabled multiple times?)
Filter with most hits (altogether):
Filter ID Public description Actions Status Last modified Visibility Hit count
61 New user removing references Tag Enabled 12:43, 14 May 2017 by Zzuuzz (talk | contribs) Public 1,593,851 hits
see also quarry-32518;
the thing is, we can't really classify hits by filter actions since actions triggered by the filters change
https://en.wikipedia.org/wiki/Special:AbuseFilter/61
statistics are info such as "Of the last 1,728 actions, this filter has matched 10 (0.58%). On average, its run time is 0.34 ms, and it consumes 3 conditions of the condition limit." // not sure what the condition limit is
List information about filters:
https://en.wikipedia.org/w/api.php?action=query&list=abusefilters&abfshow=!private&abfprop=id%7Chits
or in the sandbox:
https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=abusefilters&abfshow=!private&abfprop=id%7Chits
List instances where actions triggered an abuse filter.
https://en.wikipedia.org/w/api.php?action=query&list=abuselog&afluser=SineBot&aflprop=ids
or in the sandbox:
https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=abuselog&afluser=SineBot&aflprop=ids
* percentage of triggered filters/all edits
* break down triggered filters according to typology
* percentage filters of different types over the years
the thing is, we can't really classify hits by filter actions since actions triggered by the filters change
we know what the actions of a given filter are just for now...
maybe the dumps can answer this question?
We can try to map some of the descriptive statistics with the WM quarry service:
https://quarry.wmflabs.org/query/32483
https://quarry.wmflabs.org/query/32489
https://quarry.wmflabs.org/query/32487
give us an idea of what data the abuse filter related tables contain
Results: see above