%TODO: where should this go? Already kind of mentioned in the introducing a filter part
Since edit filters run against every edit saved on Wikipedia, it is generally adviced against rarely tripped filters and a number of alternatives is signaled to edit filter managers and editors proposing new filters.
Since edit filters run against every edit saved on Wikipedia, it is generally adviced against rarely tripped filters and a number of alternatives is offered to edit filter managers and editors proposing new filters.
%TODO: number of filters cannot grow endlessly, every edit is checked against all of them and this consumes computing power! (and apparently haven't been chucked with Moore's law). is this the reason why number of filters has been more or less constanst over the years?
"Each filter takes time to run, making editing (and to some extent other things) slightly slower. The time is only a few milliseconds per filter, but with enough filters that adds up. When the system is near its limit, adding a new filter may require removing another filter in order to keep the system within its limits."
\end{comment}
For example, there is the page protection mechanism that addresses problems on a single page.
For example, there is the page protection mechanism that addresses problems on a single page.
Also, title and spam blacklists exist and these might be the way to handle problems with page titles or link spam~\cite{Wikipedia:EditFilter}.
Also, title and spam blacklists exist and these might be the way to handle problems with page titles or link spam~\cite{Wikipedia:EditFilter}.
Moreover, it is adviced to run in-depth checks (for single articles) separately, e.g. by using bots~\cite{Wikipedia:EditFilterRequested}.
It is however worth mentioning that they not only operate alongside each other but also cooperate on occasions.
%TODO see Geiger's paper (banning of a vandal?) where the process of tracing the same malicious editor throughout Wikipedia and reverting their eidts with various tools (and issuing them warnings) leads to the (temporary) block of the editor
* consider collaborations filters/bots (e.g. MrZ Bot which puts editors found on the abuse log often on the AIV noticeboard.) are there further exampled for this kind of collaborations?
For instance, DatBot~\cite{Wikipedia:DatBot} monitors the abuse log\footnote{\url{https://en.wikipedia.org/wiki/Special:AbuseLog}}
and reports users tripping certain filters to WP:AIV (Administrator intervention against vandalism)\footnote{\url{https://en.wikipedia.org/wiki/Wikipedia:Administrator_intervention_against_vandalism}} and WP:UAA (usernames for administrator attention)\footnote{\url{https://en.wikipedia.org/wiki/Wikipedia:Usernames_for_administrator_attention}}.
%TODO are there further examples of such collaborations: consider scripting smth that parses the bots descriptions from https://en.wikipedia.org/wiki/Category:All_Wikipedia_bots and looks for "abuse" and "filter"
Apparently, Twinkle at least has the possibility of using heuristics from the abuse filter log for its queues.
Apparently, Twinkle at least has the possibility of using heuristics from the abuse filter log for its queues.
%TODO check. how about other tools
%TODO check. how about other tools
...
@@ -421,6 +418,8 @@ Apparently, Twinkle at least has the possibility of using heuristics from the ab
...
@@ -421,6 +418,8 @@ Apparently, Twinkle at least has the possibility of using heuristics from the ab
(Interesting side note: editing via TOR is disallowed altogether: "Your IP has been recognised as a TOR exit node. We disallow this to prevent abuse" or similar, check again for wording. Compare: "Users of the Tor anonymity network will show the IP address of a Tor "exit node". Lists of known Tor exit nodes are available from the Tor Project's Tor Bulk Exit List exporting tool." \url{https://en.wikipedia.org/wiki/Wikipedia:Vandalism})
(Interesting side note: editing via TOR is disallowed altogether: "Your IP has been recognised as a TOR exit node. We disallow this to prevent abuse" or similar, check again for wording. Compare: "Users of the Tor anonymity network will show the IP address of a Tor "exit node". Lists of known Tor exit nodes are available from the Tor Project's Tor Bulk Exit List exporting tool." \url{https://en.wikipedia.org/wiki/Wikipedia:Vandalism})
@@ -213,6 +213,11 @@ owing to quarries we have all the filters that were triggered from the filter lo
...
@@ -213,6 +213,11 @@ owing to quarries we have all the filters that were triggered from the filter lo
data is still not enough for us to talk about a tendency towards introducing more filters (after the initial dip)
data is still not enough for us to talk about a tendency towards introducing more filters (after the initial dip)
%TODO: number of filters cannot grow endlessly, every edit is checked against all of them and this consumes computing power! (and apparently haven't been chucked with Moore's law). is this the reason why number of filters has been more or less constanst over the years?
"Each filter takes time to run, making editing (and to some extent other things) slightly slower. The time is only a few milliseconds per filter, but with enough filters that adds up. When the system is near its limit, adding a new filter may require removing another filter in order to keep the system within its limits."
\end{comment}
\textbf{Most frequently triggered filters for each year:}
\textbf{Most frequently triggered filters for each year:}
* are there further examples of such collaborations: consider scripting smth that parses the bots descriptions from https://en.wikipedia.org/wiki/Category:All_Wikipedia_bots and looks for "abuse" and "filter"
* consider adding permalinks with exact revision ID as sources!
* an idea for the presi/written text: begin and end every part (section/paragraph) with a question: what question do I want to answer here? what question is still open?
* an idea for the presi/written text: begin and end every part (section/paragraph) with a question: what question do I want to answer here? what question is still open?
* How many of the edit filter managers also run bots. How do they decide in which case to implement a bot and in which a filter?
* How many of the edit filter managers also run bots. How do they decide in which case to implement a bot and in which a filter?
* Why are there mechanisms triggered before an edit gets published (such as edit filters), and such triggered afterwards (such as bots)? Is there a qualitative difference?
* Why are there mechanisms triggered before an edit gets published (such as edit filters), and such triggered afterwards (such as bots)? Is there a qualitative difference?