diff --git a/notes b/notes index 2039175bff12d25d27e27818fdf506c7bb07704a..2d612ce8eacf0da37959ab4fc71bf7286f64bc86 100644 --- a/notes +++ b/notes @@ -1724,3 +1724,9 @@ is_bot edits Percentage of all edits \item how many different edit filter editors are there (af\_user)? \item categorise filters according to which name spaces they apply to; pay special attention to edits in user/talks name spaces (may be indication of filtering harassment) \end{itemize} + +\textbf{Questions on abuse\_filter\_action table} +\begin{itemize} + \item how many filters trigger any particular action (at the moment)? + \item how many different parameters are there (i.e. tags when tagging, or templates to show upon a warning)? +\end{itemize} diff --git a/thesis/5-Overview-EN-Wiki.tex b/thesis/5-Overview-EN-Wiki.tex index 3d9f874b3849ab9334af1a3073ed2c54324292f0..1035b6b6465f08a3371f8820b3821735e138f90b 100644 --- a/thesis/5-Overview-EN-Wiki.tex +++ b/thesis/5-Overview-EN-Wiki.tex @@ -62,11 +62,22 @@ abuse_filter In this section, we explore some general patterns of the edit filters on Engish Wikipedia, or respectively the data from the \emph{abuse\_filter} table. The scripts that generate the statistics discussed here, can be found in the jupyter notebook in the project's repository. %TODO add link after repository has been cleaned up -As of January 6th, 2019 there are 954 filters in this table. +As of January 6th, 2019 there are $954$ filters in this table. It should be noted, that if a filter gets deleted, merely a flag is set to indicate so, but no entries are removed from the database. -So, the above mentioned 954 filters are all filters ever made up to this date. +So, the above mentioned $954$ filters are all filters ever made up to this date. This doesn't mean that it never changed what the filters are doing, since, as pointed out in chapter~\ref{}, edit filter managers can freely modify filter patterns, so at some point the filter could be doing one thing and in the next moment, it is filtering a completely different phenomenon. This doesn't happen very often though. +$361$ of all filters are public, the remaining $593$–hidden. +$110$ of the public ones are active, $35$ are disabled, but not marked as deleted, and $216$ are flagged as deleted. +Out of the $593$ hidden filters $91$ are active, $118$ are disabled (not deleted), and $384$ are deleted. +The relative proportion of these groups to each other can be viewed on figure~\ref{fig:general-stats}. + +\begin{figure} +\centering + \includegraphics[width=0.9\columnwidth]{pics/general_stats.png} + \caption{EN Wikipedia edit filters: hidden, disabled and deleted filters}~\label{fig:general-stats} +\end{figure} + Tables ... show how many new filters have been introduced over the years. And how many filters have been active (``enabled'') over the years. %TODO do I have data for this @@ -115,10 +126,10 @@ There are in the meantime over 5 pages of them, it is definitely happening autom TODO: download data; write script to identify actions that triggered the filters (accountcreations? edits?) and what pages were edited Note: do hidden filters appear in this numbers and in the table? (They are definitely not displayed in the front end of the AbuseLog) \end{comment} -%TODO strectch plot so months are readable +%TODO strectch plot so months are readable; darn. now it's too small on the pdf. Fix it! May be rotate to landscape? \begin{figure} \centering - \includegraphics[width=0.9\columnwidth]{pics/number-filter-hits.png} + \includegraphics[width=0.9\columnwidth]{pics/filter-hits-zoomed.png} \caption{EN Wikipedia edit filters: Number of hits per month}~\label{fig:filter-hits} \end{figure} @@ -205,11 +216,6 @@ Most public filters on the other hand still assume good faith from the editors a \item in which namespaces get filters triggered most frequently? \end{itemize} -\textbf{Questions on abuse\_filter\_action table} -\begin{itemize} - \item how many filters trigger any particular action (at the moment)? - \item how many different parameters are there (i.e. tags when tagging, or templates to show upon a warning)? -\end{itemize} \end{comment} @@ -313,12 +319,6 @@ It draws attention that currently nearly $2/3$ of all edit filters are not viewa Unfortunately, without the full \emph{abuse\_filter\_history} table we cannot know how this ration has developed historically. However, the numbers fit the assertion of the extension's core developer according to whom edit filters target particularly determined vandals. -\begin{figure} -\centering - \includegraphics[width=0.9\columnwidth]{pics/general_stats.png} - \caption{EN Wikipedia edit filters: hidden, disabled and deleted filters}~\label{fig:general-stats} -\end{figure} - Although the initial plan was to make all filters hidden, the community discussions rebutted that so a guideline was drafted calling for hiding filters ``only where necessary, such as in long-term abuse cases where the targeted user(s) could review a public filter and use that knowledge to circumvent it.''~\cite{Wikipedia:EditFilter}. Further, caution in filter naming is suggested for hidden filters and editors are encouraged to give such filters just simple description of the overall disruptive behaviour rather than naming a specific user that is causing the disruptions. diff --git a/thesis/conclusion.tex b/thesis/conclusion.tex index 8c6e088c873aa8d958392b26fc22ad2bfdef2dd9..96f3a62acc42d11099e6c0c2d55f01265f4f04c7 100644 --- a/thesis/conclusion.tex +++ b/thesis/conclusion.tex @@ -71,6 +71,7 @@ timeline Interesting fact: there are edit filters that try to precisely identify the upload of media violating copyrights %TODO refer to Lessig, Chapter 10 when making the upload filter commentary +% think about what values are embedded how in what systems (Lessig) From talk archive: "Automatic censorship won't work on a wiki. " // so, people already perceive this as censorship; user goes on to basically provide all the reasons why upload filters are bad idea (Interlanguage problems, no recognition of irony, impossibility to discuss controversial issues); they also have a problem with being blocked by a technology vs a real person