@@ -346,14 +346,14 @@ So far, I haven't managed to trigger a filter with a different action.
\textbf{Interesting questions}
\begin{itemize}
\item how many filters are there (were there over the years)
\item what do the most active filters do?
\item get a sense of what gets filtered (more qualitative)
\item has the willingness of the community to use filters increased over time?: looking at aggregated values of number of triggered filters per year, the answer is rather it's quite constant
\item how often were (which) filters triggered
\item percentage of triggered filters/all edits; break down triggered filters according to typology
\item percentage filters of different types over the years
\item what gets classified as vandalism? has this changed over time? (look at words and patterns triggered by the vandalism filters; read vandalism policy page)
\item how many filters are there (were there over the years): 954 filters (stand: 06.01.2019); TODO: historically?
\item what do the most active filters do?: see~\ref{tab:most-active-actions}
\item get a sense of what gets filtered (more qualitative): TODO: refine after sorting through manual categories; preliminary: vandalism; unintentional suboptimal behavior from new users who don't know better ("good faith edits") such as blanking an article/section; creating an article without categories; adding larger texts without references; large unwikified new article (#180); or from users who are too lazy (to write proper edit summaries; editing behaviours and styles not suitable for an encyclopedia (poor grammar/not commiting to ortography norms; use of emoticons and !; ascii art?); "unexplained removal of sourced content" (#636) may be an attempt to silence a view point the editor doesn't like; self-promotion(adding unreferenced material to BLP; "users creating autobiographies" 148;); harassment; sockpuppetry; potential copyright violations
\item has the willingness of the community to use filters increased over time?: looking at aggregated values of number of triggered filters per year, the answer is rather it's quite constant; TODO: plot it at a finer granularity
\item how often were (which) filters triggered: see \url{filter-lists/20190106115600_filters-sorted-by-hits.csv} and~\ref{tab:most-active-actions}; TODO aggregate hitcounts over tagged categories after finished tagging
\item percentage of triggered filters/all edits; break down triggered filters according to typology: TODO still need the complete abuse\_filter\_log table!; and probably further dumps in order to know total number of edits
\item percentage filters of different types over the years: TODO according to actions (I need a complete abuse\_filter\_log table for this!); according to self-assigned tags (finish tagging!)
\item what gets classified as vandalism? has this changed over time? TODO: (look at words and patterns triggered by the vandalism filters; read vandalism policy page); pay special attention to filters labeled as vandalism by the edit filter editors (i.e. in the public description) vs these I labeled as vandalism
\end{itemize}
\textbf{Questions on abuse\_filter table}
...
...
@@ -369,7 +369,8 @@ So far, I haven't managed to trigger a filter with a different action.
\item what are the values in the "group" column? what do they mean?
\item which are the most frequently triggered filters of all time?
\item is it new filters that get triggered most frequently? or are there also very active old ones?
\item how many different log filter editros are there (af\_user)?
\item how many different edit filter editros are there (af\_user)?
\item categorise filters according to which name spaces they apply to; pay special attention to edits in user/talks name spaces (may be indication of filtering harassment)