EDA and Quantifying Edit Wars
Topics
This week’s assignments will guide you through the following topics:
- EDA: undstanding contributions to Wikipedia
- Defining the M-Statistic
Reading
Please read the following:
- Edit Wars on Wikipedia Section 02 carefully. Really understand it
and try to verify what you are reading in the data.
- Aaron Halfaker, R. Stuart Geiger, Jonathan T. Morgan, and John
Riedl. “The Rise and Decline of an Open Collaboration System: How
Wikipedia’s Reaction to Popularity Is Causing Its Decline.” American
Behavioral
Scientist. [Link]
Tasks
Complete the following tasks:
- Calculate the M-Statistic for a single article with an ‘edit war’
(you may find script on WikiWarMonitor useful).
- Attempt an EDA on the light-dump data on the WikiWarMonitor website
(you may fall back to a smaller wikipedia, like simple english, if
needed). Your EDA should attempt to:
- Orient the reader to what edit sequences look like in time and how
they vary from article to article.
- Bonus: do a preliminary analysis of the M-statistic on many articles.
Weekly Questions
Answer the following questions:
- How is collaboration defined in Halfaker/Geiger?
- What article did you choose? What was the M-statistic?
- Give one observation from your EDA that was interesting. Be
specific and give specific values!
- What was one difficulty that you encountered this week?