Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Who Writes Wikipedia? (2006) (aaronsw.com)
92 points by luu on March 3, 2020 | hide | past | favorite | 68 comments


    Wales is right about one thing, though. This fact does have enormous policy
    implications. If Wikipedia is written by occasional contributors, then 
    growing it requires making it easier and more rewarding to contribute 
    occasionally. Instead of trying to squeeze more work out of those who spend
    their life on Wikipedia, we need to broaden the base of those who 
    contribute just a little bit.
The problem is that, as a community, Wikipedia has gone out of its way to do the opposite of that. New content is often treated as "guilty until proven innocent", and it's up to the contributor to wade through Wikipedia's idiosyncratic rules and definitions in order to justify to the moderators why their edits should not be reverted or their articles deleted.


Please don't quote text in code blocks. It's impossible to read on mobile and narrow viewports.

---

Properly formatted:

> Wales is right about one thing, though. This fact does have enormous policy implications. If Wikipedia is written by occasional contributors, then growing it requires making it easier and more rewarding to contribute occasionally. Instead of trying to squeeze more work out of those who spend their life on Wikipedia, we need to broaden the base of those who contribute just a little bit.


> New content is often treated as "guilty until proven innocent", and it's up to the contributor to wade through Wikipedia's idiosyncratic rules and definitions in order to justify to the moderators why their edits should not be reverted or their articles deleted

This has been EXACTLY my experience and it really really turned me off from even bothering in the future. :(


Isn't that the same for most open-source software[1]? To a Wikipedia editor going into a programming project to correct documentation, it might seem absurd that we want them to 'pass CI', 'squash the commits and correct the message column width to pass the code style', or 'sign the CLA'.

[1] It's a very prettied-up domain-specific language but it's still copyleft software.


It would be, if the requirements were clearly documented, consistent for all pages, and consistently enforced. They are none of those things.

EDIT: the other big difference is that every time I've attempted to contribute to an open-source project, people have, in general, been helpful and willing to explain what needed improvement. On Wikipedia, it's a much more hostile attitude. People are less willing to explain why your content was reverted or nominated for deletion, and if you protest, there's a decent chance that you'll be sanctioned (either topic-banned or temp-banned from Wikipedia). Imagine if forgetting to run a linter got your SSH key tempbanned from the project and you'll have a closer analogy to the current process.


It really depends on what you edit. Sci/tech/math is pretty much un-monitored in my experience. The economics pages are full of weird pet theories. Meanwhile, as you say, current events, biographies, and a lot of other less technical pages (that don't require the deep background that the article refers to) are infested with deletionists.


It really seems to come down to politics, specifically whether or not /any/ group considers a topic political. I've made a few reasonably long albeit anonymous edits to the pages for specific regional cuisines and so far the only thing that's been changed is someone altered my wording a little once by breaking a sentence up into two.


I noticed the same pattern. Pages surrounding software architecture and companies always have some amount of fud in them.

Do we have any surveys for wikipedia like stackoverflow about the demography of editors and how many people visit pages in particular categories? How often will someone notice a mistake and try to fix it?


Sadly there's a couple paradoxes.

Some occasiinal contributions only press for others to do something, helpfully pointing out what they think are mistakes and errors. This is someone with an axe to grind in denigrating wiki lingo.

As someone who's been told of there before, I think I understand the problem, and I like to agree with Aaron's sentiment--I don't know where it's coming from, though.

The quote is very much related. Who is "we"?

> we need to broaden the base of those who contribute just a little bit.

Easing the entry to beaurocracy would likely increase its burden significantly, while its not strictly separable from content contribution.

Nevertheless, being a frequent and fair editor is not a sufficient criterium for good administration, sadly.


> "guilty until proven innocent"

I think that's still a good approach. You want Wikipedia to be a trustworthy source, not one where everyone can add their half-knowledge without any checks.

Maybe making the discussion pages more "interactive" might help a bit. People could share their ideas, interesting sources or other inputs in a casual way and someone more skilled in writing and researching could use that to write the proper article.


From my experience with trying to contribute to the Hebrew version, try contributing focusing on expanding an existing article and watch what happens. In my case there was a very strong push back even against adding links that expand on issues mentioned which imho helped to balance an obvious slant. I gave up after a while. At university of course you’re warned not to use Wikipedia, it’s not acceptable as a reference and many examples are presented of experts in their field who contributed articles, which were then rewritten to the point where there was nothing left, other than the revised version of the moderators and the small mafia that runs the Israeli Wikipedia. There’s also the examples of the Croatian version which was taken over by neonazis and even the ministry of education had to publish a warning.


The political bias of wikipedia editors is horrible in most smaller countries. You mention Israel and Croatia. I can add Sweden. For an example, compare the wikipedia entries of the two largest parties in the Swedish parliament. One, Socialdemokraterna, starts with a blurb on how the party provides public welfare and the party slogan ("av var och en efter förmåga, åt var och en efter behov"). Compare with the second largest party, Sverigedemokraterna, which has an entire section devoted to listing scandals, controversial quotes and shunnings. Whatever you think about the two ideologies, these two wikipedia entries alone should be enough proof that there is a (left) political bias of wikipedia editors.

I use wikipedia often because it has a lot of information in it, but would never rely on it as a source of fact. It is a google result like any other, and must be treated with caution.


I can echo your concern. German wikipedia has the same troubles. That's why it's not uncommon amongst anyone I've aksed or observed to habitually fact-check a German wiki-page with a quick hop into the English version (If there is any ideological classification involved).

There at least, Wikipedia's promise, that the heterogenity of writers creates informational breadth and balancedness, seems to be more kept.

In contrast to the German version, the English articles often have dedicated "criticism"-chapters, that also leave the criticism uncommented. The German versions often start with a political classification right away. If the classification is contested, then it stays there and just gets slighty changed into reported speech. If any counter-arguments to that classification make it into an article, then they are almost always commented, often with a counter-argument to the counter-argument from groups that created that classification in the first place, litteraly giving them the last word.


>The German versions often start with a political classification right away. If the classification is contested, then it stays there and just gets slighty changed into reported speech. If any counter-arguments to that classification make it into an article, then they are almost always commented, often with a counter-argument to the counter-argument from groups that created that classification in the first place, litteraly giving them the last word.

It's exactly the same in English Wikipedia for anything to do with current politics. Try reading any article about President Trump.


Almost all social platforms on the internet have a left-leaning tilt because, at least partially, the left (historically) organizes much better than the right does. This is one of those "well duh" statements once it's said out loud, considering the left's entire platform is about unification and social issues in most countries. That said, there are signs that point to a change in that recently (the second amendment rallies and sanctuaries around the US as examples).

This is extremely visible on Reddit, to the point that even moderates on the left can find it exhausting.


I was under the impression that this was because the left skews younger, and Gen X and millennials are more active and capable online.


Age is a component, but it doesn't adequately explain the difference in movements that occur in the real world as well as online. Universities have students organizing protests against right-wing figures quite frequently, and it was almost exclusively the left who pushed for the $15 minimum wage - despite both parties populations being affected heavily.

From what I've seen (living in US, Germany, and Sweden) no matter where you go there is a difference in 'wiring.' [0]And according to some sources I've read there appears to be some science behind that as well.

[0]https://www.theatlantic.com/politics/archive/2013/10/can-you...


Universities skew younger even more than the internet does.

I think there is an element of the left wanting to "fix" the world which appeals to young people, and the right wanting to "protect" the world from bad changes, which appeals to older more cautious people.


The US military also skews young but leans more towards the republican party.

https://news.gallup.com/poll/118684/military-veterans-ages-t...


Looks like there are 1.29 million in the US military (0.3% of US population), with an average age of 34.5. (All figures from top search result of Google.)

59% of millennials lean or identify as Democrat compared to 35% Republican https://www.people-press.org/2018/03/01/1-generations-party-...


[0]There were 18.8 million veterans living in the US in 2017.

I don't know a single person in the military today that is over 30, and I live right by Nellis(Not exactly hard data, of course). Most people go into the military right after high school. I'd be surprised if the average age is actually that high, especially considering you can't even join the military over certain ages - though that depends on the branch and other factors. IIRC you cannot join the marines if you are >30

I suspect you probably are specifically looking at officers, which is an entirely different story.

[0]https://www.ncsl.org/blog/2017/11/10/veterans-by-the-numbers...


Is that true? I've always thought that the right was mildly more hierarchial (which i would describe as "organized") and the left a little bit more grass roots. That said i have no sources and another ancedote doesn't mean much.


The vast majority of large social movements have their origins in the left. Climate change, most social justice movements, the push for higher minimum wage, immigration reform, etc.

In general, people leaning 'right' are less interested in social initiatives on the whole. In the US, one of the most common talking points from the right is the desire to be 'left alone' by the government or social movements.

The only issue the right seems to really be organized about (in the US) is guns, and that's because a huge percentage of military folk lean that direction.


Large grass roots social movements are not the only way to be organized though. I agree that the left seems to better be able to capitalize on that type of movement in recent history, but i dont think it follows that the right is disorganized, just organized differently. (Then again this is the trump ers and he is not really the establishment right wing im thinking of-To me at least he seems to be riding a wave of fairly disorganized [but powerful] right wing populism, so maybe you are right in regards to him)


In what way is right more disorganized or was? Both extreme right and moderate right are full of groups, think tanks, movements, grass root movements and what not. Whether you decide right as economics or as dealing with social issues (women, race) or religious, it was never disorganized.


On English Wikipedia, the tilt is not exactly "left." It's generally more "liberal" (in the American sense). The group of editors that controls American politics articles pretty clearly isn't feeling the Bern, for example.

But this is highly dependent on the topic area. Some pages are controlled by relatively right-wing pro-Israeli editors, others are controlled by pro-Palestinian editors (and many are caught in the tug-of-war between these groups). Different groups control different topic areas, and some of those groups can be extremely right-wing.


I don't think it's left vs. right in this case.

Without going into specifics, and considering the description of the process as was mentioned by the one of the commentators, the example I'm referring to was specifically a link to the academic paper from which a certain quote was paraphrased ...

I first tried to add a direct quote from the writer's paper, which was removed, and then a link to her paper, which was removed.

What they were trying to achieve is to minimize the importance of that source, by qualifying it as "some researchers claim" sort of qualifier.

It's really not about the content, but equal and balance representation of authentic and reliable sources. ie. a research paper which was already referenced. Not every "opinion" and slant is important, but preferring a paraphrase to direct quote is obviously an attempt to minimize the source.


You may be correct but your proof suffers from the the fallacy of the middle ground.

Unbiased reality by no means implies that the number of scandals, controversial quotes, achievements or even the relevance of the party slogan must be similar for both sides.


Given that it is politics, there could not be no scandal at all, however it be framed. I can't read it, so I won't say there were none present, though.


Does your central parties have as many outrageous scandals as your far right party? Because if they don’t, what you say, doesn’t suggest any sort of bias.

I know in my country things are somewhat similar, and you’ll find less scandals listed on the central parties than you will for the left or right wing parties. That is mainly because the central parties don’t go around talking about how the holocaust never happened (right wing) or how violence against the police is warranted (left wing) though.


The centrism bias should be considered bias too. The "truth is in the middle" or "both sides must be the same" is quite often just not true - in politics, work and personal life.


Scandals from one party or the other are rarely covered equally in the news of any country. The US is not alone in the divisiveness of our politics, nor in the lack of unbiased reporting.


>The political bias of wikipedia editors

There is the problem of sourcing. Per the Wikipedia's policies[0], any contributions must be grounded in reliable sources, which in case of recent events boils down to big name news publications. Given that most of the big name media seems to gravitate towards either progressive, or socialist views, the contributions are likely to end up with a derived slant.

The other aspect is notable in cases of a conflict between the netizens (or generally the consumers) and the media[1], where Wikipedia is self-bound to side with the media's reporting on themselves. As far as I can tell, Wikipedia has no provision for resolution of this conflict of interest.

--

[0] "If no reliable sources can be found on a topic, Wikipedia should not have an article on it", https://en.wikipedia.org/wiki/Wikipedia:Reliable_sources

[1] terribly sorry about bringing up the specter of echo Tnzretngr pbagebirefl | rot13 in 2020


This was pretty similar to my experience trying to add up to date participation numbers to Australian sport.

There is a big survey done in Australia, the Ausplay survey:

https://www.clearinghouseforsport.gov.au/research/smi/auspla...

They have extensive tables on adult participation.

Two editors would not let this data into the sport in Australia article. The reason was fairly obvious, it shows that soccer/football is the most played team sport in Australia. Roy Morgan, a statistical agency also had similar figures.

There was much time spent in the talk pages spent asking these two what would be acceptable for quoting these sporting statistics. The answer was nothing.

If you can't get fairly unobjectionable material like that into wikipedia what else is being blocked?


> The reason was fairly obvious, it shows that soccer/football is the most played team sport in Australia

I'm not familiar with Australian culture, can you explain why would that be obvious?


Good point. Sorry.

Some older Australians are parochial about sport. Soccer is seen as foreign and something not really Australian.

It's similar to the way soccer has been described as un-American in the US.


You can learn a lot about a topic you care by reading the Talk page and its archives; most relevant information that people wants to add has been discussed at length there, evaluating all pros and cons of the proposed text, and the biases and analysis of the quality of sources. Some small amount of talk is "oversighted" (hidden from view to all but administrators), but that's typically only for scandals who affect living persons. Also you'll have to wade through inflamed battles with unbridled passions, but the main issues of conflict become clear.

That said, there are two kinds of articles: hot topics which get the above treatment for any suggested change (either for political or fanboyism reasons); and abandoned gardens where no one cares and weeds grow, marked with warning signs. These were created either by copying a public domain old encyclopedia, or by a passionate editor in the earlies , and have seen few since then.

Either way, the most valuable resource of Wikipedia has always been the References section. You can skim the definition and safely ignore the rest of the article, and still use it as the starting point for learning about an unknown topic by searching the most promising sources of a well-curated collection of relevant links, that would be hard to find anywhere else using a search engine.


> At university of course you’re warned not to use Wikipedia, it’s not acceptable as a reference

The reason for this is too often misconstrued as being that Wikipedia is somehow misleading. The reality is that you do you reference something that is not a primary source. When everything on Wikipedia ostensibly has to link to something, it is just lazy to cite "wikipedia" as source for something.

I mean, I imagine you couldn't just say "New York Library" as a source for a fact either in university. You'd cite the actual source.


Interestingly Wikipedia itself promotes the use of secondary sources, discouraging over-reliance on primary and tertiary sources. https://en.wikipedia.org/wiki/Wikipedia:Identifying_primary_...


It would be interesting to redo this analysis now. 2006 was a long time ago. In 2006 wikipedia was in a very rapid growth phase; lots of topics still didn't have good coverage yet. In 2020, I can't remember a recent time where a mainstream topic didn't have extensive coverage on en wikipedia. Do the masses still write most of Wikipedia now that the low hanging fruit has been written?


I recently amazed several non-tech friends by mentioning that I, personally, have edited Wikipedia. It's not something a lot of people think of as something they can even do. They're not fixing typos, let alone making substantive edits or writing new articles.

I realize "do most people edit Wikipedia" is different from "are most Wikipedia edits from casual users", but it was an eye-opening interaction for me.


None of the comments in this submission's thread so far seem to be related to the article's content (the headline was tantalizing, but the article itself is really cool), so to try and counteract that, here's one that's directly related:

As it says, Wikipedia is written by vast amounts of people. The myth that only a small number of people contribute is just that: a myth. Checking edit histories on anything but the most niche of articles would demonstrate this, but you could also do the same analysis he did today and see if you can replicate his result; it's been a few years now, things probably look slightly different.

This has gotten more relevant over time even though the people perpetuating the myth have changed, along with their motives for doing so. One of the more popular and long-lasting myths!


Each subject area is controlled by a relatively small number of people. There are about a dozen editors on each side of the Israeli-Palestinian issue who duke it out at each related article. There are about two dozen centrist Democrats who own most modern American politics articles. As a whole, Wikipedia may be edited by a large number of people, but if you spend any time editing any given subject area, you get to know the one or two dozen people who control it pretty quickly.


I don't get why you'd want to perpetuate the myth of Wikipedia editors being this small, powerful in-group of people anyways…


Possibly to discredit it as having a political identity of its own or something because small group of people can be more biased openly than a large group?


Right, but this is Jimmy Wales pushing this.


Ego in his case, though modern forms of it are generally what the above throwaway said.


The small amount of people who work on Wikipedia is it's greatest weakness. Petty power plays and individual politics all bubble up from those few into a resource so many millions use. It deserves more but how can anyone even start with all the rules, bots and reverts. It's really demoralizing to try and put in any effort there as you'll quickly encroach on some editors "territory".


There's wikipedia the product, and wikipedia the game. A lot of people really enjoy playing wikipedia. Even so, the product's really not that bad by most metrics.


> It deserves more but how can anyone even start with all the rules, bots and reverts.

That's why I prefer contributing to smaller wikis. They have much less political issues. However, I'd say more than 90% of the edits I've made to the English Wikipedia are still live.

It really depends where you choose to edit. Some articles just have asshole editors who think they own the article. And not everyone has the energy to fight back against that.


Is there a simple way to check if one's edits are live, so those that aren't can be reviewed?


You can filter article history by user [1] to check whether their contributions are still in the current version, or use Wikiblame creatively to find who wrote some specific part of a live article [2].

[1] https://tools.wmflabs.org/sigma/usersearch.py

[2] https://en.wikipedia.org/wiki/Wikipedia:WikiBlame

http://wikipedia.ramselehof.de/wikiblame.php


Incognito mode?


So you disagree with the numbers presented in article?


He may accept numbers in article and still think that 14 years later things are different. Wikipedia changed policies in the meantime, society changed in the meantime.

He may also think that more representative selection of articles to run test on would end up differently.


How would a more representative test end up? Are there better numbers available?


I write/edit on Wikipedia most days. I've contributed to over 100 English articles, on topics ranging from science to music to art. In my experience, science/mathematics articles are VERY high level, often graduate-level content, and they can use some softening of jargon and domain-specific verbiage.

In the arts and music, many paintings, musical compositions, etc. don't have much written, or it's poorly written, by a non-English writer, etc.

I tried to write an article about my college a cappella group and it got rejected for not being famous enough. I thought that was a bit silly because my group was just as famous as random tiny towns in the middle of nowhere...shrug.

I'd encourage you all to join Wikipedia and make edits to anything you see that's amiss!


I miss Aaron so much


Twice this morning, in less than an hour I am hit with profound sadness when reminded of the injustice of Aaron’s treatment at the hands of a system we call “land of the free”.

I started the morning curious about a Markdown syntax question and was reminded that Aaron also contributed to md. [https://daringfireball.net/projects/markdown/]


I miss him, and I miss people like him. At one point he embodied "web" culture. Seems like a long time ago now.


> At one point he embodied "web" culture

And is now the polar opposite of anything you might consider "web culture" today.


2006, using data primarily from Wikipedia articles written before then, was a very long time ago. Things have changed a lot: https://arxiv.org/pdf/1407.0323.pdf


Small special wikis are very different fron Wikipedia.


For those who haven't read the linked article, it states that unregistered and anonymous users create the vast majority of Wikipedia content; the registered editors mostly move things around, delete commas and the like.


Is there valid criticism on the accuracy of certain wikipedia articles?

It could also be interesting to have some study or writing on certain "edit wars" on certain controversial subjects on wikipedia.

In general, the accuracy of wikipedia is pretty good, and generally wikipedia is still very valuable.

I just wish wikimedia would do more do promote its high quality articles and bundle them per fields to make quality textbooks on particular subjects. It also seems articles are not really indexed per category, which makes it hard to gather articles on a particular subject. Anyway it would involve a lot of work.


Wikipedia does support article 'bundles' (they're known as Featured/Good Topics) but open textbooks are covered by a separate effort, namely Wikibooks.


R.I.P Aaron.


Wikipedia is written by a much smaller (from what I've seen) and far more cliquish group than in the old days. The novelty of editing an open encyclopedia has worn off and a far more vast majority now just visit for the questionable facts while only the people with too much time on their hands still edit.

Coincidentally theres even more territoriality than before with people setting up fiefdoms on prime articles and don't you dare flout their authority. Its worst of course on the politically relevant topics. A current favored tactic is to frontload the very beginnings of articles about organizations and people they don't like with negative/inflammatory information. For example compare the current versions of Breitbart News, Conservapedia, One America News Network, and Stephen Miller with Daily Kos, Rational Wiki, and Huffington Post.

The defense if they're called on it is a tortured appeal to 'authoritative consensus' where an editor will go on a fishing expedition for negative quotes from the left of center media bloc like CNN or HuffPo and anything they find on there even blatant opinion is automatically sacrosanct regardless of whether it actually is a consensus among the entire media.

So basically the political articles are even more trash than ever. Again you can draw whatever connection you want to the type of people left editing this mess. I feel sorry for anyone who actually reads and believes it.

Another annoying thing is that they still haven't fixed their scientific articles which for anything beyond the basics tend to be overly jargonish and technical yet uninformative at the same time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: