Archive

Posts Tagged ‘digital’

Wikimedia Conference 2011: Cultural Heritage, Commons and lots of Data. Pt. 1

November 8, 2011 3 comments

On Saturday 5 November, the Wikimedia foundation held a conference in Utrecht. I took the opportunity to go there and write this report about it. Because of the size I decided to split it in two parts. The first is mainly about Cultural heritage and Creative Commons, the second part is about Wikipedia itself.

The Wikimedia foundation is a non-profit organization that is at the top of several open source projects dedicated to bringing free content to the world. Its most famous project is of course Wikipedia itself, but their are several other projects which deserve attention like the Wiktionary and Wikimedia Commons, which was discussed a lot today.

The conference was opened with a speech by a man who introduced himself as the CEO of the Wikimedia foundation and talked about the commercial successes they have reached. This was done by creating ad-space on the Wikipedia pages and receiving sponsor money by for example Neelie Kroes in order to keep her page clean. During his speech it pretty soon became clear that this was all part of a comedy act about exactly everything that Wikimedia is not.

Jill Cousins – Europeana without Walls

After this little piece of comedy theater it was time for Jill Cousins, executive director of the Europeana project, to open the conference with a keynote. Cousins presented the current status of the project and its relation with the Wikimedia foundation. Europeana’s goal is to digitize all of Europe’s heritage and to make it publicly available. Europeana aggregates the objects and its metadata from institutions all over Europe. Here Cousins addressed the copyright problem. Goal is to release all the metdata collected by Europeana under a Creative Commons license which allows commercial use by other parties (CC-0). The institutions are quite anxious towards this because they believe that they lose control of their material and fear a loss of income if others can use their content for free. However, as Cousins mentioned, without the possibility of commercial use, the objects can barely be used. This because the material can not be embedded on sites that exploit commercial activities, like for example put ads on their site. This also means that the objects can not be used in Wikipedia articles, since their regulations prescribe that media content has to be openly available and also for commercial use.
Europeana realizes that most of their objects are not found directly on their own portal website, but on other sites that embed their content, so being able to work together with other sites is vital for this project.
An other issue that Europeana has, is the lack of good metadata (this I also described in my MA thesis which can be found here). In order to make full use of the semantic possibilities of the web, especially with more than 15 million objects, good metadata is essential. Europeana has recently launched several projects and a handbook to encourage the different institutions to fill in their metadata in a correct and unified way. Here Cousins also noted that no matter what the information status is, the metadata should always be in public domain.

After the plenary opening, visitors had the option of choosing three different ‘tracks’. The first was purely focussed on cultural heritage, the second was about the world and data around the different Wikimedia projects and the third, the ‘Incore Wikimedia track’ consisted of different technical sessions for Wikipedia editors. Because of my focus on digital heritage and Europeana, I chose the first.

Maarten Dammers – GLAMWiki Overview

The first speaker was Maarten Dammers (@mdammers), a very active Wikimedia volunteer. He showed the GLAMwiki project. This stands for Galleries, Libraries, Archives, Museums & Wikimedia. Goal of this project is to build a bridge between these cultural institutions and Wikimedia. In order to achieve this several different projects were found. The first project Maarten talked about was Wiki loves Art. In this project users were asked to go to museums and to take pictures of different art objects and upload them to the Wikimedia Commons image bank. Because these pictures are under a CC-BY-SA license, which means commercial use is allowed, the pictures can be embedded in Wikipedia pages about the artist or object itself. By crowdsourcing these images and by making a contest out of it, the image bank quickly became filled with thousands of pictures. Other Wikipedia users started to add metadata to the images and to place them in articles which greatly enriched the Wikipedia pages.
In the second part of the presentation, Lodewijk Gelauff (@effeietsanders) joined Maarten to talk about Wiki loves Monuments. This project is the successor of the Wiki loves Art project. Where the Art project was only in the Netherlands, the monuments project was focussed on monuments from all over Europe. After the project was finished, it had resulted in 165000 photo’s in the Wikimedia Commons image bank.

Maarten Zeinstra – The value of putting photo collections in the public domain

After a short break, Knowledgeland employee Maarten Zeinstra (@mzeinstra) presented the results of his research about what the benefits are for institutions when they put their (photo)collection in the public domain. Maarten analyzed 1200 photo’s that were released by the Dutch National Archive. All pictures were from Dutch political party members in the history of the Netherlands. When these photo’s were put in the Wikimedia Commons image bank, Wikipedia users quickly started to add the pictures to Wikipedia articles. The result of this is that the pictures of the National Archive gained a lot more attention and automatically new metadata was added. To analyze this, Maarten made use of different tools created by members of the Wikimedia foundation. Several of these tools can be very helpful when analyzing Wikipedia, also on an academic level.
Interesting in this presentation was that this analysis actually showed to what extent the materials that are being put in the image bank are used. This information is extremely helpful when institutions are in doubt about if they should put their collection in the public domain. Maarten’s research also showed that it is more likely that the materials are used when a specific set is chosen. Maarten compared the collection of the National Archive with a lot bigger one from The Deutsche Fotothek which was uncategorized. From the 1200 photo’s from the National Archive, 55% was used in different language Wikipedia articles. From the Deutsche Fotothek collection, only around 3,5% was used. The main reason for this is the fact that an uncategorized collection requires more effort from the Wikipedia editors in order to sort them out. The full report can be found on the website of Images for the Future.

Sebastiaan ter Burg – Creative Commons in practice.

Sebastiaan ter Burg (@ter_burg) is a photographer who works independently. When he makes a photo report for a company he has one clear condition: all his work becomes, directly or with a delay, freely available under a CC-BY-SA license on his Flickr account. This means that all his work can be freely used and spread, even for commercial purposes. In his presentation, Sebastiaan talked about the benefits this way of working has for him. First of all, it saves him a lot of paperwork. In the ‘old’ way of making money with photo’s, an invoice and a contract is created for each picture that is sold. By releasing the material under a Creative Commons license, this is no longer necessary. Sebastiaan sets a fixed price for a photoshoot and so their is only one contract. The more important advantage is the fact his work is being spread and is being used in all kind of other different media. He noted that he has a better income than most freelance photographers. It has to be noted however, that Sebastian’s business model is not better than the old one per se. Quality is still the most important aspect when making money in the creative industry. It will however become harder for photographers who are not at the top to generate an income. When more photo’s are released under a Creative Commons license, less photographers are needed to report an event. When a couple of photographers take good pictures, other media can use them. The presentation of Sebastian showed that a business model that works with open data can work, which is a refreshing thought.

Johan Oomen – Open Images

Johan Oomen (@johanoomen) is the head of the research and development department at the Netherlands Institute for Sound an Vision. He presented Open Images, which is a sub-project of the  Images for the Future project. This is a Dutch project which has the goal to digitize the Dutch audio-visual heritage and to make it publicly available under an open license. Oomen explained that ‘open’ has to be understood in its broadest meaning: open source, open mediaformats (ogg), open standards (html5) and open content (CC). This way of working stimulates the reuse and remixing of old content. The project will also work together with the Europeana project in order to make the content more easily accessible. The project will continue for two more years and will mainly focus on this reuse and the possibilities of crowdsourcing new information, metadata and products.

Jan-Bart de Vreede – Wikiwijs, use of open content for education purposes.

Jan-Bart de Vreede is active in the Wikiwijs project which has the goal to let teachers use more open source content in the classroom. This content varies from images and videos, to complete lessons. Different objects or parts of lessons can also be combined in order to create new lessons, as long as as these are also shared under a Creative Commons license. In order to guarantee quality, educational institutes and teachers can add their opinion about the material. Interesting to hear was that the number one reason for teachers not to share their content, is that they think their material is not good enough. Which is kind of strange when they have been using it themselves for years.

This is the end of part 1, which is all about presentations from the Cultural Heritage track of the conference. In part 2, presentations from the ‘Wiki-World’ track are being discussed.

Creative Commons Licentie
Dit werk is gelicenseerd onder een Creative Commons Naamsvermelding-GelijkDelen 3.0 Nederland licentie

Wikipedia and the Utopia of Openness: How Wikipedia Becomes Less Open to Improve its Quality

October 15, 2011 2 comments

I found out today that I have never posted my final paper of the Digital Methods of Internet Research. During my year in the Master New Media at the UvA, this was one of the most interesting researches I have worked on. With a final grade of 8.5, I was also asked to present it on the Digital Methods Conference. In this blog post, I have put down the abstract and the method. If you find it interesting, the full paper can be found here under a CC-BY-SA license.

Abstract

Wikipedia has become an enormous source of information in the last decade. Because of its ubiquitous presence on the internet and the speed of which it is updated, it has become more than a reference. It becomes ‘a first rough draft of history’. In this study the changing politics of openness are analyzed. By looking at both small articles, as well as one extremely popular, the role of openness and transparency within Wikipedia is discussed. In this study I point out that in order to improve the quality of Wikipedia, it is sometimes necessary to limit the amount of openness, which is not a problem as long as the process remains completely transparent. At the same time, more transparency is needed to improve the smaller articles, which are often created by a single person.

Method

In this paper, I want to take a deeper look inside Wikipedia and the way that the articles are created. Who is responsible for the content that can be found on Wikipedia? What is the consequence of the fact that ‘anyone can edit’ at any time and how is dealt with a project that has become so incredibly large? In the first part I will point out how Wikipedia works. The basics of Wikipedia will be explained and a more in-depth analysis of the politics of Wikipedia is done. By looking at the rules and regulations of Wikipedia, as well as how they are actually regulated by the community I will point out how Wikipedia has managed to control such a large group of editors and created an encyclopedia of high quality in stead of an anarchistic chaos.

In the second part, a closer look is taken to how an article is created and how it develops. Who creates the article? Is it a dedicated member of the community or an anonymous user who believes he can add something to the encyclopedia,? It is also interesting to see what happens after the creation. How does the community respond and what kind of edits are made? By taking a couple of articles as a case study, this will be made clear. This will make clear that a user should look at the average Wikipedia article more critically. Since this is hard for the average not so media-savvy Wikipedia user, Wikipedia should make this process of creation more insightful

In the third part, a more closer look will be taken to articles who are subjected to heavy editing. By taking a more deeper look into the Wiki article about Julian Assange the it will be made clear how the community responds on a topic like this and what this means for the idea of the ‘open’ and collaboration.

From this analysis, I conclude that the role of Wikipedia has changed, it has gone to be more than an encyclopedia, as it functions as an up to date news source. This has implications for the openness of Wikipedia and other ideas from the early days. To make sure Wikipedia can stay and become a more reliable source of information, transparency is the key.

Discussion.

The fact that Wikipedia is becoming bigger everyday, both in size, as in its ubiquitous presence, makes it an important object of study. On a daily base, millions of people use Wikipedia as a source of knowledge. The Wikipedia community is well aware of this and does its utmost best to create articles of better quality. This is not only done by checking new edits by both humans and bots, but also by creating new policies and guidelines. It seems that in the ten years of existence, the ideology of the early days has been abandoned. Rules can in fact be made and changed and the amount of openness can de reduced, as long as it benefits the quality of the content.

Wikipedia has developed from a small and open project, into a huge bureaucracy. This has several implications. It has become harder to start editing Wikipedia, new users often are frustrated by the wall of bureaucracy they run into and are therefore demotivated to become a Wikipedian. The consequence of this is that a declining group of people, is forming one of the biggest sources of knowledge. At the moment this does not affect the popular articles. As showed in the study to Julian Assange’s page, it is checked and discussed more than ever, despite the limited accessibility. It can however, reflect on the quality of smaller articles since more expertise is required and may as well lead to more conflicts between editors.

The increasing bureaucracy has two effects. On the one hand it decreases the amount of transparency. Because of the enormous growth of the policies and guidelines, it becomes harder to get the basic rules of Wikipedia and to see why a decision is made. At the same time, the user can assume that the article is of better quality because the content that is actually in the article, complies to all the rules. This however, does not apply to articles where only one editor created all the content. Most of the rules have to be checked by other users. As this research has shown, the text created in less popular articles is usually not changed much after that. The only edits that were made are text formats or adding categories and inlinks.

Therefore, I suggest that Wikipedia must give more attention to how the specific article is created and make it visible for every visitor. This way it brings back the transparency that has always been so important and improves the knowledge of the reader. It should be shown in the article how many users created it. For example, note a percentage in the top that shows how many of the content of the article is written by the same person and how many edits were made all together. This gives the user a better idea if an article is trustworthy and unbiased. By making the creating process even more transparent, it becomes easier for the user so decide how to approach the given information

It is up to Wikipedia as well as scholars to study better ways of indicating the quality of the article. With more than 3.5 million articles in the English-language Wikipedia, this can not be done efficiently by the human contributors, which numbers are slowly declining. New ways have to be found to automatically identify the quality of an article, as some researchers have already started discussing. This way, Wikipedia can indicate the quality of the article and show this to the user. This does not only make the user more aware of the fact that the content of Wikipedia is not perfect, it makes it also possible to automatically generate lists for the Wikipedians of articles that need to be checked for quality. It might even be possible to regulate the edit options automatically, giving more access when an article has proven to be of less quality, decreasing the amount of bureaucracy for starting editors.

This study has shown that Wikipedia has transformed since it was found, leading to a more bureaucratic organization. This has several implications, mainly on the openness of Wikipedia. As pointed out, these decisions can benefit the quality of Wikipedia, as long as the process remains completely transparent. By making less popular articles also more transparent, not only the quality of the content will improve, but it also notifies the reader how reliable an article is.