Word cloud

This blog will look at the strengths and the weaknesses of word clouds. We will look at how helpful of a tool it can be for historians, as well as for others who encounter it in their studies or leisure.

Word cloud has become a popular tool of choice when identifying common words within many different sources. The tool provides the reader with an idea of how frequently certain words occur within a particular text. The more common the word, the bigger the word is in size. Lesser common words are noticeably smaller in size, especially as you go down in frequency. This system of size is handy, and makes differentiating a simpler task.

For historians this tool can be useful, as it saves them the time and effort of reading through sources that may seem helpful, but have little relevance to what they are researching. It allows historians to decide which sources are of more importance, and what could potentially be extracted from them. Another way in which the tool is useful is the simplicity of it. Historians in cases are guilty of being easily turned off by technology that proves difficult to understand. This tool is basic to the eye, and makes it easy to identify a sense of what the source could contain.

On the other hand, this basic nature may well reduce its usefulness for other historians. The fact that only words are provided, means that any real sense of context is lacking. Yes, the historian knows what to briefly expect, but does not really gain an understanding of how these words are being used. The randomness may not be as helpful as some think. Also, this tool has a chance of picking out a variety of random words which are meaningless. Words such as ‘the, and ‘because’ are likely to be frequent, and therefore unhelpful since they serve no historical purpose. Sometimes word cloud could be programmed to filter out these undesirable instances, however there are still chances that unhelpful words may continue to feature. The chances of benefitting from this tool vary every time it is used, so patience and caution is required for historians who are thinking if using it.

Overall, it is fair to say that the helpfulness of this tool solely relies on how and why it is being used. If a historian is using the tool as a basic guide to give them rough ideas on what to expect, then it may well be useful. However if expectations are set higher, and historians use this for getting a clear idea on a sources contents, then problems may arise. This is since one word and its size alone does not really have a context or explain anything. With this in mind, historians could use the source, or decide to try elsewhere and see if other tools similar to word cloud can provide a better service.

Week 4- Programming and its uses for Historians

Programming in the context of this blog is the process of making the computer do tasks that you want it to do. By inputting instructions or coding, you are getting more out of the computer, as it helps you to develop your experiences with history. The usefulness programming has for historians will be considered, along with its strengths and its shortcomings.

It is irrefutable that digital history has risen in popularity over the last decade. Many historians such as Adam Crymble and Daniel Cohen have utilised the Internet in an attempt to expose people to different areas of history. Through blog posts and social outlets like Twitter, history now comes in more modernised forms, which cannot all be bad since it is made available to a much larger audience. As well as this, information is now being made more accessible online for historians themselves. In certain circumstances, academics can now gain access to information from archives and libraries without travelling to get them. They can also if fortunate enough, gain access to more sensitive material online. Through the digitisation of sources, the process of analysing historical data is therefore made quicker and easier for the historian.

The transition from books to the Internet has been a debated topic amongst historians for a long time. Traditional historians favour history being portrayed through long, well thought out books which have been professionally reviewed. It is arguable that they prefer it this way due to this process being more simple and familiar to them. The idea of learning the ways of computing usually deters certain historians. An attitude that using computer programs are too complex, and too long to grasp means that certain academics stick to what they know, the ‘old fashioned’, tried and tested way. The fact that programming is a skill which takes many months to develop, means unwilling historians will either give up, or steer away from taking their history online.

On the other hand, by sticking with attempts to learn programming, the computer can turn out to be more of a friend than a foe, and help in the quest to present better work.[1] Persistant historians over the years have been able to use programming to their advantage, and provide their findings in forms such as charts and spreadsheets. These different forms in different instances bring out the best in historical information, and sometimes leave a lasting effect that words on a page are not capable of doing. This is arguably a crucial aspect in ensuring that history is conveyed in the best possible way. By providing historical information in modernised forms, it allows understanding and interest around the subject to grow, whilst creating an enjoyable reputation. Surely the goal to engage as many people with history as possible needs to be a priority. If changing how history is presented ensures this, and if social networking allows for historical communication, then this process needs to occur to preserve history.

In retaliation, traditional historians may find it hard to embrace modern ideas to digitalise history. Some find the thought of replacing libraries as implausible. Dan Cohen reiterates the argument that digitisation is not a replacement, but rather it acts as a bigger library, one that extends history to a larger audience.[2] This effort to increase public interest around history is a goal that digital historians and traditional historians share.

There is no time like the present to look at which types of methods are working, and which types give history the best chance in the future. This may indicate why we see librarians and archivists showing greater interest in making digital forms of history more common.[3] If so, there is a chance that the computer will become the historians pencil in the next century.[4]

[1] Janine Noack, ‘Why historians should learn how to code (at least a bit)’ Doing History in Public, (2014) consulted 29/03/15

[2] Dan Cohen, ‘The Digital Public Library of America, Me and You.’ (2013) consulted 29/03/15

[3] Dan J. Cohen & Roy Rosenzweig ‘Digital History: A Guide to Gathering, Preserving and Presenting the Past on the Web’. Consulted 29/03/15

[4] Cohen & Rosenzweig ‘Digital History: A Guide’ Consulted 29/03/15

Week 2- Hoaxes & Lies

This blog post will focus on the Internet and its ability to provide reliable information. In particular, the Bracero History Archive will be analysed as an example of what we as users need to look for in the online sources that we decide to use.

During our studies, the Internet can provide us with a wealth of extra information that may be inaccessible elsewhere in books. Whilst this may seem good for the work you are doing, the information could cause more harm than you think. When we read books, we have a greater assurance that the information is reliable, since the book has been reviewed by a publishing house. However, in the search for more information online, we run the risk of coming across unreliable data. This is because simply put, the role of regulating information online belongs to everybody, and not just a group of professional reviewers.[1] Therefore, anybody can publish information on a topic, meaning that they have power to present it accurately or falsely.

With this in mind, I was given the opportunity to review the Bracero History Archive website. The site was presented in a neat fashion, with relevant pictures, as well as an about page and many descriptions of the historians involved, and their work. Also, the site had long lists of relevant sources in the bibliography section, and links to academic partners such as Texas University and Smithsonian Institutions National Museum of American History. This allows students to instantly become aware of other sources, as well as being able to create a network around the historians mentioned. The site gives users a sense of direction towards where to find information after they leave and search elsewhere.

However after further scrutiny, several instances made me feel more cautious about using the website as a source. Most alarmingly, Bracero had links with George Mason University, the same University which had damaged their credibility for circulating a hoax story on pirates. Furthermore, they had a reputation for tricking people through the use of false information, usually on historical topics which were less known about. Admittedly, I had no prior knowledge on this subject before using the site, meaning that I was vulnerable to groups which create ‘official looking’ websites to spread misinformation.[2] This could be a mistake, or even a case that this fact was unknown to the website creator. Even still, I came across other causes for concern, such as anybody having the ability to create an account and add notes. This opens the gateway for an uneducated or misinformed individual to produce information, as well as somebody with a bias view. Going further, these notes are not curated by the historians, meaning that they are not frequently checked or monitored.

Amongst the questionable standards of site regulation, I came across a broken link under the resources tab. This may be down to maintenance issues, but it still raises concerns due to the tab being an important aspect of the site for its users. Meanwhile, on the home page the site boasts about winning an award, but fails in any way to address in detail what it took to receive the award.[3] These as well as the previous instances made me question the realiability of the website, especially since adding information to the site is not solely exclusive to academics.

Overall, I would be unlikely to use this website as a source. It is possible that this website provides good information, but allowing non academics to contribute, and having links with a hoax website should deter people enough to not risk using this site.

[1] Joe Barker, ‘Evaluating Web Pages, Techniques to Apply & Questions to Ask’ UC Berkeley- Teaching Library Internet Workshops (1995) consulted 04/02/15

[2] Paul S. Piper, ‘Better Read That Again, Web Hoaxes and Misinformation’ vol 8 no.8 (2000) consulted 04/02/15

[3] Larry Johnson & Annette Lamb ‘Evaluating Internet Resources’ consulted 04/02/15

Crowdsource transcription critical reflection

Crowdsourcing is the process of obtaining or inputting information into a project, and it calls on the help of large selections of people. These people do not necessarily have to be academics, but they tend to be enthusiasts. Crowdsourcing can be beneficial to everybody who is involved. Most noticeably, those with large collections that need editing, whether that be librarians or archivists, will get their collections edited without doing it themselves(1). Meanwhile, the people editing have gained access to material that they usually would not be able to view, since it would be deemed as material that requires an expert in the field to use it(2). Therefore, this process fulfills the needs of both parties. The scholars have their sources ‘cleaned up’ and completed for them, quickly and cheaply without the stress of doing it themselves. Enthusiasts on the other hand are introduced to a whole new world. Their interests can be taken to new levels, whilst their knowledge and passion can grow at the same time. This process is a healthy one, and if enough interest is raised, it could encourage scholars to provide even greater access to sources.

For instance, Old Weather and Transcribe Bentham are Crowdsourcing projects with rising popularity. Old Weather can partly put its success down to making its site interactive. Users are almost playing a game whilst they are transcribing, and whilst the tasks can be challenging, they remain interesting and therefore likely to keep you going. Furthermore, there are moving pictures with snippets of factual information written below. This teases the users with interesting knowledge, and encourages them to engage with the site further. The site stays as simple as possible, and also reminds the users of how much they can help scholars. This reminder creates an appreciation that may make users keep coming back to the site.

The Bentham project has used other means to gain popularity. Wisely, social media networks like Facebook and Twitter were utilized to attract people to their site. Marketing was also successful elsewhere, with younger students familiar with the ‘Internet age’ being targeted(3). The website has also been modeled on an easy to use format, so users are unlikely to feel confused when transcribing. These factors allow the project to be exposed to a user who is likely to feel more positively towards working with digital sources. The importance of overcoming the idea that digital sources are complicated cannot be underestimated, since that is what deters some traditional historians.

Broadly speaking, projects need to ensure that their service can ultimately keep people satisfied, and persuade more to get involved. Keeping things simple is key in this persuasion. Christine Borgman suggests that researchers and scholars will not be attracted to add to projects until the online tools are made easy(4). This applies as well to non academics who are enthusiastic, but unsure about digital sources. Their contribution may be a great help to scholars, but a easy to use and more simple presentation of sources may be needed to gain their help. If changes like these can be made, then the average citizen could feel empowered to significantly contribute to any subject.

1, 2 & 3 ‘Building a Volunteer Community: Results and Findings from Transcribe Bentham’,

‘Building Better Digital Humanities Tools: Towards broader audiences and user-centred designs’,

Week 6- Oxford Knights mapping exercise

This blog entry will discuss the work I did during week 6 of my Digital history course. We focused on a dataset called ‘The Oxford Knights Archive (2014-15)’, which was created by Corey Alborne, Jack Dunne, Namiluko Indie and Bethany Reid. These were some of last years students who did this same course. The dataset contains information on the different students who graduated from Oxford University before the year 1715. The information was obtained through a published book called Alumni Oxoniensis that was digitised by British History Online. The students worked together to extract the birth places of all of the graduates who went on to be given a knighthood. In class, we were given the task to import this data into a Google Fusion Table that we made, and then make the changes that the lecturer highlighted.

What became noticeable from early on was the variance in ambiguous percentages amongst us in the class. By this, I mean the amount of results as a percentage that Google ruled to be open to more than one interpretation. Because of this, these particular results were not a certainty. I myself experienced 7%, but others in my class had percentages varying from 1 to 7. So when we were importing our data into Google fusion table, some of our findings looked slightly different. Although generally they all looked similar, we all did not have the same outcome from the same results that were used.

After this we then used information to see what we could conclude from a heat map. The heat map revealed where the graduates from Oxford were born, which would help us to make connections on whether these graduates were willing to travel, or keen to stay local. We found that graduates from Oxford mainly lived local to the university, but there were some who did travel from areas in the north west, such as Liverpool. Interestingly, there were not any graduates born in East anglia, so we speculated as to whether Cambridge was more favourable to those in East anglia due to it being closer than Oxford. In far away places like Exeter and Wales, very few if any came to Oxford, once again likely due to those in the area making do with their local universities. It is worth mentioning that this attitude is understandable, since during the time period travel would have been far more difficult and time consuming than it is now.

There were also minor mistakes, but mistakes that would have an impact on the results. A man called Richard Breame was recorded as being born in Surrey Vancouver, and not Surrey England. Furthermore, there was another case where one graduate was mistakenly recorded as coming from the wrong part of Yorkshire. Although these are few and minor errors, they still would contribute to wrong results. Striving for accurate and reliable results is something prioritised by all historians, so mistakes like these becoming more common would not be an ideal scenario. However, this dataset as well as Locating London’s past are initiatives which make geographical analysis possible and accessible.[1] Therefore amongst the mistakes we must remain grateful.

Overall however, the tools used would definitely be helping to historians. Data visualisation allows historians to give information in a way which is more interesting than usual.[2] Using a fusion table gets quicker results, and saves the historian time. It prevents them from having to concentrate on each graduate specifically. Heat maps are also very handy when looking for patterns. It was revealing when looking at whether locality of the university had anything to do with where the graduates chose to study. As with most tools, caution must be taken though. Mistakes with data can be made, and ambiguous percentages can reduce the reliability of the results that are spawned.

[1] ‘Locating London’s Past: a geo-referencing tool for mapping historical and archaeological evidence, 1600-1800’ consulted 15/04/15

[2] Ben Schmidt ‘Data narratives and structural histories: Melville, Maury and American whaling’, Sapping Attention (2012) consulted 15/04/15

Week 1-Past, Present, Future…

Whether they like to admit it or not, Historians want their ideas read by as many people as possible. Those interested in the topic tune in, but others with a more casual interest require the Historian to make a greater effort. In this current generation of technology, the Internet is a place that appeals more to some, especially the casual reader. The quick and easy way of finding out online about World War 2 sounds more tempting than spending hours sieving through tremendous amounts of books in the library.

Appealing to a different type of generation is a plunge some Historians are making, with success too. Getting over the negative connotations that accompany social media(eg, Twitter & Facebook) means qualified experts are allowing historical debate to blossom in a newer form of communication. The interactive nature allows for the creation of different ideas, and these conversations could set the groundwork for serious thinking.[1] Albeit, it is likely that bias or uneducated people will undesirably engage in these conversations at times. This use of modern technology may lead to some Historians becoming victim to criticism within their field. However simply put, it gives them and their work an advantage over those more traditional Historians who are criticizing them. This is due the the fact that opinions and job opportunities are commonly found on Twitter.[2]

The Historian has the ability to post whatever they want, whether that being an advertisement for their book, or a brief comment on a topic. To their advantage, they can also direct their audience to certain content, or even take the conversation to another site.[3] Their thoughts will then be published on a site used by millions of viewers. Obviously only a select few out of the millions will see, but nevertheless this exposure cannot be underestimated, since the chance for more to see is always a possibility. In comparison, those Historians who are not on social media do not have the same opportunity, and instead find themselves having academic conversation with the same circle. Likewise, they find themselves appealing to the dedicated readers, rather than readers with potential to get involved.

As with everything, the use of social media has its drawbacks for the Historian. Most obviously is the lack of space and words that they can utilize to start a topic. Limited word counts means ideas cannot get a great deal of attention, and in the case of Twitter, a lack of characters means the Historian has to take the conversation elsewhere, or shorten it down drastically. Along with the unproductive nature social media offers for some, others are concerned with the audience. Although appealing to the casual reader is a good thing, not having long enough to explain things to them in words sometimes results in a lack of quality in understanding. This is one of the major concerns for those Historians who have not taken the ‘plunge’ into social media yet.

Overall, I think it is fair to conclude that social media for Historians is something with potential. Although it has its drawbacks, it gets more people involved with the subject, and can act as a catalyst to encourage further reading by other means. The success depends on how well the Historian goes about using social media outlets. If for example, Twitter is used for advertising their ideas, and planting the seeds for a discussion elsewhere, then it is hard to see it failing. However, if their roots are abandoned for just solely posting online, they may run into trouble, and find themselves inadvertently damaging their credibility.

[1] Rachel Herrmann, ‘Twitter for Historians: Atop my Hobby-horse Twitter’ (2012) consulted 30/01/15

[2] Miriam Posner, Brian Croxall, ‘Creating your Web Presence: A Primer for Academics’, Chronicle of Higher Education (2011) consulted 30/01/15

[3] Heather Cox Richardson, ‘Should Historians Use Twitter? Part 1’ (2013) consulted 30/01/15