Case studies 
These emails were sent to the ANDS Partners List and are copied here with permission from the senders.

From: [] On Behalf Of Sam Searle
Sent: Friday, 4 May 2012 4:27 PM
Subject: Workshop activity for Library staff - give this a try!

Hi all - we recently had two successful workshops with 40 Library staff (mostly subject librarians and learning skills advisors) at the Clayton and Caulfield campuses of Monash.

I thought that I would share some scenarios (cut and pasted below) that we used in an interactive discussion session in one part of the workshop. Attendees were divided up roughly into subject area groupings (about 6-8 in each group) and were asked to read over a case study relating to a higher degree by research (HDR) student from that area and to identify at least one technical and one non-technical data management issue that the HDR student might face. After about 30 minutes for discussion, the groups were asked to summarise the case study for the other groups and to report back on the issues that had been identified.

The scenarios are not based on real examples, but represent an aggregate view of many of the real-life issues that Library staff have observed in email inquiries, consultations and interviews, or that have come up during the past three years of PhD seminars that have been attended by around 150 PhD students from all ten faculties at Monash. In particular, things that we have found that the scenarios highlight are that use of third party data is far more common than you might expect and that many if not most projects utilise more than one method of data capture and analysis.

We had discussed the scenario idea a while back in relation to drawing out 'personas' based on our ANDS Seeding the Commons interviews. We thought this might be a way of discussing the issues raised in interviews while protecting the particulars of the research processes of our real-life interviewees. We had seen this idea of personas in some of the literature (it is common in web usability testing) but had not tried it out in this context before the recent workshop.

Another thing that we have tried recently and found very successful is focusing on dissemination first and then working back, which is why the scenarios all include details about the researchers' motivations and dissemination goals. This flips the 'lifecycle' idea around and the non-chronological approach seems to work better, not least because it's usually more fun to talk about research impact than about some of the drier rules and regulations!

This exercise generated a huge amount of discussion and was rated in many of the workshop evaluations as the most useful part. All groups were able to identify multiple data management issues in areas such as storage, file formats, software/hardware obsolescence, ownership, third party copyright, ethics and dissemination. Groups not only enjoyed doing the case studies in their own subject areas, but said that they found the scenarios from other subject areas very valuable too. The exercise is now being considered by some of our staff as a model for running more participatory workshops with other staff and students.

In my four years at Monash, this is by far the most successful activity run with Library staff. Everyone had a fantastic time with it and I'd encourage you to give this a whirl either with these examples or ones that you have generated yourself to suit your institution's situation.

Please let us know how it goes!


Scenario: Medicine

Louise is a new PhD student in the School of Public Health and Preventive Medicine. Her PhD topic relates to policy interventions to prevent the outbreak of infectious diseases like bird flu. She is interested in this topic because of her work as a policy analyst with the Victorian Department of Health and her background in volunteering in developing countries, and sees completing the PhD as a good way to further her policy career as well as her interests in social development.

Louise’s research will involve a number of field interviews with health workers and policy makers in Australia, Vietnam, Indonesia and China. She has an iPod and thought that she would use this to make audio recordings of the interviews, which she will later analyse (possibly using NVivo).

Louise also wants to access the policy documents of government agencies and health service providers (including hospitals) in Victoria and other jurisdictions in Australia and overseas. She thinks she will do some kind of content analysis on these, probably also using NVivo, for which Monash has a site licence. Some agencies freely provide these documents on their websites, while other agencies have internal documents that are not readily available to the general public, which she may have to approach the organisations for directly.

Louise wants to test her hypothesis that a speedy response from policy makers can reduce the spread of infectious diseases. This will require doing some cross-analysis of her findings from the policy documentation and interviews along with the World Health Organisation’s Cumulative number of confirmed human cases of avian influenza A(H5N1) dataset, which is available for download from the WHO website as a series of PDFs published monthly.

In doing her literature review there are a number of industry publications and academic journals that Louise has identified as potential places in which she might try to publish later. There are also some big international conferences coming up, and her supervisor has encouraged her to consider presenting her results at these.


Identify at least two data management issues that the student may need to consider if they are to avoid problems.

Try to identify at least one potential technical issue, and one potential non-technical issue.

Hint: You may want to start your discussion by thinking about what the student wants to do with their research at the end of the project and working your way back from there.


Scenario: Business and Economics

Gemma is about to start a PhD in the Faculty of Business and Economics. Gemma worked as a stockbroker in London for several years, but is increasingly interested in environmental issues. For her PhD, she wants to track the relative success of shares included in ‘ethical investment’ portfolios, compared to more general investments. She also wants to look at the newspaper coverage given to ethical investment in the financial sections of major Australian newspapers to see if it has grown at the same rate as the number of ethical products in the market has grown.

Gemma has already discovered that she can access ASX information through the Australian Equities Tick History database hosted by Sirca (a not-for-profit company limited by guarantee to host and manage ASX data for a small group of collaborating Australian universities, including Monash). This data goes back to 1991 but the most recent results can take several months to appear. The data is accessed via a web interface and the results that Gemma receives from her searches (which have a certain number of parameters) are put up on a server from where she can download them as a .csv file. The files only stay on Sirca’s server for a month - after that time they are deleted.

Gemma thinks she will probably only need Excel to do her analysis on the stock data - she got a copy of Microsoft Office installed on her laptop by eSolutions when she started working at Monash as a research assistant and plans to continue using that once she finishes her contract and starts the PhD full-time.

Gemma thinks that the best way to investigate the newspaper coverage would be to download the full text of lots of newspaper articles from the Library’s databases and then load these into a software program called Leximancer, which is designed for textual analysis of the kind she wants to do. This tool was developed by UQ researchers but has since been spun out into a small company. Gemma asked eSolutions about Leximancer but they said the tool is not supported at Monash because there are only a few users of it locally. Nevertheless, a friend of Gemma’s has found it so useful that he is paying the monthly subscription out of his own pocket and has recommended that Gemma do the same.

Gemma thought her project was going really well, but her supervisor recently suggested that it might be better if she focused on more than one national market, and has suggested that she should think about including other countries such as New Zealand, Denmark and Canada as part of her study.


Identify at least two data management issues that the student may need to consider if they are to avoid problems.

Try to identify at least one potential technical issue, and one potential non-technical issue.

Hint: You may want to start your discussion by thinking about what the student wants to do with their research at the end of the project and working your way back from there.


Scenario: Arts

Lachlan has recently started a PhD in the School of English, Communications and Performance Studies. He is interested in the history of circus arts in Australia, and developed this interest while doing paid and voluntary work as an arts administrator.

Lachlan will be doing archival research in state and city archives in Victoria, Adelaide and Brisbane. His supervisor has suggested that he use a digital camera to make copies of as much material as he can while doing his fieldwork in the archives, so that hopefully he will not have to do multiple trips to the different cities (his budget for the fieldwork is very limited). He will end up with hundreds, if not thousands, of images of archival documents, programs, posters, and photographs.

He also plans to interview present and past performers, administrators and Board members of a number of circus companies, and to document a number of performances using a digital video camera. Interviews will be analysed, possibly using NVivo software, for which Monash has a site licence.

Lachlan is an aspiring writer and would eventually like to publish a social and pictorial history of circus arts for a general, rather than academic, audience. If he cannot find a publisher prepared to publish this as a book, he might try to get the information out via a website or via his blog, which he also plans to use to promote the project while he is doing it. He has also been approached by the ABC to produce a radio documentary, and plans to use snippets from his interviews as part of this 1-hour show. He thinks the interviews might constitute an interesting oral history collection in their own right and wonders whether the National Library or State Library of Victoria or some other institution may be interested in having these at the end of the project.


Identify at least two data management issues that the student may need to consider if they are to avoid problems.

Try to identify at least one potential technical issue, and one potential non-technical issue.

Hint: You may want to start your discussion by thinking about what the student wants to do with their research at the end of the project and working your way back from there.

Scenario: Science and Engineering

Paul is just starting out on his PhD in Engineering. He is investigating the properties of certain metals in the context of more efficient car design. Paul is interested in pursuing a career as an academic researcher and is more interested in the fundamentals of surface science than he is in cars, but he was pleased to receive a scholarship from the car manufacturer that is supporting the research in the hope that the results will give it a competitive edge.

Paul is one of four PhD students using this project as the means of completing their PhD - they have the same supervisor, who is the Primary Chief Investigator on the ARC Linkage Project that the PhD students are all part of. Paul will be working with samples of various kinds of metals, which will undergo different treatments in the lab. Each student in the lab will be treating the same metals slightly differently and they will need to be able to compare results with each other. The treatment processes vary, and Paul’s is one of the most complex - it can take him up to a month to generate a very small number of samples.

The treated samples will be run through a scientific instrument that produces very large images and lots of them - one experiment might generate hundreds of images. This piece of scientific equipment is provided by a commercial supplier, who also licenses the software needed to perform the analysis and visualisation on the images. The machine has been in use in the department for a while and is pretty slow: there has been talk that it will be upgraded sometime soon, which everyone is really looking forward to as this will speed up the research.

The second stage of Paul’s research will be to model the effects on car efficiency of using metals that have received the treatments. The car manufacturer that is sponsoring his research has a computer model that they have developed themselves and want to validate. Paul will feed his lab-generated data into the models, producing new derived data that may point to design changes that the company could make to improve the efficiency of their vehicles.

It is likely that prototype cars made from the new materials might be produced as a result of this work, but this would probably not happen in the timeframe that Paul is doing his PhD (he is aiming to complete in 3 years, but the project has at least 5 years of funding). When he finishes, Paul thinks he will seek a post-doc in another institution, and try to further his work using the data that he has derived during his PhD, perhaps applying the findings to another area of transport manufacturing (e.g. high speed rail).


Identify at least two data management issues that the student may need to consider if they are to avoid problems.

Try to identify at least one potential technical issue, and one potential non-technical issue.

Hint: You may want to start your discussion by thinking about what the student wants to do with their research at the end of the project and working your way back from there.


Hello Sam and all

Thanks for sharing. These are wonderful case studies.

Your email reminded me of some scenarios we created for an early-career researcher training session last year. Participants were from a variety of disciplines, and as we were guest trainers we didn’t have access to data about their areas of research. As such we tried to create three very simple scenarios that we could ask a series of questions about. I’ve copied the questions and scenarios below for everyone’s use/interest.
If I was to run the training session again I would make some changes to the wording of the questions and the details of the different scenarios so that the exercise was more focused. For example the first question sidetrack most groups. But the exercise was useful, in particular the questions about IP and data archiving and disposal generated some good discussion. Like you, Sam, I really recommend this approach.

1. What data will be collected? Sourced or generated? Format? Collection method?
2. How will the data be organized?
3. What descriptive information, or ‘metadata’, will be included to understand the data?
4. Are there any intellectual property considerations? Who owns the data? Are there any legal or ethical issues?
5. What are the sustainability arrangements? How will the data be archived? What is the disposal plan? How can the data (the raw data, not results!) be shared?

Scenario One
Project title: Key drivers of corporate-charity sponsorship relationships and the impact of organisational values: a dyadic study
Project team: Dr Amy Booth (Chief Investigator), Caleb Detmar (Honours student)
This study will identify and contact Australian commercial enterprises and their not-for-profit charity partners that have been involved in a sponsorship relationship in the last twelve months. A contact person from the commercial business and the not-for-profit charity who have been involved in the sponsorship relationship will be asked to complete a web-based survey. Primary surveys will be issued to commercial participants and mirror surveys will be issued to charity participants. The surveys will investigate organisational compatibility on a relational level, focusing on characteristics of organizational values, and outcomes of the relationship including sponsorship satisfaction and intention to maintain sponsorship relationship.

Scenario Two
Project title: A meteorological case study of 2010 Mount Lofty Ranges bushfires and WRF-fire simulation
Project team: Katy Lamond (PhD candidate)
This study will use Bureau of Meteorology data and other available data to investigate the influence of local meteorology on the fire behaviour observed during the three days of the 2010 Mount Lofty Ranges bushfires. The study will then simulate the fires using WRF-Fire: the Weather Research and Forecasting model coupled with a fire behaviour model, and output the simulations using a range of visualisation tools.

Scenario Three
Project title: Movement patterns of Underwater Drop Bears: a long-term monitoring study using acoustic telemetry
Project team: Dr Evan Furnell (Chief Investigator); Dr Gavin Ho and Ingrid Jurgen (industry partners from the Department of Mythical Creatures)
This study will track patterns of movement of Underwater Drop Bears in the Torrens River in South Australia over a five year period. Sensors will be attached to between 20 and 60 Underwater Drop Bears. The sensors will continual transmit observations via acoustic signal, including location, depth, and body temperature. Receivers will be deployed along the river and will record data whenever an Underwater Drop Bear with a sensor is in range. Data will be retrieved from the receivers on a monthly basis.


Cathy Miller
Research Data Project Officer
Barr Smith Library and IT Strategy and Architecture
The University of Adelaide, AUSTRALIA 5005
Ph: +61 8 8313 5069

Tue May 08, 2012 10:58 am
