Open Research Pilot project: the funder view

This is the sixth and final piece in our series of blogs to mark the end of the Open Research Pilot Project – a two-year initiative through which four University of Cambridge research groups worked with University Research Support and the Wellcome Trust’s Open Research Team to assess what would be needed to make all their research outputs openly available. In this blog, David Carr, Programme Manager for Open Research at Wellcome, provides his perspectives on the project.

START OF THE PROJECT

As a global research foundation dedicated to improving health for everyone through enabling great ideas to thrive, the Wellcome Trust has been a long-standing and passionate champion of open access to research publications and research data sharing. We are committed to ensuring that the outputs of the research we support can be accessed and used by bright minds around the world to help accelerate research and its application to improve health.

When we were approached by the team at Cambridge in 2016 with the idea for the Open Research Pilot, we were in the process of establishing a new dedicated Open Research team at Wellcome to help spearhead our work to advance openness. We were in the process of developing a new policy on managing and sharing data, software and materials (which was subsequently published in July 2017). We had also just launched Wellcome Open Research with F1000 as a new platform to enable our funded researchers to rapidly publish any research finding they wished to make available using a fully open and transparent review process.

We were quick to accept their proposal and join forces. The Pilot offered a chance to explore the opportunities and challenges facing our funded researchers in adopting open approaches, and the resources and support they require to do this. It also offered the potential to explore whether the Wellcome Open Research platform could help researchers in making their outputs available.

 

PROJECT IN PROGRESS

Our main input during the Pilot was through periodic update meetings with the research groups and Cambridge team to review progress and discuss emerging themes and issues. I found these discussions enormously valuable.

There is simply no substitute to hearing about and discussing the practical challenges and barriers to open science first hand from our researchers. As the previous posts have described, these highlighted important issues around data size and complexity; incentives and recognition; skills and training; and the funding and sustainability of data resources. These conversations were very timely for us and have helped to inform the development of several aspects of our Open Research activity at Wellcome.

The Pilot also helped to build on our relationship with the research support team at Cambridge and provided valuable insights on the issues facing universities in supporting researchers to manage and share their research outputs. The expertise and dedication of the team to supporting the researchers with whom they work was hugely impressive, and they must take considerable credit for proactively initiating and taking forward this project. It is great to see that Cambridge has now formalized its support for open research at an institutional level through its recently-published position statement, and I hope it continues to adopt a leadership role in this space.

While I think the Pilot was hugely worthwhile, inevitably there were challenges and lessons to learn. Personnel changes in the team at Cambridge inevitably caused some minor disruptions, and it is fair to say that from Wellcome’s perspective other priorities sometimes meant our contribution and focus on the Pilot was sometimes less than it could have been. On reflection, there were probably opportunities to better align and link the Pilot with other activities at Wellcome that were missed.

We were a little disappointed that the Groups involved didn’t utilize our new publishing platform, although delighted that the first Wellcome Open Research data note resulting from the Pilot has now arrived (as highlighted in the first blog in this series). We hope that other Wellcome funded research groups at Cambridge will consider trying out Wellcome Open Research, and see if it adds value for them in rapidly and openly sharing their research findings.

 

LOOKING AHEAD

Wellcome is committed to supporting our researchers in adopting open research approaches in ways that maximise the value of research outputs and enrich the research enterprise.

Through our Open Research programme, we are taking forward a range of activities to support this goal, including:

  • providing the Research Enrichment – Open Research funding scheme to enable existing Wellcome grantholders to apply for additional funds to enhance the impact of their funded research through opening up their research outputs;
  • running the Open Research Fund – an annual competition to support cutting-edge, innovative approaches to open research around the world;
  • enhancing our support for researchers to manage and share data, including through an ongoing pilot with Springer-Nature to make its Research Data Support service available to our funded researchers;
  • taking a lead as a funder in incentivizing open research and working with others to accelerate implementation of the San Francisco Declaration on Research Assessment (DORA);
  • developing our guidance on developing and funding output management plans as part of grant applications, and introducing new approaches;
  • continuing to develop the Wellcome Open Research platform as a venue for Wellcome researchers to share their research findings.

In addition, of course, we are also focusing on accelerating the global transition towards full and immediate open access to research publications with our partners in cOAlition S.

We are committed to continuing to work with our funded researchers and institutions to advance open research. We got a lot out of our participation in this Pilot, and look forward to continuing to work closely with our colleagues at Cambridge and across the institutions we support.

 

Published 13 March 2019

Written by David Carr, Programme Manager for Open Research at The Wellcome Trust

Creative Commons License

Open Research Pilot project: reflections from the Research Support Team

In the fifth and final blog in the series from the end of the Open Research Pilot, the project’s latest Research Support team, Georgina Cronin, Dr Debbie Hansen and Dr Lauren Cadwallader, reflect on their individual contributions and thoughts about the pilot. The team’s knowledge and skills include those related to open access and research data management as well as general research librarian and scholarly communication support.

Georgina Cronin

I became involved in the project during the initial launch preparation stage in October 2016. I had been in my post as the Research Support Librarian at the Betty & Gordon Moore Library for seven months by this point, and so felt that being involved with the pilot would not only allow for me to assist researchers in developing good open research practices, but would also allow me to gain insights as a research support professional.

Once the launch and recruitment phase was completed, I was assigned Dr Laurent Gatto as the researcher that I would be supporting throughout the pilot. I was already familiar with Laurent and his commitment to open research as I had previously worked with him in organising the Cambridge offshoot of the OpenCon conference. I was grateful for the opportunity to learn more throughout this pilot about his work in proteomics and about how he uses open research practices to share his work.

However, after several meetings, it soon became clear that Laurent was less in need of research data management and open practices support from me, and more in need of facilitated opportunities to discuss issues surrounding open research, funder insights (especially from the Wellcome Trust), and tools to facilitate these discussions throughout the pilot project. Whilst I tried to enable this with the project management team – and some progress was made during group meetings – the unique needs of each research group within the pilot and the wider focus on using the Wellcome Trust’s Open Research platform meant that some of this discussion failed to get the traction that we had been seeking. Perhaps this was due to the fact that open research means different things to different people, and the unique context of a group’s research area and funding plus their existing knowledge and priorities of practicing open science denotes whether such extensive discussions can take place.

As the project progressed, several members of the project management team left for roles in other institutions and so, in mid-2017, I took over supporting Dr David Savage’s research group which was looking at type 2 diabetes and insulin resistance. David had been excellently supported so far by my colleague Dr Marta Teperek and so there was little additional support that I could offer in the first instance other than offering further advice on the publication process and establishing a public-facing website for David’s research group.

Whilst I valued the opportunity to work more closely with two research groups as part of this pilot, there were many occasions when I felt out of my depth in the research support that I was able to offer and in the motivation that I was able to provide to my groups. Whilst open research as a concept was the main goal for the pilot, the intended steps towards achieving that goal were vague at times, and I think that this is shown in the feedback that both of my groups gave as part of their reflections on the project. They were both initially keen to be involved but, as the pilot progressed, their own motivations did not always align with the overall focus of the project. I think that we all learned a lot from one another, from the other research groups, and from Wellcome Trust, but the two year duration of the pilot meant that maintaining interest and focus while also managing other demands on time was a challenge.

Dr Debbie Hansen

When I took on the role of Research Support to Ben Steventon’s group, the project had been running for over a year and there had already been two others in this role. My two predecessors had worked with Ben to explore and identify existing and potential repositories appropriate for his team’s imaging and tracking data and put him in touch with others in the University and field who would possibly be able to help. There is further information about the team’s challenges around the open sharing of their data in this blog.

I was familiar with the Open Research Pilot as I was involved in a secretarial capacity, and so was aware of the open data issues Ben’s team were facing. By this stage, it had become clear to Ben that an existing single solution to the problem was not available. However, through his and Wellcome Trust’s engagement with the project, Ben became acquainted with the various schemes Wellcome Trust were running in support of open research, and what they were looking for in submitted proposals. I was pleased when Ben approached me to review an application and was subsequently thrilled to hear that he had been successfully awarded Open Research Enrichment funding as a result of his application. It will be interesting to follow how this project develops towards the sharing of his team’s image data and related data.

For me, being part of the project has been an opportunity to find out about the range of issues related to open research, and how, for researchers to work openly, this requires the backing of government, funders and institutions. I have also learnt more about how ‘on-the-ground’ research support can help advise on open research policy and on particulars related to Open Access, research data management and research tools.

Dr Lauren Cadwallader

Upon starting this project, I wasn’t really sure what to expect. I was keen to help researchers with open research and find out what they needed from us, but as to the actual, practical day to day stuff – I wasn’t sure what to do. I was keen to working with the Jefferis group because they work collaboratively, and I was interested in how open research works in that kind of setting when your co-researchers aren’t necessarily bound by the same open research requirements as yourselves. I thought my involvement would be much more focused on this line – for example, helping with how to negotiate where to publish based on OA requirements, or on data ownership/authorship queries.

In reality, the advice and support I gave was much more locally focused. At the beginning we identified an immediate area of support I could help with: Electronic Lab Notebooks (ELNs). The group was already participating in the University wide trial but I could help with providing further advice on the products, which I attempted to tailor to their specific needs. However, it was difficult to get a handle on their workflows to establish how an ELN might fit and therefore which solution would be most suitable. This was in part due to time – I was unable to full immerse myself in their group and practices – and also on expertise. Had I been able to get immersed, I’m not sure I would necessarily have understood what was going on! At times I felt a bit out of my depth – I knew about open research in general but not in enough detail to advise on specific aspects, especially technical systems. It was also difficult to really understand what issues the group needed help with. I think in reality they were already quite ahead with open research; the fact that this pilot enabled them to have new conversations with funders was enough, but as I was expecting to do more for them, I felt a bit lost.

This feeling has been redeemed now we are at the end of the project. I’ve realised just how useful it has been for the group to have a way to talk about the funding of resources. I’ve also been able to link them up to our Data Champions so they can receive training. This is a really nice outcome for me as it is marrying up two projects we are running for the benefit of all involved. I think this is a nice example of how networks can really help facilitate open research across the University.

What do I think this means for the future of supporting open research in Cambridge? Firstly, that it can be difficult to offer really focused support if you are not fully integrated with the research group. I think that the embedded librarian model has been a lot more successful at this. Secondly, that groups don’t necessarily need this type of embedded support! Actually, just knowing who you can ask a question to, even if your question is then redirected to the right person, is very helpful. It was surprising for my group to learn that librarians offer this type of support, so I think making researchers more aware of the research support aspect of our roles is important. Finally, I think that having conversations around open research is really valuable – not only to get someone else’s perspective on it and how it relates to their discipline, but to also highlight issues and know that others are thinking about them and looking for solutions.

Open Research Pilot case studies: sharing all research outputs and future sustainability of data repositories

In this final blog from the researchers involved in the Open Research Pilot, the Jefferis group discuss their participation.  During this time, their open interests have been focused particularly on how to share all outputs from the research process, and on issues around sustainability of data repositories into the future. 

START OF PROJECT

Dr Greg Jefferis believes that sharing research outputs fully – including data and code – is essential to accelerate research, and he has himself benefited from others sharing unpublished research outputs during his career. He hopes to create a standard in the field of neural circuits / connectomics that echoes the very high standards of sharing amongst Drosophila researchers in the past. He, together with Matthias Landgraf (Department of Zoology), recently started a new group funded by the Wellcome Trust, focusing on Drosophila connectomics. This project is an international collaboration, and will generate outputs besides publications e.g. neuronal skeletons and analysis code, that will also have great significance for over 50 labs working in this area. With this in mind they are collaborating with Virtual Fly Brain (VFB), a Wellcome Trust-funded web resource that curates and disseminates Drosophila neuroscience data to make their research freely accessible. They already plan to release and share key data via VFB on publication, and already use practices that fit with the type of data they use and produce.

They joined the Open Research Pilot to look for ways to share additional data and interim results with collaborators, and to start a discussion on what resources are needed to maintain data repositories that are functional long-term and accessible to all.

PROJECT IN PROGRESS

As part of their effort to identify ways to easily share results with collaborators, and ultimately with their wider community, the Jefferis group also took part in the Electronic Lab Notebook (ELN) trial run by the University. Unfortunately, they did not find an ELN that worked better than the tools they already use. The trial process was, however, helpful in raising awareness in the group of the available ELNs, and in identifying the features and functionality that are the most important for the group members. Greg and his group will continue to follow the development of ELNs, as they might become relevant to them in the future.

The group became more aware of the support that the Office of Scholarly Communication (OSC) provides or can facilitate for researchers. Dr Marta Costa (the Project Lead for the group) and Greg felt that their group would benefit from some dedicated training on using GitHub, as this is an important tool in the group’s research practices. The OSC doesn’t run this training but was able to ask for help from the Data Champion community that they facilitate. Data Champions are volunteers from around the University who want to help foster good data management practices. Three members were happy to collaborate to provide the Jefferis group with the training and support they needed. This benefited the group members but also proved interesting for the OSC, who were able to understand more about how or where researchers were looking for support, as Marta and Greg were initially unaware of the Data Champions and the support they could offer.

Both Greg and Marta found it fruitful to engage with the Wellcome Open Research team. Specifically, they found discussions useful around how open research (including preparing, sharing and managing it) should be funded; this issue is very relevant to the subject-specific data repositories they use, such as VFB. As part of this discussion, Greg and Marta, along with Dr Lauren Cadwallader and Dr Dave Gerrard from Cambridge University Libraries and David Carr from the Wellcome Open Research Team, authored a series of three blog posts on this issue, each from a different point of view: resource, institution and funders.

Greg and Marta also found the pilot project thought-provoking because they were exposed to other research groups and their needs around open research. They found it interesting to see how the demands and practices for open research need to serve various types of data and outputs.

LOOKING AHEAD

Even prior to the pilot Greg and Marta believed that open research should be the norm. Being involved in the project has however, reinforced their view that researchers need more support to engage with and implement open research as standard practice. This support should take the form of funding, training and/or infrastructure, and should focus not only on targeting the end point of a research project, but on developing awareness and implementing an open research culture for new and existing students and staff.

From a research group point of view, Greg and Marta think it would be helpful to have training (online and/or in person) available for new staff and students. This training would need to be discipline-specific or specific for the group, for example in the case of the Jefferis group there is a need for training on writing code that can be shared and reused (similar to the training session organised during the pilot), and on data management.

From the point of view of resources such as VFB, the issue that still needs addressing is funding. There are currently no mechanisms in the UK or worldwide for the long-term support of resources that are used by an international community. In fact, some of the genomic resources funded in the US by NHGRI have recently seen their funding significantly reduced. Although the integration and curation of data that is integral to the work of groups such as this one increases data reuse and accelerates research, there is no current funding mechanism that recognises the added value of these resources. To add to the complexity of this issue, the users of these resources are international – not bound by country borders, but by research subject.

 

Published 28 February 2019

As told to by the Jefferis group to the Open Research Pilot Research Support Team.

Creative Commons License

Open Research Pilot case studies: research integrity and reproducibility

This is the third blog in our series marking the end of the Open Research Pilot (a two-year initiative involving University of Cambridge research groups, University Research Support, and Wellcome Trust’s Open Research Team).  Professor David Savage tells the Research Support Team how he is concerned with how to improve and ensure research integrity and reproducibility.

START OF PROJECT

Professor Savage applied for the project as a direct result of two points mentioned in the original project call for participants:

  • ‘are you in favour of more transparency in research?’
  • ‘are you concerned about research reproducibility?’

He hoped to gain further insight into ways in which the appropriateness and authenticity of scientific research could be improved. He also hoped that through discussion, his group might be able to come up with new and – hopefully – effective ways to improve the current state of play in this respect.

PROJECT IN PROGRESS

At the start of the project, Professor Savage enjoyed the discussions about research integrity – his interest in this aspect was shared by a member of the project support team, Dr Marta Teperek. However, after Dr Teperek’s departure in June 2017, and as the project progressed, it became increasingly evident to Professor Savage that the project, whilst motivated by the ideals around transparency and reproducibility, was more directly focused on other more practical aspects of open research such as data sharing.

Although he appreciates that these practical considerations were a major motivation for the other participants, he felt he did not have as much to contribute or gain from this focus. Research in his field of human genetics has been conducted in an open way for many years: the benefits have been huge; for example, studies can be done on much larger populations than previously practical due to open data.

LOOKING AHEAD

Whilst his own views of open research have not changed much over the course of the project itself, what he did learn from the project is that the area of open research is generally of increasing interest and importance, and is likely to be mandated by funding bodies and academic institutions shortly.

As told to the Open Research Pilot Research Support Team

Open Research Pilot case studies: promoting greater openness in research

In the second blog in our series marking the end of the Open Research Pilot ( a two-year initiative involving University of Cambridge research groups, University Research Support, and Wellcome Trust’s Open Research Team), Dr Laurent Gatto tells us about his group’s involvement with the project.  His particular Open interests during this time have been how to influence the research community in general towards greater openness.

START OF PROJECT

Dr Gatto initially applied to be part of the project so he could learn about other researchers’ views on open research, and to contribute to and promote open research. In particular, he thought that participating in a project initiated by the Office of Scholarly Communication (OSC) at the University of Cambridge and the Wellcome Trust seemed a good opportunity to influence the UK research environment towards greater openness. His greatest hope was to promote – directly or indirectly – greater openness in research as widely as possible, thanks to the reputation of the project organisers.

PROJECT IN PROGRESS

While some participating groups may have progressed towards greater openness individually, it was Dr Gatto’s hope to achieve a wider impact, beyond those around the table. However, for him, the project has been arguably too contained for that and so he is unclear about what has been achieved overall.

All participants had interesting inputs and some were already well versed in open research. He thought that the project could therefore have been much more ambitious through making use of the collective wisdom and experience by being open and collaborative, such as through asking for input from the community at large, opening up the discussion channels, and when specific questions arose, asking experienced members from the open community for advice.

LOOKING AHEAD

Dr Gatto suggests that there are two types of support that researchers need:

Firstly, technical support, helping researchers to discover and use open research platforms. In a minority of cases, new platforms might need to be developed (for really massive data for example, or for distributed computing requirements), but for the vast majority of researchers, reasonable technical solutions and support are readily available on-line. Local, in-person support is helpful for providing a point of contact for face-to-face training, and for redirecting researchers to the right resources.

The second type of support needed should come from the institutions – senior academics, funders, etc. – to support researchers in being open and making them successful by being open. For example, funders are in a position to redefine priorities in research by promoting and funding researchers that demonstrate open and reproducible research. This type of support is something that has been generally missing in Dr Gatto’s experience; he believes the current priorities of senior management do not support the provision of adequate rewards for open researchers. This is the kind of support he would have needed as a researcher in Cambridge.

Dr Gatto welcomes the publication of peer review reports (signed or anonymous) and the promotion of pre-prints (including open, public review and discussion of pre-prints) as important current advances. He likes the Wellcome Trust’s recent call of Open Research Projects and the publication of all proposals. He thinks that such efforts promote open research throughout the community, across senior and early career researchers and students, demonstrating that openness is not only an afterthought any more, but becoming the default practice. He believes these measures will drive researchers to explore how to implement their research openly and explore technical solutions. Finally, he believes that educating under- and post-graduates about open research, either explicitly, or as part of other courses, should also lead to greater openness.

As told to the Open Research Pilot Research Support Team

Open Research Pilot case studies: sharing image data

January 2017 saw the launch of the Open Research Pilot Project.  This two-year initiative comprised four volunteer University of Cambridge research groups, University Research Support, and Wellcome Trust’s Open Research Team.  The aims of the project were to perceive what is needed for researchers to make openly available, and be rewarded for, all outputs of the research process (e.g. along with traditional publications, other outputs include negative results, protocols, source code).

In the first of a series of blogs, Dr Ben Steventon talks to the research support team for the pilot about his group’s involvement with the project.  His particular Open interests through this project have been the how, where and what related to sharing very large file size image data.

START OF PROJECT

Dr Ben Steventon applied to join the project because, while considerable advances have been made in the open sharing of sequencing data over the last ten years, the same cannot be said for the sharing of image data. He thinks that this is due to a few practical reasons that make it very time-consuming for researchers to get their imaging data ready for uploading to repositories.

Firstly, there is an increasing number of different imaging modalities available to researchers. While this is undoubtedly a very good thing for research, it does mean that it is very difficult to think of a standardized way to perform data annotation and descriptions of each image dataset. In itself, the fact that different labs will have different microscopy images of the same samples, or different samples with the same microscope is a good motivation for sharing image data in the first place. So much time is spent trying out different imaging set-ups at the start of a particular experiment and being able to access the trial-and-error periods of other research groups would be a major advantage to research. The process of going back over all the imaging data relating to a particular project or publication and annotating it in a way that is ready to share is a very time-intensive task. Perhaps the way forward would be to think of a way to integrate data collection with appropriate annotation as it is generated, so that the data would be ready to deposit at a click of a button. But how does one predict the specific repository at the start of a project that may last several years? In many cases there are no guarantees that data format and annotation templates will remain standard throughout this time.

A second barrier to data sharing is the overall size of imaging datasets, particularly now that image acquisition speed is increasing dramatically with new technologies such as light-sheet imaging. A single raw image data-set could exceed several terabytes. While some repositories are willing to take data of such size, it is not at all clear that researchers would be interested in starting with this level of the data. Many would be happy to start with already-processed imaging datasets, or even directly with the feature extracted and analyzed data that come from them. How do we go about sharing data at multiple levels of analysis, when different repositories would be interested in only one or two of them? There has to be some centralized place for interested researchers to go to, and from there follow links to the various other places that host the data. Should such a website be hosted at the level of an individual lab, imaging facility, or should it be community based?

Ben saw the project as a chance to learn more from people within Cambridge University Libraries about the local resources available, and how they might be able to support the specific challenges outlined above. He was also interested to hear more from Wellcome Trust about new ways to share research data in a more general sense, in the hope that this would sharpen up his understanding of the challenges facing the open sharing of imaging data and how these might be overcome.

PROJECT IN PROGRESS

Through the interactions with the library, Ben was introduced to a number of different repositories in the local area that might be able to host the data being generated in the lab.  These interactions have focused his attention on thinking about a) how to build an integrated website that will allow researchers to link out to these various repositories and b) how to use tools such as electronic lab books to keep track of experimental and imaging information so that the required data annotated can be streamlined at the point of data deposition. Through interactions with Wellcome Trust, he has learnt of their researcher enhancement scheme for open research funding, and crucially about the specific things that they are looking for in such proposals. He feels in a much better position to take on the challenges outlined above and thinks that his team can make progress from this point on.

Recently, Ben has published a paper in Development, that in part rests on results gained from a modified light-sheet microscope in the Cambridge Advanced Imaging Centre. Keen on sharing this data to a broader audience, he made use of both the Image Data Repository to share the raw data and a Wellcome Open Research Data Note to describing the data itself directly, taking it out of the context of the research article in Development. Making use of the Wellcome Open Research platform to publish data notes alongside resources such as IDR promises to be an effective manner of gaining increased feedback and audience for the data, as a series of additional reviewers are invited to peer-review the manuscript openly online. One difficulty of this approach was in coordinating the acceptance and publication of the data note and data with the research article. However, Development was very helpful in this respect as it was willing to add in the data sharing descriptions and citations as a correction post-publication.

The biggest surprise to Ben has been the degree of interest from Cambridge University Libraries and funders such as Wellcome Trust in trying to support Open Research in a broad sense. He observes that it seems to be a priority area for both groups, with researchers actually lagging behind a little in thinking about the problems associated with making this happen for the specific aspects relating to their research: clearly a lot more needs to be done to get these groups of people interacting more.

Over the course of the project, Ben has become much more aware of the specific problems involved relating to the open sharing of his research. He realises that the issues are going to be very field specific, but at the same time there will always be shared aspects between different fields. One positive aspect of the pilot was having researchers from these different fields together in the same room.

LOOKING AHEAD

The principle thing Ben needs to get going is having a person in the lab with the appropriate skills and the time to focus on developing the framework required to have a sustainable system in place to share the team’s research outputs. Fortunately, this has now been provided by enhancement funding from Wellcome Trust. The specific issues in terms of locating repositories and obtaining the appropriate support have been provided from his interactions with Cambridge University Libraries staff. He very much hopes that this relationship can continue.

Furthermore, he is very interested in working with the Cambridge Advanced Imaging Center in the hope that the framework that is developed in his lab can be expanded to other users of the facility. Having Open Research practices at the level of initial data acquisition and processing could be a very interesting way to move forward.

As told to the Open Research Pilot Research Support Team

Me, Myself and Data – David Marshall

For Love Data Week (12th-16th February 2018) we are featuring data-related people. Today we talk to David Marshall, FutureLib Project Coordinator, University Library

Telling Stories with Data

Let’s start with an easy one. What kind of data do you work with and what do you do with it?

I deal primarily with qualitative data collected through working with people, gathered in a number of different ways. Research methods tend to include things such as in-depth interviews, observation and shadowing, diary studies, as well as various remote data capture mechanisms. I am usually looking for a mix of attitudinal and behavioural data; comparing what people say with what they actually do. I use the insights and findings arrived at through the analysis of this data to make recommendations for service design and delivery on behalf of Cambridge University libraries.

Tell us how you think you can use data to make a difference in your field.

I am an advocate for the importance of using research and research data to understand the wider lives of people who use a product or service. This has long been an established principle in service design and delivery in the commercial sector, and libraries in UK Higher Education are learning to adopt this in order to tailor their services to the approaches, goals, needs and behaviours of their users. The data I work with often highlights aspects of the study and research lives of Cambridge students and academic staff which it would be difficult to fully uncover and explore through more ‘traditional’, quantitative methods, such as usage statistics and surveys. This in-depth, qualitative study of people provides valuable insights which can be used to inform the development of services and working practices that affect those people.

My ‘field’ is working within and for University of Cambridge library services; slightly oddly I am often conducting research, with researchers as the subjects of that research, with the aim of developing services that support research!

How do you talk about your data to someone outside of academia?

I’m going to turn this one on its head as, although I work within academia, I’m not involved in what would typically be described as academic research. I tend to refer to what I do as design research, i.e. with the end goal of using the data gathered and insights arrived at to inform service design and delivery. I often talk of ‘stealing’ methods from academic disciplines and areas such as anthropology and ethnography, and from the commercial design world. This can involve immersive research techniques such as ethnographic observation, or quick, easily-deployed techniques such as card sorting exercises and ad-hoc interviews. In terms of the data itself, I often talk of patterns emerging and insights developing. Immersing myself in the data over the course of its collection, through activities and tasks such as transcription, and again through the analysis process helps things to ‘take root’ and for these patterns and insights to become more clear.

Connected Conversations

What data-related challenges do you have to deal with in your research environment?

Collecting the data in the first place is one of the biggest challenges for me. To do my work I need data from real people, leading real, busy lives. Finding and connecting with the people I need to work with is a constant challenge. Happily, the need to go to people where they are and work around their schedules in fact leads to better data; I would much rather talk to people about their studies, research, or other aspects of their lives when they are in the middle of doing those things! Another challenge is finding the right tool for the job; over the years I have been lucky enough to work with and learn from people who have extensive experience in a wide variety of research techniques, but it can still be tricky to match the appropriate method/s to a specific question or area of study.

To add a more data-related challenge to the list of data-related challenges…: I deal with a lot of personal data, not just names and demographic information but a large amount of qualitative data gathered from individuals about their lives; their goals, motivations, points of frustration, and so on. This leads to challenges in terms of how data is collected, stored and used, even how it is considered during the analysis process.

How do you think these challenges might be overcome?

For my first points: the old Carnegie Hall adage… practice, practice, practice! Relationship building and communication is a huge part of what I do day-to-day; each time I need to find research participants it becomes a little easier due to the continuous work done in this regard over the years I have been working in my role.

For the latter: I think appropriate awareness is part of the battle. Working with research data, particularly that gained from working with people, demands high levels of awareness and an emphasis on reflection, and so it should! It is important to see qualitative data in context, for many reasons, and to be constantly aware of the ethical implications of its analysis and use.

If you were in charge what data-related rule would you introduce?

That every person I’m interested in finding out more about needs to supply me with it tout suite, please and thank you. No, that might be going too far…!

Without being specific, anything which increases the transparency of what will happen to data after it has been gathered is a good thing. I rarely struggle to get people to consent to participating in research once I have found and approached them, and am as transparent as I can be about why I need their data, what I’m planning to do with it, and where it will end up. Maybe I’m blessed by the context within which I work, and might be slightly naïve, but I can’t help but think that on any scale and in any circumstance this emphasis on transparency might be quite a useful thing.

We are Data

Tell us about your happiest data moment.

Around two years ago we (Futurelib) finished the data gathering phase of a project, Protolib, looking at the design of physical study spaces. We had prototyped different study spaces based on the findings of a collaborative design process conducted with Cambridge students and researchers. We conducted hours (as in 300+ hours…) of observation in these prototype spaces, and gathered data in various other ways, such as interviews with people leaving the spaces, feedback walls, comment cards and questionnaires. The first thing we did as researchers after this was to brainstorm the insights we had arrived at from this work. To see themes and ideas emerging so quickly, and to see them backed up and added to by the research data, was amazingly fulfilling. This is what ‘sold’ me on the value of ethnographic techniques; we had immersed ourselves so fully in the environments under study that we understood them to an extent which I would not have previously thought possible.

What advice do you have for someone who is just embarking on a career in your field?

Want to learn. Get interested in people; who they are, how they think and what they do. I don’t much like the idea of the cold, disinterested researcher. Whilst being aware of your own potential biases, and biases based on what you learn and uncover, care about the people you are working with and try to emphasise as far as is possible with what is important to them. If you don’t like talking to people and finding out about the way they work, this is possibly not quite the right job for you. Of course, there are areas of research in which disinterestedness is probably a very valuable characteristic, I just don’t think this applies to what I do.

What do you think the future of research data looks like?

Speaking about the context within which I work day-to-day, I think the future looks bright! Libraries and HE institutions are becoming increasingly interested in finding out more about the people their services support. In my area of work, usage statistics, quantitative survey mechanisms and other similar methods will always provide the broad strokes, and this is great. It is, however, absolutely not where gathering data should stop. I cannot over-emphasise the value of qualitative approaches and qualitative data in providing actionable evidence for service design.

In terms of the future of research data more generally, I don’t feel too qualified to comment… I would tentatively assume and hope that data will become more accessible, less owned, more malleable, and through this invite more discussion, criticism and conversation.

There is A LOT of data out there about all sorts of things and it is being collected all the time. Does anything frighten you about data?

From a personal standpoint, potentially losing focus. It is almost the modus operandi of what I do to collect as much data as possible about the lives of people studying and working at the University of Cambridge, so I do feel that I collect ‘A LOT’. I sometimes wonder about the nature of the data I gather, as I’m keen to emphasise with participants that I’m interested in all aspects of the ways in which they work, and more widely, the ways in which they live. This does, on occasion, lead to people sharing quite personal aspects of their lives. There are obvious concerns around how this data is handled and used, but, as mentioned previously, I feel that an appropriate level of awareness and diligence in this regard is a good starting point for working with this kind of data in a sensible, conscientious way.

Published 16 February 2018
Written by David Marshall
@futurelib
Creative Commons License

Me, Myself and Data – Keren Limor-Waisberg

For Love Data Week (12th-16th February 2018) we are featuring data-related people. Today we talk to Keren Limor-Waisberg , Founder and CEO of the Scientific Literacy Tool. Advocating for open access, citizen science and scientific literacy.

Telling Stories with Data

Let’s start with an easy one. What kind of data do you work with and what do you do with it?

I help people from all walks of life access, understand, and/or use scientific data and literature concerning their scientific topics of interest.

Tell us how you think you can use data to make a difference in your field.

A scientifically literate society is a society in which people are empowered with knowledge they can use to achieve their different goals. As we look at data and understand it, we acquire skills that are essential for both our personal and our societal development.

How do you talk about your data to someone outside of academia?

When I talk about data with someone outside academia, I will take the time to define any new terminology and make sure we understand each other.

Connected Conversations

What data-related challenges do you have to deal with in your research environment?

The main data-related challenge non-academics have is the access to data. Many datasets are simply not accessible to the public. Sadly, some datasets are not even available for other academics.

Once access is achieved, people will struggle with different formats, lack of metadata, different units and/or the lack of tools to process and analyse the data.

How do you think these challenges might be overcome?

The challenge of open data, making data accessible, is currently addressed by many countries. The European commission for communications networks, content and technology (CNECT), for example, are formulating directives that aim to open up and help reuse publicly funded research data in Europe.

Different organisations are now developing tools and packages that will help people work with datasets.

If you were in charge what data-related rule would you introduce?

As a citizen of the World, I advocate for open access, citizen science and scientific literacy so as to promote the understanding that knowledge empowers both individuals and societies to develop and prosper. To make this progress, I think we need to agree on common ethical guidelines – from the right of access to the right of use of publicly funded data.

We are Data

Tell us about your happiest data moment.

My happiest data moment was during my PhD. I calculated the performance of some viral elements using different tests. I had a lot of data and it took a while for the scripts to run. It was nerve-racking. I can still remember sitting there listening to the screeching sounds of the computer. And then one by one I got the results, and they all confirmed my hypothesis. It was great. It was a small piece of scientific knowledge, but I was the first person in the world to know about it.

What advice do you have for someone who is just embarking on a career in your field?

For someone embarking on a career in the field of promoting scientific literacy, I would recommend to be very patient. It is a slow process and there are many obstacles, but at the end it is a very rewarding profession.

What do you think the future of research data looks like?

I think that in the near future we will have much more publicly funded research data accessible. We will see more and more tools emerging to handle this data. More and more people will use this data and tools to make their statements, to dispute ideas, to create products and services, to entertain, or perhaps just to enjoy finding something new.

There is A LOT of data out there about all sorts of things and it is being collected all the time. Does anything frighten you about data?

I think it is important to make sure privacy and identities are protected when data is collected and shared.

Published 15 February 2018
Written by Keren Limor-Waisberg
@TheLiteracyTool, @OpenResCam
Creative Commons License

Me, Myself and Data – Melissa Scarpate

For Love Data Week (12th-16th February 2018) we are featuring data-related people. Today we talk to Melissa Scarpate, Research Associate in the Faculty of Education, PEDAL Centre.

Telling Stories with Data

Let’s start with an easy one. What kind of data do you work with and what do you do with it?

I work with large longitudinal data sets and large cross-cultural data. I really enjoy running latent growth models with the longitudinal data to assess changes over time in my variables of interest (primarily child/adolescent self-regulation and parenting). I use the cross-cultural data to test for differences or similarities in parenting and adolescent developmental outcomes.

Tell us how you think you can use data to make a difference in your field

By using large data sets that either have many time points or have many different countries and cultures represented, I am able to assess relationships between study variables in an impactful way. For instance, if I find that parental monitoring predicts lower levels of adolescent anxiety in 13,000 adolescents across 10 countries then I feel this information has a larger impact on families in a more global way than using a local data set with a small sample size.

How do you talk about your data to someone outside of academia?

Very carefully! I eliminate jargon and speak about the data in broad, general terms rather than getting into details. I would rather the person come away from our conversation with an understanding of what I do, what I have found in my research, and how this impacts society than to impress them with fancy words and statistics.

Connected Conversations

What data-related challenges do you have to deal with in your research environment?

I work in an office with others and it is important for each of us to keep the data confidential and out of sight of one another.

How do you think these challenges might be overcome?

Screen protectors, earplugs in the case of video coding, not printing any data, etc.

If you were in charge what data-related rule would you introduce?

If I were in charge of all data then I would create a rule that all data could be shared in an easy and collaborative way whilst maintaining study participants’ anonymity.

We are Data

Your happiest data moment?

When I finally got my latent growth model to run!

What advice do you have for someone who is just embarking on a career in your field?

Take as many classes in data management, methods, and statistics that you can and get experience in these concepts with researchers that have excellent skills and training in these areas while in graduate school.

What do you think the future of research data looks like?

Open, transparent, simplified with data visualisation techniques, and impactful.

There is A LOT of data out there about all sorts of things and it is being collected all the time. Does anything frighten you about data?

Technological advances such as my phone being able to predict where I am driving before I leave and my Echo Dot/Alexa picking up all of my conversations make me nervous. The benefits, so far, outweigh potential negatives (at least as I have experienced so far).

Published 14 February 2018
Written by Melissa Scarpate
Creative Commons License

Me, Myself and Data – Kirsten Lamb

For Love Data Week (12th-16th February 2018) we are featuring data-related people. Today we talk to Kirsten Lamb, Deputy Librarian, Department of Engineering.

Telling Stories with Data

Let’s start with an easy one. What kind of data do you work with and what do you do with it?

I work with bibliometric data for the most part. I’m not very systematic and tend to be rather intuitive about how I gather and work with it in order to tease out insights. Because I don’t usually have a specific question to get out of the data I tend to just explore to find out what is interesting about a body of literature.

Tell us how you think you can use data to make a difference in your field.

The idea that librarians can help define a research landscape and identify gaps is relatively new. I like to think that by learning to work with bibliometric data I can help researchers better engage with information professionals and give librarians the confidence to use their skills in the research context.

How do you talk about your data to someone outside of academia?

I tell them that by looking at patterns in publishing researchers can see where trends and gaps are, as well as exploiting those patterns to have a larger impact. But I also make sure to point out the fact that basing insights off of metrics is flawed. You have to understand what each metric is and isn’t measuring. None of the metrics are an indication of the quality of an individual piece of research and there’s no replacement for critical analysis to determine that.

Connected Conversations

What data-related challenges do you have to deal with in your research environment?

First, I don’t have a background in statistics or programming, so there’s a limit to how complex an analysis I can do. Second, the metrics themselves are limited, so communicating the value of the information embedded in the data is a challenge. Third, a lot of bibliometric software is based on use of a particular database’s API so it’s difficult to combine results from different databases to give a broader picture.

How do you think these challenges might be overcome?

All of them would be helped if I could learn to programme! Collaborating with someone who knew how to do the actual analysis bit would be great because that way I could provide the insights and figure out exactly what I wanted to measure and they could make it happen.

If you were in charge what data-related rule would you introduce?

People who write software that does data analysis would make it more user-friendly for people who don’t know how to code. Basically there’d be a WYSIWYG/Microsoft Excel-style programme for doing bibliometric analyses and generating beautiful graphics based on it that didn’t require any coding.

We are Data

Tell us about your happiest data moment.

I was pleased when I discovered that Web of Science does a lot of the analysis I wanted to do but thought I could only do if I had InCites or similar. As much as I like knowing what’s being measured and having an intimate knowledge of the data, sometimes it’s nice to just be able to click a few buttons and get a nice graph!

What advice do you have for someone who is just embarking on a career in your field?

I’d want to tell them that they don’t need to be a maths or programming whiz to do it, but I’m not yet convinced of that myself! I think the main thing is not to think of some metrics as good and some as bad. They’re all just tools and you need to know what they do in order to pick which ones you want to use. Always look under the bonnet!

What do you think the future of research data looks like?

While I’d love for it to be open, interoperable, integrated and well-indexed, I’m not sure that’s going to happen any time soon. Each time someone develops a new standard to rule them all, it just gets added to the growing list of standard

There is A LOT of data out there about all sorts of things and it is being collected all the time. Does anything frighten you about data?

Yes. I don’t think that as a species we’ve really figured out what it means to live in a data-rich ecosystem, and I mean that both metaphorically and literally. The rate at which data is growing is currently unsustainable from the perspectives of preservation, legislation, interpretation and energy use. While I’m definitely uncomfortable with how much certain companies know about me, I’m more concerned with the fact that collecting and managing that amount of data about everyone and everything is bad for the planet and we haven’t figured out how to make sure it’s safe. We need legislation and curation to catch up with technology instead of lagging about a decade behind.

Published 13 February 2018
Written by Kirsten Lamb
@library_sphinx
Creative Commons License