Tag Archives: Love Data Week

Me, Myself and Data – Keren Limor-Waisberg

For Love Data Week (12th-16th February 2018) we are featuring data-related people. Today we talk to Keren Limor-Waisberg , Founder and CEO of the Scientific Literacy Tool. Advocating for open access, citizen science and scientific literacy.

Telling Stories with Data

Let’s start with an easy one. What kind of data do you work with and what do you do with it?

I help people from all walks of life access, understand, and/or use scientific data and literature concerning their scientific topics of interest.

Tell us how you think you can use data to make a difference in your field.

A scientifically literate society is a society in which people are empowered with knowledge they can use to achieve their different goals. As we look at data and understand it, we acquire skills that are essential for both our personal and our societal development.

How do you talk about your data to someone outside of academia?

When I talk about data with someone outside academia, I will take the time to define any new terminology and make sure we understand each other.

Connected Conversations

What data-related challenges do you have to deal with in your research environment?

The main data-related challenge non-academics have is the access to data. Many datasets are simply not accessible to the public. Sadly, some datasets are not even available for other academics.

Once access is achieved, people will struggle with different formats, lack of metadata, different units and/or the lack of tools to process and analyse the data.

How do you think these challenges might be overcome?

The challenge of open data, making data accessible, is currently addressed by many countries. The European commission for communications networks, content and technology (CNECT), for example, are formulating directives that aim to open up and help reuse publicly funded research data in Europe.

Different organisations are now developing tools and packages that will help people work with datasets.

If you were in charge what data-related rule would you introduce?

As a citizen of the World, I advocate for open access, citizen science and scientific literacy so as to promote the understanding that knowledge empowers both individuals and societies to develop and prosper. To make this progress, I think we need to agree on common ethical guidelines – from the right of access to the right of use of publicly funded data.

We are Data

Tell us about your happiest data moment.

My happiest data moment was during my PhD. I calculated the performance of some viral elements using different tests. I had a lot of data and it took a while for the scripts to run. It was nerve-racking. I can still remember sitting there listening to the screeching sounds of the computer. And then one by one I got the results, and they all confirmed my hypothesis. It was great. It was a small piece of scientific knowledge, but I was the first person in the world to know about it.

What advice do you have for someone who is just embarking on a career in your field?

For someone embarking on a career in the field of promoting scientific literacy, I would recommend to be very patient. It is a slow process and there are many obstacles, but at the end it is a very rewarding profession.

What do you think the future of research data looks like?

I think that in the near future we will have much more publicly funded research data accessible. We will see more and more tools emerging to handle this data. More and more people will use this data and tools to make their statements, to dispute ideas, to create products and services, to entertain, or perhaps just to enjoy finding something new.

There is A LOT of data out there about all sorts of things and it is being collected all the time. Does anything frighten you about data?

I think it is important to make sure privacy and identities are protected when data is collected and shared.

Published 15 February 2018
Written by Keren Limor-Waisberg
@TheLiteracyTool, @OpenResCam
Creative Commons License

Me, Myself and Data – Melissa Scarpate

For Love Data Week (12th-16th February 2018) we are featuring data-related people. Today we talk to Melissa Scarpate, Research Associate in the Faculty of Education, PEDAL Centre.

Telling Stories with Data

Let’s start with an easy one. What kind of data do you work with and what do you do with it?

I work with large longitudinal data sets and large cross-cultural data. I really enjoy running latent growth models with the longitudinal data to assess changes over time in my variables of interest (primarily child/adolescent self-regulation and parenting). I use the cross-cultural data to test for differences or similarities in parenting and adolescent developmental outcomes.

Tell us how you think you can use data to make a difference in your field

By using large data sets that either have many time points or have many different countries and cultures represented, I am able to assess relationships between study variables in an impactful way. For instance, if I find that parental monitoring predicts lower levels of adolescent anxiety in 13,000 adolescents across 10 countries then I feel this information has a larger impact on families in a more global way than using a local data set with a small sample size.

How do you talk about your data to someone outside of academia?

Very carefully! I eliminate jargon and speak about the data in broad, general terms rather than getting into details. I would rather the person come away from our conversation with an understanding of what I do, what I have found in my research, and how this impacts society than to impress them with fancy words and statistics.

Connected Conversations

What data-related challenges do you have to deal with in your research environment?

I work in an office with others and it is important for each of us to keep the data confidential and out of sight of one another.

How do you think these challenges might be overcome?

Screen protectors, earplugs in the case of video coding, not printing any data, etc.

If you were in charge what data-related rule would you introduce?

If I were in charge of all data then I would create a rule that all data could be shared in an easy and collaborative way whilst maintaining study participants’ anonymity.

We are Data

Your happiest data moment?

When I finally got my latent growth model to run!

What advice do you have for someone who is just embarking on a career in your field?

Take as many classes in data management, methods, and statistics that you can and get experience in these concepts with researchers that have excellent skills and training in these areas while in graduate school.

What do you think the future of research data looks like?

Open, transparent, simplified with data visualisation techniques, and impactful.

There is A LOT of data out there about all sorts of things and it is being collected all the time. Does anything frighten you about data?

Technological advances such as my phone being able to predict where I am driving before I leave and my Echo Dot/Alexa picking up all of my conversations make me nervous. The benefits, so far, outweigh potential negatives (at least as I have experienced so far).

Published 14 February 2018
Written by Melissa Scarpate
Creative Commons License

Me, Myself and Data – Kirsten Lamb

For Love Data Week (12th-16th February 2018) we are featuring data-related people. Today we talk to Kirsten Lamb, Deputy Librarian, Department of Engineering.

Telling Stories with Data

Let’s start with an easy one. What kind of data do you work with and what do you do with it?

I work with bibliometric data for the most part. I’m not very systematic and tend to be rather intuitive about how I gather and work with it in order to tease out insights. Because I don’t usually have a specific question to get out of the data I tend to just explore to find out what is interesting about a body of literature.

Tell us how you think you can use data to make a difference in your field.

The idea that librarians can help define a research landscape and identify gaps is relatively new. I like to think that by learning to work with bibliometric data I can help researchers better engage with information professionals and give librarians the confidence to use their skills in the research context.

How do you talk about your data to someone outside of academia?

I tell them that by looking at patterns in publishing researchers can see where trends and gaps are, as well as exploiting those patterns to have a larger impact. But I also make sure to point out the fact that basing insights off of metrics is flawed. You have to understand what each metric is and isn’t measuring. None of the metrics are an indication of the quality of an individual piece of research and there’s no replacement for critical analysis to determine that.

Connected Conversations

What data-related challenges do you have to deal with in your research environment?

First, I don’t have a background in statistics or programming, so there’s a limit to how complex an analysis I can do. Second, the metrics themselves are limited, so communicating the value of the information embedded in the data is a challenge. Third, a lot of bibliometric software is based on use of a particular database’s API so it’s difficult to combine results from different databases to give a broader picture.

How do you think these challenges might be overcome?

All of them would be helped if I could learn to programme! Collaborating with someone who knew how to do the actual analysis bit would be great because that way I could provide the insights and figure out exactly what I wanted to measure and they could make it happen.

If you were in charge what data-related rule would you introduce?

People who write software that does data analysis would make it more user-friendly for people who don’t know how to code. Basically there’d be a WYSIWYG/Microsoft Excel-style programme for doing bibliometric analyses and generating beautiful graphics based on it that didn’t require any coding.

We are Data

Tell us about your happiest data moment.

I was pleased when I discovered that Web of Science does a lot of the analysis I wanted to do but thought I could only do if I had InCites or similar. As much as I like knowing what’s being measured and having an intimate knowledge of the data, sometimes it’s nice to just be able to click a few buttons and get a nice graph!

What advice do you have for someone who is just embarking on a career in your field?

I’d want to tell them that they don’t need to be a maths or programming whiz to do it, but I’m not yet convinced of that myself! I think the main thing is not to think of some metrics as good and some as bad. They’re all just tools and you need to know what they do in order to pick which ones you want to use. Always look under the bonnet!

What do you think the future of research data looks like?

While I’d love for it to be open, interoperable, integrated and well-indexed, I’m not sure that’s going to happen any time soon. Each time someone develops a new standard to rule them all, it just gets added to the growing list of standard

There is A LOT of data out there about all sorts of things and it is being collected all the time. Does anything frighten you about data?

Yes. I don’t think that as a species we’ve really figured out what it means to live in a data-rich ecosystem, and I mean that both metaphorically and literally. The rate at which data is growing is currently unsustainable from the perspectives of preservation, legislation, interpretation and energy use. While I’m definitely uncomfortable with how much certain companies know about me, I’m more concerned with the fact that collecting and managing that amount of data about everyone and everything is bad for the planet and we haven’t figured out how to make sure it’s safe. We need legislation and curation to catch up with technology instead of lagging about a decade behind.

Published 13 February 2018
Written by Kirsten Lamb
@library_sphinx
Creative Commons License

Me, Myself and Data – Dr Sudhakaran Prabakaran

For Love Data Week (12th-16th February 2018) we are featuring data-related people. Today we talk to Dr Sudhakaran Prabakaran, Lecturer/ Group leader, Department of Genetics.

Telling Stories with Data

Let’s start with an easy one. What kind of data do you work with and what do you do with it?

We use population level sequencing data sets from TCGA, mutation datasets from COSMIC, ClinVar, HGMD, curated database from other labs. We use discarded datasets, negative datasets, already published datasets, anything and everything. We develop and use structural genomics, mathematical modelling and machine learning tools to analyse mutations that map to noncoding regions of the human genome.

Tell us how you think you can use data to make a difference in your field.

We live on these datasets. Biological data is going to exceed 2.5 Exabytes in the next two years, and the bottleneck is the analysis of these datasets. Our job is to find patterns in these datasets. Rare variants and driver mutations become significant and identifiable only when we look for them in a population context.

How do you talk about your data to someone outside of academia?

​For us it is not difficult. The datasets we are using are generated and curated by governmental and international consortiums. They have done the bulk of publicity. For example, the TCGA dataset has all kinds of data from thousands of cancer patients and is curated by the NIH. The power of this data is for all to see. I just say we try to aid in cancer diagnosis by crawling through these datasets to find patterns.

Connected Conversations

What data-related challenges do you have to deal with in your research environment?

We are happy with the publicly available datasets. Our problem starts with the datasets we collect. How to store, analyse, and make it available for everyone to use are the questions we are trying to answer all the time.

How do you think these challenges might be overcome?

I am an ardent proponent of cloud-storage and computation. I believe that is the future. I am also aware that some countries are concerned with data migration outside their geographical boundaries.

If you were in charge what data-related rule would you introduce?

I am not going to make up anything new. Past US Presidents have made laws like any data generated with public funds should be made available.

Governmental organisations should demystify cloud based storage and computation processes. People are unduly worried. People are giving away more personal data wilfully on Facebook, Twitter, Instagram than through genome sequences collected by public consortiums.

Tell us about your happiest data moment.

It is not one moment, it is a series of moments up until now. I can run a viable research program with no startup money or funds just by scavenging through publicly available datasets.

What advice do you have for someone who is just embarking on a career in your field?

Learn machine-learning and cloud-computing

What do you think the future of research data looks like?

Lots of data analysis than data generation

There is A LOT of data out there about all sorts of things and it is being collected all the time. Does anything frighten you about data?

I am in fact excited. I believe we need to train more data scientists. We are in good times. Data is becoming truly democratic!

Published 12 February 2018
Written by Dr Sudhakaran Prabakaran
@wk181
Creative Commons License