Big data is something we hear more and more of, and is something I think we should all be paying attention to. The recent explosion in computing power has enabled the production of more, better and higher resolution data in various fields, and it is opening up a lot of opportunities for ecologists, along with posing a whole new set of problems. But this should not mean we shy away from it, far from it – these tools could allow us to make better, more accurate and cheaper predictions across all different fields of ecology (and other fields besides).
I have recently returned from ForestSat 2014, a conference aimed at showcasing, developing and exploring the potential of remote sensing for forest ecology. It also didn’t hurt that the conference was in Riva del Garda in Italy. We all know that if there is one thing the Italians excel at it, it is food and wine. I have never had such delicious food at a conference – and incredibly tasty wine at lunch, let alone the poster sessions. It was an absolute treat. But the wine wasn’t the only reason I was there, the main reason was that remote sensing is opening up new fields for ecologists, but still seems relatively underexplored as a useful tool. While two sessions did concentrate on how remote sensing is used for biodiversity monitoring and wildlife studies, what was clear is that currently there is a gap between the level of precision and detail this is being used for in biomass estimates and structure, and how this is being applied with ecological questions. This is a great opportunity for ecologists and remote sensors alike. Costs in terms of time and money can be greatly reduced compared to a total reliance on field data, and higher precision data can be produced. But of course for this to be effective there needs to be collaboration between the remote sensing experts who understand the intricate details of processing and understanding this data, and ecologists who can use this processing power to be able to frame these questions with ecological context and relate to field results. Often this data has not been originally collected for a primarily ecological use and there are compromises which need to be made as a result.
But this is changing. Soon the launch of the GEDI satellite (a satellite where the starting remit is collecting data for biodiversity use using LIDAR) will open up a whole new and exciting range of options; along with other missions such as NISAR (a collaboration between NASA and ISOR, the Indian Space Agency). And with open data options increasingly being considered it seems there is no better time to start exploring this exciting technology.
Advances in next generation sequencing are also happening apace. Genetic studies are becoming cheaper and more accessible, while approachable protocols are being developed and shared which mean it is accessible for people less experienced with molecular techniques, rather than just a few. This generates huge amounts of potentially incredibly useful data but the skills to actually manage this data are vital. Again cross discipline cooperation will be crucial, and a change in perspective on data ownership will likely have to follow on behind.
Increasingly projects are focusing on both regional / local and global scales. One such project is the PREDICTS database, which I was lucky enough to work on in my Masters. Recent publications in high impact journals such as Proceedings of the Royal Society B and upcoming in Ecology and Evolution demonstrate the interest in and impact of these ambitious and far reaching projects, and what has been achieved is truly remarkable. But this would not be possible without a change in attitude towards data sharing, and the precise and detailed database that has been developed would not exist without researchers donating their raw data. This is an important shift in my opinion, after all should we consider that scientific data, especially that collected for biodiversity conservation and research is solely ours? Or should there be a more open commitment that once you have published off that data, it is time to open it to the wider scientific community?
The most enjoyable and impressive part of ForestSat for me was the global representation. Nearly every continent had at least one representative, and the sense of a global community was strong. It is clearly exciting times for big data. Time for us ecologists to take full advantage.