Post provided by Valerie Steen

Every year Methods in Ecology and Evolution rewards Robert May Award in the best paper published in the journal by an author early in their career. Ten Early Career Researchers entered the shortlist for this year’s award, including Valerie Steen who is a postdoctoral researcher at Oregon State University in the USA. In this interview, Valerie shares knowledge about her letter ‘Spatial dilution and classroom balancing: Key choices lead to change in performance of species distribution models with civic science data‘.

Tell us about your career stage, what you do, your hobbies and interests

I am a postdoctoral fellow, at Oregon State University. I work on tools for better forecasting of species distribution and detecting population decline given the noisy data. Currently, I am researching the ability of multi-state capture patterns to detect population decline when animals have low detectability and counts contain additional heterogeneity.

My hobbies and interests include salsa / bachata dancing, authentic relationship practices, and outdoor experiences through walking, backpacking, sailing, running, and Nordic skiing.

How would you send your article to someone if you only had 30 seconds in the elevator?

We looked at the rules sets for sub-sampling civic science data to create an optimal data set for modeling species distributions. The main considerations are whether: to use all the data, to keep only all the presence data, spatial money to remove possible spatial biases and / or to balance the presence and absence classes.

We looked at 102 breeding bird species in the Northeastern US and considered whether the rarity of the species affected which rule was optimal, given the modeling algorithm used and the intended use of the model predictions. We found that there is no single best data set and that the best data set (s) depend on all the factors we have considered. However, our findings underscore the importance of knowing from the outset how the search result will be used.

Where did the idea to develop this method come from?

We were interested in some previous research that had found that conservation of all spatial dilutions and non-spatial discoveries produced the best species distribution model for a rare species. Fortunately, we had access to a large set of independent, systematically collected data. We saw this as an opportunity to independently test species distribution patterns created by different sparse and equilibrium rules, plus we could extend previous research to include species with different prevalence and multiple modeling algorithms .

The decision to spatially dilute and balance the class produces different sets of data. From a single set of original observations (stars = present, circles = missing) different approaches create five different types of data sets for modeling. Credit: Steen et al. 2020

What were the main challenges in developing this method? How did you overcome this?

Because we had worked with over 100 types, five sets of data sampling rules, four different modeling algorithms, and five different performance metrics, a major challenge was to summarize the results in direct instructions to be used by future researchers.

We first came up with a decision tree because – by its very nature – it can go through the user through multiple layers of decisions to arrive at an answer. This was the great suggestion of our associate editor. Ultimately, we decided to use a chart that clearly displayed the multiple dimensions of the problem.

How do you plan to apply the method you have published / what have you been working on since its publication?

I used our recommendations from this paper when selecting data to model wintering bird distributions for the Rhode Island bird atlas. Without the guidance in this paper, I would probably have used a data sub-sample routine that was much more complex than our paper recommended.

Since the publication of this paper, in addition to the bird atlas in Rhode Island, I have worked on modeling the bird population trends for the Chicago metropolitan area. I am currently researching the ability of multi-state occupation models to model an ‘abundant’ (as well as occupied and not occupied) state and to detect declining population trends. I am applying this to the San Francisco estuary fish populations. This has involved months of simulations, but now the draft manuscript is almost complete.

Who will benefit from your method?

I think scientists who use civic science data (or other non-systematically collected presence / absence data) to compile species distributions will benefit from our recommendations because they simplify – what could be – a decision-making process. tricky.

Application of different methods of thinning and balancing in civic science data sets for an unusual bird species (red dot = present, gray dot = absent). Credit: Steen et al. 2020

Types can also benefit. Optimal data sub-samples should provide better species distribution predictions, which may help improve land conservation decisions.

If you could travel back in time, would you add or change anything about your method?

I would probably remove a performance metric called AUC with accurate traction. Although the features of the metric seemed appropriate to include, it was not widely verified in the world of species distribution modeling and the results from this metric were difficult to interpret. Plus, we had already included four known and popular metrics.

You can read Valerie’s full letter here.

Learn more about the Robert May 2022 shortlist here.

Leave a Reply

Your email address will not be published.