How do you refresh a butterfly? You LepiMAP it in 2021!

The take home message of this blog is one of disappointment! But the reality is that we have never looked at the butterflies in LepiMAP in this way before. Because we have never been aware of how disappointing it is, we (= project coordinators) have not been encouraging corrective action. So the sense of disappointment is directed at ourselves, and not at the citizen scientists who are LepiMAPpers!

We have just invented ways to measure the “age” of the data for a project. There is a long explanation in an OdonataMAP blog and there is a video that covers the same ground in the BDI YouTube channel.

In a nutshell, choose a quarter degree grid cell in a region of interest. Find the most recent date on which each species was recorded, sort these dates, and pick the middle (the median). This gives an estimate of the “age” of the data for that grid cell. Find the median for every grid cell in the region. Find the median of all the medians. This is a good handle on the “age” of the data in the region. It is quite a challenge to wrap the brain around what this date means, but once you have got it, it’s easy. This median of the medians is the mid-date of the records in half the grid cells!

Would you like to hazard a guess what this date is for the butterflies in LepiMAP? Remember that the butterflies were last given a concerted refreshment drive during the SABCA project (fieldwork finished 2010, in South Africa, Lesotho and eSwatini). Even worse is to bear in mind that 90% of the SABCA data was specimen data, digitized from museums and private collections. Many records go back to the early part of the 20th century, and some to the 19th! A lot of species have not been “refreshed” since they were first recorded. It sounds like I am psyching you up for a really bleak picture.

It is hard to know whether the answer is “good” or “bad”, because we have never done these calculations before. The median of the medians for the LepiMAP data for South Africa, Lesotho and eSwatini is 1 March 2003. So in February 2021, it is almost exactly 18 years old. In broad brush terms, half the records in half the grid cells were made before that date, and half afterwards.

The problem with data getting long-in-the-tooth is that the distribution maps we produce can no longer really be claimed to be up-to-date! It would be nice to produce distribution maps using data from only the past decade, say. But they would look very sparse. We have never really thought in these categories before! But in an era of rapid development and climate change, old records do not provide evidence that a species still occurs in a grid cell. Ideally, it would be nice to make distribution maps using records from just the past five years!

Sharon Stanton and Heleen Louw “refreshed” this Lysander Opal Chrysoritis pan lysander on 6 February 2021. The previous record of this species in the Nuy quarter degree grid cell (3319DA), near Worcester, Western Cape, was on 18 October 1967. Gosh, that is 53 years ago. The record is archived at http://vmus.adu.org.za/?vm=LepiMAP-737335

So we need to make a determined effort to “refresh” records, and nudge that median of the medians in the direction of now!

How do we do this? Autumn is around the corner. With it comes a big peak in butterfly abundance. So plan to make a big effort to refresh records over the next few months! Start with the grid cells close to you. The secret is to upload every record you can. If we can refresh all the common species in the well-covered grid cells, that will make an awesome start.

Try to visit your favourite sites this autumn. In the past, our mindset has largely been on adding species not yet on the list for the grid cell. Now, the approach needs to be to aim to refresh everything!

The maps below provide some sort of context. The red-yellow maps for the provinces of South Africa provide the long-term overview, and give the total number of butterfly species in the LepiMAP database per grid cell. The harsh reality is that in all provinces there is lots of fieldwork to be done, and this big picture stuff must not be forgotten in the drive to refresh records! The blue maps are the scary ones. They show the number of butterfly species per grid cell so far in the current butterfly year, taken as starting on 1 July 2020. We are just over seven months into that year, with the best months just ahead. So we have still the opportunity to redeem ourselves!

We’ll work clockwork round the provinces. In the previous blog, we started in Limpopo and ended in North West. This time we’ll begin in the Western Cape, followed by Northern Cape, Free State, North West, Limpopo, Mpumalanga and Gauteng, KwaZulu-Natal and ending in the Eastern Cape.

Western Cape

The quick summary for LepiMAP butterflies in the Western Cape runs like this: 351 species recorded; 54,754 records identified (IDed) to species (or subspecies); 6,786 species-grid cells***; 240 out of 263 grid cells have data (this excludes the three shown as 0 in the map – these have records, but could only be IDed to genus). The median of the medians in the Western Cape is 13 November 1996. In other words, there is a huge amount of historical data that has never been refreshed. The message here is simple, and does not need to be put into words of one syllable! [*** 6,786 is the total you get when you add together all the numbers in the map above; it is the sum of the species richness number in all the grid cells, so we call it the species-grid cell number!]

This is the Western Cape since 1 July last year. Lots of grid cells have just a few species. A handful are in the twenties! Lots of opportunities everywhere to help bring the median dates for grid cells closer to the present!

Northern Cape

This is the overall summary for LepiMAP butterflies in the Northern Cape: 249 species recorded; 19,302 records IDed to species (or subspecies); 4,810 species-grid cells (the sum of the numbers in the map); 431 out of 654 grid cells in the Northern Cape have records IDed to (sub)species. The median of the medians in the Northern Cape is 27 September 2008. The Northern Cape is 12 years ahead of the Western Cape; a determined effort to collect data was made in the Northern Cape during the SABCA period.

This is the species richness per grid cell in the Northern Cape since July last year. If you can plan a trip there this autumn, especially to the areas which have had rain recently, you can make an enormous difference to this map.

Free State

This is the overall summary for LepiMAP butterflies in the Free State: 278 species recorded; 12,472 records IDed to species (or subspecies); 3,405 species-grid cells (the sum of the numbers in the map); 204 out of 238 grid cells in the Free State have records IDed to (sub)species. The median of the medians in the Free State is 28 January 1997, essentially the same as the Western Cape. There’s a handful of grid cells with impressive species richness. The biggest thing that these grid cells achieve is to highlight what ought to be feasible in the grid cells with very thin coverage.

The only thing needed to achieve a drastic improvement in the “age” of the data in Free State is to visit a lot of grid cells, and get a few records from each. When only a handful of species have been recorded in a grid cell, it is easy to bring the median of last records of species right up to the date of fieldwork!

We hope that a big effort is feasible in the Free State this autumn.

North West

This is the overall summary for LepiMAP butterflies in the North West: 259 species recorded; 18,569 records IDed to species (or subspecies); 4,126 species-grid cells (the sum of the numbers in the map); 157 out of 202 grid cells in the North West have records IDed to (sub)species. The median of the medians in the North West is 30 January 2010, so the data can be described as reasonably up-to-date. The priority here is that it is rather thin, especially in the western two-thirds of the province!

Please go west in North West!

Limpopo

Here is the overview for LepiMAP butterflies in Limpopo: 454 species recorded; 71,373 records IDed to species (or subspecies); 13,471 species-grid cells (the sum of the numbers in the map); only five grid cells entirely inside Limpopo have no data, out of 221 grid cells. The median of the medians in Limpopo is 1 November 2008.

Compared to the the long-term coverage map, LepiMAPping in Limpopo since July last year has been quite sparse. There’s an opportunity to make a big improvement this autumn!

Mpumalanga and Gauteng

Gauteng is in the western edge. Mpumalanga is across the centre and the east.

This is the overall summary for LepiMAP butterflies in the Mpumalanga: 442 species recorded; 44,809 records IDed to species (or subspecies); 9,176 species-grid cells (the sum of the numbers in the map); Mpumalanga touches 157 grid cells. Only two grid cells falling entirely within the province have no data, and there is another half a grid cell sharded the Free State withoug data. The median of the medians in the Mpumalanga is 4 December 2007

The overall summary for LepiMAP butterflies in Gauteng shows 263 species recorded. 30,715 records have been IDed to species (or subspecies). There are 3,777 species-grid cells (the sum of the numbers in the map). Every one of the 47 grid cells in Gauteng have records IDed to (sub)species. The median of the medians in Gauteng is 8 January 2009.

This is the map of species richness in Mpumalanga and Gauteng since July last year. The southwestern corner of Gauteng badly needs attention and there is plenty of scope for big improvements in northwest. No matter where you go in Mpumalanga, you will make a substantive contribution. There is lots of potential for autumn fieldwork in the next few months!

KwaZulu-Natal

This is the overall summary for LepiMAP butterflies in KwaZulu-Natal: 489 species recorded; 189,921 records IDed to species (or subspecies); 16,269 species-grid cells (the sum of the numbers in the map); only one of the 174 grid cells in KwaZulu-Natal has no records IDed to (sub)species (and this is in the far south, and most of the grid cell is in the Eastern Cape! In spite of the wealth of data, the median of the medians in KwaZulu-Natal is 28 September 1994, getting on for three decades ago. In inescapable conclusion is that the KwaZulu-Natal data is getting a bit long in the tooth!

The pattern of LepiMAPping in KwaZulu-Natal is for a handful of grid cells to be done superbly well every year. It is easy to pick these out on the map! This is incredibly valuable data, because it will enable amazing analyses of long term trends and how phenology*** is changing through time. There are PhDs lurking in that part of the database, and we encourage the LepiMAPpers in these grid cells to keep on keeping on. But we really badly need some mobile LepiMAPpers to visit lots of different grid cells. To achieve a large shift to the medians in many of the grid cells will need quite intensive fieldwork, because the species richness in the grid cell is substantial. Getting the KwaZulu-Natal data up-to-date represents a really big challenge! (*** In this context, the “phenology” of a butterfly species means the period when adults are in flight.)

Eastern Cape

In the Eastern Cape, the quick summary for LepiMAP butterflies runs like this: 445 species recorded; 43,447 records IDed to species (or subspecies); 8,245 species-grid cells (the sum of the numbers in the map); 288 out of 314 grid cells in the Eastern Cape have records IDed to (sub)species. The median of the medians in the Eastern Province is 27 November 1988. In a nutshell, the coverage is great, but age is an issue.

… this autumn, aim for the interior of the Eastern Cape.

… this is only part of “quality”

This blog has focused on one aspect of the quality of biodiversity data, and that is “up-to-dateness”. There are lots of other dimensions of quality. Spatial quality is important too. It is quite hard to measure; the obvious coverage statistic is the percentage of grid cells with data, but we really need something more subtle which takes account of whether the data is well-scattered through the region, or whether it is concentrated in just one part of the region.

For butterflies, we would also like to have the data well scattered in time through the overall flight period, so we have a good chance of recording the species that only fly for a short period in the season. So we need a measure of seasonal quality.

We don’t only want up-to-date data, old data is also important, because it is critical for detecting expansions and contractions in distributions. It is especially valuable to have a sample of grid cells which get a lot of records, consistently, every year. So we need a measure of long-term data quality. If this dimension of quality for a region is good, then it means that the data is likely to be valuable for looking for range changes through time.

But for autumn 2021, let us focus on bringing the data in the grid cells within our reach up-to-date. No one has to do everything or be everywhere. Citizen science is about teamwork, and if we all do what we are able to do in our areas, then cumulatively we can make a big difference. Perhaps not in a single autumn, but if we keep this new mindset working into the next few years, then we will see huge changes.

As the leadership of LepiMAP (and the Virtual Museum in general), we ought to have done this important thinking years ago, and we are sorry that we didn’t. On the other hand, no other projects seem to have done any quantitative thinking along these lines, so the Virtual Museum retains global leadership in this area.

Acknowledgements

Rene Navarro developed the software used for these brand new measures up-to-dateness. Karis Daniel generated the maps.

There would not be anything to write about without the contributions of Team LepiMAP. Thank you for your support.

Les Underhill
Les Underhill
Prof Les Underhill was Director of the Animal Demography Unit (ADU) at the University of Cape Town from its start in 1991 until he retired. Although citizen science in biology is Les’s passion, his academic background is in mathematical statistics. He was awarded his PhD in abstract multivariate analyses in 1973 at UCT and what he likes to say about his PhD is that he solved a problem that no one has ever had. He soon grasped that this was not the field to which he wanted to devote his life, so he retrained himself as an applied statistician, solving real-world problems.