0

I calculated the Growing Season Length (GSL) index for the 1950–2023 period in Turkey using ERA5-Land daily mean temperature data. According to the definition:

  • First, find the first occurrence of at least 6 consecutive days with daily mean temperature > 5 °C.
  • Then, after July 1st (NH), find the first occurrence of at least 6 consecutive days with daily mean temperature < 5 °C.
  • The number of days between these two dates is recorded as GSL.

The issue:

  • In coastal areas or warmer regions, the second condition (6 consecutive days < 5 °C) may never occur in some years, so GSL cannot be calculated and remains NaN.
  • For example, between 2000–2023, in some pixels GSL is missing for years like 2004, 2007, 2009, and 2012.
  • When I compute trends (e.g., Mann-Kendall), these missing years result in artificially high positive or negative trends in some pixels, which are not realistic.

My questions:

  • How should I handle such missing years before performing trend analysis?

What I tried:

  • Implemented GSL following the ETCCDI definition with ERA5-Land daily mean temperatures (1950–2023).
  • Calculated GSL pixel-wise using xarray + numpy.
  • Applied Mann-Kendall trend test on the yearly GSL time series. What I expected:
  • A realistic spatial distribution of GSL trends, e.g., gradual lengthening or shortening in certain regions.
  • Missing years to have minimal influence on long-term trends. What happened instead:
  • Pixels with missing years (due to no 6-day <5 °C period) show very large artificial trends (sometimes >±10 days/decade), which are not climatologically reasonable.
3
  • "In coastal areas or warmer regions, the second condition (6 consecutive days < 5 °C) may never occur in some years, so GSL cannot be calculated and remains NaN." Why NaN? Why not say that if there are no instances of 6 consecutive cold days, then the growing season ends on Dec 31st? That would lead to a very long GSL, but that is maybe reasonable if it is a region with very warm and stable temperatures. Commented Sep 16, 2025 at 15:57
  • There are many techniques for handling missing data in time series... growth-onomics.com/… Commented Sep 16, 2025 at 19:29
  • You are right. It may be possible to fill in the missing years using statistical methods (e.g., imputation techniques). However, selecting an appropriate method for a climate index such as GSL is somewhat difficult, because this is not just a numerical gap, but a deficiency arising from the definition. Commented Sep 17, 2025 at 18:04

1 Answer 1

1

GSL is a rather "dumb" indicator, in the sense that it has fixed thresholds not tied to plant requirements. It is good for defining plant-growing zones at higher latitudes. Apparently, the latitude limit is above the south (?) coast of Türkiye. You basically have two options:

  1. Modify your code to find the last of 6 consecutive cold days or 31 December. That stays close to your original question but it is a bit artificial: January 1 may still be "warm".
  2. Use an algorithm which is better suited to what you want to do with the GSL data. Growing Degree Days, for instance, is specific to certain crops and you may find (from FAO, for instance) parameters that are specific to the variety of the crop you want to study.
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot. yes considering December 31 as the end of the season is an option. Initially I left it as NaN to adhere to the original definition of ETCCDI. However this alternative may be more meaningful in warm regions, as otherwise the trends are artificially distorted.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.