[Thread Prev][Thread Next][Index]

Update: Issue with sortl: [ferret_users] Percentile along time axis for gridded data



It turns out that the original script only works for sorted data.

 

For non-sorted data, an additional step is required. Russ may provide a more general script in the future. For now, I’d like to provide a quick update to the calculation for this mailing list (see the ferret script below) – the lines highlighted in red are the modified code. 

 

Hope it helps.

 

Regards,

 

--- Peng

 

use coads_climatology

let numpoints=`sst,return=lend`

let/title="sorted indices" sorted_by_time=sortl(sst[l=1:`numpoints`])

save/clob/file=sorted.nc sorted_by_time

 

can var sorted_by_time

use sorted

 

let index_2d=0*x[g=sorted_by_time] + 0*y[g=sorted_by_time] + l[g=sorted_by_time]

let k25 = if l[g=index_2d] eq max(int(0.25*ignore0(sorted_by_time[l=1:`numpoints`@ngd])),1) then 1

 

let index_2d_sst=0*x[g=sorted_by_time] + 0*y[g=sorted_by_time] + l[g=sorted_by_time]

let k25_sorted=k25[gl=sorted_by_time@asn]*sorted_by_time

let k25_sst= if l[g=index_2d_sst] eq k25_sorted[l=1:`numpoints`@sum] then 1

let sst_k25 = k25_sst[gl=sst[d=coads_climatology]@asn]*sst[d=coads_climatology]

 

let/title="sst at 25 percentile" sst25=sst_k25[l=1:`numpoints`@sum]



On Tue, Feb 5, 2019 at 8:13 AM Ge Peng - NOAA Affiliate <ge.peng@xxxxxxxx> wrote:
Hi Ansley,

Thanks a lot for your detailed explanation - it makes a perfect sense to me know. The next one is how it deals with step-like values. I'll continue to explore whenever I can and may get back to the mailing list for more guidance.

Yes, it will be great and very helpful if the ferret documentation can remain online in the event of future government shutdowns.

Regards,

--- Peng

On Mon, Feb 4, 2019 at 5:43 PM Ansley C. Manke <ansley.b.manke@xxxxxxxx> wrote:
Hi Peng,
The results are correct.  The SORTL  values are telling how to sample the original list to get the data into ascending order.  So,
yes? list sst[i=90,j=45],sorted_by_time[i=90,j=45]
 WARNING: Listed variables have ambiguous coordinates on axes: T
             DATA SET: /home/users/tmap/ferret/linux/fer_dsets/data/coads_climatology.cdf
             LONGITUDE: 161W
             LATITUDE: 1S
 Column  1: SST[T=01-JAN 00:45:31-DEC 06:34] is SEA SURFACE TEMPERATURE (Deg C)
 Column  2: SORTED_BY_TIME[T=0.5:12.5] is sorted indices
          SST  SORTED_BY_TIME
L /  1:  26.82    1.00
L /  2:  26.99    2.00
L /  3:  27.47   12.00
L /  4:  27.83   11.00
L /  5:  27.96    9.00
L /  6:  28.22    8.00
L /  7:  27.96    3.00
L /  8:  27.42   10.00
L /  9:  27.33    4.00
L / 10:  27.55    5.00
L / 11:  27.22    7.00
L / 12:  27.02    6.00

says, to sort the data, we will first take element 1, then element 2, then element 12, then element 11, and so forth, ending with element 7 then element 6 which is the largest.  Like you I first thought they were a ranking of the sizes, but they're the index locations of the sorted data.

By the way, the note,
 WARNING: Listed variables have ambiguous coordinates on axes: T

is telling us that the SST data is on a formatted-time axis, time in hours since the origin, or whatever it is for a given axis, but the SORTL function returns a simple list L=1:12 on an abstract axis. Ferret is able to reconcile those two time axes, but the warning's there in case the user doesn't want it to assume it can associate the two variables just index by index.

Ansley

On 2/4/2019 1:44 PM, Ansley C. Manke wrote:

Hi Peng,

I want to add my thanks to those of you who conversed among yourselves during the time when we were shut down. We're looking into possibilities for making the documentation remaining available even when the server where it resides is off-line.

I don't see why those indices are showing up as the result of the sortl function.  It seems to get the right answer when they're used to sample the data and sort the values themselves.  However as I'm sure you noticed, the sampled result is correct!

I'll look into what's happening here.

yes? list sst[i=90,j=45],sorted_by_time[i=90,j=45], samplel(sst[i=90,j=45],sorted_by_time[i=90,j=45])
 WARNING: Listed variables have ambiguous coordinates on axes: T,T
             DATA SET: /home/users/tmap/ferret/linux/fer_dsets/data/coads_climatology.cdf
             LONGITUDE: 161W
             LATITUDE: 1S
 Column  1: SST[T=01-JAN 00:45:31-DEC 06:34] is SEA SURFACE TEMPERATURE (Deg C)
 Column  2: SORTED_BY_TIME[T=0.5:12.5] is sorted indices
 Column  3: SAMPLEL(SST[I=90,J=45],SORTED_BY_TIME[I=90,J=45])[T=0.5:12.5] is SAMPLEL(SST[I=90,J=45],SORTED_BY_TIME[I=90,J=45])
          SST  SORTED_ (C001,V008)
L /  1:  26.82    1.00   26.82
L /  2:  26.99    2.00   26.99
L /  3:  27.47   12.00   27.02
L /  4:  27.83   11.00   27.22
L /  5:  27.96    9.00   27.33
L /  6:  28.22    8.00   27.42
L /  7:  27.96    3.00   27.47
L /  8:  27.42   10.00   27.55
L /  9:  27.33    4.00   27.83
L / 10:  27.55    5.00   27.96
L / 11:  27.22    7.00   27.96
L / 12:  27.02    6.00   28.22


On 2/4/2019 11:23 AM, Ge Peng - NOAA Affiliate wrote:


 I am using the method provided by Russ a couple of years ago to compute percentile along time axis for gridded data of sea ice concentrations.

 

The resultant Q-tiles are a bit of too scattered in the seasonal varying ice zone. I have worked with Russ directly during the shutdown – Thank you, Russ, for your helpful suggestions.  

 

looking into it in more details, I am a bit of confused by the behavior of function sortl which does not seem to sort correctly in ldim at each I,j grid cell in the sense that the largest value should be sorted as the last. It can be demonstrated by the following ferret output. (I was using a much earlier version but have reinstalled the latest per Russ’s suggestion. The results are the same.)

 

Is this correct?

 

Thanks,

 

--- Peng

 

$ ferret

            NOAA/PMEL TMAP

            FERRET v7.44 (optimized)

            Linux 3.10.0-957.1.3.el7.x86_64 64-bit - 12/07/18

             4-Feb-19 14:01

 

yes? use coads_climatology

yes? let numpoints=`sst,return=lend`

 !-> DEFINE VARIABLE numpoints=12

yes? let numlats=`sst,return=jend]`

 !-> DEFINE VARIABLE numlats=90

yes? let/title="sorted indices" sorted_by_time=sortl(sst[l=1:`numpoints`])

 !-> DEFINE VARIABLE/title="sorted indices" sorted_by_time=sortl(sst[l=1:12])

yes? list sst[i=90,j=45],sorted_by_time[i=90,j=45]

 WARNING: Listed variables have ambiguous coordinates on axes: T

             DATA SET: /snfs1/users/gpeng/myFerret/fer_dsets/FerretDatasets-7.4/data/coads_climatology.cdf

             LONGITUDE: 161W

             LATITUDE: 1S

 Column  1: SST[T=01-JAN 00:45:31-DEC 06:34] is SEA SURFACE TEMPERATURE (Deg C)

 Column  2: SORTED_BY_TIME[T=0.5:12.5] is sorted indices

          SST  SORTED_BY_TIME

L /  1:  26.82    1.00

L /  2:  26.99    2.00

L /  3:  27.47   12.00

L /  4:  27.83   11.00

L /  5:  27.96    9.00

L /  6:  28.22    8.00

L /  7:  27.96    3.00

L /  8:  27.42   10.00

L /  9:  27.33    4.00

L / 10:  27.55    5.00

L / 11:  27.22    7.00

L / 12:  27.02    6.00

 


On Tue, Mar 29, 2016 at 10:16 PM Russ Fiedler <russell.fiedler@xxxxxxxx> wrote:

Hi Peng,

I did this in the dim distant past by saving the sorted data to a file and then picked the appropriate values out.

use coads_climatology
let numpoints=`sst,return=lend`
let numlats=`sst,return=jend]`
let/title="sorted indices" sorted_by_time=sortl(sst[l=1:`numpoints`])

! I found that I ran out of memory if I tried to sort too much for some GB sized datasets. It depends on the size of your dataset.
! You might be able to make larger windows or even do it in one go.

save/j=1/jlimits=1:`numlats`/clob/file=sorted.nc sorted_by_time
repeat/j=2:`numlats` save/app/file=sorted.nc sorted_by_time

can var sorted_by_time
use sorted

! Make a variable with time indices

let index_2d=0*x[g=sorted_by_time] + 0*y[g=sorted_by_time] + l[g=sorted_by_time]

! Create an integrating kernel for the percentile wanted 1 for the limit we want and missing elsewhere.

!25 percentile say. Make sure at least 1 valid point is used

let k25 = if l[g=index_2d] eq max(int(0.25*ignore0(sorted_by_time[l=1:`numpoints`@ngd])),1) then 1

! Show the indices used in the sorted data. Note that missing values are accounted for.


shade k25[l=@loc:1]

! Multiply by SST and sum  up

let sst_k25 = k25[gl=sst[d=coads_climatology]@asn]*sst[d=coads_climatology]

let/title="sst at 25 percentile" sst25=sst_k25[l=1:`numpoints`@sum]

shade sst25


Cheers,
Russ


On 30/03/16 05:32, Ge Peng - NOAA Affiliate wrote:

Found this message showing how to find percentiles for 2-dimensional gridded data:

http://ferret.pmel.noaa.gov/Ferret/Mail_Archives/fu_2004/msg00632.html


However, I would like to compute quantiles/percentiles along my time axis at each grid cell of the 2-dimensional gridded data. I.e., for each x and y, the percentiles are done along the time axis with valid data points. (We can ignore the z dimension for now.)


I could take the time series at each grid cell using nested repeat loop in x and y dimensions, following the above example to sort the data and place the ordered data onto the tiled axis.


It does not sound very efficient to me. Has anyone done something similar in ferret? Is there a better way to do this, perhaps without nested repeat loops in both x and y directions?


Appreciate any help.


--- Peng


--
Ge Peng, Ph.D
Research Scholar

Cooperative Institute for Climate and Satellites, NC (CICS-NC)

North Carolina State University (NCSU) and

NOAA’s National Centers for Environmental Information (NCEI)

Center for Weather and Climate (CWC)


151 Patton Ave, Asheville, NC 28801
ge.peng@xxxxxxxx
o: +1 828 257 3009
f:  +1 828 257 3002

Following CICS-NC on Facebook






--

Ge Peng, PhD
Research Scholar
Cooperative Institute for Climate and Satellites - NC (CICS-NC)/NCSU at

NOAA’s National Centers for Environmental Information (NCEI)

Center for Weather and Climate (CWC)

151 Patton Ave, Asheville, NC 28801
+1 828 257 3009; ge.peng@xxxxxxxx

ORCID: http://orcid.org/0000-0002-1986-9115

Following CICS-NC on Facebook





--

Ge Peng, PhD
Research Scholar
Cooperative Institute for Climate and Satellites - NC (CICS-NC)/NCSU at

NOAA’s National Centers for Environmental Information (NCEI)

Center for Weather and Climate (CWC)

151 Patton Ave, Asheville, NC 28801
+1 828 257 3009; ge.peng@xxxxxxxx

ORCID: http://orcid.org/0000-0002-1986-9115

Following CICS-NC on Facebook





--

Ge Peng, PhD
Research Scholar
Cooperative Institute for Climate and Satellites - NC (CICS-NC)/NCSU at

NOAA’s National Centers for Environmental Information (NCEI)

Center for Weather and Climate (CWC)

151 Patton Ave, Asheville, NC 28801
+1 828 257 3009; ge.peng@xxxxxxxx

ORCID: http://orcid.org/0000-0002-1986-9115

Following CICS-NC on Facebook




[Thread Prev][Thread Next][Index]
Contact Us
Dept of Commerce / NOAA / OAR / PMEL / Ferret

Privacy Policy | Disclaimer | Accessibility Statement