Jerome, <rant> This topic has me immediatlely climbing on my soapbox to talk about data management in general. The following opinions are therefore my own and not necessarily shared by others in the LAS group. THREDDS aggregation is a wonderful thing but it can mask over some fundamental problems in the way data is accessed. Many data providers generate data in a time sliced fashion with each 1-3D file containing 1-N separate variables. This is convenient for the data providers but inconvenient for those wishing to provide interactive access to the data. LAS gives users the illusion that they are working with a 4D file with N variables. Typical requests might be for an XY slice at a particular time and height/depth or a time series or profile at a particular location. More than one variable may be requested but it would be atypical for a user to request hundreds of variables at once. Thus, for LAS, the optimal data storage strategy would be to have each variable in its own 4D file. For very long time series you might break up the file into yearly segments of ~ 1Gb and then aggregate the segments. Data requests will force remote servers to open at most a few files. In your case it seems you have a THREDDS aggregation with thousands of irregular timesteps so that any time series request will force the THREDDS aggregation server to open thousands of separate files -- an expensive bunch of IO that will most likely result in non-interactive performance. So, even though you can create THREDDS aggregations of many separate temporal snapshots, it's not necessarily a wise thing to do if you want to provide time series access. (You could of course configure a special LAS UI behavior that allowed users to select a time but not provide access to 'views' with a time axis. Check the bottom of this page for an example: http://ferret.pmel.noaa.gov/LASdoc/serve/cache/50.html) In the best of all possible worlds, data managers would take the data that is created by data providers and, where necessary, reformat it so as to provide optimal performance for data users. After all, the work of reformatting only has to be done once but the work of opening 10K separate snapshot files has to be done every single time a user makes a time series request. As it turns out, for irregular time axes Ferret will have to open up all those files twice -- once to read in the time axes and once to read in the data. Yes, caching inside of Ferret and OPeNDAP will improve performance but the right way to solve the problem is to manage your data for the benefit of the end users, not the data providers. </rant> -- Jon Jerome King wrote: Hi Ferreters and LASers, |