[Thread Prev][Thread Next][Index]
[ferret_users] Re: [las_users] limit on storage
On Aug 23, 2006, at 3:47 PM, Jonathan Callahan wrote:
<rant>
This topic has me immediatlely climbing on my soapbox to talk about
data management in general. The following opinions are therefore
my own and not necessarily shared by others in the LAS group.
The same applies to my comments.
[...]
In the best of all possible worlds, data managers would take the
data that is created by data providers and, where necessary,
reformat it so as to provide optimal performance for data users.
After all, the work of reformatting only has to be done once but
the work of opening 10K separate snapshot files has to be done
every single time a user makes a time series request.
I concur - however, there are a number of issues involved in
transposing data from a many-fields-one-time to a one-field-many-
times format. One concern to us, as data providers, is that many of
our analysis packages require multiple fields for each time sample
processed. It is not a trivial exercise to rewrite all of our codes.
The bigger concern is archival storage costs. Basically, what we end
up with is 2X the data volume - the original data, and the transposed
data. Considering that as a data manager, I have to keep track of
literally hundreds of terabytes of data, and we do get charged for
each and every byte, generally it's just not practical at this time
for us to double our data storage charges.
As usual, there's nothing technically difficult in creating long time-
series files from single time multifield files, however, there are
policy and other issues that make the "best of all possible worlds" a
difficult one to attain.
Gary Strand
strandwg@ucar.edu
http://www.cgd.ucar.edu/ccr/strandwg
[Thread Prev][Thread Next][Index]
Dept of Commerce /
NOAA /
OAR /
PMEL /
TMAP
Contact Us | Privacy Policy | Disclaimer | Accessibility Statement