HOST: Next talk is from Eugene Burger from the Science Data Integration Group. EUGENE BURGER: Good morning. My name is Eugene Burger. I lead the Science Data Integration Group with Kevin O'Brien. It's a group of six developers and data scientists here at PMEL. I'll speak to you about our data workflows from observing platform to scientific paper. In December 1872, the HMS Challenger sailed from Portsmouth, England by many considered the first oceanographic expedition. And during this expedition, they collected many samples, did many observations. And these were documented. And many of these were preserved--all of this is preserved--at the British Natural History Museum. Earlier this year, a paper was published that used samples from the HMS Challenger 150 years ago, 150 year old samples. We were able to to discover these samples and use those samples to quantify the effect of ocean acidification on oceanic microorganisms. This greatly extended showed the value of well documented data all that discovery. And this paper illustrated that. And this is one of the objectives of the Data Integration Group here at PMEL with our integrated data management strategy. The strategy where we look at data management and the processes. And combining the processes that manages data and bringing together data from multiple sources towards the integrated view of these data. And at the same time reducing the processes that we have in data workflow process. In looking at that, how we approach this, we can put this into three categories-- firstly, in the last few years, we had to accommodate data from autonomous platforms that have not only deliver data much more frequently but much higher volumes of data. So we have to get them in new workflows. We also had existing workflows here at PMEL. And we were able to combine some of these workflows to reduce the number of data management workflows that we have here at PMEL. But at the same time, there are many projects here at PMEL that have innovative solutions in the data management efforts. And we did not seek to replace this or change those in respect of [INAUDIBLE] in data management efforts. If we look at the building blocks in our data management efforts of data manipulation, there is data visualization and metadata collection and data archival. And the tools that we use towards accomplishing those goals, many of these tools are developed here at PMEL. Some of them we learned from other agencies, such as [INAUDIBLE], such as ERDDAP Data Server and we... These are not developed only for these efforts here at PMEL. They are developed for use across multiple projects. And the reason we have these tools available-- [INAUDIBLE] That's great. OK. The reason we have housed these tools and have the ability to develop these tools is because of efforts that are heavily leveraged, heavily leveraged efforts within the Data Integration Group. And on this slide, I highlight some of our collaboration with other entities. We've seen OAR, we've seen NOAA with Integrated Ocean Observing System, and other efforts and work that we do for the Global Ocean Monitoring and Observing Division. And the Ocean Acidification Program allow us to develop these metadata collection, archive automation efforts, et cetera. So heavy leveraging is the reason that we are able to do this data management effort within PMEL. Just a snapshot of some of what we've done in the last few years, we've daylighted data for one of our projects, the Earth Ocean Interactions. We've also provided additional data access and visualization services for Atmospheric Chemistry group or updated data formats to more modern data formats for our EcoFOCI group on the Arctic and Alaska profile data. We've also established new workflows with the autonomous unmanned platforms. We've created automated workflows from data that we receive frequently here at PMEL. And once the data lands here from Saildrone at PMEL, these data are available to our scientists though interoperable data access services and data visualization services within minutes. At the same time, leveraging an open GTS project that here at PMEL we're able to we migrate these data or make this data available to our partner at the National Data Buoy Center where these data uplink onto the World Meteorological Organization, Global Telecommunications Services and Systems. And these data are available for operational use, weather forecast, and ocean modeling. Over the last few years, we've received in excess of 57,000 files or more than 6 million 1 minute records and each record contains 32 variables services. So substantial amounts of data and high frequency received here at PMEL through automated efforts with minimal human intervention due to these workflows that we've developed. But it's not just about data management, it's about optimizing processes and dealing with that data more efficiently. Chris mentioned some of the collaborations with the Engineering group. This is one effort at TELOS where data integration group and Engineering came together to reduce the latency from the time that the measurement is done on the remote observing platform to when the data lands here at PMEL. And through changes in data formats and data handling protocols, we've been able to reduce the latency to 2, to less than 2 minutes from measurement to when the data are availabl through interoperable services to the science groups here at PMEL. The Surface CO2 Atlas produces a global synthesis product. And they used to release this synthesis product every four years, but after the adoption of the PMEL developed data automation platform and efficiencies to this allowed data assembly and quality control, they now release this product every year. So it's a fourfold increase in efficiency due to development here at PMEL. It is an exciting / interesting time in data management. Within the federal government and NOAA, we have a new data policy landscape we have to deal with. And this is [INAUDIBLE] what we have to adopt to this in the next year. At the same time, there are new technology initiatives within NOAA. Two of them are highlighted here that will change the way that we interact with our data and extract additional funding from our data. So it's an exciting time, but there are challenges. We have greater volume of data. And we have also more diverse data. And there's a greater demand for more rapid access to quality controlled data. At the same time, there are resource limitations within PMEL to assist PMEL projects to move data as we move these data from data silos to data lakes. And of course, the answer is more resources. That's always the answer. More resources to establish a data curator position here at PMEL. And also the ability to fill long standing vacancies, federal vacancies that we have within the Data Integration Group. And something we frequently heard at the OceanObs19 conference last year was if you don't spend 10% on data management, you stand to lose 90% of the data. I bring you back to my first slide, the HMS Challenger expedition where they spent those resources on data management. And 150 years later, we still reap the benefits those well-documented, well-preserved data. So with that, thank you very much. [CLAPPING]