My message to folks about a potential HPCC proposal

All,

Allow me to throw a crazy idea into the ring. It may not be possible in FY09 (or possible at all), but I'll put it out there for some discussion if desired.

Earlier we had been discussing the idea of distributed CUBE/bathy processing aboard the ships, with the goal of more efficient data processing in the field. I.e. we would have one high-performance multi-processor system which could accept bathy computation jobs from survey techs and people doing the data processing.

What if we would take this one step further, and say that during data processing, the clients submit their compute jobs to a CUBE/bathy processing daemon hosted at one of NOAA's high-performance computing (HPC) facilities where we have petaflops of computational power at our disposal (the cost would be the network transfer which we would have to weigh against the benefits). An R&D effort investigating this path would fit well with the HPCC program clearly meeting 3 out of 4 of it's themes/topics, as follows:

  1. Remote Usage of NOAA HPC:
    This project would be doing exactly that. Finding another way to utilize NOAA's existing HPC infrastructure and resources.
  2. Advanced Networking Technologies
    This project would require substantially increasing the network capacity between the designated R&D field unit and the NOAA HPC facility.
  3. Technologies for Modeling, Analysis, or Visualization
    We could discuss how creating bathy surfaces from a set of soundings is a surface modeling problem and this project would create a high-performance framework for calculating that model.
  4. Disaster Planning, Mitigation, Response, and Recovery
    I can't speak specifically to this, but maybe someone would be able to make the case that more efficient data processing would reduce ping-to-chart times aiding in distaster mitigation, or the typical OCS line of providing safe entry into ports, etc., etc.

Here's a rough back-of-the-envelope feasibility calculation. Assume that on an average ship desktop machine it takes about 15 minutes to calculate a CUBE surface from a day's worth of multibeam and ancillary (POS, navigation, tides, svp, etc.) data. Let's say the data required for the calculation is about 4GB. Assume that the time to calculate the surface on an array of NOAA HPC machines is negligible (0). Your primary cost in time is a one-time data transfer to HPCS at bandwidth b. 4 GB / 15 minutes (times bytes-to-bits * minutes-to-seconds)... you would need about a 36 Mbps connection to the HPC center (could someone else can shed some light on whether these exist). Note that the data transfer would be a one-time cost, and re-computing surfaces would take almost no time (just the computation time + downloading the re-computed surface which I think would be substantially less than raw soundings/nav data).

There would obviously be a number of incidental benefits to having high-bandwidth connections from the ships. Full survey submission could happen from the field, potentially improving ping-to-chart times.

Another thought is that even if we decide that this isn't feasible on the ship, it would definitely be feasible from the branches (and Shep, I'm guessing that you could probably provide quantitative analysis as to what could be gained from faster CUBE/bathy processing at the branches?), and could possibly be used to upgrade their network connections, which has been under discussion.

-----



blog comments powered by Disqus

Published

16 December 2008

Category

work

Tags