Three providers of GPU-powered machine learning and analytics solutions are collaborating to find a strategy for multiple programs to access the same data on a GPU and process it in-place, without having to transform it, copy it, or execute other performance-killing processes.
Continuum Analytics, maker of the Anaconda distribution for Python; machine learning/AI specialist H2O.ai; and GPU-powered database creator MapD (now open source) have formed a new consortium, called the GPU Open Analytics Initiative (GOAI).
Their plan, as detailed in a press release and GitHub repository, is to create a common API for storing and accessing GPU-hosted data for machine learning/AI workloads. GPU Data Frame would host the data on the GPU at every step of its lifecycle: ingestion, analysis, model generation, and prediction.
By keeping everything on the GPU, data doesn’t bounce to or from other parts of the system, and it can be processed faster. The GPU Data Frame also provides a common, high-level option for any data processing application—not only ML/AI applications—to talk to GPU-bound data, so there’s less need for any stage in the pipeline to deal with the GPU on its own.