Informatica has introduced its serverless, Spark-based Cloud Data Integration engine that offers accelerated performance using the NVIDIA RAPIDS Accelerator for Apache Spark with NVIDIA accelerated computing.
For the first time users have access to end-to-end machine learning operations (MLOps) capabilities by operationalising machine learning models, and to the power of data management with the scalability and speed delivered by RAPIDS data science software and NVIDIA infrastructure. This is a big milestone on the road toward data democratisation and a critical step to scale up digital transformation efforts.
According to Gartner, “Forty-one percent of employees outside of corporate IT are no longer just ‘end users’ of technology. They are technology producers who customise or build their own analytics or technology solutions to support their work”[1]. These technology producers perform advanced analytics and manage vast datasets, resulting in data democratisation. But for this to be successful, companies need to provide these users access to timely and accurate data.
Informatica is the industry’s first cloud data management company offering citizen integrators, data engineers, machine learning engineers, and data scientists alike zero overhead, zero coding data access through serverless multicloud data management while applying NVIDIA’s revolutionary GPU acceleration to Informatica’s MLOps and DataOps workloads.
“Data science is the backbone of AI, as it is key to transforming oceans of enterprise data into business opportunities,” said Manuvir Das, Head of Enterprise Computing, NVIDIA. “Informatica’s integration of RAPIDS Accelerator for Apache Spark with NVIDIA accelerated computing brings the world’s most advanced infrastructure to the many industries that rely on Informatica’s enterprise cloud data management solutions, enabling customers to speed their data science and AI pipelines across their cloud and on-prem data centers.”
With this product milestone customers will now experience:
1. Increased Data Processing Speed up to 5X: To generate business insights, data analytics, machine learning, and data science projects all rely on clean and processed data from data pipelines that collect, transform, cleanse, and prepare it for extraction. Traditionally, the data pipelines run on slower CPUs whereas GPUs are faster, utilising parallel processing that allows for multiple threads to execute at the same time. With this announcement, Informatica customers can accelerate their data management workloads and operationalise machine learning models using NVIDIA GPUs to ingest and process data up to 5X faster and at scale, enabling faster insights to make critical business decisions.
2. Accelerate Data Democratisation Across the Enterprise: The accelerated computing made possible by NVIDIA GPUs and software has been used to improve the performance of compute-intensive AI and machine learning workloads, but traditionally required sophisticated Spark expertise and highly skilled developers. Informatica’s simple drag-and-drop GUI-based development experience removes the complexity by converting simple mappings to sophisticated Spark code that can execute on GPUs at scale. Informatica has been democratising data access with its data integration products for years and is now pushing the frontiers by democratising GPU access to data consumers at large.
3. Up to 72% Lower Total Cost of Ownership: Data analytics and data science projects are compute-intensive and data heavy. Operationalising these projects at scale requires a constant feed of cleansed data from various sources at high velocity, often at a high cost. By leveraging the power of GPU-accelerated software and computing, data management pipelines, MLOps and DataOps frameworks built on Informatica can deliver up to 72% TCO savings, allowing customers to accelerate their data delivery and realise huge cost savings.
“Data democratisation is the holy grail of digital transformation initiatives,” said Jitesh Ghai, Chief Product Officer, Informatica. “You can’t leverage the power of data and gain valuable insights if you are restricted in your data access. Our collaboration with NVIDIA is valuable to us in bringing enterprise-scale data democratisation and narrowing the gap between the data-haves and the data-have-nots within the enterprise. This important milestone with NVIDIA shows our continued commitment to unlock the value of data embedded in organisations across all levels and more importantly empower all key users to gain faster business-critical insights and operationalise data analytics and data science projects at scale.”