Why optimising storage infrastructure is a minimum requirement for the AI era

By Steve Leeper, VP of Product Marketing, Datadobi.

  • 1 hour ago Posted in

If you look at the biggest technology trends of the past decade or two, the ubiquitous business obsession with data is high on the list of transformative issues. Across the enterprise sector in particular, there is barely an organisation out there that hasn’t claimed to be “data-centric” or “data-driven” at some point.

The arrival of GenAI has only served to give the data collection and management efforts even more momentum, with businesses everywhere rushing to train AI systems - a process that requires lots of good-quality data. As a result, many are gathering and storing it as fast as possible, often with little regard for the financial and compliance implications.

A major part of the challenge they face is that 80-90% of this information is unstructured, ie, spread across formats such as documents, images, videos, emails, and sensor outputs, which only adds to the difficulty of organising and controlling it.

In many situations, the lack of data management systems and processes are adding to the problem. Data is collected from a wide range of sources, for a wide range of reasons. It then resides across various hybrid environments (on-premises, cloud, both, etc.) for indeterminate periods, with many businesses reluctant to delete it in case it harbors latent business or regulatory benefits.

The net result is that organisations everywhere are storing vast amounts of data with little or no visibility into what they actually have, where it came from, where it resides, how it is being used, or whether they need to keep it or not. This leaves them with no meaningful way to optimise their storage infrastructure and processes, keep control of their storage costs, or control how their environments evolve over time, let alone how to derive value from their data.

Clearly, something has to give. organisations need to see what data exists across the entire storage estate, including details such as age, location, ownership, activity levels, and type, to understand how it contributes to – or undermines – storage system optimisation.

A monumental management headache

To break this down, detailed metadata insight is essential for revealing how storage is actually being used. Information such as creation dates, last accessed timestamps, and ownership highlights which data is active and requires performance storage, and which has aged out of use or no longer relates to current users.

This level of clarity exposes large volumes of data that consume capacity without delivering value, giving organisations a realistic picture of what should remain on primary systems and what can be relocated or archived.

So, how can this be achieved? At a fundamental level, storage optimisation hinges on adopting a technology approach that manages data, not storage devices; simply adding more and more capacity is no longer viable.

Instead, organisations must have the ability to work across heterogeneous storage environments, including multiple vendors, locations, and clouds. Tools should support vendor-neutral management, allowing data to be monitored and moved regardless of the underlying platform. Clearly, this has to take place at petabyte scale.

Optimization also relies on policy-based data mobility that enables data to be moved based on defined rules, such as age or inactivity, with inactive or long-dormant data, including files that have not been accessed or modified for long periods, moved to lower-tier storage or deleted altogether.

Then there is the question of governance, where effective, optimized processes (or the lack of them) directly affect whether businesses can properly meet their compliance obligations. In this context, good governance assigns ownership and responsibility for data, reducing the volume of orphaned or unmanaged files. In doing so, it also helps address security vulnerabilities and operational inefficiencies associated with poorly managed data.

Optimizing the environment requires systems and processes that document how data is created, stored, retained, and archived, supported by regular audits and clear visibility into ownership, age, and activity. It also depends on tools that can classify and tag data consistently and apply policy-based movement across all storage environments, ensuring information is managed in line with business and regulatory requirements.

Putting the right technologies and processes in place is now imperative. organisations that continue to kick the can down the road are, sooner or later, going to find themselves on the wrong end of a serious compliance breach or will continue to throw good money after bad as they add more and more storage capacity to meet their needs. Either way, getting data management back under control should be as much of a priority as other mission-critical technology issues, particularly for those businesses with a heavy focus on AI.

By Floyd Christofferson, Vice President of Product Marketing, Hammerspace.
Infinidat says that Enecom, Inc., an information and communication technology (ICT) services...
By Rob Demain, CEO, e2e-Assure.
By Jon Fielding, Managing Director, EMEA, Apricorn.
By David Trossell, CEO and CTO of Bridgeworks.
By Eric Herzog, Chief Marketing Officer at Infinidat.
Veeam, Hammerspace, Keepit, and Solidigm unveil groundbreaking data storage innovations at the...
By Andrew Dodd, HPE Storage Worldwide Marketing Communications Manager, the LTO Program.