Bring the industry focus to dark data
Virendra Shekhawat Group Manager – Hybrid Cloud Services, HCL technologies | July 2, 2020

Dark data is mostly ignored by business and IT administrators in terms of its value, considering it to be a problem. But it can be a business opportunity for organizations. Let’s understand how.

Most of the organizations are good in managing their structured data, but there isstill there is so a lot of industry-critical data that exists in an unstructured format, causing an exponential increase in data storage costs that can be attributed to the maintenance of hot tiered Storage and DC footprint expenses.

According to a recent IDC survey, 80% of data worldwide is going to be unstructured by 2025 as it is growing at rate of 25% annually and doubling up its capacity every 40 months. Management of this unexpected and unpredictable growth in capacity is a key challenge for organizations.

Data is critical for every business’s success and it’s considered to be one of the most valuable resources for any organization. However, every data is not an active data since there is more than 40% of customer data which is deemed to be ROT (redundant, obsolete or trivial), not well-known to any one, it is concealed from view, easy to ignore and hard-to-access as well as analyse. Some categories of dark data comprise aged data, ex-employee information, MP3 files, Video files, text files, PDF files, log files, survey data, notes, presentations, emails, email attachments, etc.

The primary reason for data being improperly used is lack of visibility. As there are various systems from where organizations collect data, where Terabytes of data is added to enterprise servers daily, and there is no mechanism to grasp these data sources. Another reason for data going dark is that organizations, not knowing how to utilize unstructured data, do so without proper access to data integration and analytics tools, which may question their digital transformation initiatives. Organizations should take advantage of dark data to acquire new insights while creating new business & revenue opportunities, cost optimized storage solutions and develop new partnerships to get into data driven approaches.

How to Deal with it?

There are 3 steps designed to deal with dark data:

Data Discovery & Analysis –– It is a process that runs a file metadata discovery on large quantum of unstructured data to get a complete visibility of an organization’s overall data landscape, and identify the Dark data with the help of data analytics Tools or by applying various data pattern algorithms, queries & Metadata analysis.

Classify your data – to classify enterprise data with the help of a data categorization engine. Allow organizations to label what business that belongs to, what application context, data value, security and risk etc.

Policy Based data management – Defining Policy based management will help organizations to take a decision on classified data, whether to follow data cleansing or archive where Cold data can be tiered to lowest value storage platform. Hot- Critical data can be tiered to highest value storage platforms instead of hosting all data in Tiered 1 Storage.

As you can see, the above mentioned 3 data management measures would help organizations move data from high cost storage platforms to low cost storage (unstructured data storage) platforms. On other hand, the spinning disks are evolving drastically. Over the next few days, we can see 18TB to 20TB hard drives and dual-actuator hard drives enhance the marketplace and significantly reduce the storage cost. However, putting Inactive data on dual-actuator hard drives will again increase unnecessary cost to the organization.


A strong Dark data assessment solution provides an enterprise with a 360˚ view of corporate content for the more advanced data analytics. This will enable service integrators (like HCL) to go beyond the cost management and begin innovating, and creating new value for the customers with the insight that is extracted from massive and growing content. Analyzing file share (SMB/CIFS/NFS) to understand enterprises file data which bring business values to manage overall Data life cycle (Visualize, Analyze, move and manage), driven data placement to discard unwanted data, archive to S3, DC consolidation to lower overall TCO, risk mitigation, and data management simplification to modernized customer digital journey.

Dark data can be brought in Focus to achieve maximum potential ROI. And on top of that, a simple way approach to the Data Classification and Management of the data, all through a cost-benefit data analytics, eliminating the complexity that is closely associated with the mystery of dark data. Manual Assessment of dark data in organizations can be a tedious task, hence we need choose a right dark data analytics Tool and methodology designed to shed light on it, which builds tremendous opportunities for an organization in economics, compliance, and productivity.

It is high time now, organizations should Realize, Think and Adopt a strategy for DARK DATA MANAGEMENT!!!

