From the contents of emails to intellectual property, business plans, proprietary training documentation, and much more, most enterprises manage vast amounts of unstructured data containing valuable and sensitive information. The sheer volume of unstructured data created and managed by most companies can be enough to drive up storage costs substantially. In addition to managing unstructured data, understanding what unstructured data is sensitive and protecting it is a crucial concern for the modern enterprise.
But protecting this data can be challenging due to the nature of unstructured data and the challenges that often exist in identifying where it resides within the enterprise network, protecting it from unauthorized access, and preventing it from exiting the secure company environment.

What exactly is Unstructured Data?

Unstructured data is data that is in human readable format. Unlike structured data, which is typically programmatically correct and machine readable (e.g. a database), human created data like email, text, spread sheets, video, pictures, etc. is referred to as unstructured and probably comprises up to 80% of data generated today.

A critical characteristic of unstructured data is that it will contain a company’s sensitive data or IP and probably represents in some form or another the sum total of all the knowledge that has been created or collected in your organisation. If this data was leaked it could be detrimental to the company’s image or could even result in heavy fines if you are subject to legislation like GDPR.
Unstructured data needs to be handled securely and this should be addressed in your data protection policies.


How to start protecting unstructured data?

Unstructured data protection starts with asking and answering some key questions like:

  • Where are the sources of your unstructured data and how is it protected?
  • Who’s been modifying these files?
  • Who currently has access to sensitive unstructured data, what access controls are in place, and how do you control super user rights?

The answers to these questions are quite often not complete without an unbiased outside source reviewing a company’s IT infrastructure. For larger companies, their internal audit group can often handle this task. But for most companies, they really need to consider outside advisors be they outside counsel or consultants.

In order to best answer these questions you’ll need to first understand and identify what kind of sensitive information exists in unstructured data and where it is residing. A key element in this would be to scan your data stores and even endpoints for key company terminologies (e.g. product IDs, product descriptions, keywords in documents) or perhaps personal data like names, DoB, addresses, biometric data, etc.

This understanding will help you target the controls to put into place based on the nature of the assets and their criticality. Once the unstructured data is understood, you can start to put controls into place to monitor and control who modifies the files, copies them, or even changes their access permissions.

Design training programs related to unstructured data

Many employees do not understand risks that are unseen or occur over long periods of time. Or when someone downloads data from a secure environment into an Excel spreadsheet or thumb drive, all the controls are gone. Technology can’t solve this – this is a human problem. It can only reasonably be addressed through appropriate use policies and extensive and on-going user awareness training.

Develop training programs that deal specifically with security risks related to unstructured information; the programs should be illustrative and convey the benefits an employee can derive by taking precautions. And above all employees need to understand to not take sensitive data out of its controlled environment.