Updated: May 24, 2020
The largest collection of unstructured and ungoverned content in most organizations is the shared network drive. The mere existence of this content means that your organization is likely under-compliant in a number of critical areas, mostly because this information is so hard to see and manage. It is therefore also difficult to retain and protect. We want to shine light on this dark data, and transform it into useful, compliant, and protected data. But before we do that, we need to do a little prep work.
One of the things you have to do before you paint a house is to tape up all the doors and frames. Taping doors is one of the less satisfying household tasks, since it takes time, and by itself doesn’t actually beautify the room. It does, however, provide huge benefits. It starts by giving you a good perspective on the job at hand and any problems that might arise. Obviously, you end up with straight edges and a lot less cleaning afterwards. A similar lesson can be applied to the process of transforming your shared drives: some prep work can add a lot of benefits. It is unfortunate that organizations decide to just “lift and shift” their content to the cloud, i.e. skip the taping job in order to save time. The longer-term results are less than satisfying.
My goal is to make some recommendations about what you can do at the enterprise, workgroup, and personal level to clean and prep your shared drives for better information management and potential migration. The results will improve classification efforts and reduce the time involved. To understand how these recommendations impact your efforts, we first need to look more deeply into what is on your shares and at the technologies available to clean, tag and migrate.
What is on a shared drive?
Large organizations are transforming shared drives as part of the overall program of adopting information governance capabilities in M365. Transformation means cleaning, organizing, and managing information that currently resides on network shares. Transformation, when done with classification and file analysis tools, can be simplified and made more accurate with a good taping job.
We think of shared drives as being the place where office computer files are stored, and collaboration occurs. Most companies are surprised to find out that documents for collaboration represent only about 35-45% of the files found on shared drives. The rest, which should not or cannot be directly migrated to M365, comes from when we:
Create, download, and distribute software
Put convenience copies in multiple places
Create, share, and run databases
Store our stock photo investments for marketing purposes
Keep documents for long term preservation
Link spreadsheets together
Put emails and PST email archives so they don’t fill up the inbox
Backup up our desktop when we get a new computer or leave the organization
Manage CAD drawings or other compound files
Store our company happy hour photo artifacts
Develop and publish web content to the company
Copy our iTunes library
Keep documents because we have forgotten about them, or don’t know if someone else needs them
What we know about these files is that many of them have value but will lose that value if migrated without conversion. Many files do not have value to the company but will cost more time, storage, and resources if they are moved to costly platforms.
A proper information management strategy for shared drives is focused on segregating the different types of content so that each can be managed more effectively. Put the databases, install files, and applications on dedicated servers separate from electronic records. Put the good content into structures that allow you to manage, classify, or purge it. Put the risky content in a place with appropriate labels to that can be effectively protected. Put the archive content in a place that is designed to preserve it. Get rid of useless non-business content so that managing everything else becomes easier. Everything else can then be classified, clustered, labeled, validated, and migrated to add the most value. All this can be done using the right tools to map out what information you have, and a solid approach to cleaning.
A case for Information Mapping
Tools that that can help with the investigation, classification and cleanup of unstructured content exist in the market. Whether you plan to move some or all of your content to M365, that digital shared drive landfill can provide some useful insights about what you might run into along the way. Information mapping tools provide useful insights at different levels:
Discovery Check – Microsoft provides this solution as part of their Compliance Workshop. It looks at content already in an M365 environment to see where risks and specific content types might reside that require more advanced protection. It maps risk to location.
Case Management – This is not really the right description for these tools, since their creators would use different terms, but it is what they do. Basically, these tools track particular events that involve certain classification of content and they map to where that content repository resides. One example of these tools is a solution to GDPR (or other privacy regulation). When a subject access request to produce content containing personal data related to a specific individual arrives, the solution will track that event and related communications, and point to all the systems or repositories that need to be examined for responsive data. When you have 237 SARs, it tracks which have been responded to, how long it took, and what is left to do. The same information mapping type capabilities are found in solutions for legal holds, records classifications, FOIA requests, “meet and confer ESI maps,” and so on. They are not mapping to the individual digital object, but they could be used to help organize functional mapping of content for SharePoint or teams site deployments.
File Analysis - These Information Governance (IG) tools will inventory a network share and gather information from each file (like dates and size), the file attributes (like author), and file content (like words, phrases, and numbers). All these pieces of metadata can be combined to categorize information in an actionable way. Tools can support multiple taxonomy facets (such as retention category, content type, security, status), which can be simultaneously applied to content across the enterprise. The actions can you can take range from deleting and moving, to outputting migration scripts that allow you to migrate files and metadata into content repositories.
The quickest and most beneficial service these tools provide is insight into what your content actually looks like. It is hard to size a content repository and plan migration unless you know how much content you have.
Transforming shared drives is much more than installing mapping tools. Buying the paint brush is only one step in painting a house. Transforming shared drives compliantly and effectively requires a strategy, a methodology, policies, communication, a number of M365 design decisions, and appropriate approvals.