Updated: Sep 6, 2020
In my last blog, I talked about accelerators to help streamline the design and deployment effort of O365 using real-life evidence in your shared drives. In this entry, I will discuss some accelerators to address the migration of unstructured content
Migrating data to O365 is a fairly complex process. Far more complex than people who adopt a lift-n-shift approach would have you believe. Previous versions of SharePoint, for example, contained over 100 file extensions that were administratively blocked from migration. These files were blocked for some good reasons; some could contain malware, don’t function in a web environment, were too big, were code or system files or were otherwise dangerous or ineffectual.
Microsoft has a fairly new software tool, a SharePoint Migration Tool, to help you get your information into O365 online. It migrates older SharePoint sites as well as shared drives. It does not currently try to add value to your content before moving it, such as analyzing, culling, classifying or labeling – or blocking dangerous content. The assumption is that once it is in O365, you can then use the Keyword Query Language and other AI capabilities to label content so that you can then categorize it, and if necessary, purge expired content.
So, the question is Move-then-clean or Clean-then-move
It is a straightforward approach to lift and shift, all you have to do is point it at the source content, pick a destination, wait, and it will all get uploaded. This is where the laws of big numbers come into play, and the migration technologies used don’t really change the speed equation. According to Statista.com, average corporate upload speeds are around 40 Mbps. According to the Mover website, it takes a half second of system overhead per file to “receive” it in SharePoint.
If you had a petabyte of data, that means you would have to wait about 29 years given average file sizes. Of course, there are ways to parallelize the process and invest more in hardware and bandwidth, but for many of you, this speed will be a reality.
While you are waiting 29 years, you can think about the fact that;
7.2 years will be spent moving garbage, duplicates, old logs, and Christmas party photos and other things that you probably don’t want or need to move
3 years will be spent breaking valuable content that is complex or linked in some way (i.e. embedded links in a spreadsheet, a CAD drawing, a database application) because moving the file means links are no longer valid. These files require more careful consideration than lift and shift.
5.8 years will be spent moving already expired records, which, as soon as they arrive, will then need to be deleted.
3 years will be spent migrating those previously blocked dangerous files even though they might still be dangerous.
Without some pre-migration intelligence, you may find yourself wasting 19 of the 29 years and you might still not know if you Keyword Query Language classifications were accurate.
By contrast, leveraging our indexing and migration expertise, and advanced indexing tools, you could skim through that Petabyte in about 3 months to cull out the garbage and low value content. You could then deep index the complex stuff for remediation and expired records for deletion, in less than a year. The actual migration would then take 65% less time - quite an acceleration.
Without some forethought, planning, categorization and culling, you may be faced with a very long and useless process. If you want a realistic assessment of your content to understand the effort of migration, or how you can leverage the migration process to strategically enhance your Advanced Data Governance capabilities in O365, please feel free to reach out.