Last November, Microsoft announced a lot of new enhancements to the Microsoft Information Protection portfolio. In a short series of blogs I will explain some of these. In this blog: auto-labelling your documents in SharePoint Online and OneDrive at-rest. This is the second and final part the blog.
In my previous blog I described how you can configure the auto-classification scheme and create policies. It did take me some time to get them working. In the end, I did learn some aspects. And one is important – the classification uses the DLP engine of Office 365. Which means, amongst others, that it’s quick!
You can see the video at the end of this blog 🙂
In the end I created several policies. One of these I used to detect and classify credit card information. I only added the credit card sensitive information type. After 24 hours, the policy was finished with the simulation mode. And my documents were detected.
After turning on the policy, I checked the SharePoint site. And yes: the Word document was classified. Just to check, I added another Word document and a PowerPoint document. In just a couple of minutes, these were labelled as-well.
Again, it took some waiting to get this to work. But in the end all locations (SharePoint, OneDrive and Exchange) worked for the auto-classification on labels. I’m happy 🙂 I also like to notification when hoovering over the document.
What if I want to label EVERY document in a SharePoint site and not only document with, eg, credit cart information? Should I creat a ‘sensitiv type’ with a regex for all? Or is there a way to just apply auto labeling to all information?
No, at this moment there is no method of classifying all documents in a document library with the same label. There is such a function for the retention labels (a default for all new documents), but not for these types op labels. You do need to look at the content of the document to set the label. For a retention label this might be different, as the location can be used to store all documents which need the same retention period.
Microsoft Information Protection has a different approach. It does not matter where the document is stored. Classification and protection is based on the content in the document. So auto-labeling either in the Office apps or SharePoint and OneDrive will use properties of the document. For the labeling for data at rest, you can only use sensitive information types.
These types support the use of keywords and this might work. Because these keywords allow you to work with SharePoint content types. So this might be helpful in your situation. You can set the sensitive information type to detect document with a specific content type. You use the keyword query for this and the “contenttype:document” entry. For example “contenttype:legaldocument”.
When this is use, you can select this sensitive information type for classifying the information.
I know, this is not an easy solution and a “default label” might be more handy. But this is not possible in the GUI.
I wouldn’t use the PowerShell cmdlets for this scenario – as SharePoint Online is not supported (I think). There are some other remarks for the auto-classification:
Office files for Word, PowerPoint, and Excel are supported;
Maximum of 25,000 automatically labeled files in your tenant per day;
Maximum of 10 auto-labeling policies per tenant, each targeting up to 10 sites (SharePoint or OneDrive);
What if the file in auto-labelled already and you happen to remove the keyword which triggered the auto-labeeling. Does it re-trigger the classification and the file will get auto-labelled again according to the current condition match? If so, how quickly does that happen?
I think this should be the case, but I haven’t checked this out myself.
I will look into this and will publish the results.
Thanks for the question!