Use content search and PowerShell to report on (non)labelled documents.
Last Saturday I was honored to speak at SharePoint Saturday Netherlands. During my talk I was asked if it was possible to get an overview of all documents which have not been labelled. This is possible, but because of time-restaints I could not demonstrate this.
In order to get this kind of overview, we will need the names of all labels and use Office 365 content search. Why the names? Because in content search we’re going to use the ComplianceTag field. We’re going to search for any document which doesn’t equal any of the labels. Is this handy? Perhaps not, but it’s one of the only ways to accomplish what we want to.
Get the label names
For the search query you will need the names of the labels. The easiest way to get these, is using PowerShell. Connect to the security & compliance center and use the Get-ComplianceTag cmdlet. This will look something like this.
Now we have the names of the labels. These we can use. Let’s go to the security and compliance center.
Let’s say we want to know which Word and PowerPoint documents in our SharePoint Online environment (including OneDrive, Teams and Office Groups) have not been labelled. We can find out using content search.
Go to Search & Investigation | Content search. Here we’ll start a new search query.
We will use two conditions. The first is to select the Word and PowerPoint files (file type equals .docx and .pptx). The second is to find all non-labelled documents. And here’s where the label-names come in. This condition is based on the Compliance tag which should not be equal to the labels.
When you’ve entered these details, you can save and run the query. It will show all documents that have not been labelled. If needed (and probably you will need this), you can download the (report of the) results. You need to use the same procedure I described in this article: Data Subject Requests – news for Office 365 and GDPR
You can use this export in Excel and get the required information. This will look something like this.
Does this process also work in reverse? Can I find any documents which have been labelled? Yes you can, but take a look at the new label explorer function first. If you do want to use the content search, just use the label-name in the condition with the “equals to” operator.
This should work like this. I’ve added the SharePoint documentlibrary to show that this works 😉
Azure Information Protection
Like I mentioned during my session a couple of times: Office 365 labels are not the same as Azure Information Protection labels 🙂 But what if you want to do the same with these kinds of labels? So: get an overview of all documents which have not been labeled?
In a nutshell: using Cloud App Security. Here you can create a file policy to check for the AzureIP label. You can even set a scope (which locations to look for) or which labels you want to find. If you simply want to find all non-labeled documents, use this condition.
Neat trick: a policy can also have actions added. For instance: sending an e-mail to the owner of the document that it isn’t labelled.
I tried this for E-Mail Messages and the behavior ist strange:
When I use “equals any of” with the tag “mytag” it works correctly. When I use “equals none of” with the tag “mytag”, I get also the tagged content in the results.
Do you know something about that behavior?
I got the same just now. The way I fixed this was to add the “Type” column and then select “e-mail”. Then only the content which doesn’t have the tag will be shown. Weird, but it works…..
thanks for your hint…but does not work here.
a keyword, to reduce the amount of messages
compliance tag with “equals none of”
type with “E-Mail” only checked.
This is my first Query: ((“DRC”) AND (NOT(ComplianceTag:”mytag”)))
This is my Query with limit to E-Mail: ((“DRC”) AND (((ItemClass:”IPM.Note”) (NOT(ComplianceTag:”mytag”))))
more weird: today it works !?