Azure Information Protection Scanner

Posted by

It has been a while since I last wrote about the Azure Information Protection Scanner. I still love the functionality, although there’s always some room for improvement.

Not too long ago Microsoft released the Unified Labeling client which supported using the scanner. So now you can use the scanner to scan, classify and protect documents on your on-premises file shares, NAS and SharePoint environments using the Microsoft 365 sensitivity labels.


For those of us who have not seen or used the scanner, I decided to create a little video on this. I hope you enjoy 🙂 But I also want to share some more information.


As you can see in the figure above, there’s a couple of specific components used when configuring the scanner. Let’s take a look.

This server is the scanner itself (also called a Node). When you’re going to configure the scanner, you will need some components. For one, you will need a Windows Server. On this server you will install the Unified Labeling client. Using PowerShell you install (or upgrade) the scanner itself. When installing the scanner for the first time, a SQL database is created to store the policies. During the installation process, a Windows Server service is created, called Azure Information Protection Service, which runs using an Active Directory account.


You can have multiple servers or nodes, which are all part of a Cluster. This cluster is configured to use one or more Content Scan Jobs. A content scan job contains the overall policy settings of the job (it’s run schedule for example) but also the content repositories to be scanned by the cluster.


Components required

In order to get the scanner to work, you will need several components. Most are straight forward:

  • Windows Server
  • SQL Server of Express
  • An Active Directory account, with access to the repositories
  • An Azure Active Directory account with the Azure Information Protection licence
  • An Azure Application registration with the required permissions
  • Azure Information Protection UL client

More information:

Scanner service account

Some quick notes on the account. There are three ways of configuring the account which runs the scanner. All of these require the account running the scanner to have access to the repositories.

  1. The preferred method is using an Active Directory account which is synched to Azure AD and has the AIP license;
  2. The alternative (example below) is to use an Active Directory account which is not synched to Azure AD that works “on behalf off” a separate account in Azure Active Directory;
  3. The last alternative is to use a local machine account that works “on behalf off” a separate account in Azure Active Directory. But this will not work for SharePoint on-premises repositories.

On the Windows Server, the scanner service account requires the Log on locally (for installation and configuration). This right can be removed when the scanner is working properly.  The Log on as a service is provided automatically during installation and is required throughout.

For the two alternatives you require an Azure Application Registration (or app registration) for the scanner to work in the background (unattended). This app registration will need specific access to Azure RMS and Microsoft Information Protection.  You will use a PowerShell cmdlet to connect this app-registration, the Azure Active Directory account and the scanner service account:

Set-AIPAuthentication -AppId <ID of the registered app> -AppSecret <client secret sting> -TenantId <your tenant ID> -DelegatedUser <Azure AD account>

For example:

$pscreds = Get-Credential CONTOSO\AIPScanner
Set-AIPAuthentication -AppId “77c3c14e-8b2b-4652836c8c66” -AppSecret “OAkk+rnuYc/u+]ah2kNxL4” -DelegatedUser -TenantId “9c11c87a-ac8b-46a3-8d5c-f4d0b72ee29a” -OnBehalfOf $pscreds

This should return a value of:

Acquired application access token on behalf of CONTOSO\AIPScanner.

But this cmdlet can only be run when other steps have finished. So let’s take a look at these.

Steps required

These are the steps to create the scanner:

  1. Configure the scanner in the Azure portal (cluster, content scan job);
  2. Install the scanner on the Windows Server;
  3. Configure the app registration in order to get an Azure AD token (if needed);
  4. Configure the scanner (finetune the content scan jobs).

Installing the scanner

When you install or update the scanner on the Windows Server, it will connect to a specific cluster – at which time it will download the content scan jobs into the SQL database. If the server isn’t connected to the internet, you can export the cluster and import this into the scanner using PowerShell.

For this installation process, you will use this PowerShell cmdlet:

Install-AIPScanner -SqlServerInstance <name> -Profile <cluster name>

For example:

Install-AIPScanner -SqlServerInstance SQLSERVER1 -Profile AzureIPScanCluster1

By the way: all the cmdlets in this post are included in the Azure Information Protection client installation 🙂

Now the scanner should be up and running and will appear in the Azure Information Protection dashboard – under Nodes. You can check if the Windows service Azure Information Protection Service is running. You can also use several PowerShell cmdlets to check this, for example:

  • Start-AIPScannerDiagnostics
  • Get-AIPScannerStatus

In all – if you have on-premises repositories you want to scan and classify, and you have the relevant licenses :-), then please check out the scanner.








  1. Hi Albert, Thank you for the nice article. however, I have a question that which user should be login to system so and run this “Install-AIPScanner -SqlServerInstance -Profile ” command? I tried to install the aip scanner using a local machine user and get the token using a set-authentication cmdlet with Azure Ad user who has access to pull the policies. I got the token received after a successful run of “set-authentication” cmd but I get an error – “no policy found” on the Azure portal. Could you please point me if I am missing something here? ..Thank you.

    1. Hi Sunil,

      Thanks for your question. I’ve modified the post, as there are more options for configuring the account. In my initial blog I only mentioned one: an Active Directory account working on behalf off an Azure Active Directory account. And this works fine. But there are alternatives, like running using a local Windows account.

      I’ve added the information to the blog.

      As to your error. I think you mean the “Set-AIPAuthentication” cmdlet. Of so, do you have an Azure Information Protection policy and labels configured? You will need to have several labels configured. If not, you will not be able to use the scanner.

      By default, you should have a “Global” policy – without any labels in there :

      But you can add these.

      I hope this helps. This article also provides more information:

  2. Hi there,

    Great content,

    Was wondering if you have applied sensitivity label based of a csv?

    If so what is the best way to do this ?

    Thanks in advance.

  3. Hello Albert,

    First of all, thanks! It was a great tutorial. I have several doubts regarding the offline scenario. Please, as you have mentioned it, maybe you can help me with the configuration that I’m trying to do. The question is clear : Is posible to deploy AIP Scanner (only for labeling and classification) in a offline server? In the official Microsoft Doc:

    Says that is possible, but it talks about “computers that cannot connect to the internet for a period of time”. I’m talking about no connection (even in a period of time). Totally 100% offline… It requiered at least a minimum gap of time connected to internet in order to get an Azure AD token for the scanner?

    Best regards and thanks in advance

    1. Hi Paul,

      Thanks for your reply. Yes, it is possible to run the scanner in offline mode for a longer period of time.
      The scanner will need to receive your (IRM) policies, label configuration, etcetera. For this you can export this configuration (contentjobs) and import this on the offline scanner. See: and

      When the scanner is not connected to the internet, it will not receive any updates to the policies and you will not be able to use the label explorer, statistics and more.

      But it should work.

  4. Hello Albert, thank you for the great post.
    I have a quick question – we are looking to label a document with unified labelling based on the custom property set in the advanced properties section of the office document, is this doable?
    If yes, we have millions of documents in the repository that we need to scan, can you please advise on best possible way to achieve this in an efficient manner?

    1. Hi Jet,

      The unified labeling solution looks for information in the document itself for detecting and classifying. It will not look at the (advanced) properties section of the documents. So I’m afraid I cannot help you with that. For scanning a large amount of document, I would recommend using multiple content scan jobs and repositories. You will need to be able to handle the output of the scanner, so breaking the jobs into smaller bits is recommended. If you want more information or help, then I highly recommend looking at this Yammer-group: It’s hosted by the Microsoft people responsible for the platform and many answers are given there 🙂

  5. Hi Albert, Thank you for the nice article.
    I performed the AIP Scanner installation properly. Can I assign “Confidential” label to only dwg extension files on File Server? Is something like this possible?

    Does auto-labeling only apply to text-based files?

    1. Hi there,

      Auto-labeling for documents at rest (SharePoint Online, Exchange Online or OneDrive for Business) can be used to classify and protect information in Office files (Word, PowerPoint, and Excel – in Open XML format) and PDF format. The latter is only supported for Exchange Online.

      All information protection auto-labeling functionality will look at the content in the files to determine the sensitivity. Unfortunately, this means that a file-extension will not do the trick I’m afraid. For supported file-types you could do this by using Powershell probably or using the unified labeling client. For example:

      Get-ChildItem C:\Projects\*.docx -File -Recurse | Get-AIPFileStatus | where {$_.IsLabeled -eq $False} | Set-AIPFileLabel -LabelId d9f23ae3-1234-1234-1234-f515f824c57b

      But I’m afraid that dwg is not on the supported file-type list.
      There is an integration option though. Please see the link.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s