Azure RBAC Custom Roles

Microsoft Azure

 

Introduction

Azure supports Role-Based-Access-Control (RBAC) to controlling what actions a principal (user, service principal etc) can perform via the Azure Portal, XPlat Cli or Azure PowerShell module.

Azure provides quite a few built-in roles (48 at this time) but it is also possible to define your own custom roles. In this post I will provide a few general tips on RBAC and also how to go about creating your own custom roles.

Actions and NotActions

Actions are permissions/operations that you wish to allow and NotActions are ones that you wish to restrict. When assigning roles you need to be conscious of the fact that NotActions are not deny rules as mentioned in the Microsoft document:

If a user is assigned a role that excludes an operation in NotActions, and is assigned a second role that grants access to the same operation, the user will be allowed to perform that operation. NotActions is not a deny rule – it is simply a convenient way to create a set of allowed operations when specific operations need to be excluded.

 

View a list of the built-in roles

You can use the Get-AzureRmRoleDefinition cmdlet to view a list of built-in roles:

View a list of the custom roles

You can view the list of custom roles (ones that you have created) available in the currently selected subscription by using the -Custom switch of the same cmdlet.

How to view the possible operations for a particular resource

When you are creating your own roles, you might want to see all the possible operations that can be permissioned for a particular resource type. In the example, below Microsoft.Sql/ represents Azure SQL Database and we use the Get-AzureRmProviderOperation cmdlet to search for all operations that begin with Microsoft.Sql/.

Creating a custom role

There are two ways to create a custom role .

  1. Write a role definition in JSON as shown in the Microsoft documentation; or
  2. If there is a built-in role close to what you need you can create a custom role based on an existing built-in (or indeed another custom role) and just modify the actions/notactions.

If you have a JSON role definition file you can create a new role definition using the command:

The PowerShell code below shows how you can create a custom role based on an existing one:

NOTE

There are two important points to be aware of when creating custom roles:

  • A custom role defined in one subscription is not visible in other subscriptions.
  • The role name must be unique to your Azure AD Tenant – e.g. if you want to use the same role definition across different subscription you will need to use a different name in each subscription – yes this is a pain and could cause some confusion.

Scopes

As you may have noticed from the code snippet above roles can be applied to multiple different scopes e.g. at the subscription level, resource group level or to an individual resource.

It is important to remember that access  that you grant at parent scopes is inherited at child scopes.

 

Modifying an existing custom role

The simplest way to modify an existing custom role is by retrieving the role definition via Get-AzureRmRoleDefinition and storing it in a variable, then adding/removing actions or changing the scope as required, and finally applying the changes with Set-AzureRmRoleDefinition.

Example custom roles

I have added a few example custom roles to my GitHub repo here:
https://github.com/vijayjt/AzureScripts/tree/master/rbac/role-definitions

The only thing you’d need to change is the assignable scopes in order to make use of the role definitions.

There are only two roles in the repo at the moment:

  1. A custom virtual machine operator role: I created this role to meet a requirement I had to allow particular users to start/stop/restart VMs in a particular resource group
  2. A custom limited subscription contributor role: this role was created to remove some types of sensitive operations from a subscription contributor. Of course what you deem as sensitive will change based on your context and the users involved. The custom role just adds sensitive operations into the NotActions. Ideally you should use more specific roles and scope the appropriately – but you may be asked to provide such broad access. One of the problems with this approach is new resource types are added frequently that may be sensitive so you have to constantly update the role definition.

ARM Template Plaster Template Manifest

Microsoft Azure

Plaster is a template-based file and project generator written in PowerShell. It is commonly used to create the scaffolding for the typical directories and files that are required to create a PowerShell module. However, it can also be used to create the scaffolding for a typical ARM template e.g. azuredeploy.json,  azuredeploy.parameters.json, metadata.json files, Pester test script etc.

You can find an example Plaster manifest for creating the scaffolding for an ARM template here in my Github repo https://github.com/vijayjt/PlasterTemplates/tree/master/AzureResourceManagerTemplate.

Azure ARM Templates and Testing with Pester

Microsoft Azure

I have been recently working with Azure Resource Manager (ARM) templates and using Pester for testing. My friend Sunny showed me a very basic example of using Pester for testing an ARM template that is available as part of a template for VM Scale Sets managed by Azure Automation DSC. The pester test script provided with this template does a few things:

  • Tests that an azuredeploy.json file exists
  • Tests that an azuredeploy.parameters.json file exists
  • Tests that a metadata.json file exists
  • Tests that the azuredeploy.json has the expected properties

This is a good start but in this post I will walk through some additional types of tests that you can run and also some gotchas I found with the example in the Azure Quickstart templates Github repo.

 

Checking for expected properties in a JSON file

 

The example in the Azure Quickstart templates Github repo uses the code below to check for expected properties:

There is a problem with this code in that the order in which the properties are returned through the line with the ConvertFrom-Json cmdlet may not match the order used by the expectedProperties variable. This issue can be solved by simply sorting the properties when you store them in the expectedProperties variable and also after the call to Get-Member.

Dealing with multiple parameter files

 

Another shortcoming of the example is that it assumes only one parameter file per template, so how do you deal with multiple parameter files? e.g. azuredeploy.parameters.dev.json,  azuredeploy.parameters.test.jsonFirst we need to modify the test that checks for the existence of parameter files to allow for multiple files like so:

Next we need to deal with multiple parameter files when checking if parameter files have the expected properties. To do this at the top of the test script we create an array hashes of all the parameter files.

Then we put the tests for parameter files in a separate context block and use TestCases parameter for a It block.

Testing a resource has the expected properties

We can extend the method used to check that a azuredeploy.json template file has the expected resources to also check that the resource has the expected properties. In the example below, we first check that a the azuredeploy.json contains a virtual network resource, then we check the virtual network has properties for address space, DHCP options and subnets.

Validating Templates

Another test we can add as part of our Pester testing script is to use the Test-AzureResourceGroupDeployment cmdlet to validate the template with each parameter file.  This requires creating a resource group.

When creating a resource group you should try to randomise part of the resource group name to avoid clashes, so for example you could use something like:

Here we use Pester-Validation-RG to easily identify what the purpose of the resource group is. We then prefix this with the first 5 characters from a GUID – to avoid clashes in the event you have multiple users or automated tests running at the same time in the same subscription.

We can then use the BeforeAll block to create the resource group before running the tests and the AfterAll block to delete it after all tests have run.

We then run Test-AzureResourceGroupDeployment with the template and each parameter file in turn uses the TestCases parameter for the It block.

There are few things to note with this:

  • It obviously requires that we create a resource group – because although the Test-AzureResourceGroupDeployment cmdlet doesn’t actually create the resources in the template it requires a resource group in order to use it.
  • While there is an AfterAll block block that deletes the temporary resource group that is created to validate the template, if you Ctrl-C the test script or there is some other problem e.g. such as a corrupted test group stack it may not clean up your temporary resource group.
  • Note the deployment of the template can still fail – this simply checks that the schema for each of the resources is correct and that the parameter file is correct. Deployments can still fail for other reasons and the parameter file may still be wrong e.g. we specify a subnet address prefix in the parameter file that does not fall within the VNET address spaces
  • This will increase the time it takes for the tests to run because creating and deleting a resource group, even if it’s empty takes a little time.

Azure ASEs ARM Templates and resourceGroup.location() function

Microsoft Azure

In a recent post I wrote about Azure App Service Environments (ASEs) and AD Integration. If you look at the Azure Quickstart template for a Web App in an ASE, you will notice that the location is passed in as a parameter instead of using the resourcegoup.location() function. This is because there is a known issue where the backend infrastructure for ASEs is not correctly handling the location string returned by this function call. This is mentioned in the following stackoverflow article http://stackoverflow.com/questions/42490728/azure-arm-cant-create-hostingenvironments-location-has-an-invalid-value.

Azure App Service Environments (ASEs) and AD Integration

Microsoft Azure, Powershell

Recently I had to look at a case where there was a requirement to communicate with an Active Directory Domain Controller from a Azure Web App. We were looking to use App Service Environments, looking at the documentation published here https://docs.microsoft.com/en-us/azure/app-service-web/web-sites-integrate-with-vnet,it stated:

This caused some confusion as it appeared to suggest you could not communicate with domain controllers but it appears this is actually more in reference to domain joining.

Furthermore, there is a Microsoft blog post on how to load a LDAP module for PHP with an Azure Web App – which indicates that it is a supported scenario.

You can relatively easily verify this by deploying an Azure Web App with VNET integration or in ASE. I used a modified version of the template published here https://github.com/Azure/azure-quickstart-templates/tree/master/201-web-app-ase-create to create a Web App in an ASE.

I then created a domain controller via PowerShell in this Gist:

Then I used the PowerShell code in this Gist to install AD related roles and promoted the server to a Domain Controller via an answer file – change the forest/domain functional level and other settings to suit your needs.

At this point you can perform a rudimentary test of AD integration via Kudu/SCM PowerShell console.

If you wish to test using PHP, you will need to download the PHP binaries from http://windows.php.net/download/, and extracted them on my computer, in the ext directory you will find the php_ldap.dll file. Note the version you downloads needs to match the version of PHP you have configured your Web App with, which in my case was 5.6.

Next from Kudu / SCM I created a directory named bin under /site/wwwroot, in that directory. Then using FTPS (I used FileZilla, but you will need to create a deployment account first) to upload the php_ldap.dll file.

Then create a file named ldap-test.php with the following php code:

If you then browse to your web app domain and the file e.g. http://mywebapp.azurewebsites.net/ldap-test.php

Auditing Azure RBAC Assignments

Microsoft Azure, Powershell

I recently had a need to create a script to generate a report on Azure RBAC role assignments. The script does a number of things given the domain for your Azure AD tenant:

  • Reports on which users or AD groups have which role;
  • The scope that the role applies to (e.g. subscription, resource group, resource);
  • Where the role is assigned to an AD group, it uses the function from this blog post to recursively obtain the group members http://spr.com/azure-arm-group-membership-recursively-part-1/
  • The script reports on whether a user is Co-Administrator, Service Administrator or Account Administrator
  • Report on whether a user is sourced from the Azure AD Tenant or an external directory or if it appears to be an external account
The user running the script must have permissions to read permissions e.g. ‘Microsoft.Authorization/*/read’ permissions
The script can either output the results as an array of custom objects or in CSV format which can then be redirected to a file and manipulated in Excel.
The script could be run as a scheduled task or via Azure Automation if you wanted to periodically run the script in an automated fashion, it can also be extended to alert on certain cases such as when users from outside your Azure AD Tenant have access to a subscription, resource group or individual resource. The latter item is not a default feature of the script as depending on your organisation you may legitimately have external accounts (e.g. if you’re using 3rd parties to assist you with deploying/building or managing Azure).
The script has been published to my GitHub repo. Hopefully it will be of use to others.

HDInsight and WebSSH Security Issue

HDInsight, Microsoft Azure

Background

This post relates to an unpublished ‘feature’ of Microsoft Azure HDInsight Linux clusters that is misconfigured such that it allows users to obtain root access to clusters without having knowledge of the ‘admin’ account name or password via a web console.

I originally raised this with Microsoft Support around the end of October / beginning of November 2016. Initially, support informed me that they had discussed it with the product team and that the security issue that I was reporting was not a security issue because:

  • The security boundary of HDInsight is the Virtual Network (VNET) and
  • The clusters are only intended for single user tenancy (ironically a MSFT Cloud Data Solution Architect recently said to me that HDInsight fully supports multiple users – which I guess is sort of true now with secure clusters being in preview).

Eventually they agreed that it was indeed an issue and disabled the feature on all new clusters as an interim measure.

 

What is the issue?

An Azure HDInsight Linux cluster consists of head, worker and zookeeper nodes – these nodes are Azure VMs, although the VMs are not visible nor can the individual VMs be managed in the Azure Portal you can SSH to the cluster nodes.

When you provision a cluster you are prompted to set to credentials:

  • One that will be used for the Ambari web interface – which you can login to over HTTPS and a <cluster name>.azurehdinsight.net domain.
  • The other for a local account that will be created on ALL nodes in the cluster which you can then use to SSH to the cluster ssh <user>@<cluster name>-ssh.azurehdinsight.net

The SSH account by default has passwordless sudo – that is you can run sudo su and become root without being prompted for your password.

One of the packages that is installed when you provision a HDInsight cluster is hdinsight-webssh running apt-cache show hdinsight-webssh shows us that it is a Microsoft package (there are other Microsoft HDInsight packages they are all prefixed with hdinsight-):

Running netstat you can see that there is a nodejs based web terminal running and listening on port TCPv6 port 3000:

If you run

you will see the process (which incidentally also runs as root!).

The configuration for the service/application is here:

/etc/websshd/conf.json

It looks like that a number of python scripts are run when you provision a cluster to start ambari, configure hive etc. one of which is to start this websshd service with /opt/startup_scripts/startup_webssh.py

Impact of the issue

The issue cannot be easily exploited by an external attacker e.g. one that does not already have access to infrastructure in the Azure Virtual Network (VNET) that the HDInsight cluster resides in. Such an external attacker would first need to gain access to (doesn’t need to be a privileged account) on any other system hosted in the same VNET and from this point they can easily gain root access on the HDInsight cluster by simply browsing to http://

<clusternodeipaddress>:3000 which would automatically give them a web based shell as the user that has passwordless sudo without entering any username or password.

However, since the default NSG rules allow connectivity within a VNET (as opposed to a default deny that requires all traffic to be explicitly allowed) this makes it easier for an attacker to extend their reach.

Another possibility is that an external attacker would need to find a vulnerability in the proxy servers and/or the various web interfaces that are accessible via the proxies.

In the case of a malicious user who has authorised access to say an application or web server, they would be able to take advantage of the misconfiguration to obtain root access to the HDInsight cluster as described above.

In either case an external attacker or malicious user can then use the root access to exfiltrate data, plant malicious software etc.

Summary

Microsoft have since disabled the service (although the last time I checked back in December 2016 the package is still installed but the service is not running, nor is there a systemd unit file installed.

Microsoft didn’t explain why the package is installed in the first place but I can only assume it was added as a convenience when the product team were developing or testing.

Browser based terminals are problematic when it comes to security but it’s worse when the endpoint is

  1. Unencrypted
  2. Performs no authentication
  3. Drops you in as a user that has passwordless sudo

As an added measure you can disable passwordless sudo for the admin account – which probably shouldn’t be enabled anyway.

KVM Automation

KVM

Introduction

This blog post describes one option for automating the build of a KVM guest.

There are alternative ways to automate the build but the method that is described here uses the ability to pass a kickstart file to the virt-install command when creating a new VM. Kickstart is a file contains the answers to all the normal questions that an interactive installer would ask during installation. The kickstart script installs software packages, configures SELinux, auditd, rsyslog etc.

Using virt-install with a kickstart file

The virt-install command is used to create new virtual machines / guests; it supports a

parameter allows you specify the path to an Anaconda Kickstart file.

An example virt-install command is provided below:

The key parameters as it pertains to automating the install are:

  • The

    parameter specifies the path to the kickstart file on the host machine
  • The

    parameter then specifies where the kickstart file is on the VM.
  • The

    parameter specifies to not enter into a console – which is the default behaviour. The reason we disable this is because if we enter into a console it requires manual intervention to exit from the console after the kickstart installation completes in order to continue with the rest of the script for building a encrypted VM.

 

The Kickstart file

 The format of the Kickstart file will not be covered in detail here however, the key configuration lines that are important for automating the KVM VM build are highlighted below:

  • specifies that a text based installation should be performed

  • specifies that the VM should be shutdown after the kickstart installation completes – this is important as we use this to detect when the installation and configuration is complete before we move on to encrypting the VM operating system disk.

  • specifies the URL for the package repository and that it can be reached via the proxy 192.168.0.20 (if you have direct internet access then this line is not required, also if you are using an internal repo then the URL should be modified accordingly)

  • specifies the location of an additional repo, in this case, the EPEL repo and that it can be accessed via the proxy
  • The line below sets the password for the root user
  • It is stored in hashed form you can generate this by running the command below:

There was also a requirement to configure auditing and logging. Some of these files were quite long and so it was too unweildly to simply hardcode the entire contents of the files into the kickstart file and using heredocs to write them out to a file on the guest. In light of this I used base64 encoding and gunzip to encode the file.

The Kickstart file includes blocks of code such as the example below:

This command decodes a base64 encoded string and then decompresses it and dumps it to a file; the string contains the code for a shell script. This is a convenient way to included scripts without including the entire code using heredocs.

To create the base64 encoded and compressed script enter the script as is into a file, then run the command:

 

Detecting completion of the kickstart script

After the virt-install command is run the virtual machine build script virt-create-guest.sh script waits for the VM to enter the shutdown state (recall that the kickstart file specified that the machine should be shutdown after installation) it does this using the following snippet by running

virsh domstate <guest vm name> and check if it returns “shut off”.

 

Azure AD Authentication (Connect-AzureAD) in Azure Automation

Microsoft Azure

It is now (has been for a while) possible to modify Azure AD via the Azure Automation. The example below uses the Run As Automation Account to first Connect to Azure AD and then run the appropriate commands. You can also create a dedicated Run As account if you want, as well as use a username and password (less secure).

Before you write your code make sure that you:

  • Add the “AzureAD” module to the Automation Account
  • Give the Azure Automation Run As account the appropriate permission as show at the end of this article

Automation Code example (list all the groups in AD):

Give the Azure Automation Run As account the appropriate permissions:

  • Go to Azure Active Directory -> App registrations -> The Run Ass Account.
  • Then go to the API access as show:

  • Give the appropriate access, example below:

Don’t forget to click grant permissions!

Azure ASR Error- 78052 Master target contains different types of scsi controllers.

Microsoft Azure

This is a bit of a self-explanatory one, but I thought I would mention it anyway. When you build an ASR Master Target server make sure if you have more than one SCSI controller that they are of the same type, it doesn’t matter what type they are (LSI Logic SAS, VMware Paravirtual, ect..) but they both need to be the same or you will get the following error on the Azure portal when you attempt to fall back the machine to On-premeses.