How to integrate Synapse pipeline with Cosmos DB
Prerequisite
- You have account in Azure
- You have Contributor role on the resource group pertaining to Synapse workspace and Cosmos DB
- You already has Cosmos DB create with Database.
Initial setup with Private link
Before you connect to Cosmos DB connect setup a linked Service in Synapse to Cosmos DB
1. Open Synapse workspace and go to Linked Service under Manage option [this icon in right side]
2. Click on New and search for “Cosmos” and select Azure Cosmos
DB for NoSQL”
2. Fill up the necessary details as followings,
3. To test the connection enable Interactive Authoring as
following from the following screen,
4. Select Enabled and Apply. This is valid for one hour.
5. After enabling Interactive authoring Test the Linked service
to confirm whether connectivity was successful. Incase failure, for e.g. with following
error,
CosmosDbSqlApi operation Failed. ErrorMessage: Request
blocked by Auth dummycosmosdbsynapse : Request is blocked because principal
[----] does not have required RBAC permissions
to perform action [Microsoft.DocumentDB/databaseAccounts/readMetadata] on
resource [/]. Learn more: https://aka.ms/cosmos-native-rbac.
ActivityId: 4ba817bc-fef0-424b-aa05-445d04d81228,
Microsoft.Azure.Documents.Common/2.14.0, Windows/10.0.17763
cosmos-netstandard-sdk/3.19.3.
Request blocked by Auth dummycosmosdbsynapse : Request is
blocked because principal [----] does not have
required RBAC permissions to perform action
[Microsoft.DocumentDB/databaseAccounts/readMetadata] on resource [/]. Learn
more: https://aka.ms/cosmos-native-rbac.
ActivityId: 4ba817bc-fef0-424b-aa05-445d04d81228,
Microsoft.Azure.Documents.Common/2.14.0, Windows/10.0.17763
cosmos-netstandard-sdk/3.19.3
Activity ID:
114be62d-0f3b-47f5-b127-d85cced834c3.
👀This error message says synapse workspace/Service Principal does
have access to CosmosDB. Which means we need assig role to Synapse service
principal.
To do so we need to run few Azure command which can be done
through automation but for sake of understanding , I will be showing how to do
it using Powershell from Azure Command Shell. The steps will be followings,
1. 1. Create custom role which is having permission to
CosmosDB for readMetadata,execute and readchangefeed, item Assign the custom
role to Synapse principal in Cosmos DB role assignment
2. 2. Check if assignment is success full
Followings are 3 script to perform the same
2.1 Create Custom Role using following steps
2.2 Open Azure Shell from top left
2.3 And Copy paste the following code,
$resourceGroupName="rg-dev-synapse" #rg of synpase ws
$accountName="dummycosmosdbsynapse" #Cosmos DB account Name
New-AzCosmosDBSqlRoleDefinition -AccountName $accountName `
-ResourceGroupName $resourceGroupName `
-Type CustomRole -RoleName TestAllRole `
-DataAction @( `
'Microsoft.DocumentDB/databaseAccounts/readMetadata',
'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers/items/read', `
'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers/executeQuery', `
'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers/readChangeFeed') `
-AssignableScope "/"
2.4 Run the following command to check whether following role is created in same Azure Shell Session,
az cosmosdb sql role definition list --account-name $accountName --resource-group $resourceGroupName
2.5 Assign Custom role to Synapse Principal. Run the following code in Azure Shell under same session,
$resourceGroupName = "rg-dev-synapse"
$accountName = "dummycosmosdbsynapse"
$readOnlyRoleDefinitionId = "d8bd42e4-bbb2-4c2b-91c1-d21fc6dea694" # This can be found using above command
# For Service Principals ID you can find it home page of
the Synpase WS[See the home page screen shot below]
$principalId = "54d15205-5789-48d8-869e-1381092fb7ce"
New-AzCosmosDBSqlRoleAssignment -AccountName $accountName `
-ResourceGroupName $resourceGroupName `
-RoleDefinitionId $readOnlyRoleDefinitionId `
-Scope "/" `
-PrincipalId $principalId
Output: Note the similar Yellow mark id which will be used later
<#
Id :
/subscriptions/<some-guid>/resourceGroups/rg-dev-synapse/providers/Microsoft.DocumentDB/databaseAccounts/dummycosmosdbsynapse/s
qlRoleAssignments/511ba438-eea3-4343-8a92-c0d482e918c1
Scope :
/subscriptions/<some-guid>/resourceGroups/rg-dev-synapse/providers/Microsoft.DocumentDB/databaseAccounts/dummycosmosdbsynapse
RoleDefinitionId :
/subscriptions/a<some-guid>/resourceGroups/rg-dev-synapse/providers/Microsoft.DocumentDB/databaseAccounts/dummycosmosdbsynapse/s
qlRoleDefinitions/d8bd42e4-bbb2-4c2b-91c1-d21fc6dea694
PrincipalId :
54d15205-5789-48d8-869e-1381092fb7ce
#>
2.6 If you check the output of the above create command, you would find the role definition ID which can be used to check if role assignment was successful using following command,
az cosmosdb sql role assignment exists --account-name $accountName --resource-group $resourceGroupName --role-assignment-id 511ba438-eea3-4343-8a92-c0d482e918c1
Create Synapse Pipeline
1. Once linked service is connectivity is
successful with Cosmos DB then create a Synapse pipeline to test out whether
the data retrieval is successful.
2. Create New pipeline from Synapse Integrate
Option
3. Drag and drop Lookup activity from General
Option,
4. Now Add Dataset from Settings tab. Here if you
do not have existing dataset , create new
5. Select Query Option from following
6. Now click on Validate to find any issue with
pipeline and then publish it by clicking on publish All
7. Debug the pipeline to see if you are able to retrieve
the data from Cosmos
Click on output to see the data received from DB
8.
Conclusion
The above steps show how setup Synapse pipeline with Cosmos
DB and how to assign Custom Role in Cosmos DB side to get the access from Synapse
work space.
For more details , you can check following link,
End of Document