Thursday, July 20, 2023

How to integrate Synapse pipeline with Cosmos DB

 

How to integrate Synapse pipeline with Cosmos DB






Prerequisite

  • You have account in Azure
  • You have Contributor role on the resource group pertaining to Synapse workspace and Cosmos DB
  • You already has Cosmos DB create with Database.

Initial setup with Private link

Before you connect to Cosmos DB connect setup a linked Service in Synapse to Cosmos DB

1. Open Synapse workspace and go to Linked Service under Manage option [this icon in right side] 






2. Click on New and search for “Cosmos” and select Azure Cosmos DB for NoSQL”



2. Fill up the necessary details as followings,

 

Azure Cosmos DB for Linked Service




3. To test the connection enable Interactive Authoring as following from the following screen,

 



4. Select Enabled and Apply. This is valid for one hour.



5. After enabling Interactive authoring Test the Linked service to confirm whether connectivity was successful. Incase failure, for e.g. with following error,

 

CosmosDbSqlApi operation Failed. ErrorMessage: Request blocked by Auth dummycosmosdbsynapse : Request is blocked because principal [----] does not have required RBAC permissions to perform action [Microsoft.DocumentDB/databaseAccounts/readMetadata] on resource [/]. Learn more: https://aka.ms/cosmos-native-rbac.

ActivityId: 4ba817bc-fef0-424b-aa05-445d04d81228, Microsoft.Azure.Documents.Common/2.14.0, Windows/10.0.17763 cosmos-netstandard-sdk/3.19.3.

Request blocked by Auth dummycosmosdbsynapse : Request is blocked because principal [----] does not have required RBAC permissions to perform action [Microsoft.DocumentDB/databaseAccounts/readMetadata] on resource [/]. Learn more: https://aka.ms/cosmos-native-rbac.

ActivityId: 4ba817bc-fef0-424b-aa05-445d04d81228, Microsoft.Azure.Documents.Common/2.14.0, Windows/10.0.17763 cosmos-netstandard-sdk/3.19.3

 Activity ID: 114be62d-0f3b-47f5-b127-d85cced834c3.

 

👀This error message says synapse workspace/Service Principal does have access to CosmosDB. Which means we need assig role to Synapse service principal.

 

To do so we need to run few Azure command which can be done through automation but for sake of understanding , I will be showing how to do it using Powershell from Azure Command Shell. The steps will be followings,

1.      1. Create custom role which is having permission to CosmosDB for readMetadata,execute and readchangefeed, item Assign the custom role to Synapse principal in Cosmos DB role assignment

2.      2. Check if assignment is success full

Followings are 3 script to perform the same

2.1 Create Custom Role using following steps

2.2 Open Azure Shell from top left



2.3 And Copy paste the following code,

$resourceGroupName="rg-dev-synapse" #rg of synpase ws

$accountName="dummycosmosdbsynapse" #Cosmos DB account Name

 

New-AzCosmosDBSqlRoleDefinition -AccountName $accountName `

    -ResourceGroupName $resourceGroupName `

    -Type CustomRole -RoleName TestAllRole `

    -DataAction @( `

        'Microsoft.DocumentDB/databaseAccounts/readMetadata',

        'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers/items/read', `

        'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers/executeQuery', `

        'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers/readChangeFeed') `

    -AssignableScope "/"

 

 

Output of Create custom role


 

2.4 Run the following command to check whether following role is created in same Azure Shell Session,

 

az cosmosdb sql role definition list --account-name $accountName --resource-group $resourceGroupName


 



2.5 Assign Custom role to Synapse Principal. Run the following code in Azure Shell under same session,

$resourceGroupName = "rg-dev-synapse"

$accountName = "dummycosmosdbsynapse"

$readOnlyRoleDefinitionId = "d8bd42e4-bbb2-4c2b-91c1-d21fc6dea694" # This can be found using above command

# For Service Principals ID you can find it home page of the Synpase WS[See the home page screen shot below]

$principalId = "54d15205-5789-48d8-869e-1381092fb7ce"

New-AzCosmosDBSqlRoleAssignment -AccountName $accountName `

    -ResourceGroupName $resourceGroupName `

    -RoleDefinitionId $readOnlyRoleDefinitionId `

    -Scope "/" `

    -PrincipalId $principalId

 

 

 

Output: Note the similar Yellow mark id which will be used later

 

<#

 

Id               : /subscriptions/<some-guid>/resourceGroups/rg-dev-synapse/providers/Microsoft.DocumentDB/databaseAccounts/dummycosmosdbsynapse/s

                   qlRoleAssignments/511ba438-eea3-4343-8a92-c0d482e918c1

Scope            : /subscriptions/<some-guid>/resourceGroups/rg-dev-synapse/providers/Microsoft.DocumentDB/databaseAccounts/dummycosmosdbsynapse

RoleDefinitionId : /subscriptions/a<some-guid>/resourceGroups/rg-dev-synapse/providers/Microsoft.DocumentDB/databaseAccounts/dummycosmosdbsynapse/s

                   qlRoleDefinitions/d8bd42e4-bbb2-4c2b-91c1-d21fc6dea694

PrincipalId      : 54d15205-5789-48d8-869e-1381092fb7ce

 

#>

2.6 If you check the output of the above create command, you would find the role definition ID which can be used to check if role assignment was successful using following command,

 

az cosmosdb sql role assignment exists --account-name $accountName --resource-group $resourceGroupName --role-assignment-id 511ba438-eea3-4343-8a92-c0d482e918c1

 




               

 

 

 

 

Create Synapse Pipeline

1.      Once linked service is connectivity is successful with Cosmos DB then create a Synapse pipeline to test out whether the data retrieval is successful.



2.      Create New pipeline from Synapse Integrate Option



3.      Drag and drop Lookup activity from General Option,





4.      Now Add Dataset from Settings tab. Here if you do not have existing dataset , create new





 

Edit Linked Service




5.      Select Query Option from following



6.      Now click on Validate to find any issue with pipeline and then publish it by clicking on publish All



7.      Debug the pipeline to see if you are able to retrieve the data from Cosmos



 

Click on output to see the data received from DB





8.       

 

Conclusion

The above steps show how setup Synapse pipeline with Cosmos DB and how to assign Custom Role in Cosmos DB side to get the access from Synapse work space.

 

For more details , you can check following link,

 az-cosmosdb-sql-role-assignment-exists

how-to-setup-rbac-CosmosDB


End of Document