Azure Data Factory (ADF) v2 is the second generation of Microsoft’s cloud data integration services which allows customers to work with data wherever it lives. ADF v2 comes in handy if you are working with SAAS data and would like to incorporate it into your data lake, or data warehouse for now or future consumption.

In this article, we’ll review how to get started with ADF v2 and future articles we will review each feature provided within ADF and use cases.

Prerequisites

Azure Subscription

To get started you will need to have Azure Subscription either through your workplace or using the $200 Azure Credit to be used with the first 30 days of sign-up and 12 months of select free services. More information about the free account can be accessed from (https://azure.microsoft.com/en-us/offers/ms-azr-0044p/ )

Azure Roles

To create ADF, the user you are authenticated with must be a contributor, administrator or owner of the subscription. Permissions can be assigned from Azure Portal (https://docs.microsoft.com/en-us/azure/billing/billing-add-change-azure-subscription-administrator ).

Once you have the account in place and can sign in on Azure Portal (www.azure.com), access the portal section of the platform

In the portal, create a resource and search for Data Factory then create

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  1. Name your Data Factory
  2. Pick your Azure Subscription to use for ADF
  3. Create a Resource Group – a collection of resources that share the same permissions, policies, and lifecycle
  4. Define ADF version – needs to be v2 to take advantage of improvements in ADF
  5. Define the location that’s closest to you.
  6. Pin to Dashboard (Optional for accessibility)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Once ADF is created, you access it from the home screen of your environment. Access it by clicking on the name and it navigates to the ADF Settings page. The page has documentation and author & monitor. Click on the later..

To create a pipeline use, create pipeline (1) or (5) author a pipeline which opens the next section.

In this section, we’ll cover all sections of the canvas and look at some terminologies as well 

 

 

 

 

 

 

  1. Easy access to create pipeline – the pipeline is the process of conveying data from one place to the next.
  2. Copy Data section – a wizard-driven process of creating a pipeline
  3. Configure SSIS integration Runtime
  4. Set up Code Repository – allows you to integrate your ADF environment with VSTS or GitHub (coming soon)
  5. Author a new or existing pipeline
  6. Monitoring pipelines (in progress, succeeded, failed etc.)
  7. Navigate back to canvas home
  8. Notifications from pipelines

To create a pipeline use, create pipeline (1) or (5) author a pipeline which opens the next section.

 

 

 

 

 

In conclusion ADF v2 is a very powerful offering from Microsoft Azure team allowing you to move data to and from cloud(s), in the next article we’ll start building pipelines using uses cases to help drive it home.

 

 

 

 

 

Leave a Reply

Your email address will not be published.