Showing posts with label Azure Blob Storage. Show all posts
Showing posts with label Azure Blob Storage. Show all posts

Tuesday, March 7, 2017

How to download Azure blob storage contents in Azure Linux VM using Azure CLI

Abstract

I always get this question – how can I download Azure blob storage files in Azure Linux VM? When I say use Azure CLI (Command Line Interface) then next question asked is – Do you have step by step guide?
Well, this blog post is the answer to both questions.

The high level approach is outlined below –
  1. Provision Linux Azure VM in a subnet. [Of course this step is out of scope of this article. For detailed steps refer - this guide .]
  2. Install Azure CLI in Linux VM
  3. level Upload sample files to azure storage and then download them in a folder in Linux VM.

This article assumes you understand Azure storage and related concepts.

Wow, this is first blog post from me on Linux and Azure.

Prepare you Azure Linux VM

I have provisioned an Azure Linux VM with OS as CentOs 7.2. Added this VM in a Subnet, with NSG having only port 22 inbound open. Also I have attached a public IP to this VM so that I can make SSH to this VM from anywhere over port no 22. Step by step guide link is already shared above.
If you are having different OS than CentOS then commands in below steps will change however high level approach remains same.

Install CLI in Azure CentOS VM

First make SSH to your Linux VM and run command “sudo su” [without double quotes]. So in subsequent steps we will not face awesome “access denied” or “permission denied” errors. Or we don’t have to add “sudo” word in every command we run.

There are two versions of Azure CLI –
1.0    – This stuff is written in node.js and supports both Classic [old way of doing things on Azure] and ARM [new fancy way of doing things on Azure].
2.0    – To make this version impressive Microsoft calls it “Next generation CLI” and is written in python. Only supports ARM mode.

I will be using 2.0 version. Hence I need Python as well installed on the Linux Azure VM. So let’s first install python latest version on Azure CentOS VM.
Let’s make sure that yum is up to date by running below command –

sudo yum -y update

-y flag tells system that “relax, we are aware that we are making changes, hence do not prompt for confirmation and save our valuable time”. This command execution will take good amount of time.
Next install yum-utils using below command –

sudo yum -y install yum-utils

Now we need install IUS (Inline with upstream stable). Don’t get scared by name. This is community project which will ensure that whatever version we install for Python 3, we will get the most stable version. Run below command to install IUS –
sudo yum -y install https://centos7.iuscommunity.org/ius-release.rpm

After IUS now we can install recent version of Python. As of writing, the recent version is 3.6 but I will install 3.5 to be on safer side. In python 3.5 version I see 3.5.3 is the latest so let’s install it.
sudo yum -y install python35u-3.5.3 python35u-pip

To verify, simple run below command and it’s output should be 3.5.3.
python3.5 -V

Now install the required prerequisites on CentOS using below command –
sudo yum check-update; sudo yum install -y gcc libffi-devel python-devel openssl-devel

Finally, back to installation of CLI 2.0 -


curl -L https://aka.ms/InstallAzureCli | bash
This may prompt you to download CLI in which directory. Press enter to keep the default path of installation which would be “/root/lib/azure-cli”. Similarly keep pressing enter if more prompts are displayed.
Restart command shell to take changes effect –
Exec -l $SHELL
Just type “az”[without quotes] and it should you Azure cli commands information in CentOs. This means installation of Azure CLI 2.0 on Linux is successful.

Run below command to list storage related commands.
az storage -h

Upload sample files to Azure Storage

This step is straight forward. Use Azure portal and create one standard [not premium] ARM based storage account. Create container and upload 4 sample files in the container. It would look like below –






Add Azure account to CLI

Run command  as shown below. It will prompt you with a code and link to enter the code. After this you will be asked for login using existing azure related credentials. Successful login will show you the subscriptions associated to your account as below –



Set storage account and download the blob

Now set credentials for storage account.
export AZURE_STORAGE_ACCOUNT=YourStorageAccount
export AZURE_STORAGE ACCESS_KEY=YourStorageAccountKey

Create a directory named as test1 using the command. This is the directory in which we will download blob contents. -
mkdir test1/

After this run below command to download the blob file in test1 folder
az storage blob download -c sample -n File1.txt -f /test1/File1.txt
az storage blob download -c sample  /test1

Change directory to test by command cd test1 and run ls -l. This should list File1.txt as shown below.

Limitation

Using Azure CLI you can’t download all the blob from a container. You have to download each and every blob individually; bulk download of azure blobs is not supported.

Resolution to Bulk download

To download all blobs from a container instead of Azure CLI, we will need to use Azure XPlat CLI. Or we can also use Powershell as it is open source now [although I have not tried yet]. It’s common to refer many approaches to achieve one task when you are in open source. J
Azure XPlat CLI is a project that provides cross platform command line interface to manage Azure. Refer documentation here - https://github.com/Azure/azure-xplat-cli. But this is another blog on another day.

Conclusion

So I hope now you understand how easy it is to download azure blob storage contents in Linux Virtual Machine.
Please provide your valuable comments. Good news is its free!!
Keep Downloading!!

Friday, January 16, 2015

Azure Blob Snapshots using REST Api and Client Library


My article of Azure blob snapshots deep dive has been published. The article outlines various azure blob snapshots operations using client library and REST Api. The full source code is also available at bottom to download from GitHub. Here is the link –
http://www.dotnetcurry.com/showarticle.aspx?ID=1072

Cheers...
Happy Snapshotting!!

Monday, August 25, 2014

Using SAS, renew SAS and REST API to Upload large files to Azure blob storage in parallel and async


First of all thanks for overwhelming response to my earlier blog post related to upload of large files to azure block blob in parallel and async.
At the bottom of this post you will find the link for downloading the code sample.
In current post I will extend the same code library to support REST Api to perform azure blob upload. Also, as a best practice you should always use SAS (Shared Access Signature) for performing any operation on blob storage. Therefore in current post I will extend the code to support SAS as well.
Why we need to renew or extend Azure SAS –
Using SAS with blob storage is great option to provide one more level of security to azure storage operations however, anyone who gets access to SAS url can perform malicious operations. Therefore keeping SAS expiry time to minimum is always a good practice. But again, if we are performing upload of fairly large files then keeping SAS expiry to minimum will not help.
For example, let’s consider a scenario. Let’s say I will upload a file of size 100MB in azure blob storage using REST Api and SAS. As a best practice and already stated in my previous blog post of uploading large files to azure block blob in parallel, the files to be uploaded to azure need to be sliced in chunks.
Now if I have set the expiry time to 5 minutes. Of course upload of 100MB size will not be completed in 5 minutes and hence my block upload will fail in between.
Solution -
To continue to upload large files to azure block blob seamlessly after SAS expiry time, I need to renew the SAS token again. This is what exactly I am doing in this blog post and code sample.
So essentially, I check the response status code of every block upload. If response status code Created means current block is uploaded and continue to next. If the response status code is received as 403 Forbidden means the upload of block is failed due to invalid SAS. This is where I mark the particular failed block as Not Added.  Once the loop is over, I retrieve list of failed blocks and perform the same operation with new SAS. To perform same operation of upload with new SAS I am using popular and basic concept you also might know called as RECUSRSIVE FUNCTION.