curlを使用してLinuxコマンドラインからファイルをダウンロードする方法

Linuxcurlコマンドは、ファイルをダウンロードする以上のことを実行できます。何curlができるのか、そしていつ代わりにそれを使うべきかを調べてくださいwget。

curl vs. wget：違いは何ですか？

人々はしばしばwgetとcurlコマンドの相対的な強さを特定するのに苦労します。コマンドにはいくつかの機能的な重複があります。それぞれが離れた場所からファイルを取得できますが、ここで類似性が終わります。

wgetコンテンツとファイルをダウンロードするための素晴らしいツールです。ファイル、Webページ、およびディレクトリをダウンロードできます。これには、Webページ内のリンクをトラバースし、Webサイト全体にコンテンツを再帰的にダウンロードするためのインテリジェントなルーチンが含まれています。コマンドラインダウンロードマネージャーとしては最高です。

curlまったく異なるニーズを満たします。はい、ファイルを取得することはできますが、取得するコンテンツを探してWebサイトを再帰的にナビゲートすることはできません。どのようなcurl実際に行うことは、あなたがそれらのシステムへのリクエストを作成し、あなたにその応答を取得し、表示することによって、リモートシステムとやり取りさせています。これらの応答は、Webページのコンテンツとファイルである可能性がありますが、curl要求によって尋ねられた「質問」の結果としてWebサービスまたはAPIを介して提供されたデータを含むこともできます。

そしてcurl、ウェブサイトに限定されません。curlHTTP、HTTPS、SCP、SFTP、FTPを含む20以上のプロトコルをサポートします。そして間違いなく、Linuxパイプの優れた処理curlにより、他のコマンドやスクリプトとより簡単に統合できます。

著者はcurl、彼が間を見て違いを説明し、Webページがあるcurlとしwget。

curlのインストール

この記事の調査に使用したコンピューターのうち、Fedora31とManjaro18.1.0はcurl すでにインストールされていました。curlUbuntu 18.04LTSにインストールする必要がありました。Ubuntuでは、次のコマンドを実行してインストールします。

sudo apt-get install curl

カールバージョン

この--versionオプションは、 curlレポートをそのバージョンにします。また、サポートするすべてのプロトコルも一覧表示されます。

curl --version

Webページの取得

curlWebページをポイントすると、それが取得されます。

カール//www.bbc.com

ただし、デフォルトのアクションは、ソースコードとしてターミナルウィンドウにダンプすることです。

注意：curlファイルとして保存するように指示しない場合は、常にターミナルウィンドウにダンプされます。取得するファイルがバイナリファイルの場合、結果は予測できない可能性があります。シェルは、バイナリファイル内のバイト値の一部を制御文字またはエスケープシーケンスとして解釈しようとする場合があります。

データをファイルに保存する

出力をファイルにリダイレクトするようにcurlに指示しましょう。

curl //www.bbc.com> bbc.html

今回は取得した情報が表示されないため、ファイルに直接送信されます。表示するターミナルウィンドウ出力がないためcurl、進行状況情報のセットを出力します。

前の例では、進行状況情報がWebページのソースコード全体に散らばっていたため、これを実行しなかったため、curl自動的に抑制されました。

この例で curlは、出力がファイルにリダイレクトされていること、および進行状況情報を安全に生成できることを検出します。

提供される情報は次のとおりです。

％合計：取得する合計金額。
％Received：これまでに取得されたデータのパーセンテージと実際の値。
％Xferd：データがアップロードされている場合、送信されたパーセントと実際の値。
平均速度Dload：平均ダウンロード速度。
平均アップロード速度：平均アップロード速度。
時間合計：転送の推定合計時間。
Time Spent: The elapsed time so far for this transfer.
Time Left: The estimated time left for the transfer to complete
Current Speed: The current transfer speed for this transfer.

Because we redirected the output from curl to a file, we now have a file called “bbc.html.”

Double-clicking that file will open your default browser so that it displays the retrieved web page.

Note that the address in the browser address bar is a local file on this computer, not a remote website.

We don’t have to redirect the output to create a file. We can create a file by using the -o (output) option, and telling curl to create the file. Here we’re using the -o option and providing the name of the file we wish to create “bbc.html.”

curl -o bbc.html //www.bbc.com

Using a Progress Bar To Monitor Downloads

To have the text-based download information replaced by a simple progress bar, use the -# (progress bar) option.

curl -x -o bbc.html //www.bbc.com

Restarting an Interrupted Download

It is easy to restart a download that has been terminated or interrupted. Let’s start a download of a sizeable file. We’ll use the latest Long Term Support build of Ubuntu 18.04. We’re using the --output option to specify the name of the file we wish to save it into: “ubuntu180403.iso.”

curl --output ubuntu18043.iso //releases.ubuntu.com/18.04.3/ubuntu-18.04.3-desktop-amd64.iso

The download starts and works its way towards completion.

If we forcibly interrupt the download with Ctrl+C , we’re returned to the command prompt, and the download is abandoned.

To restart the download, use the -C (continue at) option. This causes curl to restart the download at a specified point or offset within the target file. If you use a hyphen - as the offset, curl will look at the already downloaded portion of the file and determine the correct offset to use for itself.

curl -C - --output ubuntu18043.iso //releases.ubuntu.com/18.04.3/ubuntu-18.04.3-desktop-amd64.iso

The download is restarted. curl reports the offset at which it is restarting.

Retrieving HTTP headers

With the -I (head) option, you can retrieve the HTTP headers only. This is the same as sending the HTTP HEAD command to a web server.

curl -I www.twitter.com

This command retrieves information only; it does not download any web pages or files.

Downloading Multiple URLs

Using xargs we can download multiple URLs at once. Perhaps we want to download a series of web pages that make up a single article or tutorial.

Copy these URLs to an editor and save it to a file called “urls-to-download.txt.” We can use xargs to treat the content of each line of the text file as a parameter which it will feed to curl, in turn.

//tutorials.ubuntu.com/tutorial/tutorial-create-a-usb-stick-on-ubuntu#0 //tutorials.ubuntu.com/tutorial/tutorial-create-a-usb-stick-on-ubuntu#1 //tutorials.ubuntu.com/tutorial/tutorial-create-a-usb-stick-on-ubuntu#2 //tutorials.ubuntu.com/tutorial/tutorial-create-a-usb-stick-on-ubuntu#3 //tutorials.ubuntu.com/tutorial/tutorial-create-a-usb-stick-on-ubuntu#4 //tutorials.ubuntu.com/tutorial/tutorial-create-a-usb-stick-on-ubuntu#5

This is the command we need to use to have xargs pass these URLs to curl one at a time:

xargs -n 1 curl -O < urls-to-download.txt

Note that this command uses the -O (remote file) output command, which uses an uppercase “O.” This option causes curl to save the retrieved file with the same name that the file has on the remote server.

The -n 1 option tells xargs to treat each line of the text file as a single parameter.

When you run the command, you’ll see multiple downloads start and finish, one after the other.

Checking in the file browser shows the multiple files have been downloaded. Each one bears the name it had on the remote server.

RELATED:How to Use the xargs Command on Linux

Downloading Files From an FTP Server

Using curl with a File Transfer Protocol (FTP) server is easy, even if you have to authenticate with a username and password. To pass a username and password with curl use the -u (user) option, and type the username, a colon “:”, and the password. Don’t put a space before or after the colon.

This is a free-for-testing FTP server hosted by Rebex. The test FTP site has a pre-set username of “demo”, and the password is “password.” Don’t use this type of weak username and password on a production or “real” FTP server.

curl -u demo:password ftp://test.rebex.net

curl figures out that we’re pointing it at an FTP server, and returns a list of the files that are present on the server.

The only file on this server is a “readme.txt” file, of 403 bytes in length. Let’s retrieve it. Use the same command as a moment ago, with the filename appended to it:

curl -u demo:password ftp://test.rebex.net/readme.txt

The file is retrieved and curl displays its contents in the terminal window.

In almost all cases, it is going to be more convenient to have the retrieved file saved to disk for us, rather than displayed in the terminal window. Once more we can use the -O (remote file) output command to have the file saved to disk, with the same filename that it has on the remote server.

curl -O -u demo:password ftp://test.rebex.net/readme.txt

The file is retrieved and saved to disk. We can use ls to check the file details. It has the same name as the file on the FTP server, and it is the same length, 403 bytes.

ls -hl readme.txt

RELATED:How to Use the FTP Command on Linux

Sending Parameters to Remote Servers

Some remote servers will accept parameters in requests that are sent to them. The parameters might be used to format the returned data, for example, or they may be used to select the exact data that the user wishes to retrieve. It is often possible to interact with web application programming interfaces (APIs) using curl.

As a simple example, the ipify website has an API can be queried to ascertain your external IP address.

curl //api.ipify.org

By adding the format parameter to the command, with the value of “json” we can again request our external IP address, but this time the returned data will be encoded in the JSON format.

curl //api.ipify.org?format=json

Here’s another example that makes use of a Google API. It returns a JSON object describing a book. The parameter you must provide is the International Standard Book Number (ISBN) number of a book. You can find these on the back cover of most books, usually below a barcode. The parameter we’ll use here is “0131103628.”

curl //www.googleapis.com/books/v1/volumes?q=isbn:0131103628

The returned data is comprehensive:

Sometimes curl, Sometimes wget

If I wanted to download content from a website and have the tree-structure of the website searched recursively for that content, I’d use wget.

If I wanted to interact with a remote server or API, and possibly download some files or web pages, I’d use curl. Especially if the protocol was one of the many not supported by wget.