rsyncまたはscpを使用してmachineBとmachineCからmachineAにファイルを効率的にコピーするには？

Question 1

スクリプトの主な問題は、ファイルごとscpに別々の接続を開くことです。たくさん不要なオーバーヘッド。次のように試すことができます。

#!/usr/bin/env bash

readonly PRIMARY=/export/home/david/dist/primary
readonly SECONDARY=/export/home/david/dist/secondary
readonly FILERS_LOCATION=(machineB machineC)
readonly MEMORY_MAPPED_LOCATION=/data/pe_t1_snapshot

PRIMARY_PARTITION=(0 548 272 4 544 276 8 556 280 12 552 284 16 256 564 20 260 560 24 264 572)
SECONDARY_PARTITION=(1101 1374 1641 1371 1647 1098 1635 1365 1095 1638 1089 1362 1659 1359)

dir1=$(ssh -o "StrictHostKeyChecking no" david@${FILERS_LOCATION[0]} ls -dt1 "$MEMORY_MAPPED_LOCATION"/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] | head -n1)
dir2=$(ssh -o "StrictHostKeyChecking no" david@${FILERS_LOCATION[1]} ls -dt1 "$MEMORY_MAPPED_LOCATION"/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] | head -n1)

## Build your list of filenames before the loop. 
for n in "${PRIMARY_PARTITION[@]}"
do
    primary_files="$primary_files :$dir1"/t1_weekly_1680_"$n"_200003_5.data
done

## Repeat for $SECONDARY_PARTITION
for n in "${SECONDARY_PARTITION[@]}"
do
    secondary_files="$secondary_files :$dir2"/t1_weekly_1680_"$n"_200003_5.data
done

if [ "$dir1" = "$dir2" ]
then
    ## I am using find largely because the * 
    ## in rm -rf "$PRIMARY"/* screws up the syntax 
    ## highlighting on the site and it is a good habit to
    ## get into anyway. Feel free to use rm -rf in your script.
    find "$PRIMARY" -mindepth 1 -delete
    find "$SECONDARY" -mindepth 1 -delete

    ## rsync can be run with this format:
    ##   rsync user@dest:/target/path1 :/target/path2 :/target/pathN /dest/path
    #
    ## which is why I added the : in the loop above. So, these commands will 
    ## open only 2 conections per file list. First you will try to copy all $primary_partition
    ## files from machineA, then all $primary_partition files from machineB. 
    ## rsync will complain about files not found (which is why I'm redirecting standard
    ## error to /dev/null) but will continue. You then repeat the process for machineC.
    rsync -avz david@${FILERS_LOCATION[0]}"${primary_files}" $PRIMARY/ 2>/dev/null
    rsync -avz david@${FILERS_LOCATION[1]}"${primary_files}" $PRIMARY/ 2>/dev/null

    ## Do the same for $secondary_partition files
    rsync -avz david@${FILERS_LOCATION[0]}"${secondary_files}" $SECONDARY/ 2>/dev/null
    rsync -avz david@${FILERS_LOCATION[1]}"${secondary_files}" $SECONDARY/ 2>/dev/null
fi

Answer

スクリプトの主な問題は、ファイルごとscpに別々の接続を開くことです。たくさん不要なオーバーヘッド。次のように試すことができます。

#!/usr/bin/env bash

readonly PRIMARY=/export/home/david/dist/primary
readonly SECONDARY=/export/home/david/dist/secondary
readonly FILERS_LOCATION=(machineB machineC)
readonly MEMORY_MAPPED_LOCATION=/data/pe_t1_snapshot

PRIMARY_PARTITION=(0 548 272 4 544 276 8 556 280 12 552 284 16 256 564 20 260 560 24 264 572)
SECONDARY_PARTITION=(1101 1374 1641 1371 1647 1098 1635 1365 1095 1638 1089 1362 1659 1359)

dir1=$(ssh -o "StrictHostKeyChecking no" david@${FILERS_LOCATION[0]} ls -dt1 "$MEMORY_MAPPED_LOCATION"/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] | head -n1)
dir2=$(ssh -o "StrictHostKeyChecking no" david@${FILERS_LOCATION[1]} ls -dt1 "$MEMORY_MAPPED_LOCATION"/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] | head -n1)

## Build your list of filenames before the loop. 
for n in "${PRIMARY_PARTITION[@]}"
do
    primary_files="$primary_files :$dir1"/t1_weekly_1680_"$n"_200003_5.data
done

## Repeat for $SECONDARY_PARTITION
for n in "${SECONDARY_PARTITION[@]}"
do
    secondary_files="$secondary_files :$dir2"/t1_weekly_1680_"$n"_200003_5.data
done

if [ "$dir1" = "$dir2" ]
then
    ## I am using find largely because the * 
    ## in rm -rf "$PRIMARY"/* screws up the syntax 
    ## highlighting on the site and it is a good habit to
    ## get into anyway. Feel free to use rm -rf in your script.
    find "$PRIMARY" -mindepth 1 -delete
    find "$SECONDARY" -mindepth 1 -delete

    ## rsync can be run with this format:
    ##   rsync user@dest:/target/path1 :/target/path2 :/target/pathN /dest/path
    #
    ## which is why I added the : in the loop above. So, these commands will 
    ## open only 2 conections per file list. First you will try to copy all $primary_partition
    ## files from machineA, then all $primary_partition files from machineB. 
    ## rsync will complain about files not found (which is why I'm redirecting standard
    ## error to /dev/null) but will continue. You then repeat the process for machineC.
    rsync -avz david@${FILERS_LOCATION[0]}"${primary_files}" $PRIMARY/ 2>/dev/null
    rsync -avz david@${FILERS_LOCATION[1]}"${primary_files}" $PRIMARY/ 2>/dev/null

    ## Do the same for $secondary_partition files
    rsync -avz david@${FILERS_LOCATION[0]}"${secondary_files}" $SECONDARY/ 2>/dev/null
    rsync -avz david@${FILERS_LOCATION[1]}"${secondary_files}" $SECONDARY/ 2>/dev/null
fi

Question 2

rsync責任：変更されたファイルだけをコピーしてコピーしたくないファイルは無視します（-C例：何でも指定できますが、CVSがリポジトリから除外するのと同じファイルを除外するスイッチ）。ファイル全体のディレクトリ構造を再帰的にコピーします（もちろん）。、必要な変更だけがあり、すべてではありません）。ストリームを圧縮して転送速度を上げるオプションがあります。また、単一の接続内で完全なコピーを実行するので、より高速です。

単一のファイルだけをコピーするため、ほとんどの機能は使用されません。あなたは使うでしょう

rsync -avz "$firstfile" "$secondfile"

scpこれは、フラグ（a - アーカイブが権限とタイムスタンプを保持し、vは詳細、zは圧縮）を除いて完全に同じです。

ただし、圧縮に scp を使用することもできます。

scp -p -C …

私はこれがここで最も簡単な解決策だと思います。ロゴを追加するだけで終わりです。

Answer