まず私がしたいことは次のとおりです。
- Remarkable(変換)の手書きメモを私のメールボックスに電子メールで受け取りたいです。
- 私はgetmailを使用してメールボックスを検索し、電子メールをローカルに保存します。
- その後、メッセージ(メール形式)をテキストに変換したいと思います。アプリケーションでコメントを使用できるように、webdavを介してテキストファイルを自分のNextcloudサーバーに送信します。
次のコマンドが本当にうまくいくことがわかりました。
F=/path/to/file
reformime -e -s $(reformime -i <$F | fgrep 'content-type: text/plain' | head -n1 | cut -c 10- ) <$F > file.txt
上記のコマンドは、通常のメールクライアントから送信された電子メールに対して機能します。 Remarkableから送信されたEメールは、次のエラーに変換されます。分割失敗(コアダンプ)
顧客の電子メールからの抜粋:
X-CMAE-Envelope:
MS4xfA2xbyQ//lzABNRHb8Owbn2q85OGHF5woA2zecQmWfP34ITLP1+7URNVzTP8FH22v0GsDm468r+OHNHUg2mXUPylSWAjPhu7os3WCo8tzghm1RWpN7U0
oz8YZDkiHRSOmCEYq3Vy8sw5QyTnqPEHp5Pi04lCBrhILKiGfkMCW/WeDhgBrlnoscipapZOym4H1IHf616HEG/OySE4QQRaUpg=
X-getmail-retrieved-from-mailbox: INBOX
----_com.samsung.android.email_11924222509547450
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: base64
TGVvbiBXYXJlIHZvbiBNaXI6LSBMb2dpdGVjaCBMZW5rcmFkIHQgSGFuZGVsLSBQUyA0IGlua2wu
IDIgY29udHJvbGxlci0gTGVub3ZvIE1vbml0b3JWb24gbWVpbmVtL21laW5lciBHYWxheHkgZ2Vz
ZW5kZXQ=
----_com.samsung.android.email_11924222509547450
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: base64
PGh0bWw+PGhlYWQ+PG1ldGEgaHR0cC1lcXVpdj0iQ29udGVudC1UeXBlIiBjb250ZW50PSJ0ZXh0
L2h0bWw7IGNoYXJzZXQ9VVRGLTgiPjwvaGVhZD48Ym9keSBkaXI9ImF1dG8iPkxlb24gV2FyZSB2
b24gTWlyOjxicj4tIExvZ2l0ZWNoIExlbmtyYWQgdCBIYW5kZWw8YnI+LSBQUyA0IGlua2wuIDIg
Y29udHJvbGxlcjxicj4tIExlbm92byBNb25pdG9yPGRpdj48YnI+PC9kaXY+PGRpdj48YnI+PC9k
aXY+PGRpdj48YnI+PC9kaXY+PGRpdj48YnI+PC9kaXY+PGRpdiBpZD0iY29tcG9zZXJfc2lnbmF0
dXJlIj48ZGl2IHN0eWxlPSJmb250LXNpemU6MTJweDtjb2xvcjojNTc1NzU3IiBkaXI9ImF1dG8i
PlZvbiBtZWluZW0vbWVpbmVyIEdhbGF4eSBnZXNlbmRldDwvZGl2PjwvZGl2PjxkaXY+PGJyPjwv
ZGl2PjwvYm9keT48L2h0bWw+
----_com.samsung.android.email_11924222509547450--
RemarkableのEメールからの抜粋:
X-CMAE-Envelope:
MS4xfNVk55TRNs2B6C8cpuJmkKcvMSRZwxUbpMZhZ7Y8lvumn8xQKEhAdfqgdtv2wCFb2LC7Gxv92llU3SvG1Ut5ElO+JO/4AQwXBNUmDk+lrwcnKvN807J8
w8yA8FUIN2wC5BtLNwazEb5tb12PQREN8jqRc1N3eIA4NpZPPcAx+ZwOpRY03yJn87lAAnzF7l75ZFwTxxLYKu8pZz/ATyZxQ1aFWX5Q7S9WNncfx6GPzK67
yFOrkkPC9honahTHu2RGig==
X-getmail-retrieved-from-mailbox: INBOX
DOCTYPE html>
html>
body>p>**This is the Note**/p>p>br/><p>/body>
/html>br>br>--br>Sent from my reMarkable paper tabletbr>Get yours at www.remarkable.combr>br>PS: You cannot reply to this emailbr>
Remarkableのメッセージには!DOCTYPE htmlブロックがあります。
HTML部分を少し編集する必要がありました。そうしないと、ページが正しく解釈されるため表示されません。これは、コマンドに変換できないHTMLブロックのコメントメッセージです。
上記のコマンドで両方のタイプを変換できますか?
ブルーノ様、本当にありがとうございました。
答え1
答えが修正されました!
次のスクリプトを「mail2text」ファイルに保存します。
使用法:bash mail2text --help
#!/usr/bin/bash
declare mail_file output_file output_dir append_prefix\
section_ID section_mimeType section_charset\
choosen_Indices section_ID skip_RegExp section_charset conv_charset="UTF-8"\
command_name option report help\
stderrC=$'\e[38;2;240;80;0m' indexC=$'\e[38;2;0;200;0m' previewC=$'\e[38;2;200;200;0m'
declare -a sections contents opt_subArgs
declare -A skip_type=(["typeID"]=1 ["invert"]=0 ["ignoreCase"]=1 ["0"]="typeID" ["1"]="invert" ["2"]="ignoreCase")
declare -i i sections_L preview_chars=200 convHTMLToText=1 doConvCharset dontOverWrite=0 append_type append_digits=4 main_exit_code=-1 HTMLtoText_width=79
command_name=$(readlink -f "$0"); command_name=${command_name##*/}
if [[ $1 == "--help" ]]; then
help=$(cat << EOF
\e[38;2;123;183;51m\e[1;4mUsage:\e[39m\e[22;24;4:0m $command_name [\e[38;2;240;240;0m\e[3mOPTION...\e[39m\e[23m] [\e[38;2;240;240;0m\e[3mMAIL-FILE\e[39m\e[23m] [\e[38;2;240;240;0m\e[3mOUTPUT-FILE\e[39m\e[23m]
Extracts mail mimeType sections of text/plain or text/html and converts text/html to text/plain.
If more sections are found, let you choose the sections - \e[3mseparator ";" (x;y;...)\e[23m
If no OUTPUT-FILE is overhanded, writes to STDOUT.
\e[49m
\e[38;2;123;183;51m\e[1;4mDepends On:\e[39m\e[22;24;4:0m commands - reformime, html2text
\e[38;2;123;183;51m\e[1;4mOptions:\e[39m\e[22;24;4:0m
The option argument separator ':' can be escaped with '\\\\'.
Dont escape '\\\\' with '\\\\\\\\' to prohibit separator escaping and keep the literal meaning, it will count as double '\\\\''\\\\'. Use double quotes on part!
Double quotes will be removed at start and end position of splitted part. If option argument ends with '\\\\', use double quotes on the last part.
\e[3mArgument examples: $command_name -a \e[38;2;240;240;0m\e[3m'\e[39m1\e[38;2;240;240;0m\e[3m:\e[39mapp\\\\:en\\\\:d me\e[38;2;240;240;0m\e[3m:\e[39m\e[23m6\e[38;2;240;240;0m\e[3m'\e[39m or -a \e[38;2;240;240;0m\e[3m'\e[39m1\e[38;2;240;240;0m\e[3m:\e[39m"append me\\\\"\e[38;2;240;240;0m\e[3m:\e[39m6\e[38;2;240;240;0m\e[3m'\e[39m or \e[38;2;240;240;0m\e[3m'\e[39m1\e[38;2;240;240;0m\e[3m:\e[39m"app\\\\\\\\en\\\\\\\\:d me\\\\\\\\"\e[38;2;240;240;0m\e[3m'\e[39m
Argument parts: '1', 'app:en:d me', '6' - '1', 'append me\\\\', '6' - 1, 'app\\\\\\\\en\\\\:d me\\\\\\\\'\e[39m\e[23m
-\e[38;2;145;246;214ma\e[39m \e[38;2;240;240;0m\e[3mappend\e[39m\e[23m \e[38;2;103;134;250mSTR\e[39m ... do not overwrite existing file! - [\e[38;2;240;240;0m\e[3mtypeID\e[39m\e[23m \e[38;2;103;134;250mINT\e[39m]:[\e[38;2;240;240;0m\e[3mprefix\e[39m\e[23m \e[38;2;103;134;250mSTR\e[39m]:[\e[38;2;240;240;0m\e[3mdigits\e[39m\e[23m \e[38;2;103;134;250mINT\e[39m]
\e[38;2;240;240;0m\e[3mtypeID\e[39m\e[23m \e[38;2;103;134;250mINT\e[39m [1-2]
1 ... append data to file using the defined prefix.
2 ... change filename by appending a continuing number at end,
respecting a filename extension. (filenameXXXX.ext)
\e[38;2;240;240;0m\e[3mprefix\e[39m\e[23m \e[38;2;103;134;250mSTR\e[39m optional (Standard: "")
prefix for appending mail section content.
\e[38;2;240;240;0m\e[3mdigits\e[39m\e[23m \e[38;2;103;134;250mINT\e[39m optional (Standard: 4)
minimum digits of continuing number
-\e[38;2;145;246;214mc\e[39m \e[38;2;240;240;0m\e[3mconv_charset\e[39m\e[23m \e[38;2;103;134;250mSTR\e[39m ... convert section content to this charset. (Standard: UTF-8)
for supported charsets execute "iconv -l". The empty string ("") means keep charset.
-\e[38;2;145;246;214md\e[39m do not convert text/html to text/plain!
-\e[38;2;145;246;214mp\e[39m \e[38;2;240;240;0m\e[3mpreview_chars\e[39m\e[23m \e[38;2;103;134;250mINT\e[39m ... character length of section content preview, if more sections are found! (Standard: 200)
-\e[38;2;145;246;214mr\e[39m \e[38;2;240;240;0m\e[3mskip_RegExp\e[39m\e[23m \e[38;2;103;134;250mSTR\e[39m ... GAWK regular expression pattern for skipping data in mail section content.
-\e[38;2;145;246;214mt\e[39m \e[38;2;240;240;0m\e[3mskip_type\e[39m\e[23m \e[38;2;103;134;250mSTR\e[39m ... define skip type - [\e[38;2;240;240;0m\e[3mtypeID\e[39m\e[23m \e[38;2;103;134;250mINT\e[39m]:[\e[38;2;240;240;0m\e[3minvert\e[39m\e[23m \e[38;2;103;134;250mBOOL\e[39m]:[\e[38;2;240;240;0m\e[3mignoreCase\e[39m\e[23m \e[38;2;103;134;250mBOOL\e[39m]
\e[38;2;240;240;0m\e[3mtypeID\e[39m\e[23m \e[38;2;103;134;250mINT\e[39m [1-2] (Standard: 1)
1 ... skip from first-match-start-position to section-content-end
or from section-content-start to first-match-start-position. (inverse skipping)
2 ... remove globally matches from section content
or print only globally matches from section content. (inverse skipping)
\e[38;2;240;240;0m\e[3minvert \e[38;2;103;134;250mBOOL\e[39m [0 or >0] optional (Standard: 0)
>0 ... invert skipping
\e[38;2;240;240;0m\e[3mignoreCase\e[39m\e[23m \e[38;2;103;134;250mBOOL\e[39m [0 or >0] optional (Standard: 1)
>0 ... ignore case
-\e[38;2;145;246;214mw\e[39m \e[38;2;240;240;0m\e[3mHTML_width\e[39m\e[23m \e[38;2;103;134;250mINT\e[39m ... By default, html2text formats the HTML documents for a screen width of 79 characters
GAWK RegExp description - https://www.gnu.org/software/gawk/manual/html_node/Regexp.html
For mimeType "text/html" the pattern search is executed on HTML, before converting to "text/plain".
\e[38;2;123;183;51m\e[1;4mExamples:\e[39m\e[22;24;4:0m
$command_name -p 300 -r 'hello' -t 1:1 [\e[38;2;240;240;0m\e[3mMAIL-FILE\e[39m\e[23m]
sets preview characters to 300 - A preview will be shown, if more sections are found.
sets a RegExp, which will find "hello" in content data,
sets typeID to 1 and invert to 1 - the data from section-content-start to first-match-start of "hello" will be skipped.
$command_name -r 'he+llo' -t 2 [\e[38;2;240;240;0m\e[3mMAIL-FILE\e[39m\e[23m]
sets a RegExp, which will find "hello", "heello", "heeello", ... in content data,
sets typeID to 2 - the matches on content data will be removed globally.
$command_name -r '[^\\\\n]*hello[^\\\\n]\\\\n' -t 2:1 [\e[38;2;240;240;0m\e[3mMAIL-FILE\e[39m\e[23m]
sets a RegExp, which will match a line with hello,
sets typeID to 2 and invert to 1 - only the globally matches on content data will be printed.
\e[48;2;245;0;10mBug reports to: https://unix.stackexchange.com/questions/773264/convert-e-mail-to-text-reformime\e[49m
EOF
); printf "%b" "$help" | less -rXP "Hit 'q' to exit ---"; exit
fi
str_split_ES ()
{
local -n LOC_ARR=$2; local -i LOC_i=${#LOC_ARR[*]}
local LOC_part LOC_partCarry=""
for LOC_part in $1$IFS; do
if [[ $LOC_part && ${LOC_part: -1} == '\' ]]; then
LOC_partCarry+=${LOC_part:0: -1}$IFS
else
LOC_part=${LOC_partCarry}${LOC_part%'"'};
LOC_ARR[LOC_i++]=${LOC_part#'"'}
LOC_partCarry=""
fi
done
[[ -z ${LOC_ARR[--LOC_i]} ]] && unset "LOC_ARR[LOC_i]";
}
echo -en "$stderrC" >/dev/tty; set -o noglob
while getopts "a:c:dp:r:t:w:" "option"; do
case $option in
"a") dontOverWrite=1; opt_subArgs=(); IFS=":" str_split_ES "$OPTARG" "opt_subArgs"; append_type=${opt_subArgs[0]}; append_prefix=${opt_subArgs[1]}; append_digits=${opt_subArgs[2]:-$append_digits};;
"c") conv_charset=$OPTARG;;
"d") convHTMLToText=0;;
"p") preview_chars=$OPTARG;;
"r") skip_RegExp=$OPTARG;;
"t") opt_subArgs=(); IFS=":" str_split_ES "$OPTARG" "opt_subArgs"; for ((i=0;i<${#opt_subArgs[*]};i++)); do skip_type[${skip_type[$i]}]=${opt_subArgs[i]}; done;;
"w") HTMLtoText_width=$OPTARG;;
"?") exit 2;;
esac
done
shift $((OPTIND-1)); [[ ${#*} -gt 2 ]] && { echo "$command_name - too many arguments after recognized options - allowed [MAIL-FILE] [OUTPUT-FILE]- check your options!" 1>&2 ; exit 2; }
mail_file=$1; output_file=$2;
if [[ $output_file ]]; then
output_dir=$(dirname "$output_file")'/'
[[ ! -d $output_dir ]] && mkdir -p "$output_dir"
fi
skip_text ()
{
awk -v skip_RegExp="$skip_RegExp" '
BEGIN {
RS="^$"; FS=""; skip_type='${skip_type[typeID]}'; skip_invert='${skip_type[invert]}'; skip_ic='${skip_type[ignoreCase]}';
section_mimeType="'"$section_mimeType"'"; section_charset="'"$section_charset"'"; convHTMLToText='$convHTMLToText'; conv_charset="'"$conv_charset"'";
}
{ if (section_mimeType=="text/html")
{
IGNORECASE=1;
if (conv_charset && conv_charset != section_charset)
{
gsub(/<meta *http-equiv[^>]*>/, "");
sub(/<head>/, "<head>\n<meta http-equiv=\"content-type\" content=\"text/html; charset=" conv_charset "\">");
}
else if (convHTMLToText && match($0, /<meta *http-equiv[^>]*charset/)==0)
sub(/<head>/, "<head>\n<meta http-equiv=\"content-type\" content=\"text/html; charset=" section_charset "\">");
}
if (skip_RegExp=="") { printf "%s", $0; exit }
IGNORECASE=skip_ic;
switch (skip_type)
{
case 1: match($0, skip_RegExp);
if (RSTART>0)
{
if (skip_invert) printf "%s", substr($0, RSTART, length($0)-RSTART+1);
else printf "%s", substr($0, 1, RSTART-1);
}
else printf "%s", $0;
break;
case 2: if (skip_invert)
{
text=$0; while (match(text, skip_RegExp))
{
printf "%s", substr(text, RSTART, RLENGTH);
text=substr(text, RSTART+RLENGTH, length(text)-RS-widthTART-RLENGTH+1);
}
}
else { gsub(skip_RegExp, ""); printf "%s", $0 }
break;
}
}'
}
convert_charset ()
{
if ((doConvCharset)); then echo -n "${contents[$1]}" | iconv -f "$section_charset" -t "$conv_charset"
else echo -n "${contents[$1]}"
fi
}
print_sectionContent ()
{
local conv_report="" data filename filename_wext filename_ext append_number_STR
local -i i pipe_failExit append_number
local -a pipe_exits
((main_exit_code<0)) && main_exit_code=0;
section_charset=${sections[$1]#*charset:}; section_charset=${section_charset%%$'\n'*}
section_charset=${section_charset//['"' ]/}; section_charset=${section_charset^^};
if [[ $conv_charset && $conv_charset != "$section_charset" ]]; then
doConvCharset=1; conv_report="charset:$conv_charset"
else doConvCharset=0
fi
if [[ ${sections[$1]} == *text/html* ]]; then
section_mimeType="text/html"
if ((convHTMLToText)); then
conv_report="text/plain-"$conv_report
data=$(convert_charset $1 | skip_text | html2text -width "$HTMLtoText_width" | iconv -c -f "${conv_charset:-$section_charset}" -t "${conv_charset:-$section_charset}"; echo -n ";${PIPESTATUS[*]}")
else
data=$(convert_charset $1 | skip_text; echo -n ";${PIPESTATUS[*]}")
fi
else
section_mimeType="text/plain"
data=$(convert_charset $1 | skip_text; echo -n ";${PIPESTATUS[*]}")
fi
: "${data%;*}"; i=${#_}; pipe_exits=(${data:i+1}); data=${data:0:i}
for ((i=0;i<${#pipe_exits[*]};i++)); do ((pipe_failExit=pipe_failExit | pipe_exits[i])); done
if ((pipe_failExit)); then
report+="pipe failed (command in pipe, exit status >0) - section [$(($1+1))] of mimeType ${section_mimeType}-charset:$section_charset!"$'\n'
return 4
else
if [[ -z $output_file ]]; then echo -ne "\e[1;0m" >/dev/tty; echo -n "$data"; echo -en "$stderrC" >/dev/tty;
elif [[ -f "$output_file" ]]; then
if ((dontOverWrite)); then
case $append_type in
1) [[ -z $append_prefix ]] && echo -n "$data" >> "$output_file" || echo -n "$append_prefix$data" >> "$output_file";;
2) filename=${output_file##*/}; filename_wext=${filename%"."*}; filename_ext=${filename:${#filename_wext}}
filename=${filename_wext%%*([0-9])};
append_number_STR=${filename_wext:${#filename}}; append_number=${append_number_STR##*([0])};
while :; do
((append_number++))
printf -v append_number_STR "%0.${append_digits}d" "$append_number"
output_file="$output_dir$filename$append_number_STR$filename_ext"
[[ ! -f $output_file ]] && break;
done
echo -n "$data" > "$output_file";;
*) echo "append typeID not found - stopped executing at mail-file < ${mail_file}" 1>&2 ; exit 255;;
esac
else echo -n "$data" >"$output_file"
fi
else echo -n "$data" >"$output_file"
fi
if (($?==0)); then
if [[ $conv_report ]]; then
[[ ${conv_report:${#conv_report}-1:1} == "-" ]] && conv_report=${conv_report:0:-1}
conv_report="converted to $conv_report "
fi
report+="section [$(($1+1))] of mimeType ${section_mimeType}-charset:$section_charset ${conv_report}written to ${output_file:-stdout}!"$'\n'
else
report+="write error - wasnt able to write section [$(($1+1))] of mimeType ${section_mimeType}-charset:$section_charset!"$'\n'
return 3
fi
fi
}
[[ -z $mail_file || ! -f $mail_file ]] && { echo "$command_name - mail-file < ${mail_file}"$'\n'"no mail file found!" 1>&2; exit 127; }
IFS=$'\x1e'
sections=($(reformime -i <"$mail_file"| { (($?)) && exit; grep -zoP "section[^\n]*\n[^\n]*text/(plain|html)\n([^\n]*\n[^\n])*harset:[^\n]*"; } | sed "s/\x00/\x1e/g"))
sections_L=${#sections[*]}
shopt -s extglob
for ((i=0;i<$sections_L;i++)); do
section_ID=${sections[i]%%$'\n'*}; section_ID=${section_ID#*: }
contents[i]=$(reformime -e -s "$section_ID" < "$mail_file")
done
case $sections_L in
0) echo "$command_name - mail-file < ${mail_file}"$'\n'"no mimeType text/plain or text/html sections found!" 1>&2 ; exit 5;;
1) IFS=":" print_sectionContent 0;;
*) echo $'\e[1;0mFound sections of mimeType text/(plain|html):\n-\e[44b' >/dev/tty
for ((i=0;i<$sections_L;i++)); do
echo -e "\n[$indexC$((i+1))\e[1;0m]${sections[i]}\n$previewC${contents[i]:0:$preview_chars}\e[1;0m" >/dev/tty
done
echo -ne "\nChoose section(s) [x] :\e7" >/dev/tty
while [[ -z $choosen_Indices || $choosen_Indices != +([0-9]*(;)) ]]; do echo -n $'\e8\e[0K' >/dev/tty; read choosen_Indices; done
echo -en "$stderrC" >/dev/tty; IFS=";"; for i in $choosen_Indices; do ((i>0 && i<=sections_L)) && { IFS=":" print_sectionContent $((i-1)); ((main_exit_code+=$?)); }; done ;;
esac
((main_exit_code<0)) && { report="try to enter a section index, that exists!\n"; main_exit_code=6; }
echo -en "\n$command_name - mail-file < ${mail_file}\n$report" 1>&2; echo -en "\e[1;0m" >/dev/tty; exit $main_exit_code