fb2 本から目次を抽出するには？

Question 1

使用xmlstarlet:

xmlstarlet select --template \
    --value-of '//_:section/_:title/_:p | //_:subtitle' \
    -nl file.xml

または短いオプションを使用してください。

xmlstarlet sel -t \
    -v '//_:section/_:title/_:p | //_:subtitle' \
    -n file.xml

ここで使用されるXPathクエリは、以下の各ノードの値pだけでなく、すべてのノードの値も抽出します。titlesectionsubtitle

式では、各ノード名の前の接頭辞は、文書_:で使用される名前空間識別子の匿名プレースホルダーです。

サンプル文書によると、上記の2つのコマンドのうちの1つの出力は次のようになります。

Part 1
Some name of Part 1
Chapter 1
Some name of Chapter 1
Episode 1
Episode 2
Part 2
Some name of Part 2
Chapter 3
Some name of Chapter 3
Episode 3
Episode 4

本のタイトルも欲しいですか？次に、_:section式から制限を削除します（これを行うと、p書籍のタイトルのノードも一致します）。

よりきれいに見える（本のタイトルを除く）各セクションのタイトルとサブタイトルを取得する別の方法（サブタイトルがどこではなくセクションで選択されているかを示すため）は、最初に一致をパーツに制限することです。その後、その部分からデータを取得します。

xmlstarlet select --template \
    --match '//_:section' \
    --value-of '_:title/_:p | _:subtitle' \
    -nl file.xml

Answer

使用xmlstarlet:

xmlstarlet select --template \
    --value-of '//_:section/_:title/_:p | //_:subtitle' \
    -nl file.xml

または短いオプションを使用してください。

xmlstarlet sel -t \
    -v '//_:section/_:title/_:p | //_:subtitle' \
    -n file.xml

ここで使用されるXPathクエリは、以下の各ノードの値pだけでなく、すべてのノードの値も抽出します。titlesectionsubtitle

式では、各ノード名の前の接頭辞は、文書_:で使用される名前空間識別子の匿名プレースホルダーです。

サンプル文書によると、上記の2つのコマンドのうちの1つの出力は次のようになります。

Part 1
Some name of Part 1
Chapter 1
Some name of Chapter 1
Episode 1
Episode 2
Part 2
Some name of Part 2
Chapter 3
Some name of Chapter 3
Episode 3
Episode 4

本のタイトルも欲しいですか？次に、_:section式から制限を削除します（これを行うと、p書籍のタイトルのノードも一致します）。

よりきれいに見える（本のタイトルを除く）各セクションのタイトルとサブタイトルを取得する別の方法（サブタイトルがどこではなくセクションで選択されているかを示すため）は、最初に一致をパーツに制限することです。その後、その部分からデータを取得します。

xmlstarlet select --template \
    --match '//_:section' \
    --value-of '_:title/_:p | _:subtitle' \
    -nl file.xml

Question 2

とXPath3認識FOSS（GPLv3）コマンドラインツール、xidel:

XPath2 ビルド順序:

xidel -e '(//section/title/p, //subtitle)'  file.xml

XPath1:

xidel -e '//section/title/p | //subtitle'  file.xml

Part 1
Some name of Part 1
Chapter 1
Some name of Chapter 1
Episode 1
Episode 2
Part 2
Some name of Part 2
Chapter 3
Some name of Chapter 3
Episode 3
Episode 4

xidelXML/HTML/JSON を照会するスイス軍用ナイフです。namespace独自の設定を管理できるほどスマートです。

Answer