jparsec入門のバックアップ差分(No.5)

バックアップ一覧
現在との差分を表示
ソースを表示
バックアップを表示
jparsec入門へ行く。
- 1 (2010-08-07 (土) 09:30:32)
- 2 (2010-08-07 (土) 09:34:05)
- 3 (2010-08-10 (火) 01:06:04)
- 4 (2010-08-12 (木) 02:00:41)
- 5 (2010-08-13 (金) 01:36:29)
- 6 (2010-08-17 (火) 19:22:44)
- 7 (2010-08-18 (水) 09:51:52)
- 8 (2010-08-19 (木) 10:13:09)
追加された行はこの色です。
削除された行はこの色です。
[[構文解析の記事一覧]]
*目次 [#yca86807]
#contents
*jparsec [#e9851712]
http://jparsec.codehaus.org/

パーサ生成フレームワーク

YACCとの違いは外部ファイルを必要としない点が違う。

Ruby版も存在しており、rparsecという。言語の先頭１文字をとって区別をつけている。
haskell版もあるがこちらが、元になっているので、こちらの名前はparsecという。

*チュートリアル [#sf63f7e4]
http://jparsec.codehaus.org/jparsec2+Tutorial

**日本語のjparsec使用ブログ [#x951ebb7]

だいたいチュートリアルの和訳相当だと思っていいです。

http://d.hatena.ne.jp/taichitaichi/20071008/1191808121

*俺的チュートリアル [#la10fd7b]
実は、jparsecのソースコードをダウンロードすると、計算機のサンプルコードが入っていて、

これがチュートリアルで解説してあるようなコードよりもすっきりさわやかなコードなのだ。

だから、このサンプルから逆に構築する手順を、観察力＋妄想力で、作り、俺的チュートリアルをつくるのだ！。それが、漢ってもんだろ。

**ゴール [#k0421c8f]
はっきりとしたゴールがあるってことは、それだけでも、しあわせなことなのさ。

とか、書いておきながら、まだ書きかけだったりします。

 /**
 * The main calculator parser.
 * 
 * @author Ben Yu
 */
 public final class Calculator {
  
  /** Parsers {@code source} and evaluates to an {@link Integer}. */
  public static int evaluate(String source) {
    return parser().parse(source);
  }
  
  static final Parser<Integer> NUMBER = Scanners.INTEGER.map(new Map<String, Integer>() {
    public Integer map(String text) {
      return Integer.valueOf(text);
    }
  });
  
  static final Binary<Integer> PLUS = new Binary<Integer>() {
    public Integer map(Integer a, Integer b) {
      return a + b;
    }
  };
  
  static final Binary<Integer> MINUS = new Binary<Integer>() {
    public Integer map(Integer a, Integer b) {
      return a - b;
    }
  };
  
  static final Binary<Integer> MUL = new Binary<Integer>() {
    public Integer map(Integer a, Integer b) {
      return a * b;
    }
  };
  
  static final Binary<Integer> DIV = new Binary<Integer>() {
    public Integer map(Integer a, Integer b) {
      return a / b;
    }
  };
  
  static final Binary<Integer> MOD = new Binary<Integer>() {
    public Integer map(Integer a, Integer b) {
      return a % b;
    }
  };
  
  static final Unary<Integer> NEG = new Unary<Integer>() {
    public Integer map(Integer i) {
      return -i;
    }
  };
  
  private static <T> Parser<T> op(char ch, T value) {
    return isChar(ch).retn(value);
  }
  
  static Parser<Integer> parser() {
    Parser.Reference<Integer> ref = Parser.newReference();
    Parser<Integer> term = ref.lazy().between(isChar('('), isChar(')')).or(NUMBER);
    Parser<Integer> parser = new OperatorTable<Integer>()
        .prefix(op('-', NEG), 100)
        .infixl(op('+', PLUS), 10)
        .infixl(op('-', MINUS), 10)
        .infixl(op('*', MUL), 20)
        .infixl(op('/', DIV), 20)
        .infixl(op('%', MOD), 20)
        .build(term);
    ref.set(parser);
    return parser;
  }
 }

*例 [#qc0d3069]
今朝みれてたのに、例に出していた記事が４０４になって見れなくなっていた、あわててgoogleのキャッシュから保存する。
I Hate Anonymous Classes

**URL [#c041eb23]
http://docs.codehaus.org/display/JPARSEC/I+Hate+Anonymous+Classes 

404になって見れなくなっているかもしれん

**本文 [#sa51b147]
*Jparsecの核心のアイデアについて（原文）I Hate Anonyous Classes! [#v1db40ab]

お察しのとおり,  jparsec全般的に関数脳的な考え方に基づいております。 jparsecを使い始めたら,MapやMap2,Map3などなどの実装にでくわします。で、型を入れていくとかったるくなるんです。

例えば, 次のクラスがあったとします。
-Parser<A>, 
-Parser<B> 
-Parser<C>
さらに、それらを順次、実行させ結果を使って、クラスDを作りたいとします。つまりこんな感じの時です、

 Parser<D> d = Parsers.sequence(a, b, c, new Map3<A, B, C, D>() {
  public D map(A a, B b, C c) {
    return new D(a, b, c);
  }
 });

まあ、そんなに悪くはないコードですよね？で、このAに具体的なクラス名を当てはめてかんがえてみますよ。
|A|UnbelievableGadget<Ipod>|
|B|IncredibleCartoon<Panda<KungFu>> |
|C|ViciouslyBeautiful<KingKong>|

代入すると次のコードになるよね。

 Parser<D> d = Parsers.sequence(a, b, c, new Map3<UnbelievableGadget<Ipod>,  IncredibleCartoon<Panda<KungFu>>, ViciouslyBeautiful<KingKong>, D>() {
  public D map(UnbelievableGadget<Ipod> ipod, IncredibleCartoon<Panda<KungFu>> panda,  ViciouslyBeautiful<KingKong> kingkong) {
    return new D(ipod, panda, kingkong);
  }
 });


使いやすくなったと思うかな？

**JavaでHaskell的なコードが書けるMapperクラス [#r2d724a4]

というわけで、こんな時にはRubyのような動的言語を使っている場合だと、そんなに非常識な記述じゃないんだよね。こんな感じ

 d = sequence(a, b, c) do |ipod, panda, kingkong|
   new D(ipod, panda, kingkong);
 end

さらに関数言語オタク御用達のHaskellだと次のようにかけちゃうんです。

 d = sequence a b c D

ギャー、簡潔すぎる。えっ何？あなたはJavaプログラマーだって？

あなたが自暴自棄になってやけを起こすまえにるまえに、まってください。まだ望みはあります。コーディング野郎どもは、Javaでも等価なコードが書けるものを作っていたんですよ！

そんなあなたのためにこのMapperクラスをご用意致しました！！。ルビー風に記述するとこんな具合です。

 Parser<D> d = new Mapper<D>() {
  D map(UnbelievableGadget<Ipod> ipod, IncredibleCartoon<Panda<KungFu>> panda, ViciouslyBeautiful<KingKong> kingkong) {
    return new D(ipod, panda, kingkong);
  }
 }.sequence(a, b, c);

あれ、言っていること違うんじゃね？そこらじゅうにブラケットだらけじねぇかよ。！
あれ？、「言っていること違うんじゃね？そこらじゅうにブラケットだらけじねぇかよ。！」だって？

JAVA プログラマーよ、そんなに嘆くな。もう一個おもしろいものがあるんだぜ。

そいつは、”curry"っていうんだ。

 Parser<D> d = Mapper.curry(D.class).sequence(a, b, c);
このcurry()メソッドっていうのは、カリー化のための引数をとる。それはなにかっていうと、たとえば、

構文解析する前に、クラスDのコンストラクターがわかっていたとするよね？さらに、Dクラスかどうかで、判定したい場合は、次のように記述します。

 Parser<D> d = Mapper.curry(D.class, true).sequence(a, b, c);

**カリー化 演算子の例 [#z5b6af9d]
A real example is to parse the Java ternary "?:" operator. let's first assume that the conditional expression is modeled as:

After careful observation of the precedence and associativity, the "? consequenceExpression :" part is indeed a right-associative binary operator. Any expression can be the consequenceExpression, but the "?:" binds more tightly to the alternativeExpression than the condExpression.
 public class ConditionaExpression implements Expression {
  // ...
  public ConditionalExpression(Expression cond, Expression consequence, Expression alternative) {
    // ...
  }
 }

In order to declare the "?:" operator as a binary right associative operator, we'll need to create a parser that parses the consequence expression between a "?" and a ":". This parser should return a Map2 that transforms the left operand (condition) and the right operand (alternative) to the conditional expression. Like this:
注意深く見てくれよ。"? consequence表現 :" の箇所は右側に２つの演算の指示(右結合のバイナリー演算子ともいう)を持っている. どんな表現も consequence表現になるが, "?:"は、cond表現よりも緊密にalternative表現に絡んでいる。

 "?:"を右結合のバイナリー演算子として宣言するためには、僕らは、次のようなパーサを作んなきゃならない。
それは、 ”？”と”：”の間のconsequence表現の構文解析器だよね。
んでもって、この構文解析器の戻り値は Map2 になるわけでさらに左側の演算指示箇所と右側の演算指示箇所を conditional表現に変換しなくてはなりません。ちょっとみてもらうとこんな具合です。:

 static Parser<Binary<Expression>> conditionalOperator(Parser<Expression> consequence) {
  return Parsers.between(terminals.token("?"), consequence, terminals.token(":")).map(new Map<Expression, Binary<Expression>>() {
    public Binary<Expression> map(final Expression consequenceExpr) {
      return new Binary<Expression>() {
        public Expression map(Expression condExpr, Expression alternativeExpr) {
          return new ConditionalExpression(condExpr, consequenceExpr, alternativeExpr);
        }
      };
    }
  };
 }

I'll pause for 5 minutes for you to read through the above code snippet and understand the wits buried in the annonymous class nested in the outer anonymous class, and of course, to understand this sentence.
長いですね。複雑ですね。じっくり見ていただくためここで５分待ちましょうか。

Okay, time's up.
（５分経過）

The returned Parser<Binary<Expression>> can then be used in an OperatorTable, as:
よーし、見ていただけたでしょうか

戻り値の
 Parser<Binary<Expression>>
ってOperatorTable内で以下のようにつかわれてます。

 Parser.Reference<Expression> ref = Parser.newReference();
 Parser<Expression> expression = new OperatorTable<Expression>()
  .prefix(...)
  .postfix(...)
  .infixr(conditionalOperator(ref.lazy()), 50)
  ....;
 ref.set(expression);
And if you now see what I'm really up to, and unsurprisingly not impressed by the extra noises in the anonymous classes, here's how we can do it differently with Mapper:
私が本当に驚きをもって、いま、あなたにお見せしたいのは、じゃまな、匿名クラスの記述なしに、同等のことをMapperクラスをつかうことでできるということなんです。

次のようになるんですよ。

 static Parser<Binary<Expression>> conditionalOperator(Parser<Expression> consequence) {
  return  Mapper.<Expression>curry(ConditionalExpression.class).infix(consequence.between(terminals.token("?"), terminals.token(":")));
 }
This code does exactly the same thing as our super duper anonymous classes above.

And using the _ method to explicitly ignore the return values of the "?" and ":" operators, we can make it look even more intuitive:
このコードでさっきのめんどくさくてご立派なコードと同等のコードとなります。

そしてさらに ＿メソッドを用意しておりまして、これを使うと、 "?" や":" の演算子の戻り値を気にせずに、直感的な記述になります。

 import static org.codehaus.jparsec.misc.Mapper._;
 
 static Parser<Binary<Expression>> conditionalOperator(Parser<Expression> consequence) {
  return Mapper.<Expression>curry(ConditionalExpression.class).infix(_(terminals.token("?")), consequence, _(terminals.token(":")));
 }
**The End [#q129725e]
**おわり [#q129725e]

For something as cunning as Mapper, I hope you aren't surprised by its extra requirement of cglib to stay sane performance wise.
Mapperクラスの狡猾な使い方としては、あなたがまともであれば、cglibを使うともっと便利になるかもしれません。 

Labels parameters
以上、日本語に適当に訳してみた。



*SQLの解析サンプルについて [#of950202]
jparsecをダウンロードし、解凍すると
[jparsec-2.0_src]-[examples]-[src]-[org]-[codehaus]-[jparsec]-[examples]-[sql]
がある。

**Eclipseに取り込む手順 [#o49a3671]
jparsecからダウンロードしてきたファイルを解凍しておきます。

junitのjarファイルも手元になければ、ダウンロードしてきます。

ダウンロードしてきたjunitはjunit-4.18.jarとかバージョン名がついているので、

junit.jarという名前にかえておきます。

junitはjparsecのlibフォルダに格納しておきます。

では、eclipseがわの準備を行ってみましょう。

Eclipseに新規にJavaプロジェクトを作成します。

ファイルメニューのインポートで先ほど解凍してできたフォルダを選択し、それをプロジェクトのsrcディレクトリを指定してとりこみます。

srcフォルダは、４つあります、本体用、本体test用、example用、exampleテスト用

インポート直後は

まだ、プロジェクトのビルドパスにjarが登録されていませんので、コンパイルエラーになっています。

そこで、ビルドパスの設定でparsecのlibフォルダ内のjarをすべて登録します。

コンパイルエラー表示はほぼ消えます。

が、１カ所だけAllTestクラスでエラーになっています。

それは、作者がライブラリをあげたくないからだと
build.xmlの８０行目に明記してありました。

こんな感じ、

 AllTests uses jtc, which is an extra dependency that I don't want to upload.

おそらく、Androidのソースを流用したコードだから、著作権の問題であげれないとでも思ったのでしょうか。

それはさておき、
	
build.xmlには、このクラスのみ除外してコンパイルする記述がありました。

要するにいらないんです。この

だからbuild.xmlをいじくりたくなかったので、つぎのようにクラスを書き換えておきました。


 package org.codehaus.jparsec;
 
 //import org.openqa.jtc.junit.TestSuiteBuilder;
 
 import junit.framework.TestSuite;
 
 /**
 *
 * @author benyu
 */
 public class AllTests extends TestSuite {
  public static TestSuite suite() {
    //return TestSuiteBuilder.suite(AllTests.class);
    return null;
  }
 }




**コンパイル方法 [#v14b1801]
build.xmlがあるので、toolは、ソースが公開されていないみたいなので、開発元の方用のantタスクかもしれません。それ以外はコンパイルできました。

**exampleにあるSQLパーサの使い方 [#ded7b6e6]
exampleのテストケースをみると使い方が書いてありました。

***どこに書いてあるかというと [#fae6fde4]
***パッケージ名： [#n1f6c4af]
package org.codehaus.jparsec.examples.sql.parser;

***クラス名： [#i643a3ac]
RelationParserTest

***メソッド名： [#s819af7a]
  public void testSelect() {

***内容の抜粋： [#p5e84e0f]
SQLの問い合わせ文
 select distinct 1, 2 as id from t1, t2
が下記のようにクラスの構造に解析されているのを確認しているテストコードが書かれていました。
    Parser<Relation> parser = RelationParser.select(NUMBER, NUMBER, TABLE);
    assertParser(parser, "select distinct 1, 2 as id from t1, t2",
        new Select(true, 
            Arrays.asList(new Projection(number(1), null), new Projection(number(2), "id")),
            Arrays.asList(table("t1"), table("t2")),
            null, null, null));

*persecのよみもの [#g2d334d4]
**Parsec, 高速なコンビネータパーサ [#rf72fbda]

文字コードをEUCにしないと文字化けします。

http://www.lab2.kuis.kyoto-u.ac.jp/~hanatani/tmp/Parsec.html
jparsec入門 のバックアップ差分(No.5)

jparsec入門のバックアップ差分(No.5)