Monday, 8 September 2014

Parse MIB file

This is the initial Xtext grammar that I wrote to load RFC1213-MIB. You can get a copy of RFC1213-MIB from rfc1213.txt, extract content from Section 6: "Definitions" and save it as RFC1213-MIB.mib.

The grammar is product of few iterations of improvement, hence some rules may not be straight forward, I will try to explain the reason as best I can.

Project Info

An Xtext project was created with following attributes:
  • Project Name: com.ravi.mib.xtext
  • Language Name : com.ravi.mib.xtext.Mib
  • Language extensions: mib
Eclipse Modeling Tools Version: Luna Release (4.4.0)
Xtext SDK Version: 2.6.0

Overwrite the generated Mib.xtext with following content:

Mib.xtext
grammar com.ravi.mib.xtext.Mib hidden(WS, ML_COMMENT, SL_COMMENT)

import "http://www.eclipse.org/emf/2002/Ecore" as ecore
generate mib "http://www.ravi.com/mib/xtext/Mib"

Definition:
 name=ID 'DEFINITIONS' '::=' 'BEGIN'
 Export?
 imports=Import?
 (identifiers+=ObjectType | identifiers+=Identifier | DataType)+
 'END';

Import:
 'IMPORTS' defs+=ImpDef+ ';';

ImpDef:
 (objects+=Object ','?)+ 'FROM' def=ID;

Object:
 name=ID;

Export:
 'EXPORTS' (ID ','?)+ ';';

ObjectType returns Identifier:
 name=Object imp=[Object] 'SYNTAX' mibType 'ACCESS' ID 'STATUS' ID
 ('DESCRIPTION' STRING)?
 ('REFERENCE' STRING)?
 ('INDEX' '{' (mibType ','?)+ '}')?
 ('DEFVAL' '{' INT | STRING '}')?
 value=OidValue;

Identifier returns Identifier:
 name=Object mibType value=OidValue;

OidValue:
 '::=' '{' parent=[Object] oidnum=INT '}';

DataType:
 ID '::=' ('[' 'APPLICATION' INT ']')? 'IMPLICIT'? (Choice | Sequence | mibType);

Choice:
 'CHOICE' '{' (ID mibType ','?)+ '}';

Sequence:
 'SEQUENCE' '{' (ID mibType ','?)+ '}';

mibType:
 ('SEQUENCE' 'OF')?
 ('OCTET' 'STRING' | ID | 'INTEGER')
 ('(' 'SIZE'? '('? (INT '..')? INT ')'? ')')?
 ('{' (ID '(' INT ')' ','?)+ '}')? |
 'OBJECT' 'IDENTIFIER';

 /* 
  * -------------------------------------------------------------
    * Unfortunately need to overwrite following lexers
   * -------------------------------------------------------------
   */
terminal ID:
 '^'? ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '-' | '0'..'9')*;

terminal INT returns ecore::EInt:
 ('0'..'9')+;

terminal STRING:
 '"' ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') | !('\\' | '"'))* '"' |
 "'" ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') | !('\\' | "'"))* "'";

terminal ML_COMMENT:
 '/*'->'*/';

terminal SL_COMMENT:
 '--' !('\n' | '\r')* ('\r'? '\n')?;

terminal WS:
 (' ' | '\t' | '\r' | '\n')+;

terminal ANY_OTHER:
 .;

Snapshot of the editor

The above grammar produces pretty neat editor, with nice syntax highlighting, outline and somewhat limited auto-completion.



Some explanations on the grammar:

Ecore

grammar com.ravi.mib.xtext.Mib hidden(WS, ML_COMMENT, SL_COMMENT)

import "http://www.eclipse.org/emf/2002/Ecore" as ecore
generate mib "http://www.ravi.com/mib/xtext/Mib"

This line tells Xtext to generate Ecore model on its own


Terminal Rules

grammar com.ravi.mib.xtext.Mib hidden(WS, ML_COMMENT, SL_COMMENT)

 /* 
  * -------------------------------------------------------------
    * Unfortunately need to overwrite following lexers
   * -------------------------------------------------------------
   */
terminal ID:
 '^'? ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '-' | '0'..'9')*;

terminal INT returns ecore::EInt:
 ('0'..'9')+;

terminal STRING:
 '"' ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') | !('\\' | '"'))* '"' |
 "'" ('\\' ('b' | 't' | 'n' | 'f' | 'r' | 'u' | '"' | "'" | '\\') | !('\\' | "'"))* "'";

terminal ML_COMMENT:
 '/*'->'*/';

terminal SL_COMMENT:
 '--' !('\n' | '\r')* ('\r'? '\n')?;

terminal WS:
 (' ' | '\t' | '\r' | '\n')+;

terminal ANY_OTHER:
 .;


Xtext generates grammar by extending "org.eclipse.xtext.common.Terminals" which defines most common terminal rules (lexer), however in this case, I had to take over in order to overwrite

  • ID - In MIB, its common for identifier to use hyphen character.
  • SL_COMMENT - MIB uses '--' as comment.
Note that we need to specify hidden() as well, this tells Xtext to ignore those terminal rules on parsing, this keeps parsing stage clean from these noise i.e can build parser rules as these hidden terminal rules never existed.

No comments:

Post a Comment