Bright cheery words of encouragement: if you put off this assignment until shortly before it is due, you will probably fail to complete it.
Use bison to develop a parser (syntax checker), syntax tree, and pretty printer for the JSON language. Develop your work for this assignment as a Bison grammar file (you write json.y); plus a tweaked version of your previous flex assignment. You may add .c or .h files and if you do, you must add correct dependency rules to your makefile.
The file json.h that defines integer codes for different categories of words/tokens must be modified to take those codes from bison, which generates them with the -d option as a .tab.h file.
In order to print a useful syntax error message, your .y file's helper functions section should define a function yyerror(char *s) per YACC and Bison standards.
makefile |
jsonpp: main.o lex.yy.o json.tab.o cc -o jsonpp main.o lex.yy.o json.tab.o -ll main.o: main.c json.tab.h json.h cc -c -g main.c lex.yy.o: lex.yy.c cc -c -g lex.yy.c lex.yy.c: json.l json.h json.tab.h flex json.l json.tab.o: json.tab.c cc -c -g json.tab.c json.tab.c: json.y bison json.y json.h: json.tab.h touch json.h json.tab.h: json.y bison -d json.y |
---|---|
main.c |
#include "json.tab.h" #include <stdio.h> #include <stdlib.h> extern FILE *yyin; extern char *yytext; char *yyfilename; int main(int argc, char *argv[]) { int i; if (argc < 2) { printf("usage: iscan file.dat\n"); exit(-1); } yyin = fopen(argv[1],"r"); if (yyin == NULL) { printf("can't open/read '%s'\n", argv[1]); exit(-1); } yyfilename = argv[1]; if ((i=yyparse()) != 0) { printf("parse failed\n"); } else printf("no errors\n"); return 0; } |
json.h |
#include "json.tab.h" |
no errors
Note that the grammar given here imposes some restrictions which would impose a change to the examples in Homework #1. Since the title is required, the Homework #1 examples would indicate errors as described below.
struct treenode { int label; /* terminal symbol, or production rule # */ int nkids; /* 0 for tokens (tree leaves) struct treenode *kids[5]; /* sized for muth.y */ struct token lexinfo; }For every grammar rule, you are writing something like:
mynonterminal : MYTERM1 mynont2 MYTERM2 { $$ = calloc(1,sizeof(struct treenode)); $$->label = PRODRULE; $$->nkids = 3; $$->kids[0] = $1; $$->kids[1] = $2; $$->kids[2] = $3; } ;For the leaves of your tree, you either have to modify HW#1 to create leaves in yylex(), or have to write your JSON grammar to encapsulate and create leaves every time a terminal symbol is recognized. The "modify HW#1" option looks like:
myregex { yylval.tree = calloc(1, sizeof(struct treenode)); yylval.tree->label = MYTERMINALSYMBOL; yylval.tree->nkids = 0; yylval.tree->lexinfo.lexeme = strdup(yytext); .../* insert code to preserve line and column #, filename, etc. */ return MYTERMINALSYMBOL; }The "encapsulate terminal symbols in the grammar" option looks like the following additional rules, for each terminal symbol, in your grammar:
myterminalsymbol : MYTERMINALSYMBOL { $$ = calloc(1, sizeof(struct treenode)); $$->label = MYTERMINALSYMBOL; $$->nkids = 0; $$->lexinfo.lexeme = strdup(yytext); .../* insert code to preserve line and column #, filename, etc. */ } ;With either option, you then have to plug these leaves into the larger internal nodes as children, when larger-scale non-terminal rules occur during the parse.
void print_tree(struct treenode *n) { int i; if (n == NULL) return; /* don't segfault on NULLs */ printf("node %d\n", n->label); if (n->nkids==0) { /* print stuff about leaf */ } else { for(i=0;inkids;i++) print_tree(n->kids[i]); } }