bblearn.uidaho.edu
to submit a .zip archive
file containing a Flex .l source code file, with your name in a
(C-style) comment at the top.
Use the declarative lexical analysis language Flex to develop a scanner for
JSON files. JSON is described at json.org.
Your solution to this homework will be hooked together with a parser in a
subsequent homework assignment. Note that you are to develop your work for
this assignment in a single file (json.l); it will be linked to the
following makefile and main.c. The main.c will include a file json.h to
define integer codes for different categories of words/tokens. You may not
substitute your own main()
procedure. Your yylex()
really must return an integer code each time it recognizes a token, it may
not just print the output itself.
makefile |
# flex makefile. # json: main.o lex.yy.o cc -o json main.o lex.yy.o main.o: main.c json.h cc -c -g -DLEX main.c lex.yy.o: lex.yy.c json.h cc -c -g lex.yy.c lex.yy.c: json.l flex json.l |
---|---|
main.c |
#include "json.h" #include <stdio.h> #include <stdlib.h> extern FILE *yyin; extern char *yytext; char *yyfilename; int main(int argc, char *argv[]) { int t; if (argc < 2) { printf("usage: json file.json\n"); exit(-1); } yyin = fopen(argv[1],"r"); if (yyin == NULL) { printf("can't open/read '%s'\n", argv[1]); exit(-1); } yyfilename = argv[1]; while ((t=yylex()) > 0) { if (t <= 32) { printf("token %d text %s\n", t, yytext); } else { printf("token %c\n", t); } } return 0; } |
json.h |
/* header file for JSON tokens, codes from json.org */ #define TRUE 1 #define FALSE 2 #define NULL 3 #define LCURLY '{' #define RCURLY '}' #define COMMA ',' #define COLON ':' #define LBRACKET '[' #define RBRACKET ']' #define STRINGLIT 4 /* #define CHARLIT 5 json does not have character literals! */ #define NUMBER 6 |
In this homework, you write out the category found for each "word" of a JSON file, one per line. If the word is from one of the lower integer categories (values 1-6), you also write out the letters that were matched for that word.
If the input file contains things that are not legal in JSON, you should
write a line containing "lexical error on line n", where n
is the line number. The easiest way to do this, as seen in class, is to use
%option yylineno
in your .l file, so you are required to do that.
Lexical errors do not return integer categories from yylex(), they are treated
as whitespace.
{ "name":"Clint", "age":54, "cars": { "car1":"Ford", "car2":"Pontiac", "car3":"Pontiac" } }You don't have to worry yet about the syntax or semantics of this file, your job is to break it into lexical tokens: adjacent sequences of characters that form indivisible values or entities. Your program (linked with the main.c above) would print out
token { token 4 text "name" token : token 4 text "Clint" token , token 4 text "age" token : token 6 text 54 token , token 4 text "cars" token : token { token 4 text "car1" token : token 4 text "Ford" token , token 4 text "car2" token : token 4 text "Pontiac" token , token 4 text "car3", token : token 4 text "Pontiac" token } token }