yacc - Input buffer overflow in spite of reading character by character -
in order overcome issue of input buffer overflow in lex, wrote code read incoming stream character character whenever expect see long string, however, still error input buffer overflow, can't enlarge buffer because scanner uses reject
code snippet:
<state> {identifier} { string str = yytext; if(str == "expectedstr") handlelongstr(str); copystring(yylval.str, str); return identifier; } void handlelongstr(string &str) { str.clear(); char ch; while((ch = yyinput()) != '\n') str.push_back(ch); unput(ch); }
yyinput uses buffer space in buffer, although doesn't let recover data read yytext. reason behaviour i've ever come allows unput() many of characters input() without destroying yytext, useful if you're using input() way of peeking @ next input.
for whatever reason, means cannot use yyinput avoid buffer reallocation. need next best thing: handle long tokens in smaller pieces. example, this:
%% /* variable local call yylex */ std::string longtoken; <state>{identifier} { /* i'd prefer use regex pattern if here */ if (is_long_prefix(yytext)) { longtoken.clear(); begin(state_long_identifier); } else { yylval.str = strdup(yytext); return identifier; } // ... } <state_long_identifier>{ /* here handle subtokens of 100 characters. number * arbitrary, nature of flex resulting dfa * have 1 state per repetition, , large repetitions create * lot of states. */ .{1,100} { longtoken.append(yytext, yyleng); } \n { yylval.str = strdup(longtoken.c_str();); begin(state); return identifier; } <<eof>> { error("unterminated long identifier"); } }
Comments
Post a Comment