c++ - Specify a charset without intepreting ranges -
i'm quite puzzled parsing strings when have define in rule minus , minus character , not range of characters between 2 endpoints.
for example, when write rule percent encode string of characters write
*(bk::char_("a-za-z0-9-_.~") | '%' << bk::right_align(2, 0)[bk::upper[bk::hex]]);
which means "letters, capital letters, digits, minus sign, underscore, dot , tilde", third minus sign create range between 9 , underscore or something, have put minus @ end bk::char_("a-za-z0-9_.~-")
.
it solves current problem 1 when input dynamic, user input, , minus sign means minus character?
how prevent spirit assign special meaning of possible characters?
edit001: resort more concrete example @sehe answer
void spirit_direct(std::vector<std::string>& result, const std::string& input, char const* delimiter) { result.clear(); using namespace bsq; if(!parse(input.begin(), input.end(), raw[*(char_ - char_(delimiter))] % char_(delimiter), result)) result.push_back(input); }
in case want ensure minus treated minus , not range 1 alter code following (according @sehe proposal below).
void spirit_direct(std::vector<std::string>& result, const std::string& input, char const* delimiter) { result.clear(); bsq::symbols<char, bsq::unused_type> sym_; std::string separators = delimiter; for(auto ch : separators) { sym_.add(std::string(1, ch)); } using namespace bsq; if(!parse(input.begin(), input.end(), raw[*(char_ - sym_)] % sym_, result)) result.push_back(input); }
which looks quite elegant. in case of using static constant rule guess can escape characters '\', square brackets meant 1 of "special" characters need escaped. why? meaning of []? there additional characters escape?
simple.
you devise , specify supported patterns user can supply meanings.
next,
you write code transforms character-set (e.g. expand ranges (if supported in user input) , sort
-
first character definition).do not use character set @ all.
- why not use
char_ [ _pass = my_match_predicate(_1) ]
- why not make alternation of literal characters?
lit('a') | 'b' | '-' | '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
why not use
qi::symbols<char, char>
(orqi::symbols<char, qi::unused_type> sym_;
raw [ sym_ ]
or similar)update
qi::symbols<>
approach surprisingly fast: live on coliru. had recent optimization job disappointed: see answer (under "spirit (trie)") – binary string hex c++
- why not use
in general, don't know you're trying achieve, spirit not well-suited generating rules on fly. see of existing boost-spirit answers on site.
Comments
Post a Comment