NAME
Text::ParseWords - parse text into an array of tokens or array of
arraysSYNOPSIS
use Text::ParseWords;
@lists = &nestedquotewords($delim, $keep, @lines);
@words = "ewords($delim, $keep, @lines);
@words = &shellwords(@lines);@words = &parseline($delim, $keep, $line);
@words = &oldshellwords(@lines); # DEPRECATED!
DESCRIPTION
The &nestedquotewords() and "ewords() functions accept a delimiter (which can be a regular expression) and a list of lines and then breaks those lines up into a list of words ignoring delimiters that appear inside quotes. "ewords() returns all of the tokens in a single long list, while &nestedquotewords() returns a list of token lists corresponding to the elements of @lines. &parseline() does tokenizing on a single string. The &*quotewords() functions simply call &parseline(), so if you're only splitting one line you can call &parseline() directly and save a function call.The $keep argument is a boolean flag. If true, then the tokens are
split on the specified delimiter, but all other characters (quotes,backslashes, etc.) are kept in the tokens. If $keep is false then the
&*quotewords() functions remove all quotes and backslashes that are notthemselves backslash-escaped or inside of single quotes (i.e., "e-
words() tries to interpret these characters just like the Bourneshell). NB: these semantics are significantly different from the orig-
inal version of this module shipped with Perl 5.000 through 5.004. Asan additional feature, $keep may be the keyword "delimiters" which
causes the functions to preserve the delimiters in each string astokens in the token lists, in addition to preserving quote and back-
slash characters. &shellwords() is written as a special case of "ewords(), and itdoes token parsing with whitespace as a delimiter- similar to most
Unix shells. EEXXAAMMPPLLEESS The sample program:use Text::ParseWords;
@words = "ewords('\s+', 0, q{this is "a test" of\ quotewords \"for you});$i = 0;
foreach (@words) {print "$i: <$>\n";
$i++;
} produces: 0:1: 2: 3: 4: <"for> 5: demonstrating: 0 a simple word 1 multiple spaces are skipped because of our $delim
2 use of quotes to include a space in a word 3 use of a backslash to include a space in a word4 use of a backslash to remove the special meaning of a double-quote
5 another simple word (note the lack of effect of the backslasheddouble-quote)
Replacing ""ewords('\s+', 0, q{this is...})" with "&shell-
words(q{this is...})" is a simpler way to accomplish the same thing. AUTHORSMaintainer is Hal Pomeranz
author unknown). Much of the code for &parseline() (including the, 1994-1997 (Original primary regexp) from Joerk Behrends
ten.de>. Examples section another documentation provided by John Heidemann Bug reports, patches, and nagging provided by lots of folks- thanks
everybody! Special thanks to Michael Schwernfor assuring me that a &nestedquotewords() would be useful, and to Jeff Friedl
for telling me not to worry about error-checking (sort of- you had to be there).
perl v5.8.8 2001-09-21 Text::ParseWords(3pm)