parsec-3.1.13.0: Monadic parser combinators

Copyright (c) Daan Leijen 1999-2001 (c) Paolo Martini 2007 BSD-style (see the LICENSE file) [email protected] provisional non-portable (uses local universal quantification: PolymorphicComponents) Safe Haskell2010

Text.Parsec.Token

Description

A helper module to parse lexical elements (tokens). See makeTokenParser for a description of how to use it.

Synopsis

# Documentation

data GenLanguageDef s u m Source #

The GenLanguageDef type is a record that contains all parameterizable features of the Text.Parsec.Token module. The module Text.Parsec.Language contains some default definitions.

Constructors

 LanguageDef FieldscommentStart :: StringDescribes the start of a block comment. Use the empty string if the language doesn't support block comments. For example "/*".commentEnd :: StringDescribes the end of a block comment. Use the empty string if the language doesn't support block comments. For example "*/".commentLine :: StringDescribes the start of a line comment. Use the empty string if the language doesn't support line comments. For example "//".nestedComments :: BoolSet to True if the language supports nested block comments.identStart :: ParsecT s u m CharThis parser should accept any start characters of identifiers. For example letter <|> char '_'.identLetter :: ParsecT s u m CharThis parser should accept any legal tail characters of identifiers. For example alphaNum <|> char '_'.opStart :: ParsecT s u m CharThis parser should accept any start characters of operators. For example oneOf ":!#\$%&*+./<=>[email protected]\\^|-~"opLetter :: ParsecT s u m CharThis parser should accept any legal tail characters of operators. Note that this parser should even be defined if the language doesn't support user-defined operators, or otherwise the reservedOp parser won't work correctly.reservedNames :: [String]The list of reserved identifiers.reservedOpNames :: [String]The list of reserved operators.caseSensitive :: BoolSet to True if the language is case sensitive.

data GenTokenParser s u m Source #

The type of the record that holds lexical parsers that work on s streams with state u over a monad m.

Constructors

 TokenParser Fieldsidentifier :: ParsecT s u m StringThis lexeme parser parses a legal identifier. Returns the identifier string. This parser will fail on identifiers that are reserved words. Legal identifier (start) characters and reserved words are defined in the LanguageDef that is passed to makeTokenParser. An identifier is treated as a single token using try.reserved :: String -> ParsecT s u m ()The lexeme parser reserved name parses symbol name, but it also checks that the name is not a prefix of a valid identifier. A reserved word is treated as a single token using try.operator :: ParsecT s u m StringThis lexeme parser parses a legal operator. Returns the name of the operator. This parser will fail on any operators that are reserved operators. Legal operator (start) characters and reserved operators are defined in the LanguageDef that is passed to makeTokenParser. An operator is treated as a single token using try.reservedOp :: String -> ParsecT s u m ()The lexeme parser reservedOp name parses symbol name, but it also checks that the name is not a prefix of a valid operator. A reservedOp is treated as a single token using try.charLiteral :: ParsecT s u m CharThis lexeme parser parses a single literal character. Returns the literal character value. This parsers deals correctly with escape sequences. The literal character is parsed according to the grammar rules defined in the Haskell report (which matches most programming languages quite closely).stringLiteral :: ParsecT s u m StringThis lexeme parser parses a literal string. Returns the literal string value. This parsers deals correctly with escape sequences and gaps. The literal string is parsed according to the grammar rules defined in the Haskell report (which matches most programming languages quite closely).natural :: ParsecT s u m IntegerThis lexeme parser parses a natural number (a positive whole number). Returns the value of the number. The number can be specified in decimal, hexadecimal or octal. The number is parsed according to the grammar rules in the Haskell report.integer :: ParsecT s u m IntegerThis lexeme parser parses an integer (a whole number). This parser is like natural except that it can be prefixed with sign (i.e. '-' or '+'). Returns the value of the number. The number can be specified in decimal, hexadecimal or octal. The number is parsed according to the grammar rules in the Haskell report.float :: ParsecT s u m DoubleThis lexeme parser parses a floating point value. Returns the value of the number. The number is parsed according to the grammar rules defined in the Haskell report.naturalOrFloat :: ParsecT s u m (Either Integer Double)This lexeme parser parses either natural or a float. Returns the value of the number. This parsers deals with any overlap in the grammar rules for naturals and floats. The number is parsed according to the grammar rules defined in the Haskell report.decimal :: ParsecT s u m IntegerParses a positive whole number in the decimal system. Returns the value of the number.hexadecimal :: ParsecT s u m IntegerParses a positive whole number in the hexadecimal system. The number should be prefixed with "0x" or "0X". Returns the value of the number.octal :: ParsecT s u m IntegerParses a positive whole number in the octal system. The number should be prefixed with "0o" or "0O". Returns the value of the number.symbol :: String -> ParsecT s u m StringLexeme parser symbol s parses string s and skips trailing white space.lexeme :: forall a. ParsecT s u m a -> ParsecT s u m alexeme p first applies parser p and then the whiteSpace parser, returning the value of p. Every lexical token (lexeme) is defined using lexeme, this way every parse starts at a point without white space. Parsers that use lexeme are called lexeme parsers in this document.The only point where the whiteSpace parser should be called explicitly is the start of the main parser in order to skip any leading white space. mainParser = do{ whiteSpace ; ds <- many (lexeme digit) ; eof ; return (sum ds) }whiteSpace :: ParsecT s u m ()Parses any white space. White space consists of zero or more occurrences of a space, a line comment or a block (multi line) comment. Block comments may be nested. How comments are started and ended is defined in the LanguageDef that is passed to makeTokenParser.parens :: forall a. ParsecT s u m a -> ParsecT s u m aLexeme parser parens p parses p enclosed in parenthesis, returning the value of p.braces :: forall a. ParsecT s u m a -> ParsecT s u m aLexeme parser braces p parses p enclosed in braces ('{' and '}'), returning the value of p.angles :: forall a. ParsecT s u m a -> ParsecT s u m aLexeme parser angles p parses p enclosed in angle brackets ('<' and '>'), returning the value of p.brackets :: forall a. ParsecT s u m a -> ParsecT s u m aLexeme parser brackets p parses p enclosed in brackets ('[' and ']'), returning the value of p.squares :: forall a. ParsecT s u m a -> ParsecT s u m aDEPRECATED: Use brackets.semi :: ParsecT s u m StringLexeme parser |semi| parses the character ';' and skips any trailing white space. Returns the string ";".comma :: ParsecT s u m StringLexeme parser comma parses the character ',' and skips any trailing white space. Returns the string ",".colon :: ParsecT s u m StringLexeme parser colon parses the character ':' and skips any trailing white space. Returns the string ":".dot :: ParsecT s u m StringLexeme parser dot parses the character '.' and skips any trailing white space. Returns the string ".".semiSep :: forall a. ParsecT s u m a -> ParsecT s u m [a]Lexeme parser semiSep p parses zero or more occurrences of p separated by semi. Returns a list of values returned by p.semiSep1 :: forall a. ParsecT s u m a -> ParsecT s u m [a]Lexeme parser semiSep1 p parses one or more occurrences of p separated by semi. Returns a list of values returned by p.commaSep :: forall a. ParsecT s u m a -> ParsecT s u m [a]Lexeme parser commaSep p parses zero or more occurrences of p separated by comma. Returns a list of values returned by p.commaSep1 :: forall a. ParsecT s u m a -> ParsecT s u m [a]Lexeme parser commaSep1 p parses one or more occurrences of p separated by comma. Returns a list of values returned by p.

makeTokenParser :: Stream s m Char => GenLanguageDef s u m -> GenTokenParser s u m Source #

The expression makeTokenParser language creates a GenTokenParser record that contains lexical parsers that are defined using the definitions in the language record.

The use of this function is quite stylized - one imports the appropiate language definition and selects the lexical parsers that are needed from the resulting GenTokenParser.

 module Main where

import Text.Parsec
import qualified Text.Parsec.Token as P

-- The parser
...

expr  =   parens expr
<|> identifier
<|> ...

-- The lexer
lexer       = P.makeTokenParser haskellDef

parens      = P.parens lexer
braces      = P.braces lexer
identifier  = P.identifier lexer
reserved    = P.reserved lexer
...