pleroma.debian.social

pleroma.debian.social

Haskell coders:
Say I want to parse a string and I don't need a fancy proper "Parser", I need something similar to a regex. I am happy using a PEG/combinator library if it is accessible. I am a pure ("I am fluent in ML") beginner.

I have found two library candidates:
- SimpleParser
- Text.Regex.Applicative

Both have examples but not commented examples, and the docs are hyper-terse and assume you're already familiar with Haskell idioms. I might be able to decode them but am unsure they're Best.

data RE s a

Type of regular expressions that recognize symbols of type s and produce a result of type a.

Regular expressions can be built using Functor, Applicative, Alternative, and Filtrable instances in the following natural way:

    f <$> ra matches iff ra matches, and its return value is the result of applying f to the return value of ra.
    pure x matches the empty string (i.e. it does not consume any symbols), and its return value is x
    rf <*> ra matches a string iff it is a concatenation of two strings: one matched by rf and the other matched by ra. The return value is f a, where f and a are the return values of rf and ra respectively.

If a beginner was like "I am looking for a haskell library that's the equivalent of Python PyParsing, OCaml sedlex or Rust pom¹", in other words a basic-capability workhorse parser with a focus on ergonomics, is there an obvious go-to in Haskell? I have only Google results so I'm looking for a human insight.

(If I get a reply that begins with "I asked ChatGPT and…" I block you.)

@dysfun ok. isn't that the super industrial strength one? at least one of the other libraries described itself as "megaparsec but simpler".

Although I guess megaparsec *does* have the kind of elaborate, extensive documentation I'm looking for… :O

@dysfun Great. Thank you.

@mcc as an outsider, I didn’t get in to Haskel because answers to this seemed to be “well, if you’re using this compiler, and these extension, and you’ve entirely structured your code to comply to this very specific, neat, mathematical theory, this parser is great. Theoretically. Errors suck though, and it uses quadratic space.”

Or that’s the unfair impression I got anyway.

@mcc that last sentence is a good general policy

@EMR In fact it is the general policy but I felt it was good to explicitly vocalize it in this case

@mcc If the infix operators are unclear: <$> is the infix version of fmap, applying f : a -> b underneath F in F a to yield an F b. <*> applies a function embedded underneath F to a value also embedded underneath F.

Given a function f : a -> b -> c and values x : F a and y : F b you can then write f <$> x <*> y to get an F c and stay pretty close to the notation f x y even though x and y are underneath F.

@mcc @dysfun I'd second this. I had a look at some regex libraries in Haskell and ended up deciding to use parsec instead. Megaparsec is one of a couple of libraries that are actually maintained successors to parsec, from what I remember.

@mcc From personal experience with parsec and attoparsec (plus more arcane stuff with Template Haskell and core), I'd say go for parsec - there's more modern/less standard stuff, but you'd probably prefer stability and good error messages with the option of changing to something else. It's a bit of investment to go that route, but better than handrolling or more researchy parsers.

@mcc optparse-applicative is also a kinda neat special case, see https://github.com/mfourne/signify-hs/blob/main/src/Main.hs for a small example, but I would not want to parse more stuff than some options with an applicative interface.

@mcc @dysfun there’s quite a good chapter on parsec (megaparsec precursor) in “real world Haskell” (which is free online). I concur that a parsec variant is the best choice for this, even for micro-scale parsing . And I found that counter intuitive first time
replies
0
announces
0
likes
0