06 Aug 2013
This is going to really ease parsing datasets with idiosyncratic conventions. See how the Ruby implementation works on the location headings in the
ULAN that annoyingly concatenate unique id numbers with preferred terms:
require 'verbal_expressions' location = "5600392409/New York City (New York state, United States) (inhabited place)" num_query = VerEx . new do start_of_line anything_but "/" end puts num_query . source # => ^(?:[^/]*) content_query = VerEx . new do find "/" anything_but "(" end puts content_query . source # => (?:/)(?:[^\(]*) puts location . slice ( num_query ) # => 5600392409 puts location . slice ( content_query ) # => New York City
Lincoln, Matthew D. "VerbalExpressions."
Matthew Lincoln, PhD (blog), 06 Aug 2013, https://matthewlincoln.net/2013/08/06/verbalexpressions.html.