1
0
mirror of https://github.com/stedolan/jq.git synced 2024-05-11 05:55:39 +00:00

Added regex support as per issue #164.

jq now depends on oniguruma for regex support.
Modified configure.ac accordingly.

Added valgrind suppression file for oniguruma to prevent one-time and bounded
leaks from causing tests to fail.

Signed-off-by: Nicolas Williams <nico@cryptonector.com>
This commit is contained in:
William Langford
2014-06-18 19:49:38 -04:00
committed by Nicolas Williams
parent 5d9d1b1020
commit 8ff935c01a
7 changed files with 420 additions and 2 deletions

View File

@@ -1087,6 +1087,78 @@ sections:
input: '["foobar", "barfoo"]'
output: ['[false, true, true, false, false]']
- title: "`match(val)`, `match(regex; modifiers)`"
body: |
The filter `match(val)` performs PCRE regex matching on its input.
`val` can be either a string or an array. If it is an array,
the first element is the regex specifier and the optional
second element is the modifier flags.
The accepted modifier flags are:
* `g` - Global search (find all matches, not just the first)
* `i` - Case insensitive search
* `x` - Extended regex format (ignore whitespaces)
* `m` - Multi line mode ('.' will match newlines)
* `s` - Single line mode ('^' -> '\A', '$' -> '\Z')
* `p` - Both s and m modes are enabled
* `l` - Find longest possible matches
* `n` - Ignore empty matches
The filter outputs an object for each match it finds. Matches have
the following fields:
* `offset` - offset in UTF-8 codepoints from the beginning of the input
* `length` - length in UTF-8 codepoints of the match
* `string` - the string that it matched
* `captures` - an array of objects representing capturing groups.
Capturing group objects have the following fields:
* `offset` - offset in UTF-8 codepoints from the beginning of the input
* `length` - length in UTF-8 codepoints of this capturing group
* `string` - the string that was captured
* `name` - the name of the capturing group (or `null` if it was unnamed)
Capturing groups that did not match anything return an offset of -1
examples:
- program: 'match("(abc)+"; "g")'
input: '"abc abc"'
output:
- '{"offset": 0, "length": 3, "string": "abc", "captures": [{"offset": 0, "length": 3, "string": "abc", "name": null}]}'
- '{"offset": 4, "length": 3, "string": "abc", "captures": [{"offset": 4, "length": 3, "string": "abc", "name": null}]}'
- program: 'match("foo")'
input: '"foo bar foo"'
output: ['{"offset": 0, "length": 3, "string": "foo", "captures": []}']
- program: 'match(["foo", "ig"])'
input: '"foo bar FOO"'
output:
- '{"offset": 0, "length": 3, "string": "foo", "captures": []}'
- '{"offset": 8, "length": 3, "string": "FOO", "captures": []}'
- program: 'match("foo (?<bar123>bar)? foo"; "ig")'
input: '"foo bar foo foo foo"'
output:
- '{"offset": 0, "length": 11, "string": "foo bar foo", "captures": [{"offset": 4, "length": 3, "string": "bar", "name": "bar123"}]}'
- '{"offset": 12, "length": 8, "string": "foo foo", "captures": [{"offset": -1, "length": 0, "string": null, "name": "bar123"}]}'
- title: "`test(val)`, `test(regex)`, `test(regex; modifiers)`"
body: |
Like `match`, but does not return match objects, only `true` or `false`
for whether or not the regex matches the input.
examples:
- program: 'test("foo")'
input: '"foo"'
output: ['true']
- program: 'test("foo"; "i")'
input: '"Foo"'
output: ['true']
- program: 'test("foo")'
input: '"bar"'
output: ['false']
- title: "`ltrimstr(str)`"
body: |