libhext: C++ Library Documentation
1.0.5-1fd8a07
|
Classes | |
class | AppendPipe |
Appends a given string to a string. More... | |
class | AttributeCapture |
Captures an HTML Element's attribute. More... | |
class | AttributeCountMatch |
Matches HTML elements that have a certain amount of HTML attributes. More... | |
class | AttributeMatch |
Matches HTML elements having an HTML attribute with a certain name and, optionally, whose value is matched by a ValueTest. More... | |
class | BeginsWithTest |
Tests whether a string begins with a given literal. More... | |
class | Capture |
Abstract base for every Capture. More... | |
class | CasePipe |
Changes the case of a string. Changes to lower case by default. More... | |
class | ChildCountMatch |
Matches HTML elements that have a certain amount of children of type element (excluding text nodes, document nodes and others). More... | |
class | Cloneable |
Curiously recurring template pattern that extends a base class to provide a virtual method Cloneable::clone(). More... | |
class | CollapseWsPipe |
Removes whitespace from beginning and end and collapses multiple whitespace to a single space. More... | |
class | ContainsTest |
Tests whether a string contains a given literal. More... | |
class | ContainsWordsTest |
Tests whether a string contains all given words. More... | |
class | EndsWithTest |
Tests whether a string ends with a given literal. More... | |
class | EqualsTest |
Tests whether a string equals a given literal. More... | |
class | FunctionCapture |
Captures the result of applying a function to an HTML node. More... | |
class | FunctionMatch |
Matches if the result of applying a given MatchFunction to an HTML node returns true. More... | |
class | FunctionValueMatch |
Matches if the result of applying a given CaptureFunction to an HTML node passes a ValueTest. More... | |
class | Html |
A RAII wrapper for Gumbo. More... | |
class | Match |
Abstract base for every Match. More... | |
class | MaxSearchError |
The exception that is thrown when max_searches reaches zero while calling Rule::extract . More... | |
class | NegateMatch |
Matches HTML nodes for which every given Match returns false. More... | |
class | NegateTest |
Negates the result of another ValueTest. More... | |
class | NthChildMatch |
Matches HTML nodes having a certain position within their parent HTML element. More... | |
class | OnlyChildMatch |
Matches HTML nodes that are the only child of their parent HTML element. More... | |
class | PrependPipe |
Prepends a given string to a string. More... | |
class | RegexPipe |
Filters a string according to a given regex. More... | |
class | RegexReplacePipe |
Replaces a string within a string according to a given regex. More... | |
class | RegexTest |
Tests whether another string matches a given regex. More... | |
class | Rule |
Extracts values from HTML. More... | |
class | StringPipe |
Abstract base for every StringPipe. More... | |
class | SyntaxError |
The exception that is thrown when parsing invalid hext. More... | |
class | TrimPipe |
Trims characters from the beginning and end of a string. More... | |
class | ValueTest |
Abstract base for every ValueTest. More... | |
Typedefs | |
using | CaptureFunction = std::function< std::string(const GumboNode *)> |
A type of std::function that receives an HTML element and returns a string. More... | |
using | MatchFunction = std::function< bool(const GumboNode *)> |
A type of std::function that receives an HTML element and returns a bool. More... | |
using | ResultPair = std::pair< std::string, std::string > |
A string-pair containing a name and a value. More... | |
using | ResultMap = std::multimap< ResultPair::first_type, ResultPair::second_type > |
A multimap containing the values produced by capturing. More... | |
using | Result = std::vector< ResultMap > |
A vector containing ResultMap. More... | |
Functions | |
HEXT_PUBLIC NthChildMatch::Option | operator| (NthChildMatch::Option left, NthChildMatch::Option right) noexcept |
Applies Bitwise-OR to NthChildMatch::Option. More... | |
HEXT_PUBLIC NthChildMatch::Option | operator& (NthChildMatch::Option left, NthChildMatch::Option right) noexcept |
Applies Bitwise-AND to NthChildMatch::Option. More... | |
HEXT_PUBLIC OnlyChildMatch::Option | operator| (OnlyChildMatch::Option left, OnlyChildMatch::Option right) noexcept |
Applies Bitwise-OR to OnlyChildMatch::Option. More... | |
HEXT_PUBLIC OnlyChildMatch::Option | operator& (OnlyChildMatch::Option left, OnlyChildMatch::Option right) noexcept |
Applies Bitwise-AND to OnlyChildMatch::Option. More... | |
HEXT_PUBLIC Rule | ParseHext (const char *hext) |
Parses a null-terminated string containing hext rule definitions. More... | |
HEXT_PUBLIC Rule | ParseHext (const char *hext, std::size_t size) |
Parses a buffer containing hext rule definitions. More... | |
Variables | |
HEXT_PUBLIC const CaptureFunction | TextBuiltin |
A CaptureFunction that returns the text of an HTML element. More... | |
HEXT_PUBLIC const CaptureFunction | InnerHtmlBuiltin |
A CaptureFunction that returns the inner HTML of an HTML element. More... | |
HEXT_PUBLIC const CaptureFunction | StripTagsBuiltin |
A CaptureFunction that returns the inner HTML of an HTML element without tags. More... | |
HEXT_PUBLIC const int | version_major |
Major version number. More... | |
HEXT_PUBLIC const int | version_minor |
Minor version number. More... | |
HEXT_PUBLIC const int | version_patch |
Patch version number. More... | |
using hext::CaptureFunction = typedef std::function<std::string (const GumboNode *)> |
A type of std::function that receives an HTML element and returns a string.
Definition at line 31 of file CaptureFunction.h.
using hext::MatchFunction = typedef std::function<bool (const GumboNode *)> |
A type of std::function that receives an HTML element and returns a bool.
Definition at line 30 of file MatchFunction.h.
using hext::Result = typedef std::vector<ResultMap> |
using hext::ResultMap = typedef std::multimap<ResultPair::first_type, ResultPair::second_type> |
using hext::ResultPair = typedef std::pair<std::string, std::string> |
|
strong |
An enum containing all valid HTML tags.
With the exception of HtmlTag::ANY, every HtmlTag can be casted to its GumboTag counterpart (same int value).
|
inlinenoexcept |
Applies Bitwise-AND to NthChildMatch::Option.
Definition at line 169 of file NthChildMatch.h.
|
inlinenoexcept |
Applies Bitwise-AND to OnlyChildMatch::Option.
Definition at line 76 of file OnlyChildMatch.h.
|
inlinenoexcept |
Applies Bitwise-OR to NthChildMatch::Option.
Definition at line 160 of file NthChildMatch.h.
|
inlinenoexcept |
Applies Bitwise-OR to OnlyChildMatch::Option.
Definition at line 67 of file OnlyChildMatch.h.
HEXT_PUBLIC Rule hext::ParseHext | ( | const char * | hext | ) |
Parses a null-terminated string containing hext rule definitions.
Throws SyntaxError with a detailed error message on invalid input.
SyntaxError |
hext | A null-terminated string containing hext rule definitions. |
HEXT_PUBLIC Rule hext::ParseHext | ( | const char * | hext, |
std::size_t | size | ||
) |
Parses a buffer containing hext rule definitions.
Throws SyntaxError with a detailed error message on invalid input.
SyntaxError |
hext | A string containing hext rule definitions. |
size | The length of the string. |
|
extern |
A CaptureFunction that returns the inner HTML of an HTML element.
The intent is to mimic innerHtml().
node | A pointer to a GumboNode. |
|
extern |
A CaptureFunction that returns the inner HTML of an HTML element without tags.
node | A pointer to a GumboNode. |
|
extern |
A CaptureFunction that returns the text of an HTML element.
The intent is to mimic functions like jQuery's text(), IE's innerText() or textContent().
node | A pointer to a GumboNode. |
|
extern |
Major version number.
|
extern |
Minor version number.
|
extern |
Patch version number.