libhext: C++ Library Documentation  1.0.9-c8ac8b6
Classes | Typedefs | Enumerations | Functions | Variables
hext Namespace Reference

Classes

class  AppendPipe
 Appends a given string to a string. More...
 
class  AttributeCapture
 Captures an HTML Element's attribute. More...
 
class  AttributeCountMatch
 Matches HTML elements that have a certain amount of HTML attributes. More...
 
class  AttributeMatch
 Matches HTML elements having an HTML attribute with a certain name and, optionally, whose value is matched by a ValueTest. More...
 
class  BeginsWithTest
 Tests whether a string begins with a given literal. More...
 
class  Capture
 Abstract base for every Capture. More...
 
class  CasePipe
 Changes the case of a string. Changes to lower case by default. More...
 
class  ChildCountMatch
 Matches HTML elements that have a certain amount of children of type element (excluding text nodes, document nodes and others). More...
 
class  Cloneable
 Curiously recurring template pattern that extends a base class to provide a virtual method Cloneable::clone(). More...
 
class  CollapseWsPipe
 Removes whitespace from beginning and end and collapses multiple whitespace to a single space. More...
 
class  ContainsTest
 Tests whether a string contains a given literal. More...
 
class  ContainsWordsTest
 Tests whether a string contains all given words. More...
 
class  EndsWithTest
 Tests whether a string ends with a given literal. More...
 
class  EqualsTest
 Tests whether a string equals a given literal. More...
 
class  FunctionCapture
 Captures the result of applying a function to an HTML node. More...
 
class  FunctionMatch
 Matches if the result of applying a given MatchFunction to an HTML node returns true. More...
 
class  FunctionValueMatch
 Matches if the result of applying a given CaptureFunction to an HTML node passes a ValueTest. More...
 
class  Html
 A RAII wrapper for Gumbo. More...
 
class  Match
 Abstract base for every Match. More...
 
class  MaxSearchError
 The exception that is thrown when max_searches reaches zero while calling Rule::extract. More...
 
class  NegateMatch
 Matches HTML nodes for which every given Match returns false. More...
 
class  NegateTest
 Negates the result of another ValueTest. More...
 
class  NthChildMatch
 Matches HTML nodes having a certain position within their parent HTML element. More...
 
class  OnlyChildMatch
 Matches HTML nodes that are the only child of their parent HTML element. More...
 
class  PrependPipe
 Prepends a given string to a string. More...
 
class  RegexPipe
 Filters a string according to a given regex. More...
 
class  RegexReplacePipe
 Replaces a string within a string according to a given regex. More...
 
class  RegexTest
 Tests whether another string matches a given regex. More...
 
class  Rule
 Extracts values from HTML. More...
 
class  StringPipe
 Abstract base for every StringPipe. More...
 
class  SyntaxError
 The exception that is thrown when parsing invalid hext. More...
 
class  TrimPipe
 Trims characters from the beginning and end of a string. More...
 
class  TypeRegexMatch
 Matches the name of an HTML element against a regular expression. More...
 
class  ValueTest
 Abstract base for every ValueTest. More...
 

Typedefs

using CaptureFunction = std::function< std::string(const GumboNode *)>
 A type of std::function that receives an HTML element and returns a string. More...
 
using MatchFunction = std::function< bool(const GumboNode *)>
 A type of std::function that receives an HTML element and returns a bool. More...
 
using ResultPair = std::pair< std::string, std::string >
 A string-pair containing a name and a value. More...
 
using ResultMap = std::multimap< ResultPair::first_type, ResultPair::second_type >
 A multimap containing the values produced by capturing. More...
 
using Result = std::vector< ResultMap >
 A vector containing ResultMap. More...
 

Enumerations

enum class  HtmlTag : int {
  HTML = GUMBO_TAG_HTML , HEAD = GUMBO_TAG_HEAD , TITLE = GUMBO_TAG_TITLE , BASE = GUMBO_TAG_BASE ,
  LINK = GUMBO_TAG_LINK , META = GUMBO_TAG_META , STYLE = GUMBO_TAG_STYLE , SCRIPT = GUMBO_TAG_SCRIPT ,
  NOSCRIPT = GUMBO_TAG_NOSCRIPT , TEMPLATE = GUMBO_TAG_TEMPLATE , BODY = GUMBO_TAG_BODY , ARTICLE = GUMBO_TAG_ARTICLE ,
  SECTION = GUMBO_TAG_SECTION , NAV = GUMBO_TAG_NAV , ASIDE = GUMBO_TAG_ASIDE , H1 = GUMBO_TAG_H1 ,
  H2 = GUMBO_TAG_H2 , H3 = GUMBO_TAG_H3 , H4 = GUMBO_TAG_H4 , H5 = GUMBO_TAG_H5 ,
  H6 = GUMBO_TAG_H6 , HGROUP = GUMBO_TAG_HGROUP , HEADER = GUMBO_TAG_HEADER , FOOTER = GUMBO_TAG_FOOTER ,
  ADDRESS = GUMBO_TAG_ADDRESS , P = GUMBO_TAG_P , HR = GUMBO_TAG_HR , PRE = GUMBO_TAG_PRE ,
  BLOCKQUOTE = GUMBO_TAG_BLOCKQUOTE , OL = GUMBO_TAG_OL , UL = GUMBO_TAG_UL , LI = GUMBO_TAG_LI ,
  DL = GUMBO_TAG_DL , DT = GUMBO_TAG_DT , DD = GUMBO_TAG_DD , FIGURE = GUMBO_TAG_FIGURE ,
  FIGCAPTION = GUMBO_TAG_FIGCAPTION , MAIN = GUMBO_TAG_MAIN , DIV = GUMBO_TAG_DIV , A = GUMBO_TAG_A ,
  EM = GUMBO_TAG_EM , STRONG = GUMBO_TAG_STRONG , SMALL = GUMBO_TAG_SMALL , S = GUMBO_TAG_S ,
  CITE = GUMBO_TAG_CITE , Q = GUMBO_TAG_Q , DFN = GUMBO_TAG_DFN , ABBR = GUMBO_TAG_ABBR ,
  DATA = GUMBO_TAG_DATA , TIME = GUMBO_TAG_TIME , CODE = GUMBO_TAG_CODE , VAR = GUMBO_TAG_VAR ,
  SAMP = GUMBO_TAG_SAMP , KBD = GUMBO_TAG_KBD , SUB = GUMBO_TAG_SUB , SUP = GUMBO_TAG_SUP ,
  I = GUMBO_TAG_I , B = GUMBO_TAG_B , U = GUMBO_TAG_U , MARK = GUMBO_TAG_MARK ,
  RUBY = GUMBO_TAG_RUBY , RT = GUMBO_TAG_RT , RP = GUMBO_TAG_RP , BDI = GUMBO_TAG_BDI ,
  BDO = GUMBO_TAG_BDO , SPAN = GUMBO_TAG_SPAN , BR = GUMBO_TAG_BR , WBR = GUMBO_TAG_WBR ,
  INS = GUMBO_TAG_INS , DEL = GUMBO_TAG_DEL , IMAGE = GUMBO_TAG_IMAGE , IMG = GUMBO_TAG_IMG ,
  IFRAME = GUMBO_TAG_IFRAME , EMBED = GUMBO_TAG_EMBED , OBJECT = GUMBO_TAG_OBJECT , PARAM = GUMBO_TAG_PARAM ,
  VIDEO = GUMBO_TAG_VIDEO , AUDIO = GUMBO_TAG_AUDIO , SOURCE = GUMBO_TAG_SOURCE , TRACK = GUMBO_TAG_TRACK ,
  CANVAS = GUMBO_TAG_CANVAS , MAP = GUMBO_TAG_MAP , AREA = GUMBO_TAG_AREA , MATH = GUMBO_TAG_MATH ,
  MI = GUMBO_TAG_MI , MO = GUMBO_TAG_MO , MN = GUMBO_TAG_MN , MS = GUMBO_TAG_MS ,
  MTEXT = GUMBO_TAG_MTEXT , MGLYPH = GUMBO_TAG_MGLYPH , MALIGNMARK = GUMBO_TAG_MALIGNMARK , ANNOTATION_XML = GUMBO_TAG_ANNOTATION_XML ,
  SVG = GUMBO_TAG_SVG , FOREIGNOBJECT = GUMBO_TAG_FOREIGNOBJECT , DESC = GUMBO_TAG_DESC , TABLE = GUMBO_TAG_TABLE ,
  CAPTION = GUMBO_TAG_CAPTION , COLGROUP = GUMBO_TAG_COLGROUP , COL = GUMBO_TAG_COL , TBODY = GUMBO_TAG_TBODY ,
  THEAD = GUMBO_TAG_THEAD , TFOOT = GUMBO_TAG_TFOOT , TR = GUMBO_TAG_TR , TD = GUMBO_TAG_TD ,
  TH = GUMBO_TAG_TH , FORM = GUMBO_TAG_FORM , FIELDSET = GUMBO_TAG_FIELDSET , LEGEND = GUMBO_TAG_LEGEND ,
  LABEL = GUMBO_TAG_LABEL , INPUT = GUMBO_TAG_INPUT , BUTTON = GUMBO_TAG_BUTTON , SELECT = GUMBO_TAG_SELECT ,
  DATALIST = GUMBO_TAG_DATALIST , OPTGROUP = GUMBO_TAG_OPTGROUP , OPTION = GUMBO_TAG_OPTION , TEXTAREA = GUMBO_TAG_TEXTAREA ,
  KEYGEN = GUMBO_TAG_KEYGEN , OUTPUT = GUMBO_TAG_OUTPUT , PROGRESS = GUMBO_TAG_PROGRESS , METER = GUMBO_TAG_METER ,
  DETAILS = GUMBO_TAG_DETAILS , SUMMARY = GUMBO_TAG_SUMMARY , MENU = GUMBO_TAG_MENU , MENUITEM = GUMBO_TAG_MENUITEM ,
  APPLET = GUMBO_TAG_APPLET , ACRONYM = GUMBO_TAG_ACRONYM , BGSOUND = GUMBO_TAG_BGSOUND , DIR = GUMBO_TAG_DIR ,
  FRAME = GUMBO_TAG_FRAME , FRAMESET = GUMBO_TAG_FRAMESET , NOFRAMES = GUMBO_TAG_NOFRAMES , ISINDEX = GUMBO_TAG_ISINDEX ,
  LISTING = GUMBO_TAG_LISTING , XMP = GUMBO_TAG_XMP , NEXTID = GUMBO_TAG_NEXTID , NOEMBED = GUMBO_TAG_NOEMBED ,
  PLAINTEXT = GUMBO_TAG_PLAINTEXT , RB = GUMBO_TAG_RB , STRIKE = GUMBO_TAG_STRIKE , BASEFONT = GUMBO_TAG_BASEFONT ,
  BIG = GUMBO_TAG_BIG , BLINK = GUMBO_TAG_BLINK , CENTER = GUMBO_TAG_CENTER , FONT = GUMBO_TAG_FONT ,
  MARQUEE = GUMBO_TAG_MARQUEE , MULTICOL = GUMBO_TAG_MULTICOL , NOBR = GUMBO_TAG_NOBR , SPACER = GUMBO_TAG_SPACER ,
  TT = GUMBO_TAG_TT , RTC = GUMBO_TAG_RTC , UNKNOWN = GUMBO_TAG_UNKNOWN , ANY = 512
}
 An enum containing all valid HTML tags. More...
 

Functions

HEXT_PUBLIC NthChildMatch::Option operator| (NthChildMatch::Option left, NthChildMatch::Option right) noexcept
 Applies Bitwise-OR to NthChildMatch::Option. More...
 
HEXT_PUBLIC NthChildMatch::Option operator& (NthChildMatch::Option left, NthChildMatch::Option right) noexcept
 Applies Bitwise-AND to NthChildMatch::Option. More...
 
HEXT_PUBLIC OnlyChildMatch::Option operator| (OnlyChildMatch::Option left, OnlyChildMatch::Option right) noexcept
 Applies Bitwise-OR to OnlyChildMatch::Option. More...
 
HEXT_PUBLIC OnlyChildMatch::Option operator& (OnlyChildMatch::Option left, OnlyChildMatch::Option right) noexcept
 Applies Bitwise-AND to OnlyChildMatch::Option. More...
 
HEXT_PUBLIC Rule ParseHext (const char *hext)
 Parses a null-terminated string containing hext rule definitions. More...
 
HEXT_PUBLIC Rule ParseHext (const char *hext, std::size_t size)
 Parses a buffer containing hext rule definitions. More...
 

Variables

HEXT_PUBLIC const CaptureFunction TextBuiltin
 A CaptureFunction that returns the text of an HTML element. More...
 
HEXT_PUBLIC const CaptureFunction InnerHtmlBuiltin
 A CaptureFunction that returns the inner HTML of an HTML element. More...
 
HEXT_PUBLIC const CaptureFunction StripTagsBuiltin
 A CaptureFunction that returns the inner HTML of an HTML element without tags. More...
 
HEXT_PUBLIC const int version_major
 Major version number. More...
 
HEXT_PUBLIC const int version_minor
 Minor version number. More...
 
HEXT_PUBLIC const int version_patch
 Patch version number. More...
 

Typedef Documentation

◆ CaptureFunction

using hext::CaptureFunction = typedef std::function<std::string (const GumboNode *)>

A type of std::function that receives an HTML element and returns a string.

Definition at line 31 of file CaptureFunction.h.

◆ MatchFunction

using hext::MatchFunction = typedef std::function<bool (const GumboNode *)>

A type of std::function that receives an HTML element and returns a bool.

Definition at line 30 of file MatchFunction.h.

◆ Result

using hext::Result = typedef std::vector<ResultMap>

A vector containing ResultMap.

Definition at line 45 of file Result.h.

◆ ResultMap

using hext::ResultMap = typedef std::multimap<ResultPair::first_type, ResultPair::second_type>

A multimap containing the values produced by capturing.

Why std::multimap?

  • The value of a Capture should be accessible by key => associative
  • There may be Captures with duplicate names => multi key
  • The order of Captures should be predictable => sorted

Definition at line 40 of file Result.h.

◆ ResultPair

using hext::ResultPair = typedef std::pair<std::string, std::string>

A string-pair containing a name and a value.

A ResultPair is produced by a Capture.

Definition at line 32 of file Result.h.

Enumeration Type Documentation

◆ HtmlTag

enum hext::HtmlTag : int
strong

An enum containing all valid HTML tags.

With the exception of HtmlTag::ANY, every HtmlTag can be casted to its GumboTag counterpart (same int value).

Enumerator
HTML 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/html

HEAD 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/head

TITLE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/title

BASE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/base

LINK 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/link

META 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/meta

STYLE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/style

SCRIPT 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/script

NOSCRIPT 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/noscript

TEMPLATE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/template

BODY 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/body

ARTICLE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/article

SECTION 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/section

NAV 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/nav

ASIDE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/aside

H1 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/h1

H2 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/h2

H3 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/h3

H4 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/h4

H5 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/h5

H6 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/h6

HGROUP 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/hgroup

HEADER 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/header

FOOTER 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/footer

ADDRESS 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/address

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/p

HR 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/hr

PRE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/pre

BLOCKQUOTE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/blockquote

OL 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/ol

UL 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/ul

LI 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/li

DL 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/dl

DT 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/dt

DD 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/dd

FIGURE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/figure

FIGCAPTION 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/figcaption

MAIN 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/main

DIV 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/div

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/a

EM 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/em

STRONG 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/strong

SMALL 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/small

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/s

CITE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/cite

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/q

DFN 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/dfn

ABBR 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/abbr

DATA 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/data

TIME 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/time

CODE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/code

VAR 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/var

SAMP 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/samp

KBD 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/kbd

SUB 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/sub

SUP 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/sup

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/i

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/b

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/u

MARK 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/mark

RUBY 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/ruby

RT 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/rt

RP 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/rp

BDI 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/bdi

BDO 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/bdo

SPAN 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/span

BR 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/br

WBR 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/wbr

INS 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/ins

DEL 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/del

IMAGE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/image

IMG 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img

IFRAME 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe

EMBED 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/embed

OBJECT 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/object

PARAM 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/param

VIDEO 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/video

AUDIO 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/audio

SOURCE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/source

TRACK 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/track

CANVAS 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/canvas

MAP 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/map

AREA 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/area

MATH 

https://developer.mozilla.org/en-US/docs/Web/MathML/Element/math

MI 

https://developer.mozilla.org/en-US/docs/Web/MathML/Element/mi

MO 

https://developer.mozilla.org/en-US/docs/Web/MathML/Element/mo

MN 

https://developer.mozilla.org/en-US/docs/Web/MathML/Element/mn

MS 

https://developer.mozilla.org/en-US/docs/Web/MathML/Element/ms

MTEXT 

https://developer.mozilla.org/en-US/docs/Web/MathML/Element/mtext

MGLYPH 

https://developer.mozilla.org/en-US/docs/Web/MathML/Element/mglyph

MALIGNMARK 

https://developer.mozilla.org/en-US/docs/Web/MathML/Element

ANNOTATION_XML 

https://developer.mozilla.org/en-US/docs/Web/MathML/Element/semantics

SVG 

https://developer.mozilla.org/en-US/docs/Web/SVG/Element/svg

FOREIGNOBJECT 

https://developer.mozilla.org/en-US/docs/Web/SVG/Element/foreignObject

DESC 

https://developer.mozilla.org/en-US/docs/Web/SVG/Element/desc

TABLE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/table

CAPTION 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/caption

COLGROUP 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/colgroup

COL 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/col

TBODY 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/tbody

THEAD 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/thead

TFOOT 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/tfoot

TR 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/tr

TD 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/td

TH 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/th

FORM 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/form

FIELDSET 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/fieldset

LEGEND 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/legend

LABEL 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/label

INPUT 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input

BUTTON 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/button

SELECT 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/select

DATALIST 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/datalist

OPTGROUP 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/optgroup

OPTION 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/option

TEXTAREA 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/textarea

KEYGEN 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/keygen

OUTPUT 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/output

PROGRESS 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/progress

METER 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/meter

DETAILS 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/details

SUMMARY 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/summary

MENU 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/menu

MENUITEM 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/menuitem

APPLET 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/applet

ACRONYM 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/acronym

BGSOUND 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/bgsound

DIR 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/dir

FRAME 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/frame

FRAMESET 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/frameset

NOFRAMES 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/noframes

ISINDEX 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/isindex

LISTING 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/listing

XMP 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/xmp

NEXTID 

https://developer.mozilla.org/en-US/docs/Web/API/HTMLUnknownElement

NOEMBED 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/noembed

PLAINTEXT 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/plaintext

RB 

https://developer.mozilla.org/en-US/docs/Web/API/HTMLUnknownElement

STRIKE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/strike

BASEFONT 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/basefont

BIG 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/big

BLINK 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/blink

CENTER 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/center

FONT 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/font

MARQUEE 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/marquee

MULTICOL 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/multicol

NOBR 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/nobr

SPACER 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/spacer

TT 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/tt

RTC 

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/rtc

UNKNOWN 

Unknown (i.e. non-standard) tag.

ANY 

Any html tag.

Definition at line 30 of file HtmlTag.h.

Function Documentation

◆ operator&() [1/2]

HEXT_PUBLIC NthChildMatch::Option hext::operator& ( NthChildMatch::Option  left,
NthChildMatch::Option  right 
)
inlinenoexcept

Applies Bitwise-AND to NthChildMatch::Option.

Definition at line 169 of file NthChildMatch.h.

◆ operator&() [2/2]

HEXT_PUBLIC OnlyChildMatch::Option hext::operator& ( OnlyChildMatch::Option  left,
OnlyChildMatch::Option  right 
)
inlinenoexcept

Applies Bitwise-AND to OnlyChildMatch::Option.

Definition at line 76 of file OnlyChildMatch.h.

◆ operator|() [1/2]

HEXT_PUBLIC NthChildMatch::Option hext::operator| ( NthChildMatch::Option  left,
NthChildMatch::Option  right 
)
inlinenoexcept

Applies Bitwise-OR to NthChildMatch::Option.

Definition at line 160 of file NthChildMatch.h.

◆ operator|() [2/2]

HEXT_PUBLIC OnlyChildMatch::Option hext::operator| ( OnlyChildMatch::Option  left,
OnlyChildMatch::Option  right 
)
inlinenoexcept

Applies Bitwise-OR to OnlyChildMatch::Option.

Definition at line 67 of file OnlyChildMatch.h.

◆ ParseHext() [1/2]

HEXT_PUBLIC Rule hext::ParseHext ( const char *  hext)

Parses a null-terminated string containing hext rule definitions.

Throws SyntaxError with a detailed error message on invalid input.

Example:
try {
Rule rule = ParseHext("<a href:link />");
} catch( SyntaxError& e ) {
// e.what() will contain a detailed error message.
}
HEXT_PUBLIC Rule ParseHext(const char *hext)
Parses a null-terminated string containing hext rule definitions.
Exceptions
SyntaxError
Parameters
hextA null-terminated string containing hext rule definitions.
Returns
The parsed Rule.

◆ ParseHext() [2/2]

HEXT_PUBLIC Rule hext::ParseHext ( const char *  hext,
std::size_t  size 
)

Parses a buffer containing hext rule definitions.

Throws SyntaxError with a detailed error message on invalid input.

Example:
std::string hext_str("<a href:link />");
try {
Rule rule = ParseHext(hext_str.c_str(), hext_str.size());
// ... do sth. with rule ...
} catch( SyntaxError& e ) {
// e.what() will contain a detailed error message.
}
Exceptions
SyntaxError
Parameters
hextA string containing hext rule definitions.
sizeThe length of the string.
Returns
The parsed Rule.

Variable Documentation

◆ InnerHtmlBuiltin

HEXT_PUBLIC const CaptureFunction hext::InnerHtmlBuiltin
extern

A CaptureFunction that returns the inner HTML of an HTML element.

The intent is to mimic innerHtml().

Example:
GumboNode * node = ...; // <div> like<div>a</div>rolling stone</div>
assert(InnerHtmlBuiltin(node) == " like<div>a</div>rolling stone");
HEXT_PUBLIC const CaptureFunction InnerHtmlBuiltin
A CaptureFunction that returns the inner HTML of an HTML element.
Parameters
nodeA pointer to a GumboNode.
Returns
A string containing the HTML element's inner HTML.

◆ StripTagsBuiltin

HEXT_PUBLIC const CaptureFunction hext::StripTagsBuiltin
extern

A CaptureFunction that returns the inner HTML of an HTML element without tags.

Example:
GumboNode * node = ...; // <div> like<div>a</div>rolling stone</div>
assert(StripTagsBuiltin(node) == " likearolling stone");
HEXT_PUBLIC const CaptureFunction StripTagsBuiltin
A CaptureFunction that returns the inner HTML of an HTML element without tags.
Parameters
nodeA pointer to a GumboNode.
Returns
A string containing the HTML element's inner HTML without tags.

◆ TextBuiltin

HEXT_PUBLIC const CaptureFunction hext::TextBuiltin
extern

A CaptureFunction that returns the text of an HTML element.

The intent is to mimic functions like jQuery's text(), IE's innerText() or textContent().

Example:
GumboNode * node = ...; // <div> like<div>a</div>rolling stone</div>
assert(TextBuiltin(node) == "like a rolling stone");
HEXT_PUBLIC const CaptureFunction TextBuiltin
A CaptureFunction that returns the text of an HTML element.
Parameters
nodeA pointer to a GumboNode.
Returns
A string containing the HTML element's text.

◆ version_major

HEXT_PUBLIC const int hext::version_major
extern

Major version number.

◆ version_minor

HEXT_PUBLIC const int hext::version_minor
extern

Minor version number.

◆ version_patch

HEXT_PUBLIC const int hext::version_patch
extern

Patch version number.