‹› markdown.inlinepatterns

In version 3.0, a new, more flexible inline processor was added, markdown.inlinepatterns.InlineProcessor. The original inline patterns, which inherit from markdown.inlinepatterns.Pattern or one of its children are still supported, though users are encouraged to migrate.

The new InlineProcessor provides two major enhancements to Patterns:

  1. Inline Processors no longer need to match the entire block, so regular expressions no longer need to start with r'^(.*?)' and end with r'(.*?)%'. This runs faster. The returned Match object will only contain what is explicitly matched in the pattern, and extension pattern groups now start with m.group(1).

  2. The handleMatch method now takes an additional input called data, which is the entire block under analysis, not just what is matched with the specified pattern. The method now returns the element and the indexes relative to data that the return element is replacing (usually m.start(0) and m.end(0)). If the boundaries are returned as None, it is assumed that the match did not take place, and nothing will be altered in data.

    This allows handling of more complex constructs than regular expressions can handle, e.g., matching nested brackets, and explicit control of the span “consumed” by the processor.

Classes:

Functions:

Attributes:

  • NOIMG

    Match not an image. Partial regular expression which matches if not preceded by !.

  • BACKTICK_RE

    Match backtick quoted string (`e=f()` or ``e=f("`")``).

  • ESCAPE_RE

    Match a backslash escaped character (\< or \*).

  • EMPHASIS_RE

    Match emphasis with an asterisk (*emphasis*).

  • STRONG_RE

    Match strong with an asterisk (**strong**).

  • SMART_STRONG_RE

    Match strong with underscore while ignoring middle word underscores (__smart__strong__).

  • SMART_EMPHASIS_RE

    Match emphasis with underscore while ignoring middle word underscores (_smart_emphasis_).

  • SMART_STRONG_EM_RE

    Match strong emphasis with underscores (__strong _em__).

  • EM_STRONG_RE

    Match emphasis strong with asterisk (***strongem*** or ***em*strong**).

  • EM_STRONG2_RE

    Match emphasis strong with underscores (___emstrong___ or ___em_strong__).

  • STRONG_EM_RE

    Match strong emphasis with asterisk (***strong**em*).

  • STRONG_EM2_RE

    Match strong emphasis with underscores (___strong__em_).

  • STRONG_EM3_RE

    Match strong emphasis with asterisk (**strong*em***).

  • LINK_RE

    Match start of in-line link ([text](url) or [text](<url>) or [text](url "title")).

  • IMAGE_LINK_RE

    Match start of in-line image link (![alttxt](url) or ![alttxt](<url>)).

  • REFERENCE_RE

    Match start of reference link ([Label][3]).

  • IMAGE_REFERENCE_RE

    Match start of image reference (![alt text][2]).

  • NOT_STRONG_RE

    Match a stand-alone * or _.

  • AUTOLINK_RE

    Match an automatic link (<http://www.example.com>).

  • AUTOMAIL_RE

    Match an automatic email link (<me@example.com>).

  • HTML_RE

    Match an HTML tag (<...>).

  • ENTITY_RE

    Match an HTML entity (&#38; (decimal) or &#x26; (hex) or &amp; (named)).

  • LINE_BREAK_RE

    Match two spaces at end of line.

‹› markdown.inlinepatterns.build_inlinepatterns(md: Markdown, **kwargs: Any) -> util.Registry[InlineProcessor]

Build the default set of inline patterns for Markdown.

The order in which processors and/or patterns are applied is very important - e.g. if we first replace http://.../ links with <a> tags and then try to replace inline HTML, we would end up with a mess. So, we apply the expressions in the following order:

  • backticks and escaped characters have to be handled before everything else so that we can preempt any markdown patterns by escaping them;

  • then we handle the various types of links (auto-links must be handled before inline HTML);

  • then we handle inline HTML. At this point we will simply replace all inline HTML strings with a placeholder and add the actual HTML to a stash;

  • finally we apply strong, emphasis, etc.

Return a Registry instance which contains the following collection of classes with their assigned names and priorities.

Class Instance Name Priority
BacktickInlineProcessor(BACKTICK_RE) backtick 190
EscapeInlineProcessor(ESCAPE_RE) escape 180
ReferenceInlineProcessor(REFERENCE_RE) reference 170
LinkInlineProcessor(LINK_RE) link 160
ImageInlineProcessor(IMAGE_LINK_RE) image_link 150
ImageReferenceInlineProcessor(IMAGE_REFERENCE_RE) image_reference 140
ShortReferenceInlineProcessor(REFERENCE_RE) short_reference 130
ShortImageReferenceInlineProcessor(IMAGE_REFERENCE_RE) short_image_ref 125
AutolinkInlineProcessor(AUTOLINK_RE) autolink 120
AutomailInlineProcessor(AUTOMAIL_RE) automail 110
SubstituteTagInlineProcessor(LINE_BREAK_RE) linebreak 100
HtmlInlineProcessor(HTML_RE) html 90
HtmlInlineProcessor(ENTITY_RE) entity 80
SimpleTextInlineProcessor(NOT_STRONG_RE) not_strong 70
AsteriskProcessor("\*") em_strong 60
UnderscoreProcessor("_") em_strong2 50

‹› markdown.inlinepatterns.NOIMG module-attribute

Match not an image. Partial regular expression which matches if not preceded by !.

Defined Value:

NOIMG = r'(?<!\!)'