hissp.munger module

Lissp’s symbol munger.

Encodes Lissp symbols with special characters into valid, human-readable (if ugly) Python identifiers, using NFKC normalization and x-codes.

E.g. *FOO-BAR* becomes xSTAR_FOOxH_BARxSTAR_.

X-codes are written in upper case and wrapped in an x and _. This format was chosen because it contains an underscore and both lower-case and upper-case letters, which makes it distinct from standard Python naming conventions: lower_case_with_underscores, UPPER_CASE_WITH_UNDERSCORES, and CapWords, which makes the x-encoding (but not the normalization) reversible in the usual cases, and also cannot introduce a leading underscore, which can have special meaning in Python.

Characters can be encoded in one of three ways: Short names, Unicode names, and ordinals.

The demunge function will accept any of these encodings, while the munge function will prioritize short names, then fall back to Unicode names, then fall back to ordinals.

Short names are given in the TO_NAME table in this module.

Any spaces in the Unicode names are replaced with an x and any hyphens are replaced with an h. (Unicode names are in all caps and these substitutions are lower-case.)

Ordinals are given in base 10.

hissp.munger.munge(s: str)str

Lissp’s symbol munger.

Encodes Lissp symbols with special characters into valid, human-readable (if ugly) Python identifiers, using NFKC normalization and x-codes.

Inputs that begin with : are assumed to be control words and returned unmodified. Full stops are handled separately, as those are meaningful to Hissp.

hissp.munger.TO_NAME = {'!': 'xBANG_', '"': 'x2QUOTE_', '#': 'xHASH_', '$': 'xDOLR_', '%': 'xPCENT_', '&': 'xET_', "'": 'x1QUOTE_', '(': 'xPAREN_', ')': 'xTHESES_', '*': 'xSTAR_', '+': 'xPLUS_', '-': 'xH_', '/': 'xSLASH_', ';': 'xSCOLON_', '<': 'xLT_', '=': 'xEQ_', '>': 'xGT_', '?': 'xQUERY_', '@': 'xAT_', '[': 'xSQUARE_', '\\': 'xBSLASH_', ']': 'xBRACKETS_', '^': 'xCARET_', '`': 'xGRAVE_', '{': 'xCURLY_', '|': 'xBAR_', '}': 'xBRACES_'}

Shorter names for X-encoding.

hissp.munger.x_encode(c: str)str

Converts a character to its short x-encoding, unless it’s already valid in a Python identifier.

hissp.munger.force_x_encode(c: str)str

Converts a character to its x-encoding, even if it’s valid in a Python identifier.

hissp.munger.LOOKUP_NAME = {'x1QUOTE_': "'", 'x2QUOTE_': '"', 'xAT_': '@', 'xBANG_': '!', 'xBAR_': '|', 'xBRACES_': '}', 'xBRACKETS_': ']', 'xBSLASH_': '\\', 'xCARET_': '^', 'xCURLY_': '{', 'xDOLR_': '$', 'xEQ_': '=', 'xET_': '&', 'xGRAVE_': '`', 'xGT_': '>', 'xHASH_': '#', 'xH_': '-', 'xLT_': '<', 'xPAREN_': '(', 'xPCENT_': '%', 'xPLUS_': '+', 'xQUERY_': '?', 'xSCOLON_': ';', 'xSLASH_': '/', 'xSQUARE_': '[', 'xSTAR_': '*', 'xTHESES_': ')'}

The inverse of TO_NAME.

hissp.munger.demunge(s: str)str

The inverse of munge. Decodes any x-codes into characters.