Laurence Tennant

Unicode Shift Cipher

Security by obscurity isn’t security, but it can make for a confounding layer on top of strong cryptography. Unfamilar characters like the Korean Hangul or Chinese Hanzi tend to be filtered out by people who can’t read them, which presents a fun opportunity to hide information in plain sight.

The Unicode Shift Cipher converts plaintext in one language to ciphertext that appears to be in another language. By default, an English message is converted to Unicode code points and then shifted onto the Chinese Unicode block. A random multiple of the plaintext Unicode code point range is added in order to create variation in the ciphertext, which modulo arithmetic can reverse. This works best when the ciphertext code point range is much larger than the plaintext code point range.

The code is faciliated by Python 3’s excellent Unicode support.

#!/usr/bin/python3
# -*- coding: utf-8 -*-

import random

code_points = {
    'english': (32, 126),
    'burmese': (1024, 1279),
    'arabic': (1536, 1791),
    'canadian_aboriginal': (5120, 5759),
    'chinese': (19968, 40869),
    'korean': (44032, 55216),
}


def unicodeCipher(
    message,
    mode,
    plaintext=(32, 126),
    ciphertext=(19968, 40869),
    ):
    """Unicode shift cipher
    Args:
        message (str): The plaintext to be enciphered.
        mode (str): 'e' for encryption, 'd' for encryption.
        plaintext (tuple): Tuple of the plaintext unicode start and end points. Defaults to ASCII plaintext. You can use one of the strings defined in the code_points dictionary.
        ciphertext (tuple): Tuple of the ciphertext unicode start and end points. Defaults to Chinese. You can use one of the strings defined in the code_points dictionary.
    """

    if mode != 'e' and mode != 'd':
        raise ValueError('Mode must be e for encryption or d for decryption')

    if type(plaintext) is str:
        plaintext = code_points[plaintext]
    if type(ciphertext) is str:
        ciphertext = code_points[ciphertext]

    plaintext_range = plaintext[1] - plaintext[0] + 1
    ciphertext_range = ciphertext[1] - ciphertext[0]

    ciphertext_size = ciphertext_range / plaintext_range

    if ciphertext_size < 1:
        raise ValueError('Ciphertext code point range needs to be larger than plaintext code point range')

    output = []

    if mode == 'e':
        for letter in message:
            unicode_decimal = ord(letter) - plaintext[0] + ciphertext[0] + plaintext_range * int(random.uniform(0, ciphertext_size) - 1)
            output.append(chr(unicode_decimal))

    elif mode == 'd':
        for letter in message:
            unicode_decimal = (ord(letter) - ciphertext[0]) % plaintext_range + plaintext[0]
            output.append(chr(unicode_decimal))

    return ''.join(output)


if __name__ == '__main__':
    example_plaintext = 'This is a Unicode cipher'

    example_ciphertext = unicodeCipher(example_plaintext, 'e', 'english', 'korean')
    print('Ciphertext: ' + example_ciphertext)

print('Decrypted: ' + unicodeCipher(example_ciphertext, 'd', 'english', 'korean'))