Small Wiredyne Logo

Permutation Steganography Details

Steganography using permutations.

Traditional steganography hides information in noisy spaces in data,
such as the low bits of a sound sample.  Permutation steganography
hides information in the ordering of elements of a set.  For examples,
the orders of entries in a database, in which pre-arranged messages
are sent, of the headers of a mail message, of books on a shelf, of
cards in a deck, of people posing in a picture, of cars parked in a
parking lot, or anything else.

The number of possible permutations of a set of n elements is n!.  In
order to be conveniently used for steganography, it is necessary to
uniquely map each possible permutation to a number from 0 to n! - 1.
That number is the message being sent.

API
---

A permutation is a bijective mapping from a set to itself.  We
represent permutations with lists of natural numbers.  For example,
[2, 1, 3, 4, 0] is a permutation of [4, 3, 2, 1, 0].

The programmer converts between numeric "permutations" and the
particular set being used for communication.  The example below
demonstrates that this is not difficult in Python.

permutation_to_nat(permutation) and nat_to_permutation(number,
set_size) convert between permutations and natural numbers.  (It is
necessary to specify set_size.  Only the minimum set size can be
determined from the number itself.)

>>> PStego.permutation_to_nat([2, 1, 3, 4, 0])
22
>>> PStego.nat_to_permutation(22, 5) # Set has five elements
[2, 1, 3, 4, 0]

permutation_to_int(permutation) and int_to_permutation(number,
set_size) convert between permutations and integers.  Almost identical
to the above functions, except integers can be negative.

permuted_set_size(number) is used to see how big a set is needed to
store non-negative number.

factorial(set_size) - 1 gives the largest natural number representable
by a set of set_size elements.

string_to_nat(string) and nat_to_string(number) make it convenient to
convert between numbers and strings.

>>> PStego.string_to_nat("G. H. Hardy")
301496875671042235064921671L
>>> PStego.nat_to_string(301496875671042235064921671L)
'G. H. Hardy'

All exceptions raised are of class PStegoError, which inherits from
Exception.

Example
-------

This code fragment demonstrates encoding a large integer message into
a particular ordering of a set of cards and back.

---------------
#!/usr/bin/env python

import PStego

suits = ['Hearts', 'Diamonds', 'Clubs', 'Spades']
vals = [str(val) for val in range(2,11) + ['Jack', 'Queen', 'King', 'Ace']]
deck = [ val + ' of ' + suit for suit in suits for val in vals]

# Encode
msg = 71753659453958351496053366505932007005969038565701322956079243329L
bynumber = PStego.nat_to_permutation(msg, len(deck))
arranged = [deck[i] for i in bynumber]

# Decode
bynumber_II = [deck.index(card) for card in arranged]
msg_II = PStego.permutation_to_nat(bynumber_II)

# Output
print "Original Message:", msg
print "Decoded Message:", msg_II

print

print "Index\tRaw No.\tCard"
print "-----\t-------\t----"
for i in xrange(len(deck)):
    print "%d:\t%d\t%s" %(i, bynumber[i], arranged[i])

Application Notes
-----------------

Care should be taken to ensure that permutations are equally probable
if they are to be difficult to identify.  This can be inconvenient
since most messages are in binary form.  Nearly all crypto algorithms
are designed for binary messages.

One approach is to use permutations as one time pads.  If the set has
n objects, then the key should be a randomly chosen value from 0 to
n! - 1.  The message falls into the same range.  Encryption is
performed by adding the key and the message mod n!.

If the key is a randomly chosen value throughout the domain of
possible numbers, then it doesn't matter if the message is not evenly
distributed.  This means that at least half of the message space can
be used to represent a string of bits, depending on the choice of set
size.

The above approach could also be used with a keyed pseudo-random
number generator.  It would be necessary to agree in advance on how to
use it to generate evenly distributed numbers in the unusual ranges
required.  In effect this would be an XORing stream cipher - so be
careful!  Don't reuse the key. ;-)

A less efficient approach is to use the prime factors of value 2 to
store the binary data and mask the other factors with randomly chosen
numbers.  For example, in a system using ten objects we have 10! =
3,628,800 possible messages.  3,628,800 = 2^8 * 3^4 * 5^2 * 7 = 256 *
14,175, so we can store 8 bits.  Let M be the message value in the
range 0-255, random from the point of view of the eavesdropper.  Let R
be a random value in the range 0-14174.

These values can be used in two ways:
a. c1 = R * 256 + M which is decrypted with M = c1 mod 256
b. c2 = M * 14175 + R which is decrypted with M = c2 // 14175

8bpqsY0U3M1SLzaNXJho9jV2ftBGO5rIvclQu76Eg4CiZmnWyPFHKATRwxeDdk@

Behind the Curtain
------------------

The mapping between the set of integers and the set of permutations is
accomplished by creating an alternate number system.

Cartesian Products can be used when clearly describing number systems.
It's important to use a precise specification because it is easy to
get confused otherwise.

A cartesian product is the set of ordered tuples of sets.  For
example, if set S = {a, b}, then a cartesian product S X S = {(a,a),
(a,b), (b,a), (b,b)}.  Let T = {c, d, e}.  Then, S X T = {(a, c), (a,
d), (a, e), (b, c), (b, d), (b, e)}.

Let D be the set of decimal digits, 0 through 9.  The set of three
place decimal numbers could be seen as the cartesian product D X D X
D.  We write "123" for the element (1,2,3).  Leading zeroes are
traditionally dropped, so we would write "37" for (0,3,7).  Let B be
the set of binary digits: {0, 1}.  The set of four place binary
numbers could seen as the cartesian product B X B X B X B.

We are most comfortable working with decimal numbers.  Therefore, we
will use decimal numbers to describe numbers and the digits of numbers
in other bases.  For example, let H be the set of hexadecimal digits.
The set of four place numbers could be described by the cartesian
product H X H X H X H.  The number 0xBABE would be represented as
(11,10,11,14).

This becomes convenient when we want to convert the number to decimal
- a format which is second nature to most people.  In the case of
0xBABE, it is 11 * 16^3 + 10 * 16^2 + 11 * 16^1 + 14 = 47,806.

Traditional number systems use the same digits in each column.  For
example, in base 10 the digits used are 0 through 9 and the columns
are increasing integral powers of 10: ones, tens, hundreds, and so on,
or <...,100,10,1>.  In hexadecimal the columns are ones, sixteens,
256s, etc., or <...,256,16,1>.

Note that the value of each column is actually the number of possible
numbers that can be represented by the remaining digits.  In the
decimal system the third column is the "hundreds" because there are
exactly one hundred different decimal two digit numbers: 00 - 99.  If
a number system does not have this property, it will either have more
than one representation for certain numbers or it will be unable to
represent some numbers.  Neither possibility is what we want.

We are used to number systems which use the same digits for each
column, but to make the above points clear let's consider a strange
five place number system described by this cartesian product:
{0,1,2,3,4} X {0} X {0,1,2,3} X {0} X {0,1,2}

What values should we assign to each column?  The rightmost must be
ones.  The ones column has three possibilities, so the second (from
the right) column is the threes.  Since the second column has only one
possible digit, the first two columns can express only three two place
numbers, so the third column will be the threes, too.  The first
through third columns have 12 possible combinations, so the fourth
column is the twelves.  And since the fourth column has only one
possible digit, the last column must be the twelves, too:
<12,12,3,3,1>. The numbers from 0 through 59 can be represented by
this system.  For example, the number 45 is represented by
(3,0,3,0,0).

The system we want to use is tame by comparison.  Each additional
column gets one additional digit.  Five place numbers are described by
this cartesian product: {0,1,2,3,4,5} X {0,1,2,3,4} X {0,1,2,3} X
{0,1,2} X {0,1}

The columns will have values <120,24,6,2,1>, or <5!,4!,3!,2!,1!>.  The
numbers from 0 through 6! - 1 can be represented.  The number 219 is
(1,4,0,1,1).  In other words, 219 = 1*5! + 4*4! + 0*3! + 1*2! + 1*1! =
120 + 96 + 2 + 1.

We call these numbers "permutation numbers".  Permutation numbers are
useful because an n digit number can represent (n + 1)! different
values, which also happens to be the number of permutations of a set
of n + 1 elements.

It is easy to map a permutation number to a permutation and vice
versa.  Let's represent 219 in terms of a permutation of a set of six
objects: {0,1,2,3,4,5}.  As described above, the permutation number is
(1,4,0,1,1).

The little puzzle is how to unambiguously choose each element from the
set.  The solution is to order the remaining elements of the set and
reassign their values each time a digit is chosen.

We start with this mapping:

   0 1 2 3 4 5
  {0,1,2,3,4,5}

The first digit of (1,4,0,1,1) is 1, so we select the element "1"
from the set, making the interim result (1).

The remaining elements of the set are assigned these values:

   0 1 2 3 4
  {0,2,3,4,5}

As the second digit of (1,4,0,1,1) is 4, we select the element "5"
from the set, making the interim result (1,5).

The remaining elements then look like this:

   0 1 2 3
  {0,2,3,4}

Since the third digit is 0, we select the element "0" from the set,
and so forth.  The final result is (1,5,0,3,4,2).

There is a cultural inconsistency between the order in which we write
the digits of a number and the order in which Python displays the
elements of a list; that is, when a number is written the highest
valued digits are written first, but when an array is printed, the
lowest indexed elements are printed first.

In the code below, permutation numbers are represented in a manner
natural to Python.  perm_number[0] holds the least significant digit.
For example, the number 3301 which can be represented by the
permutation (4,3,2,2,0,1), is stored as the list [1, 0, 2, 2, 3, 4] in
the Python code.  (perm_number_repr(perm_number) is available to get a
printable representation of permutation numbers.)

References:

These Wikipedia pages have a more complete discussion of these
numbering systems:

http://en.wikipedia.org/wiki/Permutation#Numbering_permutations
http://en.wikipedia.org/wiki/Mixed_radix
http://en.wikipedia.org/wiki/Factorial_number_system

We do not implement a combinatorial number system, but it may be of
interest all the same:
http://en.wikipedia.org/wiki/Combinatorial_number_system

Georg Cantor worked with mixed radix number systems:
http://en.wikipedia.org/wiki/Georg_Cantor

D.N. Lehmer and D.H. Lehmer both worked on permutation numbering
systems:
http://en.wikipedia.org/wiki/Derrick_Norman_Lehmer
http://en.wikipedia.org/wiki/Derrick_Henry_Lehmer

Home     Software     Downloads     About