Ник | Пост | Дата | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
tango | Meteor: Cryptographically Secure Steganography for Realistic Distributions Meteor is a symmetric-key steganography protocol. A sender and receiver, who share a secret key, can exchange messages that conform to a certain text generation model, and that also encode another, hidden message. An observer that does not know the secret key cannot distinguish the steganographically encoded messages from any other output of the text generation model, even if the observer knows the model. Meteor naturally adapts its encoding rate according to the local entropy of the text generation model, which is an advantage over past schemes that can fail when starved of entropy. The authors show how to use Meteor with sophisticated generative models of natural language text, like GPT-2, to provide realistic covertexts. The text generation models that are compatible with Meteor produce text one word at a time. Given a context of words that have been output so far, the model finds a probability distribution of candidates for the next word; i.e., a set of words with associated weights, where the weights sum to 1.0. The model then makes a weighted random selection from among the candidates. Suppose, at a certain point in text generation, the probability distribution for the next word is 40% little, 20% gray, 20% happy, and 20% nimble. Stack up those probabilities, in some defined order, to assign each word an interval of the numbers between 0.0 and 1.0. Sample a random number r between 0.0 and 1.0, and output the word whose interval contains r. In this case, for example, if the random number generator produced r = 0.626, the model would output the word happy; if r = 0.187, it would output little.
Meteor replaces the text generation model’s usual random number generator with pseudorandom bits from a ciphertext of the message. Instead of an interval of real numbers, assign each candidate word a range of consecutive bit strings, in proportion to its probability:
The length of the bit strings is a parameter, β. In this example, β = 4; the authors report using β = 32 in practice. Suppose the sender wants to steganographically encode the message When the receiver observes the word little, it does a reverse lookup in the probability table, to find out what values of r could have mapped to that word. In this case, r might have been any of the six values from The common prefix is not always one bit long. If the word had been happy, for example, the common prefix would have been Because the sender and the receiver have the same text model, the sender also knows the range of possible r values for the word it has just output, and therefore the sender knows how many bits the receiver will have decoded from little. In this case it’s one bit. So the sender strips one bit from its message buffer, leaving English text is not the only possible source of covertexts. Anything that satisfies the token-at-a-time generation model can be used—conceivably even things like network protocols. Besides GPT-2, which is word-oriented, the authors applied Meteor to character-oriented models trained on Wikipedia articles and HTTP headers. There is a comparison of the space and time costs of encoding in Table 3. You can see examples of Meteor stegotexts in Appendix C. The GPT-2 model encodes 160 message bytes into about 300–350 words, not counting the initial context. The following is an example of Meteor encoding using GPT-2:
| 2022-02-09T19:54:21.624Z |