[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
[freehaven-cvs] clean and tighten sec 3.1
Update of /home/freehaven/cvsroot/doc/e2e-traffic
In directory moria.mit.edu:/home2/arma/work/freehaven/doc/e2e-traffic
Modified Files:
e2e-traffic.tex
Log Message:
clean and tighten sec 3.1
Index: e2e-traffic.tex
===================================================================
RCS file: /home/freehaven/cvsroot/doc/e2e-traffic/e2e-traffic.tex,v
retrieving revision 1.25
retrieving revision 1.26
diff -u -d -r1.25 -r1.26
--- e2e-traffic.tex 24 Jan 2004 21:40:33 -0000 1.25
+++ e2e-traffic.tex 24 Jan 2004 22:53:55 -0000 1.26
@@ -324,10 +324,10 @@
Danezis also derives a precondition that the attack will only succeed when
\( m < \frac{N}{b-1} \), and calculates the expected number of rounds to
-succeed (with $95\%$ confidence for security parameter $\ell=2$ and $99\%$
-confidence for $\ell=3$):
+succeed (with $95\%$ confidence for security parameter $\l=2$ and $99\%$
+confidence for $l=3$):
-\[t > \left[ m\ell \left(\sqrt{\frac{m-1}{m}} +
+\[t > \left[ ml \left(\sqrt{\frac{m-1}{m}} +
\sqrt{\frac{N-1}{N^2}(b-1)} \right) \right]^2 \]
%======================================================================
@@ -335,13 +335,13 @@
\label{sec:extending}
\subsection{Broadening the attack}
\label{subsec:broadening}
-In this subsection, we examine ways to extend Danezis's Statistical
-Disclosure Attack to systems more closely resembling real-world mix-nets. In
+Here we examine ways to extend Danezis's statistical
+disclosure attack to systems more closely resembling real-world mix-nets. In
Section~\ref{sec:simulation}, we examine the time and information
requirements for these attacks against simulated networks.
-\subsubsection{Complex senders, unknown background traffic}
-\label{subsubsec:complex-senders}
+\subsubsection{Complex senders, unknown background traffic:}
+% \label{subsubsec:complex-senders}
First, we relax the requirements related to sender behavior.
We allow Alice to choose among her recipients with non-uniform
probability, and to send multiple messages in a single batch. We also
@@ -361,8 +361,9 @@
$t'$ of batches to
which Alice has {\it not} contributed any messages.\footnote{The attack can
still proceed if few such Alice-free batches exist, so long as Alice
- contributes more to some batches than to others. Specifically, the approach
- described in Section~\ref{subsubsec:complex-mix} can exploit differences
+ contributes more to some batches than to others. Specifically, the
+ approach described on the next page (against pool mixes and mix-nets)
+ can exploit differences
between low-Alice and high-Alice batches to infer background behavior.}
For each such
batch $i$, the attacker constructs a vector $\V{u_i}$, whose elements are
@@ -384,32 +385,33 @@
\[\V{v} \approx \frac{1}{\B{m}}
\left[ b\B{O} - (b-\B{m})\B{U} \right] \]
-\subsubsection{Attacking pool mixes and mix-nets}
-\label{subsubsec:complex-mix}
+\subsubsection{Attacking pool mixes and mix-nets:}
+% \label{subsubsec:complex-mix} can't usefully label subsubsections.
Most mix-net designs have already abandoned fixed-batch mixes in
favor of other algorithms that better hide the relation
between senders and recipients. Such improved algorithms include
timed dynamic-pool mixes, generalized mixes, and randomized versions of
-each \cite{trickle02}\cite{pet2003-diaz}. Rather than reordering and
-relaying all of their messages whenever they have received a fixed number
-$b$, these algorithms store received messages in a {\it pool}, and at fixed
-intervals relay a {\it fraction} of the pooled messages, based on the pool's
+each \cite{pet2003-diaz,trickle02}. Rather than reordering and
+relaying all the messages whenever a fixed number $b$ have arrived,
+these algorithms store received messages in a {\it pool}, and at fixed
+intervals relay a {\it fraction} of the pooled messages based on the pool's
current size.
When attacking such a mix, the attacker no longer knows for certain
which batches of recipients contain a message from Alice. Instead,
-the attacker can only estimate, for each batch of exiting messages,
+the attacker can only estimate, for each batch of output messages,
the probability that the batch includes one of Alice's messages.
Following D\'iaz and Serjantov's approach in \cite{pet2003-diaz}, we treat
these mixing algorithms generically as follows: a mix relays a
-number of messages at the end of each round, depending on the number of
+number of messages at the end of each round, depending on how many
messages it is currently storing. All messages in the mix's pool at the end
of a round have an equal probability of being included in that round's batch.
Thus, we can characterize the mix's batching algorithm as a probability
function $\PMIX(b|s)$---the probability that the mix relays $b$ messages
-when it has $s$ messages in the pool. Clearly, $\sum_{b=0}^{s}\PMIX(b|s) =
-1$, for all values of s.
+when it has $s$ messages in the pool. Clearly, $\forall s,
+\sum_{b=0}^{s}\PMIX(b|s) = 1$: the mix will always output between $0$
+and $s$ messages.
%In the case of a timed dynamic-pool mix, this distribution is:
%\[ P_{TDP}(b|s) =
@@ -432,21 +434,21 @@
We assume that the attacker has a fair estimate of $P_R$.\footnote{The
attacker can estimate $P_R$ by
sending test messages through the mix, or by counting the
-messages entering and leaving the mix and deducing the pool size.
-Dummy messages can hamper these techniques to an extent discussed
-in \cite{dummy-msgs}.}
+messages entering and leaving the mix and deducing the pool size.}
+%Dummy messages can hamper these techniques to an extent discussed
+%in \cite{dummy-msgs}.
Now, when Alice sends a message in round
$i$, the attacker observes rounds $i$ through some later round $i+k$,
choosing $k$ so that $\sum_{j=k+1}^{\infty} P_R^i(j)$ is negligible.
The attacker then uses $P_R$ to
-computes a weighted mean $\B{O_w}$ of the observations from all of these
-rounds, weighting by the expected number of messages from Alice that will exit
+compute $\B{O_w}$, the mean of the observations from all of these rounds
+weighted by the expected number of messages from Alice that will exit
that round:
\[ \B{O_w} = \sum_i \sum_{r=0}^k P_R^i(r) m_i \V{o_{i+r}}
\approx \frac{\B{m}\V{v} + (\B{n}-\B{m})\V{u}}{\B{n}} \]
To solve for Alice's behavior $\V{v}$, the attacker now needs a good estimate
-for estimate for the background $\V{u}$. The attacker gets this by
+for the background $\V{u}$. The attacker gets this by
averaging observations $\V{u_i}$ from batches with a negligible probability of
including messages from Alice. Such batches, however, are not
essential: If the attacker chooses a set of $\V{u_i}$ such that each
@@ -454,21 +456,22 @@
Alice, the attacker will obtain
\[\B{U'} \approx \frac{\delta_a}{\B{n}} \V{v} +
\frac{1-\delta_a}{\B{n}} \V{u} \]
-and can thus substitute
+and can thus solve again for $\V{v}$ in the earlier equation for
+$\B{O_w}$, now using
\[\V{u} \approx \frac{1}{1-\delta_a}
\left[ \B{n} \B{U'} - \delta_a \V{v} \right] \]
-into the earlier equation for $\B{O_w}$ and again solve for $\V{v}$.
-Another way senders behave differently from the model of the original
-disclosure attack is by directing their messages through a chosen path
+Senders can also behave differently from the original model
+by directing their messages through a chosen path
in a network of mixes. While using a mix-net increases the effort an
attacker must spend to observe all messages leaving the system, it
has no additional effect on intersection attacks beyond changing the
delaying characteristics $P_R$ of the anonymity system as introduced
above.
-Assume for the sake of simplicity that all mixes share a single
-$P_R$, and that Alice chooses a path of length $\ell_0$. The chance of
+Assume for the sake of simplicity that all mixes have the same
+expected delay $P_R$, and that Alice chooses a path of length $\ell_0$.
+The chance of
the message being delayed by a further $d$ rounds is now
\[ P_R'(\ell_0+d) = \binom{\ell_0+d-1}{d} (1-P_D)^{\ell_0} P_D^d \]
If Alice chooses her path length probabilistically according to
@@ -481,16 +484,16 @@
Danezis has independently extended statistical disclosure to pool mixes
\cite{gd-thesis}.
-\subsubsection{Dummy traffic}
-\label{subsubsec:dummy-traffic}
-One strategy for reducing the impact of traffic analysis is for Alice
-to periodically send messages into the network that arrive at no
-recipient, or to periodically send messages into the network that
+\subsubsection{Dummy traffic:}
+%\label{subsubsec:dummy-traffic}
+Alice can also reduce the impact of traffic analysis by
+periodically sending messages into the network that are dropped inside
+the network, or by periodically sending messages into the network that
arrive at recipients other than those with whom she wants to
communicate.
Although these methods can succeed in slowing or stopping the attacker (as
-discussed below in section \ref{sec:simulation}), the change in the attack
+discussed below in Section \ref{sec:simulation}), the change in the attack
itself is trivial: Alice's behavior
vector $\V{v}$ no longer adds to $1$, since there is now a probability that a
message from Alice will not reach any recipient. Aside from this, the
@@ -530,43 +533,42 @@
%% regime results in traffic quadratic in the number of participants in
%% the system, and thus scales poorly.
-\subsubsection{Partial observation}
-\label{subsubsec:partial}
+\subsubsection{Partial observation:}
+%\label{subsubsec:partial}
Until now, we have required that the attacker, as a global passive
-adversary, observe all the messages entering and leaving the system.
-(At least, all the messages sent by Alice, and all the messages
-reaching Alice's recipients.) This requirement is not so difficult
-as it might seem: in order to be a ``global'' adversary against
+adversary, observe all the messages entering and leaving the system
+(at least, all the messages sent by Alice, and all the messages
+reaching Alice's recipients). This requirement is not so difficult
+as it might seem: to be a ``global'' adversary against
Alice, an attacker need only eavesdrop upon Alice, and upon the mixes
that deliver messages to recipients. (Typically, not all mixes do
-so.) In this section, we examine the consequences for a non-global
-attacker.
+so.)
Depending on which points of the network the attacker can observe, a
non-global attacker has different characteristics. If an attacker
eavesdrops on a fraction of the {\it mixes} in the system, the
attacker receives a random sample\footnote{But possibly a biased
sample, depending on Alice's path selection algorithm.} of the
-messages entering or leaving the system. Such an attacker
-will still converge on the same $\B{O}$ and thus the same
-estimation of Alice's behavior, but will require more rounds of
-observation to do so.
+messages entering or leaving the system. If such an attacker can see some
+messages from Alice and some messages to her recipients, he will still
+converge on the same $\B{O}$ and thus the same estimation of Alice's
+behavior, but the attack will require more rounds of observation.
Alternatively, an attacker who eavesdrops on a fraction of the {\it
users} in the system receives {\it all} of the messages sent to or
-from those users but no messages sent to or from other users. So long as
-one of these users is Alice, the attack can still proceed: To such an
-attacker, the network is as if the messages sent by Alice to
+from those users but no messages sent to or from other users. So long
+as one of these users is Alice, the network (to such an attacker) is as
+if the messages sent by Alice to
unobserved recipients were dummy messages. Now the attack converges
as before, but with only information concerning the observed
recipients: the attacker learns which of the observed recipients
receive messages from Alice, and which do not.
-\subsubsection{Time-variant background traffic}
-\label{subsubsec:time-variant}
+\subsubsection{Time-variant background traffic:}
+%\label{subsubsec:time-variant}
If Alice's behavior changes completely and radically over time, a long-term
intersection attack cannot proceed: the attacker cannot make enough
-observations of any version of Alice's behavior in order to converge
+observations of any version of Alice's behavior to converge
on a $\B{v}$ for any of them.
On the other hand, if Alice's behavior $\V{v}$ remains consistent
@@ -582,14 +584,17 @@
So if an attacker can get good local estimates to $\V{u}$, the
intersection attack proceeds as before.
-\subsubsection{Attacking recipients}
-\label{subsubsec:recipients}
-As a final (and somewhat anticlimactic) extension to the original
-attack, we note that an attacker can mount attacks against recipients
-as well as senders with only slightly higher storage, and no increase
-in computational cost.
+\subsubsection{Attacking recipients:}
+%\label{subsubsec:recipients}
+%As a final (and somewhat anticlimactic) extension to the original
+%attack,
+Finally,
+we note that an attacker can % mount attacks against
+find recipients
+as well as senders with slightly higher storage and the same %no increase in
+computational cost.
-In this model, the attacker wishes to know which senders are sending
+The attacker wishes to know which senders are sending
anonymous messages to a given recipient Bob. The analysis remains the
same: the attacker compares sender behavior in rounds from which Bob
probably receives messages with behavior in rounds from which Bob
@@ -597,7 +602,7 @@
attacker cannot tell in advance when Bob will receive a message.
Therefore, the attacker must remember a window of recent observations
at all times,
-such that if Bob later receives a message, the probability is negligible that
+such that if Bob later receives a message, the chance is negligible that
the message was sent before the first round in the window.
\subsection{Strengthening the attack}
@@ -613,8 +618,8 @@
increased traffic, we discuss ways to reduce the amount of traffic required
for the attack by incorporating additional information.
-\subsubsection{Exploiting message partitioning}
-\label{subsubsec:full-linkability}
+\subsubsection{Exploiting message partitioning:}
+%\label{subsubsec:full-linkability}
The attacker's work can be greatly simplified if some messages leaving
the system are
{\it linkable}. Two messages are said to be {\it linkable} if they are
@@ -667,8 +672,8 @@
% to $<$Word6, Bob$>$, she is more likely to send to $<$Word6,
% Carol$>$ than to $<$Word2K, Carol$>$.} -NM
-\subsubsection{Exploiting {\it a priori} suspicion}
-\label{subsubsec:suspicion}
+\subsubsection{Exploiting {\it a priori} suspicion:}
+%\label{subsubsec:suspicion}
Finally, the attacker may have reason to believe that some messages
are more likely to have been sent by the target user than others. For
example, if we believe that Alice speaks Urdu but not Arabic, or
@@ -697,7 +702,7 @@
model of the original statistical disclosure attack, then against more
sophisticated models. We describe our simulations and present results below.
-\subsubsection{The original statistical disclosure attack}
+\subsubsection{The original statistical disclosure attack:}
%trial1
First, we simulated Danezis's original statistical disclosure attack,
varying the parameters $N$ (the number of recipients), $m$ (the number
@@ -729,7 +734,7 @@
recipients (large $m$); when there are fewer recipients for Alice to hide
hers among (small $N$); or when batch sizes are small (small $b$).
-\subsubsection{Complex sender behavior and unknown background traffic}
+\subsubsection{Complex sender behavior and unknown background traffic:}
% trial2
The next simulation examines the consequences of a more complex model for
background traffic, and of several related models for Alice's behavior.
@@ -793,7 +798,7 @@
more than they did before.
%XXX say more? -NM
-\subsubsection{Attacking pool mixes and mix-nets}
+\subsubsection{Attacking pool mixes and mix-nets:}
\label{subsec:sim-complex-mixes}
%trials 3,4
Pooling slows an attacker by increasing the number of output messages
@@ -843,8 +848,8 @@
% XXX can you explain this better? -NM
-\subsubsection{The impact of dummy traffic}
-\label{subsec:sim-dummies}
+\subsubsection{The impact of dummy traffic:}
+%\label{subsec:sim-dummies}
%trials 5a,5b
Several proposals exist for using dummy messages to frustrate traffic
analysis. Although several of them have been examined in the context of
@@ -894,8 +899,8 @@
``imperfect threshold padding'' (Alice always tries to pad up to a threshold
of $M$ messages per round, but is sometimes offline).
-\subsubsection{The impact of partial observation}
-\label{subsec:sim-partial}
+\subsubsection{The impact of partial observation:}
+%\label{subsec:sim-partial}
%trial 6
Finally, we examine the degree to which a non-global passive adversary can
mount the statistical disclosure attack. Again, we base our simulation on
@@ -937,8 +942,8 @@
closing, we offer suggestions for mix-net designs, and suggest several open
questions for future work.
-\subsubsection{Implications for mix-net design}
-\label{subsubsec:implications}
+\subsubsection{Implications for mix-net design:}
+%\label{subsubsec:implications}
If we were to design a mix network based on our findings here, what steps
should we take in order to frustrate intersection attack?
@@ -985,8 +990,8 @@
% connection with an attacker on each end compromises a sender--receiver
% link.
-\subsubsection{Questions for future work}
-\label{subsubsec:future-work}
+\subsubsection{Questions for future work:}
+%\label{subsubsec:future-work}
Many questions remain to be answered before the effectiveness of long-term
intersection attacks can be considered a closed problem.
***********************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe freehaven-cvs in the body. http://freehaven.net/