Abstract: In this piece, BitMEX grantee Gleb Naumenko talks through approaches to mitigate channel jamming on the lightning network. Channel jamming is when a malicious entity blocks up liquidity in the lightning network, by making a payment to themselves, via third party channels and then never revealing the secret, such that the payment never completes. Gleb discusses the potential solutions to this problem, both short-term fixes and long-term protocol changes. He concludes by explaining that there is no simple all encompassing solution and that some of the effective mitigation systems may be complicated to implement. Therefore, the path forwards may require substantial extra research and discussion.
Since LN is a permissionless system (meaning there is no central point of limiting the use of it), it is prone to Denial-of-Service attacks. Channel jamming is an example of those attacks.
Mitigating channel jamming while maintaining the permissionless nature of the LN is not trivial. In this post, I overview the design space of the proposed solutions.
This post is based on the conversation among a group of LN and Bitcoin protocol developers that took place in Fall 2021. It is targeted towards those curious about LN protocol design and tech-savvy LN users and followers.
What is Channel Jamming?
Lightning Network is a network of nodes helping (usually, for a fee) each other to make payments via payment channels: if Alice and Bob don’t have a direct channel, they can use a path of routing nodes between them to forward this payment by adjusting the balances in the channels along the path.
It’s important for these multi-hop payments to be atomic, otherwise there is a risk of routing nodes taking the funds without forwarding them. Multi-hop payments happen in two phases: locking the funds on the path from sender to receiver, and then shifting the balances by propagating the secret from receiver to sender.
The main idea behind jamming is occupying the capabilities of routing nodes to forward payments by making fake payments and never finalizing them. For the attack duration, it becomes impossible for routing nodes to forward other (honest) payments.
To jam certain channels, an attacker pretends to make a payment to themself via those channels, and never releases the secret on the receiver side.
Types of Channel Jamming –
There are two kinds of jamming:
– amount jamming, per which an attacker locks a significant portion of the target channel capacity
– slot jamming, per which an attacker locks forwarding capabilities of the target channel by exhausting the limit of in-flight payments (we explain this in more detail in “Baseline cost”)
For more detailed information on channel jamming, please refer to https://github.com/t-bast/lightning-docs/blob/master/spam-prevention.md. For understanding this write-up, you don’t have to fully understand all low-level details presented there.
The impossibility of timing-out
The most obvious solution to jamming is to cancel payment forwarding at a routing node after the routing node detects that the payment is stuck in-flight for too long.
Unfortunately, after the payment is forwarded, it’s impossible to fail it. The failure can only propagate back from the latter hops of the route. This is a fundamental part of the LN protocol which preserves security of the funds of routing nodes.
As a result, since the final hop belongs to the attacker, they would not propagate this failure back until they want to finish the attack.
Alternatively, routing nodes could just negotiate shorter timeouts from the beginning. This also doesn’t work: an attacker would just have to repeat the attack more frequently (re-initiate fake payments once a minute instead of once an hour).
Is holding funds even an attack?
The LN protocol stack does not explicitly state that making someone’s funds stuck in-flight is malicious, although it’s probably true at the moment because the LN is mainly used for low-latency payments.
However, this might change in the future. Locking funds in-flight might be a genuine activity, for example, in the context of DLC over Lightning.
From my observations, there seems to be no consensus yet about holding funds for a non-negligible time (not about failing payments!), as there are two opinions:
- The protocol should discourage long-hanging payments
- The protocol should move towards more flexible approach of pay-for-liquidity which would also account for the time funds are locked in-flight
Fortunately, it seems like both most feasible long-term-solutions (discussed later) involve some form of paying (either with native lightning, or with reputation tokens), so they should be more or less suitable for (2) with slight protocol changes, or otherwise just keep being a measure to achieve (1).
At the same time, it’s not really clear how to make this work in the current protocol: a payment sender could pretend the payment is going to be fast, and there will be no way to punish them for their payment taking longer than expected, nor to cancel the payment exceeding the anticipated time.
However, if we end up figuring out how to charge for hanging payments, it might turn out that it will often make sense for the sender to open a dedicated channel to the receiver and spend on-chain tx fees instead of paying for the liquidity. Even though this would mean fewer people will use the pay-for-locked-liquidity feature, we might have to implement it just to observe this phenomenon.
Currently, there is already a bonding requirement to jamming: an attacker must open an LN channel, which means locking some funds on-chain (opportunity cost) and paying on-chain transaction fees for opening/closing a channel.
The opening/closing aspect is O(1) meaning the cost is the same no matter how much funds an attacker is gonna jam and for how long, thus, this cost can be negligible for any serious attack (unless it involves opening many channels).
The opportunity cost, however, might be sufficient to protect against amount jamming: to jam X sats, an attacker would have to lock X sats (looping could optimize the attack ~20x).
For slot-jamming, the cost is much lower. For every channel, the number of payments flowing through it concurrently is 483 (exceeding that is unsafe because a transaction closing this channel won’t be able to secure more than 483 inputs).
Thus, it’s sufficient for an attacker to lock 483*min-payment sats in a channel and jam a channel of any amount with 483 minimal payments. (An attacker would also have to allocate some sats for the routing fees, although no need to pay them because the payment failed, but they should be there for the routing nodes to forward it).
Increasing slot limits (by @niftynei)
As we demonstrated earlier, the opportunity cost of slot-jamming any payment channel in the network depends on the slot limits, which is derived from the LN protocol.
The protocol, however, could be changed to allow having more than 483 concurrent payments over the same channel. This can be achieved by:
- Modifying the LN on-chain transaction structure (for example, I can imagine a channel being closed by two non-conflicting transactions instead of one, which would allow doubling the limit), for example, via nesting.
- Modifying the underlying Bitcoin layer to support LN-related transactions with more than 483 inputs.
Both of these solutions require substantial efforts from protocol researchers to analyze and implement, and the linear increase of the attack cost might not justify this amount of work.
Splitting slot limits in categories (by @niftynei)
We could have two different in-flight-payment limits/buckets for smaller and for larger payments.
The 483*in-flight payments per channel* limit doesn’t make sense for payments below dust, because they can’t be claimed on-chain anyway (although they are allowed by the spec). At the same time, counting those towards the same limit as above-dust payments is what makes it very cheap to slot-jam.
Thus, we could impose this limit only on above-dust payment, while applying some other limit for sub-dust payments.
After this change:
- It won’t be possible to slot-jam above-dust payment capabilities with sub-dust payments.
- It would be possible to slot-jam above-dust payment capabilities with above-dust payments. In other words, to disable routing of above-dust payments, an attacker would have to lock at least 483*min-dust-limit, at which point it might be cheaper for them to amount-jam rather than slot-jam.
- It would be possible to slot-jam sub-dust payments with sub-dust payments (according to a new sub-dust payment in-flight limit)
Thus, in practice it only limits slot-based jamming of a certain size.
Peer restrictions imposed on the previous hop
Another solution is monitoring inbound payment traffic and limiting it based on the (direct) peer it’s coming from.
A simple implementation of this idea is a plugin for LND by joost called circuitbreaker(). It can indeed protect from cheap and non-sophisticated attacks, but it probably can’t do much more, because due to the anonymity of the payments it’s hard to tell who is the source of failing payments.
Consider the following example.
Let’s say Dave trusts Carol based on some good reputation, thus no extra limit is applied.
Carol wants to protect her ability of forwarding through Dave, so she imposes limits of 241 on both her inbound channels from Alice and Bob. Now Alice can’t prevent payments coming from Bob’s channel to be forwarded through Dave, and vice versa.
The same logic propagates towards Bob’s inbound channels, but now let’s say Bob does want to open many channels. To prevent those channels from limiting each other and Bob in forwarding to Carol, those channels have to adhere to the total limit of 241. Note, if Bob refused to apply those limits on inbound channels, the Bob-Carol channel would be easily jammed by one of Bob’s counterparties. Now, despite Carol’s will to forward payments from Bob, she made it even easier to DoS Bob-Carol channel. Thus, this solution should be applied by Bob as well to make sense.
Now let’s assume that the limits are not terribly low and are sufficient for Bob’s clients. Attack-wise, this means that jamming the Bob-Carol link would require making many channels to Bob to take the space (from the total limit of 241) leaving them fewer slots. This ends up being a new, potentially even more powerful DoS opportunity.
Bob, of course, can adjust the slot limits on attacker’s channels based on their good behavior (e.g., paid fees). That doesn’t work because now Bob’s counterparties can affect Bob’s dynamic reputation towards Carol, and Bob can’t really prevent them from doing so. And also Bob still can’t exceed 241 total allowance for those channels.
In the end, it becomes apparent that these restrictions work only if applied by the whole network. At the same time, the responsibility propagation across many hops is suboptimal.
The idea behind this proposal is to charge for all payment attempts rather than just successful ones. This can be done via a routing protocol change, per which nodes would require payment senders to pay upfront via one way or another.
The attractiveness of this solution probably depends on the payment failure rate in the LN as a whole, because requiring users to pay for failures may be very counterintuitive and make the UX poor.
On one hand, it might be the case that for LN to succeed, the failure rate has to be low enough to attract mainstream users with smoothness and really low latency (because retries take time). Thus, it might make sense to design protocols around this assumption.
At the same time, it’s challenging to preserve low failure rate since channel balances are not public info, and this can be solved with large reserves and liquidity management, which results in higher routing fees.
Thus the market can cause the emergence of cheaper routing options with higher failure rate.
Additionally, for the entire path to be low-failure, every hop must be low-failure (the path is as weak as the weakest component in it), which might be impossible for a given channel graph.
– low-failure more expensive routing may be not realistic
– for cheap non-low-failure routing, it may be preferable to have a more silent way of paying that spending sats on failed attempts (especially if the payment even didn’t finalize in the end)
– may incentivize routing nodes to fail payments and collect fees even though it was not necessary (unclear how end-game economics of routing would work) or break incentives in other ways depending on the exact protocol
Reputation systems for payment senders
Instead of propagating the responsibility of maintaining DoS-resistance to the previous hop, routing nodes could make decisions based on the payment sender instead. While this is trivial assuming the payment sender is known, the sender’s privacy is currently protected by onion routing, and sacrificing that is probably not acceptable.
Stake certificates (by @gleb and @ariard)
Stake certificates enable anonymous credentials based on the ownership of on-chain funds. In other words, using resources of a routing node would require proving that a sender indeed has locked a sufficient amount of sats in a channel (where sufficient is defined by the routing node, for example requiring 1 BTC locked to send 1000 payments an hour).
More specifically, a routing node might require a certificate every time and keep track of the activity of a particular certificate owner.
It’s true that an attacker already has to lock funds, otherwise they won’t be able to even initiate payments, but having stake certificates could also:
– allow routing nodes to associate payments from the same source to detect attacks (which also could cause some deanonymization and censorship)
– limit the payments from the same source based on how much value their certificate covers.
Reputation tokens (by @roasbeef)
An alternative reputation approach is for individual routing nodes to increase the bandwidth of individual senders based on their performance.
For example, a routing node may issue special tokens to a payment sender, so that those tokens can be later used to “pay” for frequent payments over that routing node.
Tokens then may be invalidated by the issuing router node if the sender makes too many failing payments, or may be issued if the sender makes many successful payments and pays fees.
Combining the two
One of the challenges with reputation tokens is that it’s unclear how to distribute them initially.
At the same time, the disadvantage of stake certificates is the same source payment correlation.
Combining the two ideas might solve both individual problems. The flow might be:
- A payment sender shows a stake certificate to the routing node, receives 10 tokens
- A payment sender uses a token to make a payment.
2a) If succeeded, the sender gets 1 more token
2b) If payment fails, the token is invalidated
– implementation is sophisticated
– understanding incentive-related issues may require substantial research and real-world experiments (e.g. the emergence of secondary markets for certificates and tokens)
Upfront payments or Reputation for senders?
Upfront payments and reputation systems seem to be two ways to move forward.
Fortunately, we don’t have to choose one of two. Instead, routing nodes could specify they want to get compensated for routing in their channel announcement (via bit flags).
As for comparing them, they have a lot in common:
- sats locked in a channel while joining the network represent stake, similarly to how certificate-related tokens represent stake
- sats are spent while paying for payment attempts upfront, similarly to how certificate-linked tokens are spent for attempts
The main disadvantage of reputation tokens is that they require a more sophisticated protocol to bootstrap them, and may require hurting privacy to be effective. What we get for that cost is a potentially more incentive-compatible protocol and somewhat better UX, because certificate/tokens are less worthy (or at least less liquid) than sats.
Although fully mitigating channel jamming while preserving the permissionless nature of the LN is probably impossible, we have ideas on making the attack harder.
Incremental ideas might be integrated by LN implementations or even node operators independently today, but they are limited. Long-term fundamental solutions require additional research and broader discussion before moving ahead with them.
Hopefully, this article helps to move forward with these ideas in a more efficient way by bringing some clarity to the design space.
I’m grateful to Sergei Tikhomirov, Lisa Neigut, Antoine Riard, Joost Jager and Clara Shikhelman for reviewing this article, and to all LN protocol devs contributing their ideas to solving channel jamming.