Why Nostr? What is Njump?
2024-11-18 17:57:17 GMT

Key Management is a Blocker

This is a long form article, you can read it in https://habla.news/a/naddr1qvzqqqr4gupzp978pfzrv6n9xhq5tvenl9e74pklmskh4xw6vxxyp3j8qkke3cezqqxnzdenxyenvdesxvmrvwp4j6mqr0

So I have this cool new product, which for about two weeks has been ready to release, if I could just solve one thing. I have recently moved away from storing user keys in my apps due to the ease with which they could (and have) been put at risk. In doing so, I've discovered that despite its downsides, pasting your nsec into an app is a pretty straightforward operation which even non-technical people can pull off. In contrast, pretty much no other key management solution is.

Just to state the obvious, and to kick off this survey of nostr key management options, let me just state that asking users to paste their nsec into your app is a bad idea. However good your intentions, this opens your users up to all kinds of attack vectors, including clipboard hijacking attacks, exposing keys to insecure communication channels, exposing keys to many different apps, supply chain attacks, XSS attacks, and yes, bugs that cause your software to send keys to analytics or error reporting backends.

The era of nsec-pasting is over.

I've committed to embracing the pain and removing nsec login from Coracle, and I encourage other devs to do the same. The sooner we treat key management with the urgency it deserves, the sooner we can come up with a secure and convenient key management solution.

As an aside, ncryptsec is a great innovation for securely transporting keys, but it still doesn't protect against exposure to apps that need to use keys. It has its place though; in fact I'm of the opinion that nsec and seed words should be deprecated, and support for them should be removed. Giving friendly names and human-readable representations to data that is essentially private is a really bad idea (unless you're memorizing your key). But I digress.

Signer Comparisons

Let's go through a few existing options for key management, and compare their relative merits. I've tried to list them in the order they appeared on the scene, which also helps to clarify the logic of how signers have evolved. Throughout, I will be focusing on what kinds of user experience each approach unlocks for non-technical users, since my goal is to build products that work for regular people.

Extension Signers

The first signer application (that I know of) was nos2x, by fiatjaf. As I understand it, this was a proof-of-concept of how users might protect their keys without releasing custody of them. And it works really well! In fact, even though there have been many forks and imitators, I still use nos2x when using nostr on my desktop browser.

Extension signers offer a great user experience, along a narrow happy path. Setting up a browser extension is a relatively familiar process for normal users, and once it's done you don't really have to think about it. In theory, extensions can also include their own onboarding process and key backup strategies as well, allowing users to get started in a single place. Plus, there's very little latency involved in making calls to the signer extension.

This positive experience breaks down quickly though once a user wants to use a desktop or mobile application. When this happens, users have to start over essentially from scratch. Nothing they did to set up the extension helps them move to another signer application.

While it's technically possible to use extension signers on mobile via e.g. the Kiwi browser, this doesn't work for native apps or apps installed as PWAs. Instead, you either have to revert to pasting keys, or use some other solution.

One slight permutation of extension signers is browser signers, like Spring. Instead of adding a signer to your browser, Spring allows you to install a browser that holds your keys and allows you to use any nostr web application. But this has all the same basic limitations that extension signers do.

Hardware Signers

Hardware signers came around shortly after extension signers. I'm not going to spend much time talking about them here, because although they're about as far along the spectrum towards security as you can go, they're also not very convenient. Non-technical users aren't going to onboard by buying (or building) a device which they have to connect to their desktop via USB whenever they want to sign a message. Hardware signers have their place, but onboarding isn't it.

The only hardware signer I'm aware of (although I'm sure I've heard of others) is from LNBits, and is usually used via a browser extension like horse. This of course means that it has all the same limitations that browser extensions have, and then some (although mobile and desktop apps would likely be able to find a way to talk directly to the signer).

Hosted Signers

Remote signers (aka "bunkers") use the Nostr Connect protocol (also known as NIP 46) for remote signing.

Hosted signers in particular are one example of a NIP 46 remote signer, which lives on "somebody else's computer". Because they use a legacy web architecture, they can be built to be very familiar and convenient to users. It's trivial to build a hosted signer that offers email/password login along with 2FA, password resets, session revokation, the whole shebang. But they have one fatal flaw, which is that they are custodial. This means that not only do users have to relinquish exclusive control over their keys, but hosted signers also can become a target for hackers.

Desktop Signers

Several projects exist which allow users to run their own bunker, on their own hardware. These include nostr clients like Gossip, as well as command-line utilities like nak. This approach is mostly an improvement over extension signers, because it widens the scope of applications that can conveniently access the signer from those that run in the browser to those that run on the desktop computer the signer lives on. The downside is that they have to communicate via relays, which either introduces latency or requires an additional component to be running locally.

While it's technically possible to use desktop signers to log in on other computers or mobile apps, I don't think that's going to be very practical for most people. Mobile apps by definition are more portable than regular computers. Anyone who wants to access their nostr account on more than one device will have to either set up separate solutions, or go with another kind of remote signer. This isn't a huge obstacle for people highly invested in nostr, but it's a significant amount of friction for a new user.

Mobile Signers

Mobile signers solve the problem introduced by desktop signers of not always having access to your signer (or of your signer not having access to you, due to being powered down or disconnected from the internet). Mobile devices are generally more available than desktop devices, and also have better push notifications. This means that users can approve signer requests from any device as easily as tapping a notification.

Mobile signers on Android can also upgrade their UX by taking advantage of NIP 55 to avoid the round trip to relays, reducing latency and making it possible to sign things offline. Amber has been a pioneer in this area, and other projects like Nostrum and Alby's nostr-signer have been prototyped.

To date, there unfortunately haven't been any signer applications released for iOS, which leaves the mobile signer story incomplete. In my opinion, this is probably the most promising solution for end users, although it's currently under-developed.

Web Signers

One interesting alternative that combines the benefits of hosted, desktop, and mobile wallets is nsec.app. This is a web application frontend which keeps keys in the browser, so that they are never shared with a third party. Using web push notifications and a healthy sprinkle of black magic, nsec.app is able to respond to signer requests by opening itself in a browser window.

This works generally pretty well for desktop web applications, less well on android, still less well for android PWAs, and (to my understanding) not at all on iOS. Artur from nostr.band is working on these problems using a variety of approaches, one of which is embedding nsec.app in an iframe and communicating using postMessage.

This approach also makes it possible to sync keys between your phone and desktop, simulating a hosted UX by making them accessible from either location by signing in to nsec.app. This is done by encrypting user keys and storing them on the nsec.app server. In theory this should be secure, but it's something to consider.

I'm cautiously optimistic about this approach. If successful, it would enable a single brand to exist on every platform, which is important to reduce unnecessary configuration and cognitive overhead for users.

Multisig Signers

Another experimental approach is multi-sig. Promenade is a project by fiatjaf exploring this possibility. This would allow users to split their keys across different custodians and require all (or some majority of them) to approve an event signature before it would be valid.

The downsides of this are an increase in complexity (more moving parts for users to deal with) and latency (more parties to coordinate with to sign events). I'm also not clear on whether encryption is possible using multi-signature keys. If not, that would preclude not only existing direct messages (which will hopefully end up on MLS eventually anyway), but also things like private lists, mutes, and application settings. I think multi-signature signers are promising, but are definitely a long-term project.

Self-Hosted Signers

Coming nearly full circle, self-hosted signers are a special case of hosted signers, but, you know, self-hosted. These signers might live on a home server like a Start9 and be accessible for signer request approvals via tor, or they might live on a server run by the user (or an Uncle Jim). This would be an extremely convenient approach for anyone willing to deal with the complexities of hosting the infrastructure.

A good candidate for NIP 46 support might be AlbyHub, which is already one of the easiest self-hosted wallets to set up and use. Adding signer suppport to AlbyHub would allow users to have their wallet and nostr keys stored in the same place, and accessible anywhere either via the web interface or via AlbyGo.

Omniplatform Signers

This leads me to, finally, "omniplatform" signers. This isn't really a new architecture, but a combination of several. User choice is great, but nostr has a very tight complexity budget when onboarding new users. If a brand can manage to get new users set up with a very simple but sub-optimal solution, then grow them into a more complete integration into the nostr ecosystem, that would be a huge win.

I think Alby has a great shot at doing this, if it's something they want to prioritize. Bitwarden would also be a great candidate, since they already have apps on every platform, as well as a self-hosted option (Vaultwarden). If users could start with a mobile app, and incrementally set up a browser extension, self-hosted vault, and hardware signer as needed, that I think would be an ideal path.

Nostr Connect: broken, but promising

If you can't tell from the above comparison, I'm partial to NIP 46 as the best, most flexible way to build high-quality user experiences. Remote key management means a reduction in moving keys, hosting keys, and software installation and administration. If we can get users to the point where their keys live in only two places (their password manager and their signer), we'll be doing good.

There are however many ways to implement NIP 46. Implementing all of them in a single client (or signer) would be burdensome for developers, and introduce a lot of UI complexity to users. Here's a quick survey of flows that currently exist.

Signer -> Client

The simplest way to connect a client and a bunker is for a user to explicitly authorize the connection by copying a bunker:// URL from their signer application to their client. This allows the bunker to generate and validate a secret embedded in the URL without the client having to do anything other than pass it along in the initial connect request.

This is a great UX for people who know what they're doing, but isn't at all friendly to newcomers. Someone signing in for the first time isn't going to know what a bunker link is, and even if they do they're immediately confronted with the problem of picking a signer, setting it up, and finding out where in that app they can obtain a bunker link. This can be marginally smoothed out using things like protocol handlers and QR codes, but these won't apply in all (or even most) cases.

Client -> Signer

The reverse flow is similar. This relies on the user to explicitly authorize the connection by copying a nostrconnect:// url from the client into the signer app. In technical terms, this requires one fewer step, since in NIP 46 the connection is always initiated by the client. In this case, the pasting of the URL replaces the connect request. The client listens for a response containing a client-generated secret embedded in the nostrconnect:// url. This step isn't currently supported by all signer apps, some of which return an ack instead. This can result in session hijacking, where an attacker can intercept signing requests (although they can't do anything that would require the user's key, like decrypting messages).

While at first glance nostrconnect seems functionally identical to bunker links, the UX has the potential to be much better. The reason for this has to do with how people use which devices, and where a client or signer application is most likely to be run. This requires making some assumptions, but in my mind the most common scenario is that a user will want to host their signer on their phone, since that is the device that is most universally available for authorizations (apart from an always-online hosted signer on the open internet). In other words, users generally have their phones with them when they're using their computer, but often don't have a desktop available when using their phone. This idea is validated by (for example) the prevalence of SMS-based 2FA, which assumes the presence of a phone.

Assuming the signer is on the user's phone, QR-scan flows for client authorization make a lot more sense if the client is the one generating the link, since they can simply scan a code generated on another device with their camera, or copy/paste or use a protocol handler for a client on the same device. In contrast, when using a bunker link users might find themselves in the awkward position of having to copy a link from their phone to their desktop. Whether this is done via QR code or by sending yourself a link via DM/text/email, it's an awkward flow most people aren't really prepared for.

Auto-Connect

Some enhancements have been made to the bunker flow which allow clients to send an initial connect request without asking the user to copy links between apps. These allow clients to do away with opaque magic strings entirely and provide the idealized "just one click" flow. However, after trying to make this flow work over the course of a couple weeks, I've come to the opinion that the additional complexity involved in automating the flow just isn't worth it.

There are a few variants of this "auto-connect" flow:

  • Signer NIP-05: Signers can register a NIP 05 address for a user's pubkey on their domain, allowing users to enter their address rather than their pubkey on login. Unfortunately, this address has no relation to their actual NIP 05 address, which can result in a lot of confusion.
  • User NIP-05: To solve this problem, fiatjaf has proposed a new version which allows users to enter their own NIP 05 in at login instead of the one provided by the signer. The client would then look up the user's 10046 event and follow the signer pubkey listed there.
  • Nostrconnect handler: Signers may publish a NIP 89 handler which includes a handler url that clients can send nostrconnect urls to. This isn't currently specified anywhere, but it is supported by nsec.app. This bypasses the NIP 05 address requirement entirely, allowing users to simply pick a signer and click a button.

Each of these flows have their own strengths and weaknesses, but all of them share a dependency on some external source of truth for routing a user to the correct bunker.

In the first case, this is done by remembering the NIP 05 address assigned by the signer, which relies on DNS and on users to not forget which address they're using.

In the second case, this is done by relying on the user having done a significant amount of configuration (setting up a NIP 05, adding it to their kind 0, and having published a 10046 event) which may or may not exist. This forces clients to gracefully degrade to some alternative login method anyway, and adds UX friction since users have to choose which interface will work for them.

The final method bypasses the need for users to remember their NIP 05 address, but it does require either the client or the user to select a trusted signer. If poorly implemented, this could result in users choosing an untrustworthy signer on signup (risking their keys), or the wrong signer on login resulting in a broken session.

For all these reasons, I've opted to go with the vanilla bunker/nostrconnect flow, which allows me to display a simple interface to users. Presenting a QR code without comment assumes that users know what to do with it, but the benefit is that it makes explicit the signer selection step which the auto-connect flows try to paper over. This is actually a good thing, because instead of using heuristics like addresses or lists of signers presented by a client to make the decision, users can choose based on which app they actually have installed, which is a richer mnemonic device.

Making NIP 46 Work

The bottom line here is that while NIP 46 is the best baseline for signer support, it doesn't currently work very well at all. There are a variety of reasons for this:

  • The specification itself isn't clear, and is constantly changing. This leads to incompatibilities between apps and signers (or explosive complexity in trying to handle every case).
  • Extensions to the basic bunker flow (both in terms of signer implementation and signer discovery) are worth researching, but each one creates another dimension of possible incompatibility. Signers will be incentivized to support every possible login flow, creating complexity for users and increasing attack surface area. Clients will have to implement fallbacks to their preferred signup flows, again resulting in UX complexity.
  • Clients don't currently deal well with latency. In order for NIP 46 to work smoothly, clients will have to implement better loading, debouncing, optimistic updates, publish status, and "undo". There are downsides to this, but many of these features endu up being built by mature software products anyway, so supporting these patterns may actually improve rather than degrade UX.
  • There's currently no easy and secure way for users to store keys in a single signer which they can access anywhere. This means that users have to set up multiple bunkers depending where they're sitting, or resort to alternative login methods like NIP 07 or 55. These are great upgrades, since they reduce latency and bandwidth use, but shouldn't be required for new users to learn.
  • There's no unified experience across platforms. If a user signs up on their desktop, how do they safely transfer their keys to their Android signer app? If they're given seed words, how can they import them as an nsec? Consensus on best practices would be an improvement, but I think only a unified UX across platforms for a single signer can really solve this.
  • As nice as it might be to bypass app stores and built-in push notifications, shunning traditional platforms drastically increases the friction for users. To my knowledge, no signer app currently exists in traditional app stores, or supports built-in push notifications. If we want nostr to be accessible to non-technical folks, we can't ask them to start by downloading Obtanium or zap.store and a UnifiedPush distributor for their platform.

As I mentioned above, I don't think NIP 46 will ever be the only solution for signers. But I do think it's a great baseline on which to build a kind of "progressive enhancement" approach. For example, clients should support at least nostrconnect/bunker links, and encourage users once they've logged in to upgrade to NIP 55 or NIP 07 signers. Signers should exist in the mainstream app store and use native push notifications, with an option to install elsewhere or opt-in to UnifiedPush.

The goal here is to balance user experience and security. The number one rule for this is to reduce attack vectors for obtaining user keys. This points to (ideally) a single non-custodial signer, easily accessible to the user, and a simple protocol for using that signer from any app. Progressive enhancement is fine, but we should always be able to fall back to this baseline.