pleroma.debian.social

How Decentralized Is Bluesky Really? https://dustycloud.org/blog/how-decentralized-is-bluesky/

A technical deep-dive, since people have been asking me for my thoughts. I'll expand a bit on some of the key points here in a thread. ๐Ÿงต

First of all, before I say anything else, my goal here is NOT to be mean to Bluesky's devs. I know there's a lot of fediverse-Bluesky rivalry, but I have enormous respect for Jay Graber and her team and I know they believe in their vision!

This started because I got some very kind encouragement by @bnewbold to write something. I'm trying to be technical in my analysis, not unkind. I hope that can be recognized, really and truly.

That said, let's get to the summary: Bluesky / ATProto are not decentralized or federated, according to my analysis.

However, the "credible exit" goal is worth perusing, and does use decentralization techniques! But it is not decentralization/federation without moving the goalposts on those terms.

Furthermore, I think Bluesky is providing something valuable: a lot of people are trying to leave X-Twitter *right now* because it has become a completely toxic place.

The fact that Bluesky's team has managed to scale to receive such users is incredible, nearly feeling miraculous.

On the fediverse we also see a lot of accusations of Bluesky being owned by Jack Dorsey, and this isn't true. My understanding is that Jay performed an impressive amount of negotiation to allow Bluesky to receive funding independently.

These days Jack Dorsey is instead focusing on Nostr, which I can only describe as "a sequel to Secure Scuttlebutt with extremely bad vibes where bitcoin people talk about bitcoin"

I participated a bit in the process of when Bluesky was Jack Dorsey and Parag Agrawal's personal project. I also believe Jack and Parag were sincere about Bluesky as a decentralized social network protocol that Twitter would adopt, which is the directive that Bluesky was given as an organization.

When Jay Graber was awarded the position to lead Bluesky, I was not surprised. To me, Jay was the obvious choice to deliver what Bluesky was being directed, and I do think Jay is an excellent leader

There is also something which Bluesky gets right which the fediverse does not. I mentioned that Bluesky uses decentralization *techniques*, and the most important of those is content-addressing. This allows content to exist even when a server goes down.

This is a great decision and I have advocated that the fediverse do so as well. In fact several years ago I wrote a demo in @spritely's early days showing off how one could build a content-addressed ActivityPub in a spec-compatible way.

So I have opened here with the things that Bluesky does well. As you may guess, we are about to move into critiques territory, and it's a lot of critiques from a *decentralization*/*federation* perspective. It doesn't erase the "credible exit" goals, which I think are good still.

Let's dive in...

A frequent way of describing Bluesky's decentralization, including by Bluesky's team, is "it's like a bunch of blogs (Personal Data Stores), and then the relay/appview/etc pieces are like search engines"

This is a reasonable starting point for thinking about things, so let's run with it.

In fact ATProto's own tutorial even says "Think of our app like a Google": https://atproto.com/guides/applications

And indeed this is a good way to think about things. But it doesn't seem so bad, because we have Personal Data Stores like blogs, so probably things are fine, right?

While most people would argue that blogs and websites are open, few would argue that *Google* is open. So this is a curious place to begin thinking, and yet structually, it is actually quite apt.

PDS'es are like blogs, the rest is like Google. But relays/appviews/etc do a lot *more* than Google.

Relays, AppViews, etc don't just index information. Blogs and their interactions are generally slow-moving, but social media is direct and responsive. Notifications and fast interactions are key. So search engines, yes, but we should also think of these components of doing much more.

But let's stay on this blog/search engine analogy for a while before we unpack what it means on a *technical* level, which is interesting. Let's analyze for the moment from a power dynamics level.

Building a web search engine is actually pretty easy these days, you can do so with off-the-shelf tools. And yet there are only a couple of search engines *really*, Google and Bing (DDG mostly uses Bing). And yet the information is right there. *Anyone* could run their own engine. Why don't they?

Furthermore there is an interesting connection between blogs and social media: the death of blogs + feed aggregation directly aligns with the death of social media.

How many of you were around for the birth and awkward death of blog engine feeds? Because I was! Oh, remember Google Reader?

Feed readers are also simple, and in fact they were even easy to self host, even on the desktop! But Google Reader came in and was such a good design that everyone used it.

When it went away, blogs were still *there*. But blogging as a *syndication medium* died. One big player left, and it's gone.

This was sad for me especially; my favorite medium on the internet ever was webcomics. Webcomics still exist, sort of, but the loss of independent publishing and aggregation meant that they had to change to survive.

The shape of webcomics started to get shaped to the shape of Twitter's image box.

This may seem like an enormous aside, but it isn't. The big sell currently is that "you don't need to run a relay because you can run your own PDS!" but as I have illustrated here, the distribution and syndication power dynamics matter a lot.

So. It isn't enough to self-host your own PDS. Whether or not people can run their own relays/appviews/etc actually matters *a lot* if we want this stuff to survive.

So, can we? How hard is it to run your own AppView/Relay/etc?

Today, there is only one real organization running a Relay that really matters or an AppView that people use for anything other than fun aggregation of statistics. Nothing that resembles meaningful decentralization of the network. It's all run by one company: Bluesky.

But could we change that?

People are trying; most notably alice has done some great work recently: https://alice.bsky.sh/post/3laega7icmi2q

So now someone *can* run their own Relay (not the AppView yet, but maybe soon), and we're getting a sense of the cost and scale. This is good news; we didn't know before.

In fact we also have an idea of the rate of growth. Approximately 4 months prior, @bnewbold.net posted an article detailing how to run a Bluesky relay: https://whtwnd.com/bnewbold.net/entries/Notes%20on%20Running%20a%20Full-Network%20atproto%20Relay%20(July%202024)

This is great. We need more people trying to do so to get a sense of how decentralized things can be.

Just focusing on storage, in July @bnewbold.net estimated the amount of storage expected to run a Bluesky relay is approx 1 terabyte. In just 4 months at start of this month (November), alice estimates nearly 5 terabytes.

This is a fast growth rate and this is *before* the big post-election influx.

I tried estimating how much this would cost; as a lazy approximation I dumped a 5 terabyte machine into seeing what Linode would cost to self-host, and it was approximately 55k a year: https://bsky.app/profile/dustyweb.bsky.social/post/3lah5n3kld42q

That's a lazy estimate, but that's also what many people make in the US every year

However @bnewbold.net pointed out, correctly!, that there were cheaper options available. If we used even Linode's block storage, it would be cheaper (but still expensive) for the storage component, and this is true https://bsky.app/profile/dustyweb.bsky.social/post/3lah5n3kld42q

In fact @bnewbold and alice had gotten the server down to just close to $200/month in their estimate, much much cheaper than I had, by choosing a dedicated server plan. Much cheaper!

But there's a problem though; that's cheap because you've got a server that has a dedicated disk...

Even if we look at the dedicated hosting provider that @bnewbold provided in June and scale the cost to the pre-election storage requirements, we are adding on a massive amount of cost every month, over $400/month more.

4x 7.68tb SD  is +$414.20/month on the original dedicated storage example

But worse, we have reached the limits of what is possible to do with a dedicated server. We *have to* move to abstracted storage from this point forward because we're starting to hit the limits of what's offered for cheap dedicated storage on one machine. And this number will only grow, and as said previously, is growing at an enormous rate.

I have spent a lot of time focusing on the cost of storage, but storage is only one cost required. These estimates have been done so far against servers that *nobody is actually using*. The cost of servers that people are using will be much higher, because more needs to happen than just store things.

And that is not even to mention the challenges with administrating, dealing with takedown requests, illegal content, etc, which are probably much more serious.

Let's take a break, the analysis of server costs is boring and I don't like doing it, and I'm sure people will throw numbers at me of the absolute race-to-the-bottom hosting numbers they can find to store and run all this stuff, but really that's not interesting to me.

Let's do a comparison.

Remember that the idea of "fully self-hosting" on Bluesky/ATProto at this point is primarily abstract; nobody is really doing it. But of course there's a place where tens of thousands of people are running their own servers for millions of users, and that's the fediverse/ActivityPub.

As said, tens of thousands of people are self-hosting *today*. Fediverse software doesn't just scale up, it scales *down*.

GotoSocial is cheap enough on resources where you can run it for family and friends on a raspberry pi or spare laptop you have sitting around.

Now you're hitting the point in this thread where some of you may be thinking "aha! this is where Christine is saying that the fediverse/activitypub are awesome and atproto is terrible!"

you have NO IDEA HOW MUCH I CRITICIZE THE FEDIVERSE ALL THE TIME, I do it all the time, and will later here

The fediverse has a lot of flaws. Oh trust me, we're gonna get to that.

But comparison-wise: what I mean to say is that architectural decisions matter, and scaling up isn't the only thing that's important, *scaling down matters too*.

If you care about decentralization, anyway.

Now look, we're about 1/3 of the way done here, there's a lot more to say, and a lot more said in my article, it's about 24 pages long if you print it out.

This is because in the age of TikTok I somehow have decided to model myself after David Foster Wallace, sorry

"Consider the Fediverse" I guess

But now, I will break for lunch. Enjoy your intermission because I will be back. We still have to get through the remaining 2/3 of the analysis, after all.

======= LUNCH BREAK HERE =======

Okay I am back from lunch, time to resume my analysis thread for "How decentralized is bluesky really?" https://dustycloud.org/blog/how-decentralized-is-bluesky/

I have been receiving a lot of notifications, I am not reading any of them until I finish with this so bear with me, BEAR WITH ME, we're gonna make it through

And before we make it any further can I say that I watched a nice medley of David Bowie and Cher singing, and it was so lovely https://www.youtube.com/watch?v=KPlN8RBP-Ws

@mlemweb said "of course it's very heteronormative despite having two queer coded icons on the stage and ISN'T THAT THE WAY I guess

But where was I? Oh yes. We had talked about why PDS'es aren't enough (blog/google analogy), relative costs of hosting things on ATProto vs ActivityPub, etc etc

But we haven't gotten into the really interesting parts which are the structural analysis stuff, so let's move onto that

Now you may be saying, "Christine, this is really unfair, because you're looking at ActivityPub servers which are only dealing with a small amount of the network, what if it were an ActivityPub mega-node? What are the costs THEN huh?" and "What if we hosted just PART of ATProto?"

What then INDEED

ATProto is not designed for the Relay and AppViews to only hold part of the network, not *really*, and ActivityPub is. We'll get to this in a moment.

But Bluesky actually has good justification for this! I will defend it insofar as Bluesky was making a serious *design decision*

Remember the directive that Bluesky was given: develop a decentralized protocol which Twitter can adopt. That informs a lot of things, and has meant that Bluesky was really very ready for this moment!

If you're an ex-X-Twitter user then by god, you're going to be amazed! It's just like Twitter!

BY THE WAY if you are reading this, I am dual-posting here and on the Bluesky simulataneously, dual-wielding social media "platforms" https://bsky.app/profile/dustyweb.bsky.social/post/3lbkkkj5mhs24

Why not follow the same thing in two places at once I guess

This informs some other things:
- Bluesky's gotta scale BIG and do so FAST (scaling down: not a priority at all)
- It has to be something Twitter can adopt (of course, not anymore, but initially)
- Everything on ATProto is public (yes, everything, including your blocks btw, we'll get to that)

But here's the other thing. People have trouble with the fediverse! All those decentralization decisions get in the way, my god, you've got to choose a server, search doesn't work well (actually it could but it's a cultural thing, different topic), and worst of all:

Sometimes you DON'T SEE REPLIES!

Actually all these critiques of the fediverse are TRUE, these are known challenges, and actually it's not really so bad, but it could be better, and at any rate, Bluesky made a major decision to simplify a lot for new users, and they have. Things seem to just work for people! Incredible!

The thing you often get seen thrown around is "it's amazing, I had no idea a decentralized protocol could just work like that! How on earth did they solve that in a decentralized system and so FAST too!"

It's simple: all those things "just work" because Bluesky is centralized.

Now yes, they are using decentralized techniques. Remember when I said content-addressed storage is a good idea and the fediverse should do it too? IT IS! (And as I also said, it's actually fully possible for the fediverse to do, more on that later.)

But the reality is, it's still *centralized*

In every meaningful way from a power dynamics perspective *EXCEPT* the category of "credible exit" (which I am saying and agreeing is a good idea!) Bluesky is centralized.

MAYBE another big corporation could come along and host all this stuff but that's adding a Bing to our Google

Yes, you can host your own PDS. You can also host your own blog. But try hosting your own PDS and NOT hosting a relay or AppView and you can't do much.

Blogs are decentralized, Google is not.
PDS'es are decentralized, Bluesky is not.

We're getting to the point where we get to why I'm so damn frustrated about this and have been biting my tongue until it nearly comes detached from my mouth: users THINK Bluesky is decentralized because they're TOLD Bluesky is decentralized

AUGH! *That's* what drives me nuts.

Here's an example of this problem in action

fry69: "The working search box was the second thing that impressed me on Bluesky, I thought that was not possible with a decentralized model"

Sorry fry sixty-nine I regret to inform you the reason search works so well is that it's centralized! THAT'S WHY

So hold on, let me set some terms for "decentralization" and "federation" that I think are reasonable.

> Decentralization: the result of a system that diffuses power throughout its structure, so that no node holds particular power at the center.

Pretty reasonable. Do you agree? I hope so!

Okay how about "federation" now because this is a *technical term* that the *fediverse has established* and I'm kinda PO'ed about the goalposts being moved on this one.

A lot of people coming to Bluesky have never heard of "federation" before in a social network so listen up this is important!

Here's my definition of federation:

> Federation: a technical approach to communication architecture which achieves decentralization by many independent nodes cooperating and communicating to be a unified whole, with no node holding more power than the responsibility or communication of its parts.

@cwebber Well...Baran had a graphic for that.

Now historically, federation has been achieved on the fediverse via "message passing". Actually, this is to the degree where I just always associated message passing with federation, but really, federation is about the distribution of power, creating an abstract whole in a sea of autonomy.

Maybe there is another way to achieve federation, but it's about the power dynamics. It's a technical immersion of power dynamics, the flow and interchange of cooperation between many parts.

So you may say, well, doesn't ATProto have that? After all, messages flow through the different parts!

ActivityPub, as it turns out, follows the actor model of computation. Okay, many people implementing the fediverse don't know about the actor model aspect of ActivityPub but I am here to tell YOU, dear reader, that it is an important thing, not a detail

I'll take one more note about federation which is that often time the message passing mechanism of the fediverse is often called "federation", but theoretically another mechanism could exist, but I'm actually not so sure of that.

There's a reason the actor model and the lambda calculus are undying

Oh god Christine said "the lambda calculus" did you know she's into lisp and functional programming, what's she going to talk about next monads?!?!

I am not going to talk about monads. Not TODAY

But we do need to get a better architectural idea of how these systems work because it matters a lot!

@cwebber your spooky Halloween name next year should be Christine "Leibniz"-Webber

So let me introduce two models of communication which we can use to analyze these two systems. It's important!

- Fediverse/ActivityPub: "message passing"
- Bluesky/ATProto: "shared heap"

Okay, cool, terms established, let's talk about them and why they matter because they matter A LOT

"Message passing" is what ActivityPub uses. It's "like email", people say, and that's true.

Actually it's even a lot like physical mail. You write a letter, you say where it should go, it gets delivered to your house.

Message passing. The world runs on it.

Now I can use message passing to send a message to you *directly* and indeed, that's "like email". For one-to-one correspondence, that's enough.

But it's not enough for a followers/following type mechanism. But we can build it on top! Thank *you* computational abstractions!

On top of "message passing" we will build "publish-subscribe" as a second-layer abstraction

"Your ideas are interesting and I'd like to subscribe to your newsletter."

You send me a letter saying you'd like to hear the things I have to say, okay, you're part of the reader list. That's how it works.

On top of that we can build even more abstractions and the net result is that this is how federation works in pretty much every "federated" system I know.

ActivityPub does some extra work to help you see replies on a thread, think "letters to the editor". This is a bit lossy sometimes though

It's true that sometimes users click over to a thread and see some replies but not all on their instance's UI. There's things that could be done to improve it, but it's sometimes mildly confusing, but not so bad, and you can click over typically to see whatever else is happening, and people learn to

I actually think this is improvable but I mostly don't care because this isn't as big a complaint as people tend to think it is on the fediverse, the other concerns like "what instance do I pick" tend to be bigger and "oh no my server went down"

That can be improved, we'll talk about that later

So okay, the federation is "message passing" and like email, or physical mail. You have an idea how it works.

Now we need to get to that other thing, a "shared heap" architecture. What on earth does that mean?

If "message passing" is like "mail comes to your house", a "shared heap" system works differently

In a "shared heap" system, all the mail gets dumped at the post office, and in the most naive version, you go over there and read through every single piece of mail to see which one is relevant to you

There is no "directed delivery" in a "shared heap" system, which means you are stuck with two things: either a "god's eye view" (Bluesky) or "even lossier about replies than ActivityPub" (Secure Scuttlebutt/Nostr)

The Bluesky approach to the "shared heap" is that *everything* goes into the big, centralized shared heap. Bluesky takes a "god's eye" view: it knows everything, and so knows what all your replies are, and can give you perfect search.

Secure Scuttlebutt / Nostr... well long story. Lossier, I'll say

You can imagine the physical world version of "message passing" already because you already live in this world. Messages come to your house or apartment building or whatever

For Bluesky's "shared heap" architecture, you'd have to build a whole addition to your house for everyone's mail

That's exactly why running a Relay or AppView is expensive: you're building an addition to your house for all the world's mail.

Eeep! That ain't cheap. That's why I'm saying: decentralization also means the ability to *scale down*.

@cwebber didnโ€™t Jack pick up his toys and leave after they insisted on putting in proper moderation tools? I thought he was fullly divested now

@twipped yes I said that he left

Look, I know that I've been hitting this nail on the head for a while but: the web is open, blogs are open, but Google isn't open

But you could run your own Google, in theory. You could index the web. So why aren't you?

Ah yeah. Same thing here. That's what I mean, that's why it's centralized

Now as I have said, this is a *design decision*. And remember: most users of Bluesky really *don't care*. Decentralization is not their focus, they're trying to get the hell off the nazi hellscape that Musk's toxic reign of Twitter has become.

Bluesky's architecture, actually, is great for them.

If what your *goal* is to get off Twitter, then Bluesky has solved it. They solved it by building another Twitter, and this time it's open source, which is cool! And it might have this "credible exit" thing.

But god damnit it's not decentralized and it's not federated stop TELLING people that

"Oh Christine you're being sensitive"

Maybe, but there are real consequences to this. What if Bluesky/ATProto fails? "Oh well we tried decentralization and that didn't work." If people think something is something that it isn't, then that's a real problem.

Users, clearly, think a lot more of Bluesky is decentralized than it is, and realize less of the consequences than they should. This really worries me. Blocks and DMs are both great examples of this.

Blocking first. Bluesky's decision to have *everything* public means that it is expected that every participating node knows *everything* about who's blocking *everyone*.

"This is consistent with how blocking works on Twitter/X" their paper says

But wait, I'm pretty sure that one's not true though

It is ONE thing to be able to block JK Rowling and for you to see that JK Rowling is blocking you.

It is an ENTIRELY DIFFERENT THING for ANYONE to see who is blocking JK Rowling and who JK Rowling is blocking

This one is shocking to me: this seems like a vector for abusive actors

@cwebber For sure.

Now to be completely fair this is something that Bluesky's devs are interested in potentially changing: there is an open issue to discuss the possibility of private blocks https://github.com/bluesky-social/atproto/discussions/1131

What I am saying is there are architectural consequences to fundamental design abstractions

Yes, I may sometimes seem silly over here, SICP-hugging fangirl, come on we're just trying to build things that *work* over here

Look I'm a lisp lady, I know the realities of "Worse Is Better" more than most, I now the right CS designs don't win

But Conway's Law flows in two directions!

You know what, we'll come back to "bidirectional Conway's Law", let's talk about Direct Messages for a minute because I think those are telling

Direct Messages in Bluesky, wait how do they work if ATProto is public?

Did you guess?

DMs are centralized! All DMs flow through Bluesky

Now to be completely fair Bluesky is clear about this *in their blogpost announcing DMs*, but just like this thread, I doubt nearly anyone has read that far (am I talking to the void? I don't know, if you actually have gotten to this message reply with "I found the easter egg" or something)

@cwebber reading and enjoying your thread

The thing that is telling to me about DMs is that we *have* federated direct message protocols like XMPP which have been around for ages; if Bluesky wanted to they could have tacked that on pretty quickly, E2EE or not. It still would have been decentralized at least

@cwebber ๐Ÿฃ

The point is that I have *seen in the wild* people saying "Oh yeah Bluesky added DMs to their decentralized protocol" and augh

I know they aren't claiming this but it's very clear to me that people are reading things as being completely different architecture than it is

But to Bluesky's credit, Twitter's DMs aren't decentralized either! And getting and shipping something that works, now for the influx of Twitter users, again... I am sympathetic

Bluesky's team is doing an INCREDIBLE JOB in that way of scaling to meet the incoming stream of Twitter refugees

On that note, again, I am not reading the replies right now because I am (a) afraid to and (b) I'm never gonna finish this and we are a bit over HALFWAY THROUGH the analysis but I have this fear that EVERYONE is mad at me, Bluesky fans, fediverse fans

I am trying to be analytical. I am trying!!!

I said we are about halfway through and criminy we're halfway through the afternoon, I need a break to get some tea

We have a few big topics left:

- Decentralized identity, how does it work (magnets too, yes)
- The Org is a Future Adversary
- Christine critiques the fediverse
- Wrap up

And so, it is TEA TIME

Go get yourself a hot beverage. Put honey or agave in it, if you like. Dairy, or perhaps, non-dairy, if you prefer.

=== BREAK TIME! Time for tea! ===

I can confirm, @cwebber is currently making us both tea :)

@cwebber it is a well known fact that activitypub and bluesky are both worse because they're not written in lisp ๐Ÿ˜Ž

@mlemweb Thank you for corroborating my story

Okay, I am back and I am back with tea! I made "black tea with ginger" and I put some whipped honey in it. I also made tea for my spouse

I am drinking out of an oversized mug from @baconandcoconut that says "I'm that person who likes to serve on open source program committees", which is not actually accurate but I do anyway

I am also sad about the US House of Representatives being shitty to trans people who work there and are just trying to make it through the day

I used to do data modeling contracting for the US HoR on our legal system, true story, which sends me back to a time when I did a lot of data modeling

@cwebber

The house should ditch men's and women's restrooms and switch to republican & democrat restrooms.

A lot of data modeling I did in that time was in the W3C Verifiable Credentials group that was working on Verifiable Credentials, zcap-ld (my spec), and, oh hey, Decentralized Identifiers (DIDs, the name is not my fault)

So actually I was pretty excited when I heard that Bluesky was gonna use DIDs!

Back in 2017 I wrote a whitepaper: "ActivityPub: from decentralized to distributed social networks" and it also suggested using DIDs https://github.com/WebOfTrustInfo/rwot5-boston/blob/master/final-documents/activitypub-decentralized-distributed.md

I no longer think DIDs are necessary to solve this, but then and now I think *decentralized identity is important*

In that sense, I am really glad Bluesky is taking on decentralized identity, as a concept! And DIDs, in a way, are a good signal.

But there are several problems, the first of which is: Bluesky supports two kinds of Decentralized Identifiers and they're both -- you guessed it -- centralized!

@cwebber fabulous! Stuff to ๐Ÿ‘€

You mention "Message passing" vs "shared heap" architectures and it occurred to me how fast this shift to pioneering in "decentralized/distributed solution design" space and entering entire new computing paradigms currently is.

Where once more tech is running way ahead of responsible use. Technically all is possible. Reality is we stumble ahead, no best-practices, impl on-the-fly.

We take as it were big bets on future direction, may overlook externalities.

1/..

@cwebber

Not only are the implications of the BS shared heap architecture easily overlooked and consequences come later, this has been the de-facto approach for any decentralized web technology thus far, including AP. Where hard-tech mindset and focus dominates.

And yes, the complexity warrants all that attention.

Yet there's less thought and attention payed to how DX, UX system / application / solution design should cope in the higher levels of the stack, and esp. in FOSS circles.

2/..

@cwebber

Making it extra hard to bridge the technology adoption chasm beyond early adopters, while the decentralized ecosystem suffers protocol decay.

Re:new computing paradigms.

> "local-first p2p social networking at scale"

.. someone said.

That buzzwordy sentence might see us enter a new exciting social web of adventure, if we don't squander the opportunity.

Technical all is once again possible. Martin Kleppmann inspires with generic local-sync protocols, universal back-ends, etc.

3/..

@cwebber

But thinking about exploring technical possibilities is way out of lock-step again, speeding ahead of how one would use this shiny technology to build useful things on top of in the best possible way.

I have difficulty wrapping my head around picturing a local-first social network at scale where CRDT's p2p synchronise application state and data of all actors - people, apps, services - in the social graph between 1,000's of peers. So many options, what approach is even feasible?

4/..

Before we get there, let's talk about what the DID spec was and what DIDs are. The core DID spec is an *abstract interface* for key management which provides a way of representing keys (and some other metadata) which can be created, retrieved, and updated/rotated.

So far so good...

@cwebber

Meanwhile there are already hundred or more local-first projects and vendors who are independently building "the right way", in other words fragmenting into indvidual explorations with little cross-pollination and co-creation.

Why isn't there already an IETF local-first working group, or something similar?

Well.. someone should step up to the plate to do that, that's the wait now. Lotta work for volunteers and no funding beyond hard-tech. So this is up to vendors then, I guess.

5/..

@cwebber

Unrelated to this thread it occurred to me how much time and energy we waste by endlessly sifting through untangled mess of complexity with different viewpoints and perspectives leading to Babylonian confusion and overlap all the time in discussions.

Bluesky had a big advantage, in that they could forge ahead, highly focused as a close-knit team exploring greenfield technology. They set sail, just tapping the chaotic information stream for collecting stakeholder feedback.

6/..

@cwebber

Now if we look at AS/AP ecosystem, there is a problem as the storm of discussion on vNext of the protocol or choosing alternative directions, goes on unabated, and no one seems to be coming to any kind of real consensus.

It almost looks like we once again must leave that to the vendors to sort out, when they enter the 'fedi market' en masse.

Ideally we want to have multiple commons-controlled focused and productive working groups that elaborate various themes of the social web.

7/..

The other requirement you would expect, based on the name, is that Decentralized Identifiers are *actually decentralized*.

When I got involved in DID work, that was actually the expectation of everyone. Then it was loosened. What? Why on earth?!

@cwebber

Thus I had the idea to write a proposal to start, what I call, a fellowship that runs an open social web laboratory, and is able to separate the general discussion to focused input for working groups to quickly iterate on a theme, in a similar way to how BS operates now.

See for info: https://discuss.coding.social/t/proposal-start-a-fellowship-to-explore-the-social-web/571

The idea is follow-up to "Vision for fedi spec" feedback gathering that @helge initiated, as a means to cope with the broad subject area.

See: https://discuss.coding.social/t/wiki-vision-for-a-fedi-specification/563/24

/end

@cwebber i'm just very glad that you're dispelling the myth that bluesky is federated. i'm so tired of being told on bluesky that fedi isn't safer than bluesky because bluesky is federated so if they fuck it up people can take their stuff and go elsewhere, when it's like "how? go where?"

@cwebber tangential thought (I'm really tired and my adhd go burr today) the more I think about it the more weird "worse is better" feels to me because while I know it's meant to mean whether software is adopted or not is ultimately a function of the needs of real users, the surface implication of the slogan is that compatibility with user needs is not a valid primary metric for judging whether or not software is good

The reason actually stems from the first centralized DID method that Bluesky supports: did:web.

did:web is centralized, and kinda useless. It just works by a regex rewrite of the DID's name to an https URI and then it's retrieved. Anywhere you use did:web, you could have just used an https: URI

"Now wait Christine, didn't you say earlier that the web is decentralized and open? So therefore, did:web is decentralized and open"

Yeah but the naming system of the web is CENTRALIZED

We use DNS and ICANN (and then we add another centralization layer with TLS/SSL CAs)!

Everyone in the DID standards space KNEW that did:web was centralized, so why on earth was a centralized identifier permitted for something named "Decentralized Identifiers"?

The answer is easy. did:web is easy to implement, many DID methods were not.

did:web existed for test suites.

I was kind of exiting that particular area of standards when this happened but colleagues will tell you that I, and some others, were deeply upset and troubled by this

"Sure having a nearly no-op DID to pass the test suite is helpful but it shouldn't be labeled as a DID, people will get confused!"

Confusion, on its own, is one thing. But the problem is when confusion turns into decentralization-washing.

"This is going to turn into decentralization-washing!"

"It's just to pass the test suite!"

[... time passes ...]

"Actually we like did:web now, it's a DID method everyone can implement!"

@cwebber @helge

Tangential, but to add some more spice to this..

We need more fellowships like this, who explore yet other areas together.

Like for object capability social web at scale.

A couple of years ago, when you were still on Spritely Project, you sent out a toot out in which you sighed that once spritely technology would be mature enough for widespread use, it would probably be already too late.

The institute to the rescue, I guess. Valid and prudent choice.

1/..

And of course once the door was open to did:web, the door was open to everything! Decentralization is now no longer a requirement for DIDs. You can make a centralized DID method and call it a "Decentralized Identifier" and you're right because it implements a spec named "Decentralized identifiers"

But it's ONLY EXPERTS IN DIDs WHO UNDERSTOOD THIS

Most users hear "Decentralized Identifiers" and they think they know what's being delivered, the distinction between the *spec* being called that and the *mechanism used* being centralized... you have to go digging to find that out

So did:web is not only useless, it misleads people about the problem domain entirely, but hey it's now the most broadly deployed DID method in the world, congrats everyone!

Speaking of centralized Decentralized Identifiers, did I mention that did:plc is centralized?

For that matter, where did the term did:plc come from? Early versions of "did:plc" documentation called it the "Placeholder" DID method, that's what it stands for, to motivate changing it later

Well the docs no longer say that, it now says "Public Ledger of Credentials"

Good backronymn, but...

did:plc is centralized, and that bothers me because once again, users think something is more decentralized than it is, because they're being *told* it's decentralized

The particular way in which did:plc is centralized doesn't bug me too much but once again, few users have read into this

If you read the documentation of did:plc, they're actually quite upfront about did:plc's centralization being non-ideal. That's good, I appreciate that. Again, you gotta dig though, and the name misleads (which is, to be fair, the original sin of the DID Working Group)

(aside: wow my eyes are getting tired from staring at my monitor while I recap of what was a 24 page blogpost, why do I do this to myself)

Aside from being irritated about the name misleading, I don't mind the centralization of did:plc too much (other things, I am more concerned about, we'll get there)

There's one organization that can be queried via their API that keeps a definitive list of certificate and their updates

@cwebber @helge

It is still hard to hook on to spritely unless you have deep technical expertise. That means most others (large group) are in wait-and-see necessarily.

Choice is perfectly valid, because its the foundation team's own initiative.

Is it the best tech introduction strategy? Best technology adoption model to use?

Your community and ecosystem have to catch up, once you say "it's time for fun".

Randy's community pattern language might serve to unlock upper-stack stakeholders now.

@cwebber @bnewbold

Hey, Christine.

Did you consider that it's in Brian's and Bluesky's interest to position the difference between ActivityPub and AT Proto as one of technology and not of governance?

And to get the editor of AP to do it?

Also, did you think about getting your hands dirty with a proprietary protocol that has no patent or other licensing grants?

I intentionally have not done either of these things. I think Brian encouraged you to do this for his and Bluesky's own benefit.

In theory, once a DID is registered with Bluesky, it cannot be altered by Bluesky, because a cryptographic update from the original key is necessary; it's a certificate chain, a good design

Bluesky can refuse to share did:plc documents or their updates, but it can't manufacture updates

This is pretty good tbh, it lowers the stakes a lot to have certificate chains

I love certificate chains, certificate chains are great

Honestly, having a centralized registry for them, it's not the best but it's not the worst (aside from that damn naming thing)

However...

@cwebber @helge

Because that is highly tangential from spritely core technology, fanning out into vast scope, you might offload that to a fellowship that can facilitate multiple independent initiatives at the same time, not just spritely but also see an ecosystem of convergance and increasing alignment, rather than fragmentation as per the norm.

There are some strange, strange things about did:plc that heightens the centralization concerns and, well

I'm not a cryptographer, but some of my good friends are cryptographers, etc etc. I got some... reactions to what is to follow

@cwebber @bnewbold I hope Bluesky Inc. made a big donation to the Spritely Institute for this huge amount of work you did.

The first strange thing to me is that did:plc uses sha256 and, AFAICT, not sha256d (which is really just running sha256 again over the hash). Unless I am missing something? Am I wrong?

Maybe it's not a concern because of doc parsing but it's best practice to protect against length extension attacks

The next concerning thing is that did:plc truncates the hash to just *15 bytes* of entropy.

I'm... again I'm not a cryptographer, but why throw away all that delicious entropy? So the did fits in 32 characters? Weird choice, and it means collisions are cheaper

This is public information, I don't need to file a CVE to tell you about the truncation of entropy. I am, again, not a cryptographer. Maybe it's fine?

I do remember the Debian short IDs fiasco tho https://gwolf.org/2016/06/stop-it-with-those-short-pgp-key-ids.html

Why not hold onto all the entropy you can get?

DIDs weren't meant to be seen by the user; cryptographic identifiers in general *shouldn't be*, they should be encapsulated in the UI.

We'll get to UI stuff in a bit.

I just don't understand this decision though, it just seems weird to me but maybe a cryptographer will tell me it's fine, actually

At any rate, I continue to not understand it, maybe it's fine, but it did play a part in that "Hijacking Bluesky Identities with a Malleable Deputy" blogpost, which is fascinating and, unlike me, is written by a Real Cryptographer (TM) https://www.da.vidbuchanan.co.uk/blog/hacking-bluesky.html

Good post btw

@cwebber

Shouldn't this be 20 bytes? There are 32 characters, and each character is base32, or 5 bits. So 160 bits?

I don't *think* there's a huge concern over this, because while maybe you could do a birthday collision attack in 80 bits, this wouldn't really get you much and wouldn't let you take over someone else's account. For that you'd need a pre-image attack on the whole 160 bits.

*Also not a cryptographer!!*

@fontenot no because the 32 characters includes the "did:plc:"

@cwebber @baconandcoconut vandalize it to say "I'm that person who open source program committees" :3

One way in which the truncation shows up in that blogpost which I thought was curious is that the attack involved generating a *longer* truncated hash

The fix ended up resulting in codifying the hash length: 24 characters, and no longer https://github.com/did-method-plc/did-method-plc/pull/31

There's another thing about that blogpost that caught my attention. I will just quote it:

> However, there's one other factor that raises this from "a curiosity" to "a big problem": bsky.social uses the same rotationKeys for every account.

> This is an eyebrow-raising decision on its own; apparently the cloud HSM product they use does billing per key, so it would be prohibitively expensive to give each user their own. (I hear they're planning on transitioning from "cloud" to on-premise hosting, so maybe they'll get the chance to give each user their own keypair then?)

Anyway that's the quote and presumably this must be changed. I haven't looked, but I can't imagine they're still doing this today (are they?) but the fact that only one key was ever used in production for expense purposes is a strange decision

@cwebber they've invented the google of email of twitter......

At any rate, that decision was used to create a kinda confused deputy-ish attack, which is why it came up in the blogpost, and anyway, hi, I'm not a cryptographer, momentary reminder that I am not a cryptographer, but I have designed cryptographic certificate chains and I was pretty shocked by that

At any rate, one way or another, you can presumably use did:plc to move yourself from one server to another so in the interest of "credible exit" this is a good choice

Though, one might take a moment to ask: who controls the keys if you *do* want to move?

Bluesky has identified, I'd say correctly even, that key management for users is an *incredibly* hard thing to do.

But the solution, once again, ends up pretty centralized: for all users on Bluesky's main servers at least, Bluesky generates and manages the keys for them.

I am, once again, kinda sympathetic and kinda unsettled simultaneously.

- Sympathetic: key management *is* hard and we just don't have the UX answers to solve that, and Bluesky is once again trying to deliver to Twitter refugees
- Unsettled: it's centralized, but... there's something *more* troubling

The big promise here, the "credible exit" side of things is that for most users, the vision they have is that if Bluesky gets bought by a big evil company, no problem, move somewhere else

But for those same users, Bluesky still *controls their keys* and thus *controls their destiny*

Regardless, Bluesky has this "your domain is your id!" thing, and that's pretty cool, the domain maps to your DID and your DID maps to your domain

Well, I'm not gonna get into this in detail here, I do on the blogpost if you wanna read it but, the cyclic dependency might be an actual cycle

tl;dr on that UX part:

- users only know domains, they don't know the DIDs
- turns out that's a phishing attack when those can change at any time
- if bsky.app ever goes down how do you actually know I *really* mapped to that name
- and a whole lot of "liveness" problems that enter there

in addition to this long-ass thread there is a long-ass article and if you care about things like "zooko's triangle" maybe read that version, the rest of y'all can move on we've got other stuff to cover here

It is time for TEA BREAK 2: THE REHEATENING

I will also go to the bathroom

TMI? If you've read this far into this weird thread I am already giving you too much info

=== TEA BREAK 2 ===

@cwebber thank you for reminding me that it is TEA TIME!

Took me a minute because my water boiler needed refilling...

I have returned, with tea

I am still not reading notifications. Well, I have seen a few fly by on the fediverse which is blipping and blooping nonstop in the Mastodon UI so people are clearly reading it there

Bluesky says "30+". How big is the +?? I will resist temptation to look and assume "31"

"Where are we going with this Christine?"

Well you could have just read the blogpost but 3 more sections remain, we are approximately 2/3 there

I know, bear with me, what is left is:

- What should the fediverse do?
- Preparing for the organization as a future adversary
- Conclusions

Yes, I changed the order of the remaining sections, not from the blogpost but from the last time I said what was left on this thread

pray I do not reorder them again

Before we get into the next section, earlier I left an easter egg, which you could quote post and say "I found the easter egg" or something

Now you can put 2 eggs

I 2 was once an egg

(Look I specifically transitioned so I could never be accused of making dad jokes again so that does not qualify)

Alright you've heard enough critiques of Bluesky for a bit and I SAID I was gonna critique the fediverse and I am a WOMAN OF MY WORD

So let's get into it!

I have actually critiqued ActivityPub and the fediverse a lot! I have kind of never stopped critiquing it, ever since the spec was released. There's a lot that can be improved!

I have even gotten criticism from AT LEAST ONE ActivityPub spec author for critiquing AP-as-deployed but I do anyway

Actually something that is funny about ActivityPub is that there's "ActivityPub the spec", which I think is pretty solid for the most part, and "ActivityPub-as-deployed"

Many of the critiques I'm about to lay out we left holes in the spec for which I hoped would be filled with the right answers

One thing we have already discussed so, before I will say anything else, I will repeat: content addressing is really good, and I'd like to see it happen in ActivityPub, and it's *possible to do*, I even wrote a demo of it https://gitlab.com/spritely/golem/blob/master/README.org

Bluesky does the right thing here, AP should too

@cwebber this is a fantastic writeup that brings together a lot of technical criticisms and also things to learn in a really satisfying way. Thank you for taking the time to write it!

@cwebber <grabs popcorn> :)

No but seriously this thread is great, thank you so much for writing this! I'm learning a lot

Content addressing is important. It should not matter where content "lives". It should be able to live anywhere.

A server should be able to go down, and content should survive.

Go content addressing!

Actually with this and several other things I am going to bring up, I actually made sure there was space to do things right: there was a push to make ActivityPub "https-only"

I pushed back on that, I didn't want that requirement, and it was exactly for this reason: enabling content addressing

This isn't the only time I left a critique of ActivityPub-as-Deployed as opposed to ActivityPub-as-it-could-be: see also OCapPub, which critiques the anti-abuse tools of AP as inadequate and leading to "the nation-state'ification of the fediverse" https://gitlab.com/spritely/ocappub/blob/master/README.org

Oh, and ocaps!!!

ActivityPub left giant holes in the spec around two things which sound the same but which are not the same: Authentication and Authorization

Trying to mix these two, you accidentally get ACLs, and then you get confused deputies and ambient authority, plagues of the security world

Anyway, if you know *anything* about me, you know I am a big fan of capability security (ocaps) and that's the foundation of our work over at @spritely

But we will come back to ocaps in a second because it turns out OCapPub is not the only time I proposed AP + ocaps!

@cwebber great thread also not gonna lie, this is starting to feel a bit like one of those nethack halls that is filled with an infinite line of C's

christine's avatar repeated several times with "Show more... 12"

The other time I wrote about ActivityPub + ocaps was in a proposal to, yes, Twitter's Bluesky process in 2020 with @jay.bsky.team titled... "ActivityPub + OCaps"! https://gitlab.com/-/snippets/2535398

I think that document laid out all the right ideas for *the fediverse* (not saying bsky, the fediverse)

Now I want to be clear here that I *don't* think that proposal was necessarily the right one for Bluesky, and I *do* think Jay Graber *was* the right person to lead Bluesky

What I wanted to do required a lot more research, and we have done that over at @spritely instead

@cwebber I've read your entire article, thanks so much for such a detailed & thoughtful writeup. It was very illuminating as I had not come across any discourse yet that really got into the weeds about DID implementations and so forth. I was also not aware of your proposals regarding Ocap. This is interesting stuff I want to try to understand further. I am glad that you are thinking about these pain points with current fedi implementation, we're all feeling them and I hope they can be addressed.

The reason I bring up the proposal here is that I think it has all the right analysis of *what the fediverse should do*, if it was going to rise to the challenge of fulfilling its true potential

So let me lay out what the things in that proposal were:

Here is your recipe for making the "Correct Fediverse IMO (TM)":

- Integrate ocaps, which is possible because actor model + ocaps compose
- Content addressed storage!
- Decentralized identity (notice the *y*, I did not say DIDs) on top of ~mutable CAS storage
- Petname system UX

(cotd...)

(cotd ...)

- Better anti-spam / anti-harassment using OCapPub ideas
- Improved privacy with E2EE ("encrypted p2p" even a better goal)

Whew! An improved fediverse?

"Uh, Christine, this sounds like a lot, do you think the fediverse can take this on?"

Spec-wise in ActivityPub, I think it's possible. The ecosystem, as deployed? I think the ecosystem can and will only do part of it, if we really get everyone excited, maybe the content addressed storage and decentralized identity parts, in which case the fediverse will also survive nodes going down

The ocap stuff, I tried getting fediverse implementers excited about this and tbh, it's pretty hard to design into a Ruby on Rails or Django style framework and mindset. Backporting the right designs to existing systems is a real challenge.

Especially ocaps need to go bottom-up.

For this reason, @spritely's tech looks like it's very focused on computer science'y low-level BS, but that's actually because it's *too hard to build the systems I want right now on top of current technology*, we need stronger foundations

But people have to build for today too

Let's leave the ocap stuff to the side for now, then. Let's focus on what Bluesky and the fediverse have to learn from each other.

- The fediverse should adopt content-addressed storage and decentralized identity
- Bluesky should adopt real, actual federation and decentralization

For this reason @blaine says of both ActivityPub done right and Bluesky done right, "they're the same picture" (The Office meme goes here, yes)

To a large degree, I think @blaine is right

Of course, adapting an existing system as deployed isn't easy.

I will say though that I think if Bluesky were to become *actually decentralized* it would look a lot like ActivityPub in terms of having directed messaging. This will also introduce similar challenges around eg replies, etc.

To the end of the fediverse, perhaps I sound bitter, "they didn't adopt ActivityPub the way *I* saw it!"

The truth is that Mastodon didn't, but Mastodon also saved ActivityPub. It then painted a vision of the future that wasn't, at least, what Jessica Tallon and I expected of it. But it saved AP.

The fediverse and Bluesky, at great effort, could learn a lot from each other in the immediate term.

In the longer term, neither is implementing the ocap vision I think is critical for the big vision, and in a way, I think maybe neither can be easily rearchitected to achieve it. Well, not yet.

When I laid out the ideas of OCapPub to various fediverse developers, the response was "this sounds cool but I have *no idea* how to retrofit a Rails/Django app for this kind of actor-oriented design".

And they were right.

Remember when I said Conway's Law flows in both directions?

Conway's Law says that a technical architecture reflects the social structure under which it was built. But the reverse is also true. The social structures *we can have* are made possible by the affordances of the tools we have available.

"Tech problems/social problems": false dichotomy.

It's for that reason that @spritelyinst.bsky.social, while aiming for a *socially collaborative* revolution, is first focusing on a *technical* revolution.

It's too hard to build massively, securely collaborative tools right now. With Spritely's tools, p2p ocap secure tech is the *default output*.

Remember when I said that IMO @jay.bsky.team is the right person to lead Bluesky and that I am sympathetic with many design decisions of Bluesky (even if critical of them for being non-decentralized)?

Bluesky is building what they can for a scale big objective. The tech flows from goals.

So too does the social structure flow from the tech. It does on Bluesky, and it does on the fediverse.

I won't elaborate further on this, I actually would like you to pause and think about it. In which ways are tech and social systems bidirectional, here and otherwise? It's important.

The vision laid out for the fediverse, both independently in my writings and even in Jay Graber and I's joint proposal... well, it's a big lift.

@spritely would like to see if we can retrofit our version onto ActivityPub. Time will tell if that's a separate thing.

And perhaps this is all my *massive* Cassandra complex speaking. I won't deny that I have one, for better or worse

Still, despite all I have said about both Bluesky and the fediverse technically, it is because I want a hopeful direction for all of us. Secure collaboration. More important than ever.

Let's take another tea break. (And another bathroom break. This teacup is massive.) We're getting close to done, I promise. Just two sections left, they're both much shorter.

Then I can finally brave reading my notifications.

Maybe.

== TEA BREAK THE THIRD: BEVERAGE TRIFORCE ==

@smallcircles

Your chart is ready, and can be found here:

https://www.solipsys.co.uk/Chartodon/113528765697411416.svg

Things may have changed since I started compiling that, and some things may have been inaccessible.

In particular, the very nature of the fediverse means some toots may never have made it to my instance, in which case I can't see them, and can't include them.

The chart will eventually be deleted, so if you'd like to keep it, make sure you download a copy.

Hello, I am back again. Did you miss me? I still am not reading notifications.

Help I started writing this summary at 11am and it is now 6pm here I have wasted a whole day of work

But I have tea, and I also flossed my teeth, and it is time to resume this thread. If you are here, you know why.

transphobia, uspol, returning to tech in a sec

Before we go any further, earlier I mentioned the US House of Representatives, and here I am giving a MASSIVE content warning for transphobia

But @evangreer is the coolest fucking person for standing up to Rep. Mace at the Project Libery summit https://www.fightforthefuture.org/news/2024-11-21-transgender-digital-rights-activist-confronts-hate-monger-rep-nancy-mace-at-internet-summit/

What I am trying to say is I don't have many heroes but @evangreer is absolutely a heroine of mine

You should donate to @fight they are some of the only people doing sensible advocacy against terrible internet laws

Also fuck TERFs

But anyway

Also you have reached it: the third secret egg

You have now collected the egg triforce and can defeat Gender Ganon

If you want to

The power was in you all along

But let's continue.

It's time, we have reached the second to last section: "Preparing for the organization as a future adversary."

I love this one because I love that phrase, and the best part is that the Bluesky team came up with it, "the organization is a future adversary". It's genuinely good and self reflective

Occasionally an org creates a phrase like this, and back in the day Google had "Don't be evil"

And yeah, people criticize Google for never having been sincere but it gave an opportunity for people inside and outside the organization to critique Google on its own stated values. That was good.

It was *at least* good insofar as the moment Google retired the phrase as never really meaning anything anyway, as evil as Google may have been before, Google got *noticably* worse.

To Bluesky people internally: keep that phrase going as long as you can, and use it reflectively.

As opposed to Google's "Don't be evil", a commandment for the everpresent, "the organization is a future adversary" acknowledges the realities of the future, that it is uncertain, and in fact, that power-dynamics-wise, there will be pressure to make things worse.

Making design decisions in the present which guard against the future is one of the most important things we can do. It is one of the most important reasons to choose FOSS licenses, for instance, which provide an exit plan and also counterbalance against temptation to enshittify a project.

To this end, Bluesky's goals of "credible exit" are actually very important. It creates a similar pressure for the organization itself to stay true as long as it can, even acknowledging the organization as a future adversary, and actually preparing for it.

I am pro-Bluesky-credible-exit.

And there *will* be a lot of pressure: Bluesky has taken VC money as investments; the pattern of such is that early on, things are very good and flexible, and after some time, the investors start placing pressure to enshittify.

I have seen good peoples' orgs clawed from their hands. It happens.

This happens despite the very best people with the very best intentions. Talk to early Twitter co-founders and they will tell you the org that things became was not the org that they envisioned.

A future adversary indeed. So we should plan for it today.

Before we continue further, I have done about every job imaginable in a FOSS project/organization. Fundraising, by far, is the worst, and the most stressful.

It's incredibly hard to raise anything to do anything. I think that's worth acknowledging.

The structure of an organization does matter. There's a reason that @spritely is a 501(c)(3) in the US. Any money we take in is a donation: we aren't "delivering on an investment" (though we must deliver on *results*)

Bluesky is a Public Benefit Corporation, also interesting

A Public Benefit Corporation has a mission for the public good, but can take investments in the way a nonprofit cannot. This also means it can move much faster. Given the influx of users to Bluesky, taking investments this way may have been the only load handling route available this fast.

Again, this is all tuned to "What is Bluesky trying to build?"

Bluesky might not be a good "decentralized Twitter replacement", but it is a good "Twitter replacement" with the possibility of "credible exit"

That Bluesky is providing needs for many users who are looking for refuge from a white supremacist site *today* is something to pause and acknowledge the difficulty and scope of doing so quickly and in the moment. I'm glad Bluesky is here at this stressful geopolitical moment in history.

There will be a lot of pressure soon from investors: run ads, make premium accounts that do not actually make sense in a decentralized way, so on and so on.

In this way, "credible exit" is the most important thing for Bluesky the organization and its community to push on *today*

What I will *not* accept is the goalposts being moved on decentralization and federation. Bluesky is neither decentralized nor federated.

If Bluesky wants to become so, it has an enormous amount of work to do, particularly in terms of architectural design.

Blogs are decentralized, Google is not.

Bluesky will face every pressure to be enshittified. Bluesky has even, correctly, acknowledged this. It is up to Bluesky and its community to rise to the challenge of "credible exit" knowing that this is a likely, perhaps inevitable, risk.

The org is indeed a future adversary. So what now?

And here it is. We have reached the final part.

I am not even going to take a tea break. I am not even going to go to the bathroom. I kinda have to, but we are powering through.

We have reached the conclusion of this megathread, and "summary" of an equally long article.

I laid out definitions of "decentralization" and "federation", and Bluesky meets neither, without major rearchitecting or moving the goalposts on those terms, which I cannot accept.

However, "credible exit" is a good goal for Bluesky. Bluesky created that term and it's a good and feasible goal.

I laid out a strong critique, but let me end on a call to empathy.

Bluesky is built by good people, and the fediverse is built by good people. Neither reflect the designs I presently would like to see today, but ultimately these are built by humans trying their absolute hardest.

The infrastructure we build reflects our social dynamics, and our social dynamics are made possible by our infrastructure.

This thread has been long, and I have said everything I have to say. Thanks for listening. I hope we can build a good future for each other. ๐Ÿ’œ

@cwebber

monads? more like GROANads

@cwebber omg, I skipped all the way to the end and OBVIOUSLY you look at this situation from every conceivable angle, including governance, because it wouldn't be a Christine Lemmer-Webber post without it.

I appreciate the depth of analysis. I do still think that Bluesky should make a donation to Spritely if @bnewbold asked you to make a 25-page report, though.

@cwebber I also don't share your optimism about cross-pollination. There's a reason that W3C specifications have to only have normative dependencies on specs from recognized standards bodies. Too many minefields unless you have a clear license.

I'm glad that @bnewbold is in the SocialCG and I hope we can find some opportunities to publish reports with some or all parts of the AT Proto stack.

@cwebber I was definitely surprised how journalists called it โ€œdecentralizedโ€ right when it started. Now I hear journalists call it โ€œfederated.โ€ Bluesky has good PR, for sure.

@cwebber Bluesky calling itself decentralized is like Zoom calling itself end-to-end encrypted: making up your own definition of a common term to capitalize on what people think you mean.

@cwebber @bnewbold my fediverse server costs 5 EUR a month and runs a lot of other stuff as well

@cwebber I have been an XMPP fan for ages, and find it really frustrating that due to networking effects (of the social kind) everyone non technical is hooked on WhatsApp and how good and useful it is (and it is, I use it, but reluctantly), when the concept of Instant Messaging is good and useful. I am very concerned about Who pays for WhatsApp . People assume the WhatsApp fairy.

replies
0
announces
0
likes
0

@evan I am glad you liked it after reading the whole thing :)

I absolutely would not turn down a donation from Bluesky to Spritely should they want to ;P but also @bnewbold welcomed and said he would be "honored" to see me write something, but absolutely did not ask me to write a 25 page document, that's just me lol

But there was too much to cover, and I felt I really could not do the issue justice without covering it from every important angle, so I did. Glad it was well received. <3

@cwebber @bnewbold I didn't read the whole thing. โ˜น๏ธ

Since I actively work on ActivityPub, I can't afford to introduce patented ideas into our specs or extensions, even accidentally or unconsciously.

So, I avoid reading any technical discussions of the BS protocol. I've asked Brian and Mike to offer a public patent license or to release their work through W3C or IETF which also uses a patent license. No luck so far.

Anyway, I'm glad you had fun.

@smallcircles @helge @cwebber i had a looksy at that and the webassembly part for one of the technologies was the only turn-off i could see at a glance.

i realize that the addon system for browsers is tivo-isation by #mozilla (terrible) and that addons aren't harnessing an efficient language/codebase and addons might not be able to do everything in a browser. but by the same token, i dont believe we ought to EXPECT everything to be able to be done in a broser.

@frogzone @helge @cwebber

I think is a smart choice, as it unlocks spritely deliverables in Guile for polyglot development in all wasm-supported languages. And wasm is moving beyond just the browser to become a universal package delivery system for edge, cloud and browser.

"In July 2024, running a Relay on ATProto already required 1 terabyte of storage. But more alarmingly, just a four months later in November 2024, running a relay now requires approximately 5 terabytes of storage. That is a nearly 5x increase in just four months"

wtfh?!

Are they hiding a blockchain or some other idiotic data "structure" in there!? I know warezlords who had hidden directories for IRC DCC bots on compromised servers which weren't such disk hogs.

@teajaygrey @cwebber people post a lot of shit. I'm pretty sure my instance is using like 100 GiBs despite not keeping a copy of every image/video everyone posts. It adds up. The bsky model requires a copy of the entire universe, so it makes sense

@cwebber @bnewbold Anyway, I just set up my own personal monthly donation to Spritely Institute. It was a good reminder!

@cwebber

What exactly do you mean about content continuing to exist? Do you just mean to split into a content server and a regular server, where each had to be taken down separately?

@amici a link that's https://foo.example/cat.jpg can go down but a link that's a magnet link of the hash plus a suggested place to get it can be retrieved even if the suggested place goes down

@cwebber @spritely

For the chronically curious: are there publicly available details about the AP demo youโ€™re referencing here?

@cwebber @bnewbold i think stating upfront that you are trying to be kind and objective in your technical analysis, before your technical analysis, is important, because its so easy for readers to take things personally when you arent intending to do that.

it's also great for your mental health, where if someone does give u an earful, its kind of on them to realize that you put in an effort to try and be kind, and that you even considered it in the first place.

@cwebber @bnewbold sorry for the long ridiculous reply in retrospect--my meds are most effective in the morning and it's the small fraction of the day where i dont get too much brain noise and things are clear, so im able to actually express what i mean coherently lol

@m455 it was good!

@cwebber Don't fascism people also talk about fascism there?

@vivtek yes that too

@cwebber This is excellent, thank you for writing it up. It roughly matches my gut feeling from the little I knew about BlueSky's architecture, but of course with much more expertise and detail

@VamptVo ๐Ÿ’œ

@cwebber@social.coop I read many time the paper to be sure to not miss that point, but it seems you never consider the fact that a relay is not needed and that's explicit on the AT proto documentation here
https://atproto.com/guides/glossary#relay

Relay is only optimisation a huge appview want to use, but smaller appview can directly query PDS, avoiding totally the relay cost.

@aeris if every PDS queries every PDS, that's quadratic cost on scaling

@cwebber FWIW, there are over 1000 independent PDSes right now, and the number seems to grow at about 400-500 a month.

The practical effect is different since the means to control that data beyond exporting without funky command line tools rests with the AppView, which is still extremely centralized.

edit: a list + count https://github.com/mary-ext/atproto-scraping

@Kye thanks, helpful info!

@cwebber I am looking forward to this thread. So much analysis starts with either doubting the intentions or from some level of misinformation or misunderstanding, and it's hard to even begin addressing that since they're usually not willing to listen.

A lot of stuff depends on them reaching at least some of the goals, especially an independent PLC directory (or equivalent if replaced), and informed critique is vital to keeping things on track. blobfoxlurk

edit: I just realized you linked my post! I had my reasons for subscribering it, mostly building a newsletter so the particulars of platforms I don't control matter less. Being stuck at 100 subscribers for a year was a little discouraging, and it's up about 20 now.

@cwebber Relevant discussion from earlier: https://bsky.app/profile/why.bsky.team/post/3lbjdux6ubc2f

Non-archival relays solve some problems, introduce others.

@Kye Oh wow actually somehow I hadn't made the connection that post was you!

Good to know re: why you subscriber-only'ed it

@cwebber I want to be able to speak intelligently enough on the subject, but I only learned enough about Bluesky/ATProto to know that I wasn't interested in using it. Do you think it's worth understanding to be able to explain to people? And/or is there a good brief explainer somewhere?

@elplatt their paper is a good explainer, whether it's worth reading is up to you https://arxiv.org/abs/2402.03239

@cwebber if content addressed storage is considered a new idea then I'm done. I quit

@ionizedgirl nobody said it was new

@cwebber

" Then along came Google Reader and... friends, if you are reading this and are of a certain age range, there is a strong chance you have feelings just seeing the phrase "Google Reader" mentioned."

You got me. Great read, thank you.

@cwebber do you mean something akin to tuplespaces?

@pry yes; usenet, and atproto were called "shared heap" in my document but you could think of them as "tuple spaces"; it's not likely something many readers would know the CS history of to grok though and I already got pretty nerdy

@cwebber Hey don't you model yourself after DFW too much! โค๏ธ

@jandi oh the DFW analogy was in article length only. the similarities mostly end there I think

@cwebber@social.coop woah, easter eggs

@cwebber

"On top of 'message passing' we will build 'publish-subscribe' as a second layer abstraction."

That sounds like an ad for Smalltalk....

/cc @ckeen

@alexshendi @ckeen indeed smalltalk and the actor model have a shared history

@aeva activitypub the specification has lisp jokes all over it

@mlemweb @cwebber Excuse me while I slide in to say you two are the best and I hope you are both having a fine Friday. Also, thank you Christine for this fantastic analysis โ€“ extremely helpful for people who care but Don't Quite Get It like me.

@josh @mlemweb ๐Ÿ’œ

@cwebber@social.coop i mean you're on my timeline every 2 seconds, it's hard to miss

@aeva it is a phrase that was created to make its audience unsettled and uncomfortable tho https://www.dreamsongs.com/WorseIsBetter.html

feeling uncomfortable with it thus is extremely fitting and a good sign

@cwebber hopefully credible exit too? I guess I should just go google it.

@jonbro@friend.camp @cwebber@social.coop Bluesky's credible exit refers to the ability of another organization to host the network, including user data and posts, without needing any special access or OK from the Bluesky foundation (I think). This should be a laudable goal, but its feasibility remains to be seen.

@cwebber Hey Christine, big fan of your work, but holy yapfest. World record for longest fedi thread??

@abbyfluoroethane certainly if not the yappiest thread, one of the yappiest

@evan I am reading backwards in time but @bnewbold encouraged me to speak after I had expressed frustration about biting my tongue about things. I don't think this was for Bluesky's benefit at all and, I think you recognize this later but, tbh my article was *extremely* critical, even if polite

Bluesky folks have received it very thoughtfully but trust me I did *not* take that as a given and it could have very much so have not gone that way. I'm glad it did tho

@cwebber I'm so glad that you continue to enjoy that mug. And I'm also grateful that you do serve on open source program committees.

@baconandcoconut I love the mug and I use it all the time

ESPECIALLY when I have a mega amount of work to get done in which case I put in two teabags and power through

@cwebber (this is a great metaphor for me because YIKES the package room in our apartment sometimes)

@cwebber Real ActivityPub has never been tried

@cwebber did you already get yourself an agent to turn this into a book?

@dottorblaster yes the agent and the book are both called "my blog"

People complain about threading on Mastodon not working right, and @cwebber is just out there like

Screenshot of phanpy UI showing a post from Christine. It says that this post is 143 of x posts in the thread

@cwebber Would I be allowed to call them Mom jokes then?

@cwebber โ€œi still am not reading notificationsโ€ ๐Ÿ‘‘๐Ÿ‘‘๐Ÿ‘‘

@cwebber You have given me - and all of us - an excellent exploration of ActivityPub and Bluesky. For me, itโ€™s the best one Iโ€™ve read on here, period.

So no, you havenโ€™t โ€œmissed a day of workโ€. Quite to the content, youโ€™ve done a good dayโ€™s work, and then some.

@evan @cwebber @bnewbold they said the big novelty check is in the mail

@cwebber @bnewbold it's like living in the movie Memento!

Anyway, good work.

@cwebber I'd like to hear more about AP follows the (Hewitt) Actor Model of Computation, if that's the one you mean. Just having message passing and an inbox and a thing called an "Actor" doesn't make the thing a unit of computation. Given the stated importance to AP, I don't see Hewitt's actor model mentioned in the spec or in any of the WG transcripts, so I'm curious what I'm missing.

https://arxiv.org/abs/1008.1459

@steve https://en.wikipedia.org/wiki/Actor_model#Fundamental_concepts fundamental concepts section on wikipedia summarizes well

@cwebber
Speaking of RSS, any chance your blog site will have an RSS or Atom feed? Long threads here are certainly a choice and I (heard) it's possible yo subscribe to Fediverse accounts as RSS, but a whole-thread-in-one blog post sounds better to me.

@cwebber
I misread that line as "Not TODAY, monads." Which is not quite the same thing.

@cwebber FYI your mastodon account link from your blog website is still to the octodon instance.

@cwebber Some notes:

(Also choosing sha256 over sha256d, thereโ€™s maybe the question of length extension attacks, but I suppose the parsing of the document means this is maybe not a problem, Iโ€™m not sure.)

So a fun thing amout merkle-damgรฅrd hash functions is that theyโ€™re only subject to length extension attacks if used at full length. If truncated theyโ€™re not vulnerable. So SHA-256 and SHA-512 are vulnerable, but SHA-224 (which is SHA-256 with different constants and truncated to 224 bits) and SHA-384 (which is SHA-512 with initial different initial constants and truncated to 384 bits) are not. Back in 2012 NIST standardised SHA-512/224 and SHA-512/256 which are similarly truncated versions of SHA-512 with different initial constants which also sidestep the length extension attack issue.

Anyway this is to say that because they truncated the hash in did:plc identifiers (to a level which feels unwise to me too!) theyโ€™re immune to length extension attcks.

@cwebber I don't get it. You are saying did:web is centralized because it relies upon DNS which is centralized. So Bluesky is centralized. Fair.

But, fediverse instances are hosted to a domain and my AP handle has a DNS too so ActivityPub is centralized too?

@lutindiscret here I was talking about bluesky's decentralized identity stuff. the fediverse is also using centralized and non-portable identity. I am advocating that decentralized/portable identity is a good thing for the fediverse too. in the process I am analyzing how decentralized bluesky's identity currently is

@cwebber how mobile friendly is this article realy?

@perina I proofread it on my phone?

@aaravchen it does have one but browsers no longer tell you it does

it's in the html headers tho https://dustycloud.org/blog/index.xml

@aaravchen fixed, thx!

@cwebber @baconandcoconut

Can confirm (since apparently my job on this thread is corroborating Christine's tea habits)

@smallcircles @cwebber all centralized tech has that.

@serapath @cwebber

Agreed. And it is a huge advantage. I have a hunch how foss grassroots movement, might be way more effective too, and maybe one day to make a true fist to big tech, who knows. Right now we are nowhere even close. But we have most fascinating opportunities.

@smallcircles @cwebber

you shpuld try keet messenger.
it has thousands of peers in rooms.
you coupd look at autobase.

its more building material to make it easy to define and design your CRDTs and related mechanisms for your app ๐Ÿ™‚

if you ever used nodejs, just use the pear runtime to get started.

`npx pear run pear://runtime`

and follow the tutorial ๐Ÿ™‚

@serapath @cwebber

Thanks a lot for these resources. I will have a look! ๐Ÿ‘

@smallcircles @cwebber

IETF and all big standard bodies are the old way of doing things. its the wrong place to look

@serapath @cwebber

I agree. Or rather something is missing.

Right now all the entities that are founded to serve the FOSS community are like arcane and distant temples and mystic shrines that we devs must make pilgrimage to and pray for the right support.

These temples need to come closer to people, come down to earth where they fly aloof, and built bridges too.

This bridgebuilding is part of 2 themes of social coding movement: and , Free Software Development lifecycle.

@smallcircles @cwebber

how about nostr?
how about the pear runtime?
how about dat ecosystem?

the runtime works now.
a p2p messenger like keet works now.
nostr works now.

to me that is way more inspiring than the more academic work of klepmann.
it is also unlikely the next decentralized social media will come from academia

@serapath @cwebber

Klepmann is I feel aiming for internet-scale open standards adoption. With DAT, Solid, AS/AP, many other approaches, I see apps with app-specific ecosystems.

They are nowhere near the ambition level. Unsafe bets for technology decision makers (also FOSS ones).

I interacted with DAT for a bit years ago, giving feedback on lack of attention to non-technical matters and how I felt this put the project at extreme risk, with little chance for success. Same with Solid, AS/AP.

@cwebber ah well see this is why activitypub is marginally better than butt sky unless that also is full of lisp jokes then they're equals

@cwebber *blue

@smallcircles @cwebber

yeah.
affiliation.
viral marketing.
we need to do that p2p too.

sadly too littpe knowledge and attention seem to be channeled into that yet and i hope this changes in the future.

@smallcircles @cwebber

might be that something can be learned here when looking at bitcoin ๐Ÿ™‚

@serapath @cwebber

What I find interesting is the analogy to Big Industryโ„ข.

How is it that big industry can run the most intricate global just-in-time supply lines between ultra-complex factory complexes and their suppliers, and is able to profitably produce consistent output to consumers en masse.

And a collab between 2 foss projects, totaling 4 people, is most likely to end in a catfight drama playing out online. And maybe, if lucky, forks. ๐Ÿ™ˆ

How are we supposed to topple hypercapitalism?

@smallcircles @cwebber

i'd prefer to burn down all those temples. fuck them tbh. we need to make it work grassroots.

the most recent impactful movement that was successfully torpedoed by microsoft was nodejs and npm growth.

the reason they were successful was money.

The nodejs ecosystem grew up and figured its not sustainable for them.

Every used open source repo must be part of supply chains automatically and receive funds to make it sustainable. Without, any movement will fail again imho

@serapath @smallcircles @cwebber
> Every used open source repo must be part of supply chains automatically and receive funds to make it sustainable.

Agreed. Better yet, or maybe this is part of what you meant, create the repo as part of an economic network that also provides for its own material and other needs.

@bhaugen @smallcircles @cwebber

I agree here as well.
Of course - i wanted to leave open how one might tackle the issue, but I do think that direction is the right direction.

The issue is probalby by starting it in this way, a lot more opinions are baked in, thus - what is the least opinionated way of approaching this? ๐Ÿ™‚

That is a tough one

@serapath @smallcircles @cwebber

I can think of two ways to approach software that wants to be part of, and supported by, an economic network:
1. find an economic network and create some software that the network will like and use,
or,
2. create an economic network at (roughly) the same time as creating the software.

We're trying both approaches and we'll see which (or both) works for us.

@bhaugen @smallcircles @cwebber

Of course, but what are all the modalities you might opt into. How exactly does the support look like?

That's more what i meant - of course, the choices (1.) and (2.) you mention seem obvious. If (1.) exists and you like it, join it. If not, you can only choose (2.) or waiting longer for somebody else to choose (2.)

Every such network was at some point started using option (2.) ...but what modalities would you choose when setting it up? what are the options?

@serapath
> Every such network was at some point started using option (2.) ...but what modalities would you choose when setting it up? what are the options?

Don't know yet. With luck, I may find out.

@smallcircles @cwebber

@bhaugen @serapath @cwebber

This discussion is very interesting. Unfortunately it is lost in fleety, threadrotting in fedi timeline history tomorrow.

To the general vision, I'd say let's make it happen.. gradually and sustainably ๐Ÿ˜…

I have ideas..

https://discuss.coding.social/t/proposal-start-a-fellowship-to-explore-the-social-web/571

And there is action already:

https://discuss.coding.social/t/wiki-vision-for-a-fedi-specification/563

Hop on, and join the fun, you are invited. ๐Ÿ˜ƒ

@smallcircles @cwebber

theblast word hasnt been spoken.
dat still survives and everyone learns.

it is easy to make a standard body or to create a foundation for funding or marketing.

The centralized answers are well known, but they have the inherent risk of degenerating the novel solution back to the status quo they tried to escape from.

Finding new decentralized answers on the organizational layer of the stack as well is a lot of work - not just research into the unknown, but implementing

@serapath @cwebber

๐Ÿ’ฏ

@smallcircles @cwebber

It's not entirely true though.
Open source is everywhere and won already and it started with linux

How is it possible that linux as the biggest and most popular example is so stable and contains so many packages and contributors and maintainers?

How is it possible that the entire web runs on bundled npm packages and the deep node_modules folders behind that are again having so many contributors and maintainers?

Its obviously possible and happening, but no compensation

@serapath @cwebber

Open source is everywhere and won. Agreed. FOSS ate the world.

Open source maintainers however. Esp. the free software types.. poor folks. That includes me too, sadly.

If people earning decent sustainable income is a criterium, then FOSS has failed and is inherently unsustainable. Even more so because mosts the fruits of its near-slave labour (not talking hobbyists) are harvested by fat smiling corporate farmers, plucked, low-hanging fruit.

Protected by a license sticker.

@alienghic @cwebber idk if you meant that literally, but that's a funny premise

@mr_breakfast @cwebber

It's inspired by a mix of this article

"Since the conversation, if you can call it that, about trans people always seems to come down to bathrooms, I am sure of one thing.

I would much rather share a ladiesโ€™ room or a locker room with Sarah McBride than with Nancy Mace."

https://www.theguardian.com/commentisfree/2024/nov/22/nancy-mace-sarah-mcbride-transphobia

And also i wanted to make clear who the threat is...men who think they should be dominant. There's just a never ending stream of evidence of republican men doing scummy scummy things to women and children.

@amd @cwebber @bnewbold I don't know.

I wish that Bluesky would publish an open, royalty-free patent license on AT so this wasn't a problem for anyone else in the space.

In the open standards world, we typically only consider work from other open standards, specifically for this reason. So, basing a W3C spec on an IETF RFC.

So my experience here is slim.

@amd @cwebber @bnewbold my understanding is that AT Proto is significantly different enough from ActivityPub in its architecture that it's unlikely we'll stumble across some technique it uses by mistake.

So I think it's a better idea to just steer clear and follow our own path.

@amd @cwebber @bnewbold Brian is in the SocialCG at W3C, so if he thinks some ideas from AT should be used by others, he has the opportunity to publish them as CG reports. These are an extremely lightweight way to give assurance to the community.

https://www.w3.org/community/reports/reqs/

@amd @cwebber @bnewbold I think we can assume that if he doesn't submit those techniques, the company is not interested in sharing them.