Blocking instances to clear the 500k queue

Today kbin.social is blocking a huge list of domains just to get federation working again.

The reason for this temporally block is not to defederate, but rather to get the large backlog of 500k messenger queue processed again. Anyway, this does mean that kbin.social is federating again with other instances.

This is a temporary measure. Several users / developers are looking into how to better optimize the failed message queue, as we speak. Hopefully Ernest has eventually time to dive into solutions as well instead of workarounds, once his instance is migrated to Kubernets. See my preview thread: https://kbin.melroy.org/m/updates/t/4257/Kbin-federation-issues-and-infra-upgrade

List of the domains causing trouble:

lemmygrad.ml, eientei.org, vive.im, lemmy.ml, lemmynsfw.com, kbin.lol, lemmy.webgirand.eu, tuna.cat, posta.no, lemmy.atay.dev, sh.itjust.works, kbin.stuffie.club, kbin.dssc.io, bolha.social, dataterm.digital, kbindev.lerman-development.com, test.fedia.io, mer.thekittysays.icu, lemmy.stark-enterprise.net, kbin.rocks, kbin.cocopoops.com, kbin.lgbt, lemmy.deev.io, lemmy.lucaslower.com, lemmy.norbz.org, social.jrruethe.info, digitalgoblin.uk, pwzle.com, lemmy.friheter.com, federated.ninja, lemmy.shtuf.eu, u.fail, arathe.net, lemmy.click, thekittysays.icu, lemmy.ubergeek77.chat, lemmy.maatwo.com, faux.moe, eslemmy.es, seriously.iamincredibly.gay, test.dataharvest.social, programming.dev, kbin.knocknet.net, pawb.social, lucitt.social, longley.ws, kbin.dentora.social, atay.dev, lemmy.kozow.com, ck.altsoshl.com, pawoo.net, techy.news, lemmy.vergaberecht-kanzlei.de, lemmyonline.com, beehaw.org, pouet.chapril.org, kbin.pcft.eu, fl0w.cc, lemmy.sdf.org, lemmy.zip, feddit.dk, fedi.shadowtoot.world, lemmy.noogs.me, lemmy.kemomimi.fans, social.agnitum.co.uk, fediverse.boo, hive.atlanten.se, forkk.me, lemmy.ghostplanet.org, lemmy.mayes.io, lemmy.mats.ooo, lemmy.world, lemmy.sdfeu.org, lemmy.death916.xyz, geddit.social, masto.fediv.eu

Sign in to add comment

Growing pains only can happen when you’re growing! I’ve been loving Kbin so far :)

That is great to hear! But your account seems to come from lemmy.world, lol?

You can have accounts on both.

That is true, the fediverse doesn't know any limits.

Yes correct! I’m trying out mlem for fun, and I had this account on here!

Does this point to an inherent problem with the federated approach; i.e. that every instance has to be able to handle the load of the content on all other instances it federates with?

Pardon if I'm misunderstanding something. But it seems like a big barrier to entry for new instances if e.g. an instance with 100 users has to sync the contents from 100,000 other users to work properly. As the fediverse keeps growing and the requirements to host instances keep increasing, won't it end up where only a few instances have the money / resources to handle the load?

My understanding is that they don't have to handle all traffic from all instances, but rather all traffic that anyone in that instance interacts with. So if you made one for your own personal use, then requirements will only scale up with the number of instances you interact with.

It does seem like it's going to be a big issue specific interfacing with the really large instances though. We'll see how it goes.

Well, actually.. big instances like kbin.social (but that is the same with other big instances or software), need to process not only the outbox but also the inbox of the activitypub. So this thread with comments, need to be sent to several instances. And each instance need to process this comment now. And the other way around is also true, I created this thread but that needed to be sent towards multiple instances (which is converted the inbox from their server perspective). Kbin.social had mainly issues with sending out messages actually (the outbox), because many instances are not responding or are blocked, causing retries and eventually also an increase in the queue.

Right, but if you have a small instance for personal use, only stuff from communities/magazines you have interacted with are sent, no? The complete traffic from those communities shouldn't be that hard to handle in my understanding. Unless you subscribe to 196 in which case good luck. ;)

(I just blocked them, my first block so far. Even though it doesn't show up in /sub it was too much on the "all" page.)

I thought so as well.. But I see threads in my random magazine, from users I do not follow nor I follow this magazine or domain. For instance this thread: https://kbin.melroy.org/m/random/t/4864/A-cache-of-90s-internal-documents-at-Sega-has-suddenly

It seems like once the remote user is added in kbin (due to some post or thread from a magazine I do follow), I will get all the comments, threads and posts from those users as well. On top of the magazines and people I do follow...

But it does seem like for the largest instances the economics will favour centralisation. Decentralisation is typically expensive and inefficient as it often requires duplication of resources.

ActivityPub protocol seems to create big instances (like the big and populair mastodon, kbin and lemmy instances). I hope that spreading content across and finding information across all fediverse will improve over the coming years. And users should be understand the concept of decentralization much better. Users moving to other (smaller) instances will so bring the load down.

Bottom line, I hope AP will be further improved. And users will understand the power of decentralization better (in currently a very very centralized www).

Soon kbin.social instance will be moved to the new infrastructure (using Docker on a kubernetes cluster). Which hopefully would fix all of those scalability issues we're currently experiencing.

So it’s two-fold: the underlying technology and the amount of data it can handle.

Expect growing pains as they (the instances) find tech that works.

The underlying technology has quite some down-sides indeed (activitypub) in terms of scalability. At the same time, large/big instances of the fediverse need to process large amount of data (not only local data but also external data from remote instances). Plus /kbin was still in early development phase, not fully ready to scale yet, so it was a big unexpected (due to Reddit ...) migration. All the things I just mentioned are now coming together, all at once.

It's kind of awesome we are having these problems really.

Is there somewhere we can see the status of the queue? like useless but fun info for users. Gives people some insight in the amount of messages send and received.

i can imagine wasting hours at work looking at that xD

I was thinking of starting my own instace just so I can look at the data.

Sorry for the other picture, that was for the lolz. But RabbitMQ is behind firewalls. You can't see any live status at the moment. I can sent you an image however of how it looked like in RabbitMQ. See attachment.

Its catching up now. for an example of something similar https://zkillboard.com/ztop/

So what would that mean? We wouldn't be able to see the content for them for now? Cause I'm from Lemmy world lol

You are now viewing content from my instance actually, not kbin.social. But basically, yes, but I think not for too long. I expect kbin.social to unblock several domains again (incl. lemmy.world) after the queue is fully processed.

If they've temporarily defedorated already then they won't see you comment. Obviously this one isn't nearly as drama-filled, but here's the post about what being defederated means from when beehaw did it:

https://lemmy.world/post/149743

Ignoring the commentary parts, the technical explanation of what it means should still apply here.

Correct, hence you don't want to defederate too long. Also there are possibilities to backfill data again. Meaning you are able to fetch older posts and comments again on your instance, restoring federation.

I'm surprised to see lemmy.sdf.org on the list of domains causing trouble when it only has 2.7k users. Growing pains is weird lol

Actually in that case it's not the amount of users. But the downtime of that server yesterday, causing a spike in queued messages on kbin.social.

See errors: https://fediverse.observer/lemmy.sdf.org

Or.. it could be that lemmy.sdf.org blocked kbin.social.. Which is also possible, but I can't verify this. I can only confirm that https://lemmy.ml is actively blocking all kbin instances, based on user-agent strings.

My main account is on SDF, and they've been upgrading their server hardware today so hopefully it's all worked out soon.

ETA: https://lemmy.sdf.org/post/502127

Ah interesting. I don't believe kbin is blocked so the downtime theory sounds like it could be it.

"Posted 107 minutes in the future"

And the number is actually going downwards :D