Is the daily down time of lemmy.world because of attacks, system instability, or some kind of resource issue?
It seems like most times I go to my .world account, I get the bad gateway error. Is there a fix for this?
It seems like most times I go to my .world account, I get the bad gateway error. Is there a fix for this?
It's really getting old. But if it's DDoS attacks, I'm not giving up. I might if it was just repeated resource issues, but I refused to be pushed by assholes.
There's no reason not to use a different home instance. It's generally a good idea for sharing the load.
As you can see, I posted this from a different instance, so I get that, but I've spent a lot of time subscribing to things from my .world account and it works take a long time to duplicate that on an alt, so my preference would be to stay there as my default.
I haven't tried it myself but there seems to be something for this: https://github.com/CMahaff/lasim
I assume I'd need to do that on a desktop, correct? I've been exclusively mobile, but could give it a try.
It was me. There is a light switch in my garage … I didn’t know what it was for. Figured out today it launches attacks on Lemmy.world 😭
On that score, there's not much use trying to reason it out. The 4chan types seem to like kicking down other people's sandcastles. It doesn't buy them anything, they just seem to get off on increasing unhappiness.
I’d personally blame Slovenia, but that might be unfair as there is no evidence or motive. They may still be behind my late car payment, though. In all seriousness, no one knows but the admins have posted about it weekly.
Sync was released recently, and - like many apps before it - directed its users to lemmy.world as a 'default' instance, so they've had an influx of users to contend with.
Also, if you ask for the 'next page' of communities via their API, it'll just keep feeding you the same ones, over and over, even if you ask for Page 1 Billion, so there's probably some bots, crawlers, front-ends etc thrashing the hell out of it.
I don't know about DDoS attacks, but while they're happening, it seems to act a catch-all to blame any problems on.
Its not the new user influx. If all users would magically disappear it will still be down. Its an issue with the database and the queries from lemmy.
Its a lemmy bug with the backend. The Database breaks down because of some queries and is stuck until a restart.
Does the database not come with features to analyze query times and log queries that exceed a given threshold, I thought this was a core feature of Postgres.
Or are you saying that it's a general issue, and everything is taking longer than expected?
I'm no Postgres expert, but I think log_min_duration_statement is the one way to find query that take an abnormal long time to execute.
Causes the duration of each completed statement to be logged if the statement ran for at least the specified amount of time. For example, if you set it to 250ms then all SQL statements that run 250ms or longer will be logged. Enabling this parameter can be helpful in tracking down unoptimized queries in your applications.
https://www.postgresql.org/docs/current/runtime-config-logging.html
I know scaling DBs can be tricky, but I also know there are bootstrap solutions that go pretty high up before needing custom work.
I don't know enough about the lemmy infrastructure, but did they build some custom thing scratch framework or did they start with something stable and tested?
The thing is, it is not the amount available databases. Rather 1 query takes super long and it blocks everything
Lemmy uses the Diesel ORM. Lemmy uses a large collection of Rust libraries, so I guess you could say they rolled their own framework. I've never encountered a framework that I believe could handle non-trivial high-traffic web applications. I worked on a project that used Django for years. By the time we were done, we bypassed almost all of Django's functionality to get it to scale with our data and users.
I'm completely ok with this happening, it is to be expected with a project of this magnitude.
Be prepared, have a second account on another instance.
What I find strange is the absence of communication. This is the third day with major issues and I haven't seen an announcement. Or even just people talking about it? Nothing (until now). I might have missed it all, but I used the search function with no luck
I agree completely, and posted this question from an alternate account. The admins have been so transparent, the silence on this issue seems really weird. Even the posts on the instance status site are relatively cryptic. I posted this question because it's frustrating not knowing what's going on.
Might be the new admin call out for volunteers from a few days ago. If so, I think someone just failed the first day exam. The only way to deal with this is far far far more transparency about ineptitude, and someone that learns extremely quickly.
It's been going on since well before that, and they've been pretty good at transparency generally, so that seems unlikely to me.