The impending database implosion???
BotB Academy Bug Reports and Feature Requests
 
 
75279
Level 6 Playa
Yuki
 
 
post #75279 :: 2016.12.21 5:57pm
  
  anewuser, MiDoRi, RazerBlue6, Jimmyoshi, sleeparrow, Dimeback and Modus Ponens liēkd this
I considered replying to the existing thread about this, but it has an ugly title which I didn't want to bump!

I noticed there have been intermittent database errors and other shenanigans, the nature of which look exactly like the result of corrupted data.

This is where I get worried, because data corruption propagates and spreads itself. I also see several irreproduceable and untraceable errors pop up in the bug reports, and I wouldn't be surprised they're all caused by the very same thing.

I saw puke said something about the database needing to be rebuilt with a new encoding. It probably does, but I doubt it will fix this specific issue. The most recent errors I've seen don't look like the result of database encoding mismatches.

A large part of these errors mention broken database tables, so the logical choice would be to assume that the database image itself is corrupted, except....

- I noticed a plain PHP error in the reports, unrelated to mysql
- If it actually was a broken database table these errors would show up consistently on the same page, and not intermittently.

I'm afraid this is actually a hardware problem with the server. :x I'm thinking along the lines of a failing hard disk, memory, motherboard and/or processor core(s).

If that were the case, and one were to repair all errors and corruption in the database and rebuild it, it'd quickly get corrupted again.

Now I'm not sure whether this site is still hosted by dreamhost, moved to a dedicated server or whether it's directly hosted on puke's estate.

My suggestion? If feasible, test the server hardware for integrity. A hard drive sector check and memtest should rule out any hardware issues. I guess it's impossible if this server is managed and held hostage by a hosting company. And even then BotB would be down for however long that takes (several hours/days)

What I fear the most is that this site is hosted on a server that's gone bad, and the company in charge is reluctant to fix it or even check its integrity. Hopefully that's not the case!

Most of this is me speculating (based on what I've learned). I don't know what has been tried to fix/diagnose these problems. Maybe the powers that be can shine a light on this! It would be a big relief if it turns out my worst fears are impossible.
 
 
75284
Level 14 Mixist
johnfn
 
 
post #75284 :: 2016.12.22 12:19am :: edit 2016.12.22 12:19am
  
  sleeparrow and MiDoRi hæitd this
  
  anewuser liēkd this
> This is where I get worried, because data corruption propagates and spreads itself.

I'm a programmer and I can assure you that this does not actually happen. :) So hopefully that can help some of your fears right off the bat.

One possibility for what is going on (I haven't witnessed the problem, so I can't really say for sure) is that some unicode characters that the website can't handle got inserted into the database somehow. The corruption would "propagate" in the sense that if a comment had a bad character, then anywhere where that comment would show up would also have the "corruption".

But in reality, nothing is broken! Simply going and editing the database to remove the bad character would fix the problem. No data is lost in the database.
 
 
75285
Level 23 Pixelist
MiDoRi
 
 
 
post #75285 :: 2016.12.22 4:50am :: edit 2016.12.22 4:51am
  
  sleeparrow, Yuki and anewuser liēkd this
If the data corruption was caused by failing hardware (HDDs) it most likely WILL propagate, since hard disks' count of failures/irrecoverable sector errors tend to increase gradually.

Also, from programmatic point of view, any new data that derives (is computed from/takes as arguments) from some other data already corrupted would get corrupted too, so it surely can spread further, and wreak some havoc.
 
 
75289
Level 23 Renderist
anewuser
 
 
 
post #75289 :: 2016.12.22 6:58am
  
  Yuki, ViLXDRYAD and MiDoRi liēkd this
Yuki, johnfn, ty for your concerns. They are mine as well. Had had two hdds fail on me slowly but surely. I wouldn't rule out a disc sector check, even if it takes a week. Also, I'd most definately backup now. All content it's important, but in particular compositions from music, to visuals, to the oddballs like mariopaint or svg ohcs are the utmost important. Easily 10k entries? I lost count on it.

Unicode being the reason? Who knows but back in 2005-2006 unicode was in diapers but botbers keep testing the boundaries with submissions, and msgs to see what/if/when/how unicode characters would break the site. It was often that ohcs had to be fixed by puke7.

My thinking as well, and my concern is hardware failure.
 
 
75290
Level 23 Pixelist
MiDoRi
 
 
 
post #75290 :: 2016.12.22 7:50am
  
  ViLXDRYAD liēkd this
Yeah, if BotB isn't backed up anywhere yet, Puke should do this ASAP, losing so many precious entries is no fun :(
 
 
75295
Level 6 Playa
Yuki
 
 
post #75295 :: 2016.12.22 9:53am
  
  sleeparrow and MiDoRi liēkd this
I am very much against doing a file system or hard drive check before running memtest.

The danger here is: if there's random bits being set/reset during read/write operations by a faulty processor/memory sector, then fsck will detect errors in a perfectly healthy filesystem. If you let it try to fix those errors you'd end up with a corrupted filesystem. :x

I also think a memory sector/processor core failure is much more likely than a harddrive sector failure due to the random and intermittent nature.

I'm sure the database is backed up regularly. Usually with this sort of thing, the entries are stored outside of the database. The database only contains a local path to them.

I'm not sure how often all the entries are backed up. Probably whenever the entire server image itself is backed up. The last backup might be years old even! I personally don't have much of a problem with this since I haven't made anything in the past year. :P
 
 
75297
Level 23 Pixelist
MiDoRi
 
 
 
post #75297 :: 2016.12.22 10:03am
Your own entries are of least concern, Yuki, since most probably you have local copies and sources of those :D
 
 

LOGIN or REGISTER to add your own comments!