Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pending Move
Re: Pending Move
vorticity Wrote:Stay safe, BA.  "As long as people are smart" sounds like asking for a bit much out of people, though.

Around these parts, 12 inches of rain would be the annual total in a wet year.  I hope y'all can handle it in just a few days.
So far, so good.  We've been getting a lot less rain than I thought we would.

And no worries.  Flood control is absolutely nothing new here in San Antonio.

http://www.mysanantonio.com/150years/ar ... 192388.php

http://en.wikipedia.org/wiki/San_Anton ... lk#History

And a more recent effort:

http://www.expressnews.com/150years/maj ... 378378.php
Reply
Re: Pending Move
Glad to hear you're ok, BA.

Meanwhile, I've avoided work on the forums over the weekend basically out of feeling stupid over the whole thing.  However, on Saturday night Peggy and I came up with a fairly straightforward (and admittedly kinda obvious) way to address the remaining messages with HTML without having to start over from scratch, which, if I get the time tonight (friends are coming over right after work for what I hope is a short visit) I will try to at least get a start on it.-- Bob
---------
Then the horns kicked in...
...and my shoes began to squeak.
Reply
Re: Pending Move
And it worked.  Oh, and Brent?  There are about 5 or 10 messages in the converted DB that still have unconverted HTML in them.  Some of it is mysterious -- <br> tags still in the text -- and some of it is what I assume is essentially unique markup that you didn't anticipate.  Fortunately there were so few that I could easily fix them individually.

What this means is... we are live to start porting threads over to the new board as of right this freakin' second.  I'm going to see how much of the Marketplace I can in fact do in the next half hour or so, using it as my final QA run for my scripting.  However...

LAST CALL FOR NAME CHANGES!  If you've changed your handle between here and the new boards, and you want your old messages associated with the new handle, TELL ME NOW!

ETA:  Hm.  6 threads and two bugs to fix.  The thread record isn't getting the correct thread date -- not sure why that's happening yet, as it worked fine for the very first test thread.  And I just ran into my first old-board user who isn't registered -- gotta do a coalesce on the uid, no big. We're getting there, people.

ETA2:  Well.  You learn something every day.  Don't put sql keywords into message text in such a way that they actually look like programming language.  Crapatalk will give you a full-screen nastygram.-- Bob
---------
Then the horns kicked in...
...and my shoes began to squeak.
Reply
Re: Pending Move
And the Marketplace has been moved in its entirety.  Each thread was moved individually, but I do believe that in the process I've thoroughly debugged my scripts, and I can now batch-move threads without worries.  But that's for tomorrow night. -- Bob
---------
Then the horns kicked in...
...and my shoes began to squeak.
Reply
Re: Pending Move
Hmmm. I noticed the thread count isn't the same, but it looks like that's just because I can see spam topics that have been hidden.

Odd though that the most recent post when looking in the new forum is not the one shown for last post in the forum overview.

-Morgan.
Reply
Re: Pending Move
WOOHOO!!!

BYE BYE CRAPATALK!  Big Grin

EDIT:
This is wonderful news!

Bob, what would you think of doing Other People's Fanfiction next?  Crapatalk utterly mangled my Being You is Suffering story thread when it got moved over, and on top of that I'm gonna have to go in there and rework the timeline.  (Turns out it's supposed to be 1993 to 1994 for the window between OVA2 and OVA3.  That's... gonna change a few things.)
Reply
Re: Pending Move
Gyah.  Double post - sorry!
Reply
Re: Pending Move
Morganite Wrote:Hmmm. I noticed the thread count isn't the same, but it looks like that's just because I can see spam topics that have been hidden.
Odd though that the most recent post when looking in the new forum is not the one shown for last post in the forum overview.
It is in fact the case that three old spam threads were not brought over.  They weren't deleted, just hidden, back when that was part of the standard procedure for reporting spam to ezBoard, and I never cleaned them up.  Of course, Crapatalk still counts them when displaying the thread count for the forum.  It also appears that it counts the global sticky -- the "Pending Move" thread -- even though it doesn't live in the forum.

As for the most recent post, I'll look into that.  It's almost certainly the result of a small bit of logic or perhaps an unwarranted assumption about the data on my part.  If worst comes to worst, I can fix it manually after I'm done with the import.  Of necessity I've become a dab hand at manipulating the board data over the past couple weeks.  -- Bob
---------
Then the horns kicked in...
...and my shoes began to squeak.
Reply
Re: Pending Move
BA, give me a few days on that.  I'm writing my batch thread-transfer code now, and that's going to need testing and debugging before I'm comfortable setting it loose on a large and important forum.  Plus, as a forum with a lot of activity, there are many new posts since Brent did his scrape.  I need to coordinate with him on an incremental update to each forum as I'm ready to bring it over.   

Which reminds me... I need to establish procedure with you guys about moving a busy forum so that nothing gets lost or forgotten during the transition.  I'll probably give notice when I'm going to move one, and lock it, and ask folks to hold off on posting in the new forums until the transfer is done.-- Bob
---------
Then the horns kicked in...
...and my shoes began to squeak.
Reply
Re: Pending Move
Bob Schroeck Wrote:...
Which reminds me... I need to establish procedure with you guys about moving a busy forum so that nothing gets lost or forgotten during the transition.  I'll probably give notice when I'm going to move one, and lock it, and ask folks to hold off on posting in the new forums until the transfer is done.

Putting my "storage guy" hat on...

There are only two safe ways to do offsite replication.

One is to do it in real time - have changes on one side show up on the other immediately. That requires resources that we don't have and aren't likely to have; I strongly suspect Tapatalk won't run a continuous process that helps them lose customers.

The other - the one that we can actually do - is to lock both the source and the destination so that neither can be updated except by the migration script, then read the source and write it to the destination. In this case, "read the source" means "read the diff between what Brent has already captured and the live version", so we'll save some time there.

Anything else risks losing changes made to either the source or the destination while the destination is being populated.--
Rob Kelk
"Governments have no right to question the loyalty of those who oppose
them. Adversaries remain citizens of the same state, common subjects of
the same sovereign, servants of the same law."

- Michael Ignatieff, addressing Stanford University in 2012
Reply
Re: Pending Move
I've no problems leaving the destination unlocked, as long as no one tries to restart a thread that's going to be moved.

I'll drop Brent an email tonight about doing the diffs.  Technically, I suppose I could do them myself, as I have access to the scraper (he put it up on Github), but that would just add to the time it takes to get everything moved over; if he's willing to do them in parallel with me that might save some time.  Plus I need to ask him if his HTML conversion is part of the scraper or separate.  And if it doesn't have the ability to scrape only those messages after a given date/time, I think I'd rather ask him to make that change instead of puzzling through a program written in a language I've never used (Python) to try to find the right place to do it.-- Bob
---------
Then the horns kicked in...
...and my shoes began to squeak.
Reply
Re: Pending Move
Double-check what the new forum uses for unique message IDs before you decide to leave the destination unlocked during a move. Overwriting an existing new message with an imported message would at the least lose the new message, and might (depending on how the forum's database front-end is coded) break the message database altogether.--
Rob Kelk
"Governments have no right to question the loyalty of those who oppose
them. Adversaries remain citizens of the same state, common subjects of
the same sovereign, servants of the same law."

- Michael Ignatieff, addressing Stanford University in 2012
Reply
Re: Pending Move
That's not an issue.  Thread and message ids are autoincrementing identities, and I don't set them during the insertion of records into the new forum database tables.  In fact, one of my early challenges in writing the MySQL code my T-SQL script outputs was finding the right way of getting those IDs after those inserts, for use in records in other tables.

No, I'm actually more concerned about stuff like two different "Fic Update 3x" threads appearing in a forum, which can happen, as there's no uniqueness constraints on thread names.-- Bob
---------
Then the horns kicked in...
...and my shoes began to squeak.
Reply
Re: Pending Move
Can you create a table that keeps track of the new thread ID and the old thread ID?  That might make it easier to deduplicate.

To get updates, I'll wait until we're locked, then grab everything off of the recent changes back until the day I finished my crawl.  It should be easy enough to even limit it to new posts, so long as people aren't editing posts that are more than a day old it should be fine.

I'm actually kind of surprised I found hidden threads, but I guess that's possible.  They must not be so hidden to be displayed to logged-out users.

tid=37 has some translation failures on the first post.  Basically it looks like Tapatalk is running photobucket through an image proxy which might even be good for privacy.  Still, not good in the post data.  That's like an easy regex cleanup, though.

Altogether, so many failure modes to cover in conversion.  Four different forms of email protection, and obfuscation on top of that.  I know the trouble Bob went through because I had similar issues with translation.  Special cases for different products: very fun.-- ?×V
Reply
Re: Pending Move
Quote:Can you create a table that keeps track of the new thread ID and the old thread ID?  That might make it easier to deduplicate.

Um.  Not easily.  The MySQL script knows nothing about the T-SQL database, but I should be able to write a script that compares thread titles to retrieve the IDs.  But frankly, if you just give me the data in the same format as before, I can locate the MyBB threads based on the titles and recycle my most of existing scripting from there.-- Bob
---------
Then the horns kicked in...
...and my shoes began to squeak.
Reply
Re: Pending Move
Well.  Good news.  My thread batching code works beautifully.  I transferred every forum in the Drunkard's Walk section that didn't have posts since the scrape, and locked them here.  Bad news:  phpMyAdmin or MySQL is unhappy with unusually long posts -- like Pyeknu's story chapters and the Nanosteps thread in "Future Steps".  Something in them barfs when I try to insert large text blocks into the message field of a post record.  It also barfs if I try to add them through the board editor. 

Anyway, to get those threads up successfully, I'm going to have to chop the text chunks into smaller pieces and post them in individual messages.

I've also found and installed a spoiler markup plug-in, so all our spoilered stuff will stay that way.-- Bob
---------
Then the horns kicked in...
...and my shoes began to squeak.
Reply
Re: Pending Move
My guess is the post data is stored in a TEXT field, which stores 64kiB.  If you bump that up to a MEDIUMTEXT (16 MiB), you'll be fine.  Since everyone is approved before posting, we shouldn't face a spam problem from that.  I didn't notice this because Sqlite's default TEXT allows you to store 1TB, and it can be compiled to allow 2TiB of storage.-- ?×V
Reply
Re: Pending Move
If Vorticity's suggestion doesn't work, Bob, you can just go ahead and leave out the content of my story posts and I can just edit them back in on the new board.

I'm worried that the ones that got garbled will become even more garbled anyhow. :p
Reply
Re: Pending Move
Thanks, BA.  And yeah, the garbling on Pyeknu's stuff is about as bad as you can get without actually losing or corrupting the text.

Brent, you are right -- the message field is TEXT.  I was actually going to research this morning whether there would be any adverse effects from changing the field to a maximum-sized VARCHAR.  (And once again, I get a Crapatalk nastygram for putting that in the standard form.)  But if MEDIUMTEXT will do the trick, I'll save myself the effort.-- Bob
---------
Then the horns kicked in...
...and my shoes began to squeak.
Reply
Re: Pending Move
And that's what I just did.  I tried to rerun the script for the first of the three threads which crashed and found that the Inserts I generated were truncated for the longest posts.  This looks like a problem in my scripting, so I'll have to figure out why that's happening despite the variable that holds the insert as it's being assembled being a max-sized varchar.  The inserts are certainly smaller than 2 Gb .

I definitely want to solve this before I move Other People's Fanfiction.-- Bob
---------
Then the horns kicked in...
...and my shoes began to squeak.
Reply
Re: Pending Move
Okay, Brent, after looking at the scripts that failed on my last night and figuring out why they didn't work, I can safely say that we have two points of failure where things are getting truncated.  Something on my end in the thread script in truncating the insert statement when it gets too long.  But... the message data that you scraped is also truncated when it gets too large -- probably the TEXT vs MEDIUMTEXT size issue.  You may want to revise that in the package on GitHub.-- Bob
---------
Then the horns kicked in...
...and my shoes began to squeak.
Reply
Re: Pending Move
I think we have a problem.

At least from here, every thread but that one is giving the same error message (not just imported threads).

-Morgan.
Reply
Re: Pending Move
For you as well as Jonathan?  Is anyone else seeing it?  As I just posted in the new forums, I once saw that error, but it was on a thread I'd been working on in SQL at the time, and a reload of the page where link was cleared it up.-- Bob
---------
Then the horns kicked in...
...and my shoes began to squeak.
Reply
Re: Pending Move
http://imgur.com/ZTEJ8px

I get the same problem Jonathan reported___________________________
"I've always wanted to be somebody, but I should have been more specific." - George Carlin
Reply
Re: Pending Move
Yeah, I'm getting it now, too.  And I know why.  I appear to have accidentally blown away every post with a thread ID under 810.  Assume I am cursing vigorously.-- Bob
---------
Then the horns kicked in...
...and my shoes began to squeak.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)