New user's registration have been closed due to high spamming and low trafic on this forum. Please contact forum admins directly if you need an account. Thanks !

What witchcraft is this? System transfer problem

Got problems with your B2 or B3? Share and get helped!
Post Reply
stasheck
Posts: 126
Joined: 15 Jan 2014, 13:13

What witchcraft is this? System transfer problem

Post by stasheck »

I'm in deep trouble. It's quite probable you're going to facepalm a few times when reading this, but please refrain from caustic comments - I'm already quite exhausted after spending my entire Sunday (which I had other plans for) on this issue.

TL;DR: I do not have working system, and B3 seems to be doing weird stuff to partition table of drives installed internally.

I wanted to move to a new harddrive - 3TB HGST proved to be waaay too loud, so I ordered 2TB WD Red for the B3. Also: I am running Arch, installed using Sakaki's scripts.

I had a pendrive with Arch and Sakaki's scripts lying around since I first moved to Arch, so it's probably v1.0 or 1.1 or sth around that. I checked the script - as you know, it creates partition table, copies system files, kernel etc., and then you're good to go.

I connected the WD Red over USB enclosure to be able to transfer the ~1.5 TB of data while server was still online - just stopped Samba and AFP, but otherwise transferring the system seemed simple - this seems the same thing that Sakaki's script was doing.

Thing to note: my /home is a separate partition.

The data got transferred over during Saturday and Saturday night, and by Sunday I was ready to move. (now, this might or might not be important: I did pacman -Syu prior to copy, and did not reboot system afterwards, I just forgot to do it).

I installed WD Red in B3, reloaded... only to be met with purple blinkenlight. Weird. So I started system from aforementioned pendrive - imagine my surprise when I saw that the partition table is missing! (or rather: there was something there, and drive claimed to have msdos partition table - I used GPT - but at that time, it didn't seem important to write it down. Now I know this is the first occurence of witchcraft). Well, the partition table might be missing, but I just spent over 20 hours copying stuff, and the data is there - so I created new partition table, recreated all of the partitions, and lo and behold - I could access all the data again, no problem. So I rebooted again.

B3 went up, network connectivity went up, all is nice and dandy... hey, why does it not accept my SSH password?

Another reboot with pendrive - weird stuff. Journald shows "failed password" for every login attempt. I even moved the password to pendrive /etc/shadow to test it out, and it worked fine. So, weird stuff. I enable root login and rebooted. Root password did not work either.

So I connected HGST via USB enclosure to md5 the data for corruption. Imagine my surprise when HGST did not have partition table as well! (witchcraft part II) Well, I dealt with this using first drive, so I tried doing the same: recreate partition table, create partitions... only this time it did not work. Maybe I forgot how I set them up in the first place; maybe there's something else in place. So, the data is there, so I'll probably be able to recover it, but for now it was no-go.

Now comes a few hours that are a blur, and it's hard for me to remember what I did exactly. Somewhere near the beginning, without apparent reason, network on B3 ceased to start. Neither LAN nor WAN came up. So, any further troubleshooting was: change something, try to boot, wait, reconnect cables, reboot with pendrive, change again and so on.

At some stage, I noticed that networking seems not to go up, because script that gives eth0 and eth1 their MAC addresses is failing; it's failing because /dev/mtd0 was missing. Also, some other devices were missing. That seemed to be some kind of trace - I tried copying devices from pendrive (as in Sakaki's script) - no go; then I tried using old kernels from pacman's cache (using depmod afterwards in chroot) - still no go. I tested quite a number of them.

I can't even give you a log now, since despite copying fragments on pendrive, I left the pendrive at home (I'm writing from the office now).

There's clearly something I failed to do, but I don't know what it is.

And, above all: can someone explain what B3 is doing with partition tables? (I noticed it also about half a year ago when disposing of old HDD, but it didn't seem important back then - I just had to reformat drive anyway).
Gordon
Posts: 1461
Joined: 10 Aug 2011, 03:18

Re: What witchcraft is this? System transfer problem

Post by Gordon »

I don't think B3 does anything to your partition tables. However if you use the wrong tool to read a GPT partitioned disk it will show you a single msdos partition. I guess that is what you were seeing?

As for the missing mtd devices, I noticed those missing in one of the first B3 Gentoo kernels as well. Seemed to be a driver issue, because when I told the system it was another flash type supported by that same driver it did load and created the three mtd devices - be it with a warning that it did recognize that it was in fact the flash type from the original system description. The issue was automagically fixed in a later kernel (i.e. the kernel guys were not even aware the problem had existed).

Login issues can have several causes. It may be an incorrect setting in your /etc/sshd/sshd_config or /etc/pam.d/sshd. I even stumbled on an article mentioning random console messages having been entered to /etc/nologin.

I think the most straightforward method to fix this is to start fresh and use the latest release from the Live USB. Is that an option, or would that require you to reinvent everything you did after installing Arch the first time?
stasheck
Posts: 126
Joined: 15 Jan 2014, 13:13

Re: What witchcraft is this? System transfer problem

Post by stasheck »

I still think that it does something to partition tables - I was only using fdisk to check/see; besides, even kernel does not recognize them. Maybe I'll dig a bit more in some spare time.

Anyway, I gave up trying to restore the system - after all, the reason to set up /home on separate partition was to handle occasions just like this one. So I reinstalled all from scratch, which allowed me to cleanly get rid of shorewall :-) and back to plain iptables.

At least one of the problems seems to be that with update to OpenSSH 7, you can no longer login as root, which bite me in posterior when I updated the install stick :-) Live and learn.

Another 3 hours and system was running same was as before.
Post Reply