rsync: page allocation failure.

Got problems with your B2 or B3? Share and get helped!
Post Reply
theWebalyst
Posts: 96
Joined: 27 May 2010, 14:53

rsync: page allocation failure.

Post by theWebalyst » 16 Jan 2012, 15:04

From a Windows 7 PC I am using rsync over ssh to backup files to B3. I've been doing this for some time and it works pretty well.

Alongside this I use Paragon to backup the PC's hard drive to an external drive. Recently I added transfer of these very large backup archive files to B3. Each is 4G in size. On the PC rsync runs for quite a while (10mins or so) and then fails, and du on B3 shows that data is being stored until rsync stops.

Looking in B3 syslog I see that rsync has hit a memory problem. I found this thread in which Tor suggests trying "echo 4096 > /proc/sys/vm/min_free_kbytes" in relation to a similar problem.

I have not tried this because "cat /proc/sys/vm/min_free_kbytes" shows it is already at 8192!

Should I try a larger setting or am I bonkers to be trying to rsync multi-Gigabyte files?!

For info here is the syslog with backtrace:

Code: Select all

Jan 16 17:21:54 b3 kernel: [5378047.849540] rsync: page allocation failure. order:1, mode:0x4020
Jan 16 17:21:54 b3 kernel: [5378047.855692] Backtrace:
Jan 16 17:21:54 b3 kernel: [5378047.858322] [<c00326ec>] (dump_backtrace+0x0/0x10c) from [<c0366e8c>] (dump_stack+0x18/0x1c)
Jan 16 17:21:54 b3 kernel: [5378047.866897] r6:00000000 r5:00000000 r4:00000000
Jan 16 17:21:54 b3 kernel: [5378047.871697] [<c0366e74>] (dump_stack+0x0/0x1c) from [<c007d550>] (__alloc_pages_nodemask+0x448/0x660)
Jan 16 17:21:54 b3 kernel: [5378047.881055] [<c007d108>] (__alloc_pages_nodemask+0x0/0x660) from [<c00a2d5c>] (new_slab+0x240/0x278)
Jan 16 17:21:54 b3 kernel: [5378047.890330] [<c00a2b1c>] (new_slab+0x0/0x278) from [<c00a2ef8>] (__slab_alloc+0x164/0x25c)
Jan 16 17:21:54 b3 kernel: [5378047.898736] [<c00a2d94>] (__slab_alloc+0x0/0x25c) from [<c00a309c>] (__kmalloc_track_caller+0xac/0xd8)
Jan 16 17:21:54 b3 kernel: [5378047.908181] [<c00a2ff0>] (__kmalloc_track_caller+0x0/0xd8) from [<c02bb0f8>] (__alloc_skb+0x5c/0xf4)
Jan 16 17:21:54 b3 kernel: [5378047.917445] r8:bf000078 r7:00000020 r6:00000f60 r5:d9cc2c00 r4:df802180
Jan 16 17:21:54 b3 kernel: [5378047.924351] [<c02bb09c>] (__alloc_skb+0x0/0xf4) from [<bf000078>] (ath_rxbuf_alloc+0x30/0x98 [ath])
Jan 16 17:21:54 b3 kernel: [5378047.933568] [<bf000048>] (ath_rxbuf_alloc+0x0/0x98 [ath]) from [<bf0f0eb8>] (ath_rx_tasklet+0x7a0/0x1dbc [ath9k])
Jan 16 17:21:54 b3 kernel: [5378047.943952] r7:00000000 r6:dedd0e80 r5:dedee258 r4:67a1cb5c
Jan 16 17:21:54 b3 kernel: [5378047.949825] [<bf0f0718>] (ath_rx_tasklet+0x0/0x1dbc [ath9k]) from [<bf0efe00>] (ath9k_tasklet+0x70/0x148 [ath9k])
Jan 16 17:21:54 b3 kernel: [5378047.960234] [<bf0efd90>] (ath9k_tasklet+0x0/0x148 [ath9k]) from [<c0044c60>] (tasklet_action+0x7c/0xf4)
Jan 16 17:21:54 b3 kernel: [5378047.969760] r6:c047c2c0 r5:c045a6ec r4:00000000
Jan 16 17:21:54 b3 kernel: [5378047.974561] [<c0044be4>] (tasklet_action+0x0/0xf4) from [<c0045338>] (__do_softirq+0x90/0x14c)
Jan 16 17:21:54 b3 kernel: [5378047.983308] r7:c047c29c r6:00000006 r5:00000001 r4:00000018
Jan 16 17:21:54 b3 kernel: [5378047.989152] [<c00452a8>] (__do_softirq+0x0/0x14c) from [<c0045478>] (irq_exit+0x84/0x94)
Jan 16 17:21:54 b3 kernel: [5378047.997387] [<c00453f4>] (irq_exit+0x0/0x94) from [<c002e054>] (asm_do_IRQ+0x54/0xa4)
Jan 16 17:21:54 b3 kernel: [5378048.005361] [<c002e000>] (asm_do_IRQ+0x0/0xa4) from [<c002ee24>] (__irq_usr+0x44/0xa0)
Jan 16 17:21:54 b3 kernel: [5378048.013417] Exception stack(0xde58dfb0 to 0xde58dff8)
Jan 16 17:21:54 b3 kernel: [5378048.018621] dfa0: 000749a8 002c2d90 77f882ad 8429d2d2
Jan 16 17:21:54 b3 kernel: [5378048.026934] dfc0: 79d85d57 0000009b 76047693 d3fcc3b5 00003f72 00000030 0000003f 955d5689
Jan 16 17:21:54 b3 kernel: [5378048.035254] dfe0: 9f5bdbf6 bef69c50 00044898 00043fb4 20000010 ffffffff
Jan 16 17:21:54 b3 kernel: [5378048.042007] r6:00000016 r5:fed20200 r4:ffffffff
Jan 16 17:21:54 b3 kernel: [5378048.046799] Mem-info:
Jan 16 17:21:54 b3 kernel: [5378048.049227] Normal per-cpu:
Jan 16 17:21:54 b3 kernel: [5378048.052176] CPU 0: hi: 186, btch: 31 usd: 48
Jan 16 17:21:54 b3 kernel: [5378048.057127] active_anon:12081 inactive_anon:16388 isolated_anon:0
Jan 16 17:21:54 b3 kernel: [5378048.057133] active_file:44631 inactive_file:44789 isolated_file:0
Jan 16 17:21:54 b3 kernel: [5378048.057139] unevictable:0 dirty:2231 writeback:0 unstable:0
Jan 16 17:21:54 b3 kernel: [5378048.057145] free:2688 slab_reclaimable:2133 slab_unreclaimable:3672
Jan 16 17:21:54 b3 kernel: [5378048.057152] mapped:3583 shmem:1578 pagetables:919 bounce:0
Jan 16 17:21:54 b3 kernel: [5378048.087721] Normal free:10752kB min:8192kB low:10240kB high:12288kB active_anon:48324kB inactive_anon:65552kB active_file:178524kB inactive_file:179156kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:520192kB mlocked:0kB dirty:8924kB writeback:0kB mapped:14332kB shmem:6312kB slab_reclaimable:8532kB slab_unreclaimable:14688kB kernel_stack:1368kB pagetables:3676kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jan 16 17:21:54 b3 kernel: [5378048.128110] lowmem_reserve[]: 0 0
Jan 16 17:21:54 b3 kernel: [5378048.131601] Normal: 2402*4kB 1*8kB 3*16kB 4*32kB 1*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 10752kB
Jan 16 17:21:54 b3 kernel: [5378048.142341] 97105 total pagecache pages
Jan 16 17:21:54 b3 kernel: [5378048.146324] 6103 pages in swap cache
Jan 16 17:21:54 b3 kernel: [5378048.150050] Swap cache stats: add 47634, delete 41531, find 1426398/1429548
Jan 16 17:21:54 b3 kernel: [5378048.157150] Free swap = 956156kB
Jan 16 17:21:54 b3 kernel: [5378048.160614] Total swap = 1048572kB
Jan 16 17:21:54 b3 kernel: [5378048.178205] 131072 pages of RAM
Jan 16 17:21:54 b3 kernel: [5378048.181497] 3216 free pages
Jan 16 17:21:54 b3 kernel: [5378048.184443] 2261 reserved pages
Jan 16 17:21:54 b3 kernel: [5378048.187735] 3437 slab pages
Jan 16 17:21:54 b3 kernel: [5378048.190682] 50664 pages shared
Jan 16 17:21:54 b3 kernel: [5378048.193887] 6103 pages swap cached
Jan 16 17:21:54 b3 kernel: [5378048.197441] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
Jan 16 17:21:54 b3 kernel: [5378048.203680] cache: kmalloc-8192, object size: 8192, buffer size: 8192, default order: 3, min order: 1
Jan 16 17:21:54 b3 kernel: [5378048.213200] node 0: slabs: 257, objs: 1028, free: 0
Jan 16 17:21:54 b3 kernel: [5378048.218469] skbuff alloc of size 3872 failed

Ubi
Posts: 1547
Joined: 17 Jul 2007, 09:01

Re: rsync: page allocation failure.

Post by Ubi » 16 Jan 2012, 16:12

syncing dvds does not sound like a crazy operation. Rsync is however know for memory problems with large operations. Theres a few options that may help:
1) turn off compression
2) turn off SSH and run the rsync via an rsync server process (enhanced speed is a bonus)
3) maybe try an alternative such as no-cost but non-free allway-sync. Does seem to to partial files but only over SMB or FTP.

Gordon
Posts: 1373
Joined: 10 Aug 2011, 03:18

Re: rsync: page allocation failure.

Post by Gordon » 16 Jan 2012, 17:11

I think you may be hitting some limits on account of 32-bit operation, which can be either 2Gb or 4Gb depending on whether address space is using signed or unsigned numbers. A quick check seems to indicate that the B3 should be able to handle files upto 4Gb, but you're rsyncing with a Windows machine which implies cygwin and that has a known limit of 2Gb. Could it be that what you're seeing in the logs on the B3 is actually callback information from the cygwin module?

Ubi
Posts: 1547
Joined: 17 Jul 2007, 09:01

Re: rsync: page allocation failure.

Post by Ubi » 17 Jan 2012, 06:31

The B3 can handle files bigger than 4GB without problem. How did you come up with this limit? It's certainly not restricted by ext3... (only NFS2 has a known limit for 4GB). But still, there has been another thread here with someone complaingin about files >4G not transferring properly.

Gordon
Posts: 1373
Joined: 10 Aug 2011, 03:18

Re: rsync: page allocation failure.

Post by Gordon » 17 Jan 2012, 08:48

Ubi,

It's not about filesystem or operating system - the limits apply to rsync itself.

Ubi
Posts: 1547
Joined: 17 Jul 2007, 09:01

Re: rsync: page allocation failure.

Post by Ubi » 17 Jan 2012, 10:19

yes, that statement was correct in 2006, and it is still in the FAQ somewhere, but they increased that limit some time ago. I think I've done files >12GB over cygwin some time ago and it works fine.

Gordon
Posts: 1373
Joined: 10 Aug 2011, 03:18

Re: rsync: page allocation failure.

Post by Gordon » 17 Jan 2012, 13:22

I stand corrected - my intel appears to be outdated.

Found an entry on the old.nabble site though (listed as Bug #477377) that seems very similar to the described problem. The suggested fix is to add/change a setting in sysctl.conf. Remarkably enough this setting was already discovered by the Excito team but given a much lower value than claimed to be a proven workaround.

/etc/sysctl.d/bubba_min_free_kbytes.conf

Code: Select all

# This settings increase the allocable memory which is required for
# certain applications that needs to be able to allocate more than
# the default limit

#vm.min_free_kbytes = 8192  # original setting by Excito
vm.min_free_kbytes = 65536
Other suggested settings to investigate are:
  • vm.lower_zone_protection
  • vm.vfs_cache_pressure

theWebalyst
Posts: 96
Joined: 27 May 2010, 14:53

Re: rsync: page allocation failure.

Post by theWebalyst » 23 May 2012, 12:23

I have finally solved my rsync problems. Applying all the the changes suggested here (and even recompiling the cygwin rsync to ensure it was using the correct protocol) DID NOT FIX THE ISSUE.

However, after updating B3 to 2.4.2 a couple of days ago I find my rsync runs for hours and hours and transfers many tens of G without exiting, regardless of file size. Prior to this it was getting stuck on files of a few 100M simply because it didn't run for long enough to complete them. So I had to re-run the transfer many times and it took many hours, even days, before it ran long enough to complete some files.

The Solution
The solution for transferring files to B3 using rsync, from Windows 7 x64 using 802.11n WiFi (and maybe other setups!) is:
- some WiFi fix in B3 software version 2.4.2

Possibly (not checked yet, but this is currently in place):
- setting a large value for /proc/sys/vm/min_free_kbytes (I currently have 65536)

I Did Not Need To:
- disable rsync compression
- fiddle with rsync timeouts
- use rsync in daemon mode

This bug has been a real pain and wasted a lot of my time investigating and trying different solutions, as well as leaving me with a hotch potch of backup solutions. Now I can tidy that up no end, fantastic.

Mark

Post Reply