Migrating Linux sever with 200+ MILLION FILES
This article describes steps to resolve migrating Ubuntu 16 server with 200+ millions of files
Note – This issue was seen by the Customer is OCI – However, the issue is more with the ORIGIN/TARGET configuration than specifically with any Cloud Provider.
Background:
When migrating the Linux server (Ubuntu 16 in this case ) with 200+ million files we ran in to below issues
Issue #1
++++++
rsync run out of memory error , this is due to rsync is trying to buffer all the 200+ millions file and trying to copy this takes long time and will run out of memory after some time . To resolve this issue we need to apply RMM patch( Please find the patch and details instruction how to apply in Before you begin section ) to instruct rsync to do incremental copy instead of bulk copy
May 12 01:35:10 m360-transfer-london rmm:00000000000159ca:00007f4999de7700:NOTICE:host_sync.cpp : 3351: | opt string details = sync failed: rsync error: error allocating core memory buffers (code 22) at util2.c(106) [sender=3.1.2] (22)
Issue #2
+++++++
After applying the patch to RMM to make the rsync to perform incremental copy we still run in to below error , After investigating found that RC of the issue is due to limitation in ext4 file system Please find the more details in the RC of the issue section
symlink "/mnt/rackware/tmp.pQP7aQ/journey_files/216468/proxy_symlinks/2020-04-12/ada858350c0945945655b7b30e9c6b03.css" -> "/mnt/data1/journey_files/216468/proxy_assets/2020-04-12/e98a43397ee3816953d36a209045045f.css" failed: No space left on device (28)
RC of the Issue:
The issue here is target is hitting internal limits within the ext4 file system, and it's behavior is a little unpredictable when it gets into these extreme cases. Specific file names will randomly be impossible to create when this happens. XFS is a better file system type for this kind of application that creates a large number of files, especially if there are individual directories that contain a large number of files (which is what they seem to have).
So I think the only solution we can offer here is going to be to make the target file system XFS instead of ext4.The sync they are running is bound to fail with the same error due to this ext4 limitation.It's not a bug in RMM software or in the kernel, but a limitation of the file system ext4 itself.
Also please find the attached Document with detailed explanation of what is the exact issue and why we need to switched to xfs file system
Before you Begin:
Apply the patch ( rt-21522-v7.4.0.583.patch) attached to the RMM this will resolve the issue #1 mentioned above
Please find the below instruction and patch attached Steps to apply patch +++++++++++++++++ 1) Copy the rt-21522-v7.4.0.583.patch to /opt/rackware in rmm 2) cd /opt/rackware 3) patch -p1 < rt-21522-v7.4.0.583.patch Steps to make the patch to work only on the server with less memory +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1) Create /opt/rackware/utils/common/incscan.txt 2) Put the target ip address in the above file 3) Multiple server syncs can be selected by putting one target IP address per line in incscan.txt Note : The sync progress for the targeted server will be inaccurate
Use case/Applicable To:
Migrating Linux server with 200+ millions file with ext4 file system
Preparation/Pre Req’s:
3) After the AP completed configure the /opt/rackware/utils/common/incscan.txt file with the target IP
4) Run the sync without no-transfer flag
Steps to convert from ext4 to xfs 1. ssh root@<target ip> 2. lsblk --> this cmd to find the device of large file system with 200+ million files 3. blkid --> this cmd to find the UUID of the device of large file system with 200+ million files 4. wipefs -a /dev/sdb1 --> here the /dev/sdb1 is the device with large file system with 200+ million files 5. mkfs.xfs -b size=4096 -I maxpct=3 /dev/sdb1 6. xfs_admin -U <UUID of Drive> /dev/sdb1 7) start the wave
Contact:
Any issues or you need assistance with, please contact Support@RackwareInc.com