0

My intention is to keep two remote directories (say dir1 & dir2) synced. So that whenever there is a change in the content of the dir1 (can be addition or deletion of new file or directory or modifying the content of a file in the directory) then the change should be propagated to dir2 and vice versa.

The naive way I can think of doing this is to run rsync periodically via cron in both the machines. But there are fallacies in this approach:-

  1. It might happen that the previous rsysnc is not complete and the cron executed rsync once more while the previous rysnc is still going on.
  2. A new file is added in dir1 and before rsync ran on dir2 rsync on dir1 ran then newly added file might be deleted from dir1 since it is not present in dir2
  3. Also this is not real time.

Can some suggest some better way of doing this? I am looking for open-source which will be easier to set-up and get started.

3
  • DRBD and a cluster file system? Commented Sep 20, 2016 at 9:06
  • Can you provide a little more explanation? Merits and demerits of the two. Can you suggest some open-source tools which will provide me the cluster file system? Commented Sep 20, 2016 at 9:09
  • What kind of information is stored in that directory? Are we talking about Gigabytes or just bytes? What kind of application is accessing that directory? Commented Sep 20, 2016 at 9:47

2 Answers 2

1

The only way to provide a "hard real-time" guarantee without a race window is to make sure, a write is acknowledged only after it has hit both sides. The usual way to achieve this is with a cluster file system (such as OCFS2 or GFS2) on a shared block device. Such a shared block device can easily and inexpensively be build using DRBD.

As with any sync mechanism, your intra-cluster network must be able to carry the change rate with acceptable latency.

The cheat sheet is around the lines of

  • Reserve a block device (disk, partition, LV, ..) on both sides
  • install and configure DRBD (apt-get install drbd-utils), use the excellent documentation at their website
  • Install a cluster stack of your choice: Full fledged Red Hat (if you need more than just a shared file system) or the minimalist but very easy O2DLM (included in OCFS2).
  • Format the DRBD device with either GFS2 (Red Hat stack only) or OCFS2 (possible with both stacks) and mount it on both sides

You now do NOT have a pair of synced directories: You have a single directory, that is available on both nodes. This is functionally the same, but without the race windows.

0

If it is on the same machine consider using a file system link. It would mean either dir2 "points" to dir1 or vice versa while it would be transparent. In order to do this use ln e.g. ln -s /path/to/dir1 /path/to/dir2 or alternatively use relative paths. The -s means it's a symbolic link so it's just referring to the path and not the inode of dir1.

Edit: Sorry, missed the "both machines" part.

Maybe consider using a Network Share, syncing two locations in (near) real time is always hard.

1
  • Directories are on different machines. I have updated the question. Commented Sep 20, 2016 at 9:45

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.