Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Keeping two file servers in sync (separate location, not via the internet)

  • 06-04-2012 10:24am
    #1
    Registered Users, Registered Users 2 Posts: 7,318 ✭✭✭


    Hi Guys,

    I have a problem and I was wondering would anyone be able to help me out with it.

    I have two file servers, one in Athlone (linux) and one in Galway(currently windows, but i can change it).

    The Galway one is a branch (a copy) of the Athlone one from a few months ago, the problem is I'm having massive issues keeping them in sync manually.

    The internet in Athlone is not good enough to simply rsync the two servers. I have a 320gb portable hdd for a "sneaker net" (bringing between both sites) connection.

    I have a few ideas on how you might approach this, I am willing to write an application to do this myself, but i would prefer if there was something out there that would do this for me.

    My ideal scenario is that I can run an application or a command, It would check the other servers files (via ssh or http or something) and copy the differences to an external hard drive.

    Another thing that would work fine for me is that each machine would mimic their folder structure on the portable HDD with 0 bytes files with the correct names and then i used some tool on each server to add any files that it didnt find its name sake on the hard drive.

    Another approach might be using version control to detect the differences, again I would need to use the portable HDD to do the transfer.

    Last approach i think of is a seriously throttled rsync. As i mentioned the Athlone internet is the weak point. We dont have much TV (old sky box, no sub) in the house so we use the RTE player and the like to watch stuff alot. So basically I would have to make it only go at night, or employ some device to do bandwidth managment on the net to give the rsync very low priority (easier said than done as the same machine is a web server too so it cant be a IP address that marks it)

    Any suggestions on how to achieve any of the above or any other way of doing this would be greatly appreciated.

    Thanks


Comments

  • Registered Users, Registered Users 2 Posts: 2,771 ✭✭✭niallb


    You could create a vpn tunnel with limited bandwidth and run rsync across that. That would require very little tweaking.

    Some numbers would be really useful:
    What speed link do you actually have at both ends?
    How much data is involved?
    How much changes on a daily basis?

    Which is more important - TV or server sync?
    Are you living at the faster or slower location?
    Bear in mind that maximum download is at best the maximum upload at the other end, so if you've adsl you'll not interfere with incoming TV as much as you might think.


  • Registered Users, Registered Users 2 Posts: 391 ✭✭freelancerTax


    have a look at duplicity
    u can create diffs against a local backup to tranfer later

    ft


  • Registered Users, Registered Users 2 Posts: 7,318 ✭✭✭witnessmenow


    Hey guys, thanks for the replies.

    Ok the numbers:

    There is about 2tb in each location about 90% of which exists in both locations (At a guess i would say there is 100gb or so that is in Galway that isnt in Athlone, and maybe 20gb that's in Athlone that's not in Galway)

    Athlone is eircom 7mb with the "unlimited" add-on. In the real world i get about half of this, I have never seen downloads go any faster than 450kb/s which translates to about 3.6mb. According to speedtest my upload is 0.15mb

    Galway is UPC 20/25mb (not sure which) . I seem to get the majority of this. Not sure what the upload is, 1mb maybe?

    Monday to Friday I live in Galway, and i stay in Athlone at the weekends. Athlone would have active internet users all week. Galway would have less internet activity at the weekend than during the week, but still might have some.

    Most of the downloading would happen in Galway. So Athlone's upstream would be used less.

    The stuff might change daily in Galway, but it does not need to be synced daily (a lag of a few days/weeks would be acceptable).

    Then sync is very much a background activity and should not interfere with anything else going on.

    Any more info needed just shout


    I'll have a look at duplicity thanks


  • Registered Users, Registered Users 2 Posts: 37,485 ✭✭✭✭Khannie


    What kind of files are you looking at? Are they mostly text type files? (e.g. source code) These tend to compress incredibly well so you'd only be looking at transferring a fraction of the total payload using a compressed rsync.


  • Registered Users, Registered Users 2 Posts: 7,318 ✭✭✭witnessmenow


    Videos, they are both media servers to various devices and computers throughout each house. The sync is not for backup purposes, just for convenience


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 78 ✭✭timbyr


    I came up with this as a possible solution assuming bash, find and openSSH are available on both servers. Possibly through cygwin on Windows.

    Copying all files on your local server to your portable HDD that aren't on the remote server.
    #!/bin/bash
    if [ -z "$1" ] || [ -z "$2" ] || [ -z "$3" ] || [ -z "$4" ]
    then
            exit 1
    fi
    ssh "$2" test -d "$3"
    if [ -d "$1" ] && [ -d "$4" ] && [[ $? -eq 0 ]]
    then
            VAR1="$(echo $1 | sed 's/\(.*[^\/]\)$/\1\//')"
            cd $VAR1
            join --nocheck-order -v 1 <(find $1 -type f -printf "%P\n" | sort) <(ssh "$2" 'find $2 -type f -printf "./%P\n"' | sort) | xargs tar -c | tar -xC $4   
    fi
    

    Usage is
    ./script /path/to/local/files [email]user@server.example.com[/email] /path/to/files/on/server /path/to/portable/hdd
    

    Kind of hacky but maybe someone else can build on it.


  • Registered Users, Registered Users 2 Posts: 7,318 ✭✭✭witnessmenow


    Thanks Timbyr! I'll def give that a go!

    Could someone give me a quick breakdown on what it does?

    So it checks the param inputs

    Then logs into the server and moves to the correct folder

    checks if the HDD and local path are folders (not sure what $? -eq 0 does)

    Not sure what the regex is doing :o

    No idea what the last line is doing other than taring up whatever it finds to the HDD

    Is it recursive?

    Thanks :)


  • Registered Users, Registered Users 2 Posts: 78 ✭✭timbyr


    Checks if all the parameters are there.
    if [ -z "$1" ] || [ -z "$2" ] || [ -z "$3" ] || [ -z "$4" ]
    then
            exit 1
    fi
    

    Checks if the local directories exist and if the directory exists on the remote server. The $? -eq 0 part checks the return value of ssh "$2" test -d "$3"
    ssh "$2" test -d "$3"
    if [ -d "$1" ] && [ -d "$4" ] && [[ $? -eq 0 ]]
    

    This bit is unnecessary. I left it in by mistake :P
            VAR1="$(echo $1 | sed 's/\(.*[^\/]\)$/\1\//')"
            cd $VAR1
    
    You can replace it just the following if you want.
            cd $1
    

    List all files in the local directory in alphabetical order and their relative paths.
    (find $1 -type f -printf "%P\n" | sort)
    

    List all files in the remote directory in alphabetical order and their relative paths.
    (ssh "$2" 'find $2 -type f -printf "./%P\n"' | sort)
    

    This takes two sorted lists and outputs lines are in list1 but not list2; ie. Files that are on the local system but not the remote.
    join --nocheck-order -v 1 <list1 <list2
    


    All this does is take a file and it's relative path and copies it with it's relative path to the directory $4.
    I couldn't find any obviously tool to a copy a file ./dir/file to /somewhere/dir/file. Only ./dir/file to /somewhere/file.
    The effect here is that it tars up the file and it's path and then extracts the file and creates the directory structure if it doesn't exist.
    xargs tar -c | tar -xC $4
    


Advertisement