Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Linux - monitor the written size of a file while it is being written

  • 28-10-2019 11:16pm
    #1
    Registered Users, Registered Users 2 Posts: 14,049 ✭✭✭✭


    I want to monitor, in real time, maybe every three seconds or so, the size of a large file as it is actually written to a slow device such as a USB stick.

    Anything I have tried so far reports completion long before the file is written ..... presumably because it has been copied to buffer/cache and thus the copy is complete while the writing continues.

    Can anyone give me some direction or links that might help achieve this?


Comments

  • Registered Users, Registered Users 2 Posts: 8,073 ✭✭✭10-10-20


    Sounds like a sparse file, ie, that the space has been pre-allocated on file creation.

    Here are two examples. file.img is a 512MB sparse file created using 'truncate' (truncate -s 512M file.img).
    While file.heavy was created using a dd from /dev/urandom (dd if=/dev/urandom of=file.heavy bs=1M count=512).

    ronnie@Mint-E6420:~$ stat file.heavy
    File: file.heavy
    Size: 536870912 Blocks: 1048584 IO Block: 4096 regular file
    Device: 801h/2049d Inode: 536813 Links: 1
    ...
    ronnie@Mint-E6420:~$ stat file.img
    File: file.img
    Size: 536870912 Blocks: 0 IO Block: 4096 regular file
    Device: 801h/2049d Inode: 536805 Links: 1
    ...

    According to this article you can use du to test for sparse-ness.
    https://wiki.archlinux.org/index.php/Sparse_file


  • Registered Users, Registered Users 2 Posts: 14,049 ✭✭✭✭Johnboy1951


    It seems sparse does not apply here.

    I tested using rsync to copy an ISO to a USB stick, monitoring the target partition size with
    du -m -s <target>
    

    and once started its size increased quickly to nearly 50% of of the 5GB file I was copying. Thereafter it slowed to an expected speed of writing, as near I could judge until the target size was equal to the source ... but the writing process did not stop there; it continued on for about the same time again until it stopped and it synced
    sync
    

    It seems to me that du is reporting the size copied by rsync, and not the size written to the target.

    Am I misinterpreting this?

    Is there some specific command that will reveal what is actually written and not a combination of 'written+buffer' as I interpret this to be?

    Thanks.


  • Registered Users, Registered Users 2 Posts: 14,049 ✭✭✭✭Johnboy1951


    OSI wrote: »
    Did you try it with --apparent-size?

    In truth I now do not recall ...... I went down so many rabbit holes I am unsure.

    I do know that the way I read that option it would not be suitable ..... but on re-reading it now, it is less than clear what it measures ..... so I will definitely give it a try today, thanks.


    EDIT:

    Did a quick check and --apparent-size gives the size of the file when it will be written ..... the reserved space for it.
    So it appears here.

    Is this not correct?

    .


  • Registered Users, Registered Users 2 Posts: 14,049 ✭✭✭✭Johnboy1951


    I checked using
    rsync-fadvise
    today, and it seems it did not impact the behavious of the destination.

    It might have affected rsync's behaviour when reading from source ... maybe not to cache that data. I did not check.

    Essentially I now have a two stage feedback to the user ...

    1. informed when copying with rsync (and some writing to destination)
    2. informed when rsync has finished, but a lot of data remains to be written from buffers to the destination.
    rsync -a $source $dest

    The aim is to get both 1 & 2 above together in one information dialog, showing the percentage of the source actually written to the destination (so that it can be cleanly unmounted).

    Any further thoughts appreciated.

    I really am amazed that such a basic function does not seem to be present in Linux, or I have not thought of using whatever is available :)


  • Registered Users, Registered Users 2 Posts: 8,073 ✭✭✭10-10-20


    I might be wrong, but you may not be able to complete #2 as the caching of filesystem objects is transparent to most bash applications. Meaning to say that I don't know of a way of working out the remaining blocks of a file in cache. It's supposed to be this way, as it's a write-back cache.

    For #1, what filesystems have you tested on the USB key? Could this is a FAT32 'feature' which we're not understanding? What happens with EXT2/3? What happens if you stat the file during copy, can you see that the inode count increasing? Because if it's a linux FS and you don't, then the inodes (the 4k chunks of a disk) aren't being assigned dynamically.


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 14,049 ✭✭✭✭Johnboy1951


    10-10-20 wrote: »
    I might be wrong, but you may not be able to complete #2 as the caching of filesystem objects is transparent to most bash applications. Meaning to say that I don't know of a way of working out the remaining blocks of a file in cache. It's supposed to be this way, as it's a write-back cache.

    For #1, what filesystems have you tested on the USB key? Could this is a FAT32 'feature' which we're not understanding? What happens with EXT2/3? What happens if you stat the file during copy, can you see that the inode count increasing? Because if it's a linux FS and you don't, then the inodes (the 4k chunks of a disk) aren't being assigned dynamically.

    Thanks for your feedback.

    I have gone for a mixture of commands.

    Rsync is fine for smaller files and directories with little content.
    For larger files, such as distro ISOs I am now using 'dd oflag=nocache' and have I think achieved most of my aim.

    I use only Linux and Linux filesystems - almost exclusively ext4.

    What was catching me was copying the files of an extracted ISO, where one large squash file, when using rsync, had a huge buffer and so screwed up the % calculations.
    By dividing the files up between the two commands I seem to have gotten as close as possible to 100% accurate.

    It was pointless to guage the % based on file size alone as the write speed of the various target devices vary hugely.

    Thanks to all who contributed.
    I hope I now have it locked down sufficiently well. ;)

    (still amazed why it has to be so difficult, and there is not an overall simple command to influence what follows and prevent the use of cache/buffers)


Advertisement