Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Reading the Nth character in a file?

  • 20-06-2011 6:29pm
    #1
    Registered Users, Registered Users 2 Posts: 9,034 ✭✭✭


    Suppose I have a file which contains just a string of 0s and 1s and nothing else.
    I don't want to open the file, and I don't want to print anything or change anything within the file - I just want to know whether the Nth character is a 1 or a 0.
    What would the most efficient way of doing this be?


Comments

  • Closed Accounts Posts: 5,082 ✭✭✭Pygmalion


    If you're writing a program to do it you can seek() (man fseek and/or lseek for the docs on the C standard library function).

    For example to get the 7652th character you could do
    //Open the file etc.
    fseek(file, 7651, SEEK_SET);
    //Read from the file etc.
    

    No idea if there's a way to do this on the command line with the usual tools.

    Worth noting that since data is generally read from the disk in blocks there might not actually be a real performance difference between reading 1 byte and reading 100 bytes, unless the bytes span more than one block on disk.


  • Registered Users, Registered Users 2 Posts: 3,721 ✭✭✭E39MSport


    Technically, if you don't open the file you can't access it afaik.

    Something like cut is simple.

    echo "this is a test string" | cut -c6 will yield 'i' (edit, but that obviously line based. perl may better suit for an entire file)


  • Registered Users, Registered Users 2 Posts: 37,485 ✭✭✭✭Khannie


    head -c <nthchar> <filename> | tail -c 1


  • Closed Accounts Posts: 5,082 ✭✭✭Pygmalion


    Khannie wrote: »
    head -c <nthchar> <filename> | tail -c 1

    Still involves reading in the first 'n' characters of the file, so can still be quite inefficient for very large files.


  • Registered Users, Registered Users 2 Posts: 9,034 ✭✭✭Ficheall


    Cheers for the replies. Aye, yon cut -c seems to do exactly what I want but for a string. The problem is there are 8 files of about 400M, so I'd rather not have to read them if avoidable.


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 13,077 ✭✭✭✭bnt


    You could try this in Python, either scripted or in an interactive shell. Saying "open" opens a handle to the file, it doesn't mean you read the whole file in to memory. Something like this (via the docs):
    f = open('/tmp/workfile', 'r+')
    f.seek(7651)
    f.read(1)
    
    returns the 7652th byte of the file only.

    You are the type of what the age is searching for, and what it is afraid it has found. I am so glad that you have never done anything, never carved a statue, or painted a picture, or produced anything outside of yourself! Life has been your art. You have set yourself to music. Your days are your sonnets.

    ―Oscar Wilde predicting Social Media, in The Picture of Dorian Gray



  • Registered Users, Registered Users 2 Posts: 37,485 ✭✭✭✭Khannie


    Pygmalion wrote: »
    Still involves reading in the first 'n' characters of the file, so can still be quite inefficient for very large files.

    Agreed, but only an issue if both n and the file are very large. Seeking is better of course, but it's quick and easy to type and very easy to remember (half the battle :)). Even very large files should be done quickly though. I'd expect 80+MB/s.


  • Registered Users, Registered Users 2 Posts: 368 ✭✭backboiler


    I'm late as usual, but what about the following?
    dd bs=1 count=1 if=$inputfile skip=$position
    

    PC here gave me the 300,000,000th character in an uncached 450 MiB file in 24 microseconds.


  • Registered Users, Registered Users 2 Posts: 37,485 ✭✭✭✭Khannie


    Nice. <3 dd.


Advertisement