3.5. Text files

In UNIX derived systems there is no difference between text files and binary files, but in DOS derived systems like Microsoft Windows this is not true: "new line" code is translated to the two characters sequence "carriage return" "line feed"; Mac OS use a single but different code to represent "newline" concept.

POSIX I/O does not afford the issue because the API does not provide string related functions: the programs have to deal with "binary buffers" and these sort of issues are considered "application side problems".

C standard I/O tried to mask the issue adopting the concept: program does not know the internals of the operating system and "newline" is transparently encoded/decoded by standard I/O library. This approach is very elegant but there's a subtle problem: when a text file is moved from a UNIX style system to a DOS style one, the file must be "translated". In the file transfer world, this was not an issue, of course: all the data mover since FTP age perform the "newline" translation. In the data sharing age the solution is not so easy: imagine a GNU/Linux system serving UNIX systems through NFS and Windows systems through SAMBA. With a bit of imagination you may think a multi platform application running on UNIX and Windows... what happens with "text" files? Which "standard" should be adopted?

The elegant solution seems to be bugged when files are shared among UNIX, Windows, Mac, etc...

At the time of this writing libjf does not provide a "transparent" dealing of newline dilemma: instead of opening a "text" journaled file, an application can choose to open a "DOS text journaled file" appending a "D" to "open mode".

Example 3-4. dos_text.c

     1	#include %lt;jf_file.h>
       
     2	int main()
     3	{
     4	        int rc;
     5	        jf_file_t jf;
     6	        size_t write;
       
     7	        rc = jf_file_open(&jf, NULL, "jf_tut_foo", "wD", NULL);
     8	        if (JF_RC_OK != rc)
     9	                return 1;
       
    10	        rc = jf_file_printf(&jf, &write, "%s", "Hello world!\n");
    11	        if (JF_RC_OK != rc)
    12	                return 1;
       
    13	        rc = jf_file_commit(&jf);
    14	        if (JF_RC_OK != rc)
    15	                return 1;
       
    16	        rc = jf_file_close(&jf);
    17	        if (JF_RC_OK != rc)
    18	                return 1;
       
    19	        printf("DOS text program is OK!\n");
    20	        return 0;
    21	}
      

dos_text.c source code can be compiled with this command:

libtool --mode=link gcc -Wall -I/opt/libjf/include -L/opt/libjf/lib -ljf \
        -o dos_text dos_text.c
    
execute it and verify the produced journaled file:
tiian@linux:~/tutorial> ./dos_text
DOS text program is OK!
tiian@linux:~/tutorial> od -cx jf_tut_foo
0000000   H   e   l   l   o       w   o   r   l   d   !  \r  \n
        6548 6c6c 206f 6f77 6c72 2164 0a0d
0000016
    
you can note the produced journaled file is very likely the journaled file produced by hello_world program, but the newline sequence is now encoded following as "DOS standard" (carriage return, line feed).

3.5.1. Conclusions

3.5.2. Future developments

3.5.2.1. What about MAC OS X?
3.5.2.2. Will a "transparent flag" be provided in the future?

3.5.2.1. What about MAC OS X?

Unfortunately I don't have it: when the port will be performed, this issue should be solved; I suppose a new "open mode" flag might be introduced, for example "M", to specify a "MAC OS text journaled file".

3.5.2.2. Will a "transparent flag" be provided in the future?

I don't think a transparent flag like "T" (text) is useful because it's a bit confusing: think to an application compiled as Microsoft Windows native and as Cygwin emulation... When executed as native it should adopt DOS standard, but when executed as a cygwin application it should adopt UNIX standard... Who's taking care about user's mind? I know I cannot change the world, so if a lot of people asked for it, it would be developed.