Rule number one: every operating system has some differences when dealing with data synchronization. libjf should be portable across many environments and it's difficult to take benefit of some specific operating system related features when the software must be portable.
Rule number two: documentation from standards are very weak; just to figure out what "very weak" means, take a look to documentation available in IEEE std. 1003-2001
libjf supply two type of synchronization: "fast" and "safe".
This type of synchronization prevent data loss in case of application crash and does not supply any warranty in case of system crash.
Fast synchronization uses fflush
function
to flush buffer content to operating system: in the event of
application crash, operating system closes all open file
descriptors and queues pending data for writing. If the application
crashed its data would be saved by operating system.
This type of synchronization prevent data loss in case of system crash.
Safe synchronization uses fdatasync
(fsync
when the previous is not available)
function to sync device content.
An application may hard code the type of synchronization specifying
flag JF_JOURNAL_PROP_SYNC_SAFE
or
JF_JOURNAL_PROP_SYNC_FAST
at
jf_journal_open
time:
jf_journal_t j; struct jf_journal_opts_s jopts; jf_set_default_journal_opts(&jopts); jopts.flags |= JF_JOURNAL_PROP_SYNC_SAFE; rc = jf_journal_open(&j, "jf_tut_foo-journal", 2, &jopts);this method has all the benefits and the disadvantages of "hard wired" parameters. libjf allows you to specify the type of synchronization at run time: this is the default behavior, but you may ask for it by your own:
jf_journal_t j; struct jf_journal_opts_s jopts; jf_set_default_journal_opts(&jopts); jopts.flags |= JF_JOURNAL_PROP_SYNC_ENV_VAR; rc = jf_journal_open(&j, "jf_tut_foo-journal", 2, &jopts);an application that uses
JF_JOURNAL_PROP_SYNC_ENV_VAR
searches for
environment variable JF_JOURNAL_SYNC_TYPE
to establish the type of synchronization must be used:
environment variable is defined and its value is "0": fast synchronization is adopted
environment variable is defined and its value is "1": safe synchronization is adopted
else: JF_JOURNAL_PROP_SYNC_SUGGESTED
synchronization is adopted (take a look to
"API reference guide")
Showing the effects of different synchronization type is a hard job out of the scope of this tutorial, but an example to empirically verify the performance gap is easy to build.
Example 4-1. many_hello_world.c
1 #include <jf_file.h> 2 int main() 3 { 4 int rc, i; 5 jf_file_t jf; 6 size_t write; 7 rc = jf_file_open(&jf, NULL, "jf_tut_foo", "w", NULL); 8 if (JF_RC_OK != rc) 9 return 1; 10 for (i = 0; i < 10000; ++i) { 11 rc = jf_file_printf(&jf, &write, "%s", 12 "Hello world!\n"); 13 if (JF_RC_OK != rc) 14 return 1; 15 rc = jf_file_commit(&jf); 16 if (JF_RC_OK != rc) 17 return 1; 18 } /* for (i = 0; i < 10000; ++i) */ 19 rc = jf_file_close(&jf); 20 if (JF_RC_OK != rc) 21 return 1; 22 printf("Many hello world program is OK!\n"); 23 return 0; 24 }
many_hello_world.c is like
hello_world.c but it performs 10000
transactions instead of only 1. We do not specify
JF_JOURNAL_PROP_SYNC_ENV_VAR
because it's
the default option.
To compile many_hello_world.c you can use
this command:
libtool --mode=link gcc -Wall -I/opt/libjf/include -L/opt/libjf/lib -ljf \ -o many_hello_world many_hello_world.cexecute it:
tiian@linux:~/tutorial> rm jf_tut_foo* tiian@linux:~/tutorial> export JF_JOURNAL_SYNC_TYPE=0 tiian@linux:~/tutorial> time ./many_hello_world Many hello world program is OK! real 0m0.499s user 0m0.149s sys 0m0.345s tiian@linux:~/tutorial> rm jf_tut_foo* tiian@linux:~/tutorial> export JF_JOURNAL_SYNC_TYPE=1 tiian@linux:~/tutorial> time ./many_hello_world Many hello world program is OK! real 0m3.478s user 0m0.130s sys 0m0.390s tiian@linux:~/tutorial> rm jf_tut_foo* tiian@linux:~/tutorial> unset JF_JOURNAL_SYNC_TYPE tiian@linux:~/tutorial> time ./many_hello_world Many hello world program is OK! real 0m0.507s user 0m0.173s sys 0m0.331ssecond execution take 7 times the first; third execution is very like the first: this means current value of
JF_JOURNAL_PROP_SYNC_SUGGESTED
is
JF_JOURNAL_PROP_SYNC_FAST
but in the
future it might be changed. To check the journaled files
contains 10000 rows issue this command:
tiian@linux:~/tutorial> wc -l jf_tut_foo 10000 jf_tut_fooTo check journal contains 10000 commits issue this command:
tiian@linux:~/tutorial> jf_report -j jf_tut_foo.jf | grep commit | wc -l 10000Please pay attention many_hello_world is not a benchmark program! To measure libjf performances, utility program jf_bench is supplied, but this is another tale.
To test synchronization, crashes must be reproduced. Application crash is easy to simulate: a division by zero exception, a segmentation fault exception, etc... Simulating a system crash is a much difficult task; a realistic simulation is probably an impossible task without hacking the operating system kernel. Despite this fact, some types of test must be performed against a "journaled files library"...
libjf implements a "crash simulation feature" used to stress the library with crashes in all the interesting code steps: this simulation should be sufficiently closed to a real crash to declare "libjf should be a safe journaling tools". Nothing is engraved in the stone and some stuff might be changed in the future.