3. System

3.1. How can I inspect the content of a journal file?
3.2. Can I change the type of synchronization for a journal file without changing the program?
3.3. Which type of synchronization should I choose?
3.4. Is there a list of environment vars recognized by libjf?

3.1. How can I inspect the content of a journal file?

Use utility program jf_report to obtain an XML dump of journal file content. Some options are available to increase/reduce verbosity.

3.2. Can I change the type of synchronization for a journal file without changing the program?

If the program open the journal using flag JF_JOURNAL_PROP_SYNC_DEFAULT or JF_JOURNAL_PROP_SYNC_ENV_VAR, synchronization may be changed using environment var JF_JOURNAL_SYNC_TYPE:

export JF_JOURNAL_SYNC_TYPE=0

for fast synchronization (fflush)

export JF_JOURNAL_SYNC_TYPE=1

for safe synchronization (fdatasync)

3.3. Which type of synchronization should I choose?

JF_JOURNAL_PROP_SYNC_FAST is fast and works fine if system crash (hardware, operating system, power supply) is a seldom event you don't have to care of. JF_JOURNAL_PROP_SYNC_SAFE is slow and works fine when system crash is a frequent event or you can not deal with its consequences. Good I/O devices are necessary to achieve good performances.

3.4. Is there a list of environment vars recognized by libjf?

Yes, this is the list:

JF_TRACE_MASK

exadecimal value containing the mask of traced modules (purpose: development & debugging)

JF_JOURNALED_FILE_CACHE_SIZE

number of bytes used to cache a journaled file;

Note

the value is used only if:

  • it's greater than hard wired constant JF_CACHE_FILE_DEFAULT_LIMIT

  • program does not specify a specific value

JF_CRASH_SIMUL_POINT

place where crash must happen; note: the value is used only if library has been built with --enable-crash-simul option (purpose: development & debugging)

JF_CRASH_SIMUL_COUNT

number of times the process must cross the crash simulation point before a crash can happen; note: the value is used only if library has been built with --enable-crash-simul option (purpose: development & debugging)

JF_JOURNAL_SYNC_TYPE

type of synchronization (fast or safe) must be used; note: the value is used only if the program does not specify a specific value

JF_JOURNAL_VIRTMEM

virtual memory usage for internal buffering purposes; note: the value is used only if the program does not specify a specific value

JF_JOURNAL_SIZE

journal file size; note: the value is used at creation time only if the program does not specify a specific value

JF_JOURNAL_NUM

number of journal files must be kept in rotation process; note: the value is used at creation time only if the program does not specify a specific value

JF_JOURNAL_ROTATION_THRESHOLD

filling up ratio must be reached before a journal rotation can happen; note: the value is used at creation time only if the program does not specify a specific value.

3.1. Utilities

3.1.1. How can I create a journal without writing my own C program?
3.1.2. How can I add a standard system file to a journal file without writing my own C program?
3.1.3. How can I remove a journaled file from the control of a journal without writing my own C program?
3.1.4. Can I rename a journaled file using standard system command mv?
3.1.5. Can I move a journaled file to a different filesystem using utility jf_rename?

3.1.1. How can I create a journal without writing my own C program?

You can use utility jf_create to create a new journal file; journaled files may be added using utility jf_join.

3.1.2. How can I add a standard system file to a journal file without writing my own C program?

You can use utility jf_join where "join" means "a new standard file joins the group of files journaled using a specific journal file".

3.1.3. How can I remove a journaled file from the control of a journal without writing my own C program?

You can use utility jf_leave where "leave" means "a journaled file leaves its journal and become a standard file".

3.1.4. Can I rename a journaled file using standard system command mv?

No, renaming the file breaks the link between journal file and journaled files. To rename a journaled file, use utility jf_rename.

3.1.5. Can I move a journaled file to a different filesystem using utility jf_rename?

If your rename (man 2 rename) implementation supports move across filesystems yes, otherwise no. GNU/Linux 2.6.x and glibc 2.3.3 do not support this type of operation ( EXDEV error is returned).

3.2. Recovery

3.2.1. How can I recover journaled files after a crash?
3.2.2. How can I verify the "recovery pending" status of a journal?
3.2.3. How can I recover from a "recovery pending" status?
3.2.4. How can I establish if a journal is damaged?
3.2.5. How can I recover a damaged journal?
3.2.6. How can I guess the operations would be performed if recovery phase was performed?
3.2.7. How can I see the operations performed in a recovery phase?

3.2.1. How can I recover journaled files after a crash?

You can use utility jf_recover specifically designed for this purpose. This is a quite flexible utility can be used:

  • to check if a journal needs recovery

  • to create an XML report with the operations would be performed if a recovery phase was performed

  • to perform a recovery phase.

These functions apply to "recovery pending" journals and "damaged journals" too.

3.2.2. How can I verify the "recovery pending" status of a journal?

Use utility jf_recover with -t flag:

jf_recover -j <my journal name> -t
	  
and check exit code:

  • 0: journal needs recovery

  • 1: journal is damaged

  • 2: error

  • 3: journal is not in "recovery pending" status.

3.2.3. How can I recover from a "recovery pending" status?

Use utility "jf_recover" and check exit code is 0. Example:

jf_recover -j <my journal name>
	  

3.2.4. How can I establish if a journal is damaged?

Use utility jf_recover with -t and -f flags:

jf_recover -j <my journal name> -t
	  
and check exit code is 1, then
jf_recover -j <my journal name> -t -f
	  
and check exit code is 0

3.2.5. How can I recover a damaged journal?

Use utility jf_recover specifying -f ("force") flag and check exit code is 0. Example:

jf_recover -j <my journal name> -f
	  

3.2.6. How can I guess the operations would be performed if recovery phase was performed?

Use utility jf_recover with options -t ("test") and -d ("dump"). Example:

jf_recover -j <my journal name> -t -dh
	  
Type jf_recover -h to see the list of sub-options related to -d

3.2.7. How can I see the operations performed in a recovery phase?

Use utility jf_recover with option -d ("dump"). Example:

jf_recover -j <my journal name> -dh
	  
Type jf_recover -h to see the list of sub-options related to -d.

3.3. Tuning

3.3.1. What is the maximum size of a journal?
3.3.2. How many backup journals will be kept?
3.3.3. At what time does "rotation process" happen?
3.3.4. How can rotation threshold be set?
3.3.5. How can the cache of a journaled file be tuned?
3.3.6. Is there an official libjf benchmark tool?
3.3.7. Why is libjf slower than stdio?
3.3.8. How fast/slow is an application developed with libjf in comparison with stdio?
3.3.9. Why should I use libjf if it is "so slow"?
3.3.10. Which factors do influence jf_bench results?
3.3.11. Can I guess how slow will become my application after migrating it from stdio to libjf?
3.3.12. Is there any difference between "append" and "update" from a performance point of view?
3.3.13. Can I obtain raw test data from jf_bench?
3.3.14. Can I use different block size, number of records, number of files, etc... with jf_bench?
3.3.15. Is libjf already optimized or should we expect better results for jf_bench in the future?
3.3.16. jf_bench utility program stops and shows this error:
jf_bench/bench_test_results_compute: error while computing 
benchmark results: -20/ERROR: journal file exceeds maximum 
desired size and operation can NOT be performed; a global 
sync/rollback is necessary to activate journal rotation
	  

3.3.1. What is the maximum size of a journal?

Journal maximum size is determined at creation time:

  • the program can specify it setting file_size field of struct jf_journal_file_opts_s

  • if the program does not specify it, the value of environment var JF_JOURNAL_SIZE is used

  • if environment var JF_JOURNAL_SIZE is not set, the default value (see hard wired constant JF_JOURNAL_DEFAULT_FILE_SIZE) is used.

The maximum size of a journal file can be retrieved using utility jf_report.

3.3.2. How many backup journals will be kept?

Journal file can NOT be expanded indefinitely by design. A full journal is archived appending suffix "1" and a new journal file is created: this is called "rotation process". The number of old journals must be kept is determined at creation time:

  • the program can specify it setting file_num field of struct jf_journal_file_opts_s

  • if the program does not specify it, the value of environment var JF_JOURNAL_NUM is used

  • if environment var JF_JOURNAL_NUM is not set, the default value (see hard wired constant JF_JOURNAL_DEFAULT_FILE_NUM) is used.

The number of old journals kept by a journal file can be retrieved using utility jf_report.

3.3.3. At what time does "rotation process" happen?

A journal file rotates when "rotation threshold" is reached and a global sync point is asked by the application. "Rotation threshold" is a number in range (0 , 1.0)

Example 1. rotation does not happen because journal file is not full

file_size = 4 Mbytes
rotation_threshold = 0.75
journal file current size = 2.3 Mbytes
application calls jf_journal_commit()
  

Example 2. rotation does happen because journal file is "quite" full

file_size = 4 Mbytes
rotation_threshold = 0.75
journal file current size = 3.2 Mbytes
application calls jf_journal_commit()
  

3.3.4. How can rotation threshold be set?

Journal rotation threshold is determined at creation time:

  • the program can specify it setting rotation_threshold field of struct jf_journal_file_opts_s

  • if the program does not specify it, the value of environment var JF_JOURNAL_ROTATION_THRESHOLD is used

  • if environment var JF_JOURNAL_ROTATION_THRESHOLD is not set, the default value (see hard wired constant JF_JOURNAL_DEFAULT_ROTATION_THRESHOLD) is used.

The rotation threshold of a journal file can be retrieved using utility jf_report.

3.3.5. How can the cache of a journaled file be tuned?

Cache associated to a journaled file is determined at open time:

  • the program can specify it setting field cache_size_limit of struct jf_journal_file_opts_s of struct jf_journal_opts_s of struct jf_file_open_opts_s

  • if the program does not specify it, the value of environment var JF_JOURNALED_FILE_CACHE_SIZE is used

  • if environment var JF_JOURNALED_FILE_CACHE_SIZE is not set, the default value (see hard wired constant JF_CACHE_FILE_DEFAULT_LIMIT) is used.

Cache size can not be set at value lower then JF_CACHE_FILE_MIN_LIMIT (it's a hard wired constant).

3.3.6. Is there an official libjf benchmark tool?

Yes, utility jf_bench is the official tool.

3.3.7. Why is libjf slower than stdio?

First, libjf is a recent experimental piece of code, while stdio is a well tested one. Second, libjf is based on stdio to avoid wheel design. Last but not least, libjf has a reacher semantic that allows transactionality, while stdio does not support transactions.

3.3.8. How fast/slow is an application developed with libjf in comparison with stdio?

Applications that use "safe" synchronization (disk synchronization) tend to use twice the time when moved from stdio to libjf. Applications that use "fast" synchronization (buffer flush) tend to slow down a lot (4-5 times) when moved from stdio to libjf.

3.3.9. Why should I use libjf if it is "so slow"?

Because stdio does not provide a rollback function. Because stdio does not provide a synchronization function for more than one stream/file descriptor: your system may crash after synchronization of first stream/file descriptor and before synchronization of second stream/file descriptor.

3.3.10. Which factors do influence jf_bench results?

There are many:

  • filesystem type

  • storage type (EIDE, SATA, SCSI, etc...)

  • device type (native, RAID, DRBD, crypto, etc...)

  • RAM and cache size and speed

  • CPU type and speed.

3.3.11. Can I guess how slow will become my application after migrating it from stdio to libjf?

If your application is CPU intensive and occasionally write something to disk, don't mind about libjf slow down. If your application writes a significative amount of data on disk, you can expect a slow down between 5% and 20%. If your application is I/O intensive, the results of jf_bench utility can help you guessing how elapsed time will increase and how CPU, user and system, time will increase. jf_bench does nothing with data written and read to/from disk so it should be the worst case your application might reach.

3.3.12. Is there any difference between "append" and "update" from a performance point of view?

Yes, the slow-down introduced by libjf is correlated to the type of write an application requires. jf_bench utility program can help you with average results and pattern specific results.

3.3.13. Can I obtain raw test data from jf_bench?

Yes, jf_bench prints on terminal average results, but you can ask it to supply all the raw data in CSV and/or XML format.

3.3.14. Can I use different block size, number of records, number of files, etc... with jf_bench?

Yes, you may specify different parameters changing the constants defined in the source code and recompiling it. The official version of jf_bench might change these values in the future: no one know "the right values"...

3.3.15. Is libjf already optimized or should we expect better results for jf_bench in the future?

At this time libjf is "alpha" software and no optimization work has been done.

3.3.16. jf_bench utility program stops and shows this error:

jf_bench/bench_test_results_compute: error while computing 
benchmark results: -20/ERROR: journal file exceeds maximum 
desired size and operation can NOT be performed; a global 
sync/rollback is necessary to activate journal rotation
	  

Sometimes default journal size is not sufficient for benchmark execution and environment variable JF_JOURNAL_SIZE must be tuned to a value higher than default; you may use this command before execution to set journal size at 256 Mbytes:

export JF_JOURNAL_SIZE=268435456