4.2. Journaling and caching

Designing a super safe journaling tool without keeping in consideration the performance point of view is a useless academic exercise: no one would use a very slow "safe journaling tool" instead of standard I/O libraries. libjf is not already optimized and a lot of code review, in the future, would be probably increase performances, but from an architectural point of view, the library adopt some strategies to limit performance degradation when compared with standard I/O libraries.

The most important feature is a high level cache we can explain with few words: every time the application updates a journaled file, the change is not propagated to the underlining file, but simply kept in the cache managed by libjf. Data are copied to file when cache reaches maximum size or a commit is requested by application. If the cache is large enough, no underlining file is touched until commit point and, in case of rollback, no file is touched at all. Managing a new level of cache is expensive in terms of CPU and virtual memory, but updating files before commit or rollback dramatically increases elapsed times because every time a bit is touched, its undo record must have been saved and synchronized in a safe place (call it "journal", "log" or "rollback tablespace" does not alter the concept).

If libjf was kernel stuff at filesystem level, its performances would be closer to native file access operations, but a lot of big issues should be solved:

The efficient kernel level implementation of libjf would not exist and we are not discussing about libjf...

The maximum size of cache allocated for every journaled file can be specified setting the field cache_size_limit of struct jf_journal_opts_s of struct jf_file_open_opts_s. Take a look to this sample program:

Example 4-2. cache_size.c

     1	#include <jf_file.h>
     2	int main()
     3	{
     4	        int rc;
     5	        jf_journal_t j;
     6	        jf_file_t jf1, jf2, jf3;
     7	        struct jf_journal_opts_s jopts;
     8	        struct jf_file_open_opts_s fopts;
     9	        jf_set_default_journal_opts(&jopts);
    10	        jopts.flags |= JF_JOURNAL_PROP_OPEN_O_CREAT |
    11	                JF_JOURNAL_PROP_OPEN_O_EXCL;
    12	        rc = jf_journal_open(&j, "jf_tut_foo-journal", 2, &jopts);
    13	        if (JF_RC_OK != rc) {
    14	                printf("%d/%s\n", rc, jf_strerror(rc));
    15	                return 1;
    16	        }
    17	        jf_set_default_file_open_opts(&fopts);
    18	        fopts.join_the_journal = TRUE;
    19	        fopts.journal_opts.journal_file_opts.cache_size_limit = 123400;
    20	        rc = jf_file_open(&jf1, &j, "jf_tut_foo-data1", "w", &fopts);
    21	        if (JF_RC_OK != rc) {
    22	                printf("%d/%s\n", rc, jf_strerror(rc));
    23	                return 1;
    24	        }
    25	        printf("Cache limit for first journaled file: "
    26	               JF_OFFSET_T_FORMAT "\n",
    27	               jf_file_get_cache_limit(&jf1));
    28	        fopts.journal_opts.journal_file_opts.cache_size_limit = -1;
    29	        rc = jf_file_open(&jf2, &j, "jf_tut_foo-data2", "w", &fopts);
    30	        if (JF_RC_OK != rc) {
    31	                printf("%d/%s\n", rc, jf_strerror(rc));
    32	                return 1;
    33	        }
    34	        printf("Cache limit for second journaled file: "
    35	               JF_OFFSET_T_FORMAT "\n",
    36	               jf_file_get_cache_limit(&jf2));
    37	        rc = jf_file_open(&jf3, NULL, "jf_tut_foo-data3", "w", NULL);
    38	        if (JF_RC_OK != rc) {
    39	                printf("%d/%s\n", rc, jf_strerror(rc));
    40	                return 1;
    41	        }
    42	        printf("Cache limit for third journaled file: "
    43	               JF_OFFSET_T_FORMAT "\n",
    44	               jf_file_get_cache_limit(&jf3));
    45	        rc = jf_file_close(&jf1);
    46	        if (JF_RC_OK != rc) {
    47	                printf("%d/%s\n", rc, jf_strerror(rc));
    48	                return 1;
    49	        }
    50	        rc = jf_file_close(&jf2);
    51	        if (JF_RC_OK != rc) {
    52	                printf("%d/%s\n", rc, jf_strerror(rc));
    53	                return 1;
    54	        }
    55	        rc = jf_file_close(&jf3);
    56	        if (JF_RC_OK != rc) {
    57	                printf("%d/%s\n", rc, jf_strerror(rc));
    58	                return 1;
    59	        }
    60	        rc = jf_journal_close(&j);
    61	        if (JF_RC_OK != rc) {
    62	                printf("%d/%s\n", rc, jf_strerror(rc));
    63	                return 1;
    64	        }
    65	        printf("two_files program ended OK!\n");
    66	        return 0;
    67	}

cache_size.c source code explanation

Rows 19-20

set cache size to value 123400 bytes for journaled file jf1

Row 27

retrieve the size of cache associated to journaled file jf1

Rows 28-29

set cache size to default value for journaled file jf2

Row 36

retrieve the size of cache associated to journaled file jf2

Row 37

open journaled file jf3 with default values

Row 44

retrieve the size of cache associated to journaled file jf3

4.2.1. Compilation and execution

To compile cache_size program you can use this command:

libtool --mode=link gcc -Wall -I/opt/libjf/include -L/opt/libjf/lib -ljf \
        -o cache_size cache_size.c
executed it:
tiian@linux:~/src/tutorial> rm jf_tut_foo*
tiian@linux:~/src/tutorial> export JF_JOURNALED_FILE_CACHE_SIZE=765000
tiian@linux:~/src/tutorial> ./cache_size
Cache limit for first journaled file: 123400
Cache limit for second journaled file: 262144
Cache limit for third journaled file: 765000
two_files program ended OK!
tiian@linux:~/src/tutorial> rm jf_tut_foo-*
tiian@linux:~/src/tutorial> export JF_JOURNALED_FILE_CACHE_SIZE=437900
tiian@linux:~/src/tutorial> ./cache_size
Cache limit for first journaled file: 123400
Cache limit for second journaled file: 262144
Cache limit for third journaled file: 437900
two_files program ended OK!

4.2.2. How cache size limit can be tuned

After you developed your application you can try to expand the cache size limit and measure elapsed times: only if the performance improves significantly the cache size expansion is suggested. For most applications, default value should be fine.


the parameter has the meaning of "cache size limit": only necessary memory are allocated by the application.