Strings¶
All text strings in MuPDF use the UTF-8 encoding.
Unicode¶
The following functions encode and decode UTF-8 characters, and return the
number of bytes used by the UTF-8 character (at most FZ_UTFMAX).
int fz_chartorune(int *rune, const char *str);
int fz_runetochar(char *str, int rune);
Locale Independent¶
Since many of the C string functions are locale dependent, we also provide our
own locale independent versions of these functions. We also have a couple of
semi-standard functions like strsep and strlcpy that we can’t rely on the
system providing. These should be pretty self explanatory:
char *fz_strdup(fz_context *ctx, const char *s);
float fz_strtof(const char *s, char **es);
char *fz_strsep(char **stringp, const char *delim);
size_t fz_strlcpy(char *dst, const char *src, size_t n);
size_t fz_strlcat(char *dst, const char *src, size_t n);
void *fz_memmem(const void *haystack, size_t haystacklen, const void *needle, size_t needlelen);
int fz_strcasecmp(const char *a, const char *b);
There are also a couple of functions to process filenames and URLs:
char *fz_cleanname(char *path);Rewrite path in-place to the shortest string that names the same path. Eliminates multiple and trailing slashes, and interprets “.” and “…”.
void fz_dirname(char *dir, const char *path, size_t dir_size);Extract the directory component from a path.
char *fz_urldecode(char *url);Decode URL escapes in-place.
Formatting¶
Our printf family handles the common printf formatting characters, with a
few minor differences. We also support several non-standard formatting
characters. The same printf syntax is used in the printf functions in the
I/O module as well.
size_t fz_vsnprintf(char *buffer, size_t space, const char *fmt, va_list args);
size_t fz_snprintf(char *buffer, size_t space, const char *fmt, ...);
char *fz_asprintf(fz_context *ctx, const char *fmt, ...);
%%,%c,%e,%f,%p,%x,%d,%u,%sThese behave as usual, but only take padding (+,0,space), width, and precision arguments.
%g floatPrints the
floatin the shortest possible format that won’t lose precision, exceptNaNto0,+InftoFLT_MAX,-Infto-FLT_MAX.%M fz_matrix*Prints all 6 coefficients in the matrix as
%gseparated by spaces.%R fz_rect*Prints all
x0,y0,x1,y1in the rectangle as%gseparated by spaces.%P fz_point*Prints
x,yin the point as%gseparated by spaces.%C intFormats character as
UTF-8. Useful to print unicode text.%q char*Formats string using double quotes and C escapes.
%( char*Formats string using parenthesis quotes and Postscript escapes.
%n char*Formats string using prefix
/and PDF name hex-escapes.