Design considerations for an error logging/notification library in C
I'm working on several programs right now, and have become frustrated over some of the haphazard ways I'm debugging my programs and logging errors. As such, I've decided to take a couple days to write an error library that I can use across all of my programs. I do most of my development in Windows, making extensive use of the Windows API, but the key here is flexibility: this library ideally needs to remain flexible, offering the programmer notification options in console apps and GUI apps on Windows and Unix-like environments.
My initial idea is to use one library that uses preprocessor conditional inclusion of the headers for windows and unix based on the current environment. For example, in a win32 app, console error message notification (while possible) isn't necessary; instead, a simple
MessageBoxA/W(hWndParent, TEXT("Some error message that makes sense in context"), TEXT("Application Name"), MB_ICONERROR | MB_OK)
would make the most sense. On the other hand, on linux things are a little more complex:
GtkWindow *w;
w = gtk_message_dialog_new(pOwner, GTK_DIALOG_MODAL, GTK_MESSAGE_ERROR, GTK_BUTTONS_OK, TEXT("Some error message");
gtk_window_set_title(w, TEXT("Application Name"));
On either operation, simple file logging with more information about the error (function, file, line, etc.) would also be useful in pinpointing the source and tracing the flow.
Moreover, logging should be possible: even when a function requiring error logging/notification is called, it should be possible to show the sequence of function calls in the program, if that level of logging is activated.
Thus, my initial considerations are to have a library that incorporates all of these features, with minimal overhead by dint of preprocessor conditionals. I think it would make the most sense to break this up into a couple structs:
- The "main" struct, which is passed to every function in the main program requiring logging, and contains
- A bitmask containing the status codes for which file logging is necessary (e.g., NOERROR | WARNING | CRITICALERROR or CRITICALERROR | CRASH)
- A pointer to an "output" struct
- A linked list of "error information" nodes
- The "output" struct, which handles printing to file, displaying message boxes, and printing to the console
- The "error information" struct, which contains information about the error (function, file, line, message, type, etc.)
This is just my initial thoughts about this library. Can any of you think of any more information that I should include? Another major issue for me is atomicity of erro开发者_如何学Cr addition: it's likely that another thread might create the error than the one logging the error, so I need to make sure that creating and adding an error node is actually an atomic operation. Thus, mutexes would likely be the way I'd go about synchronization.
Thanks for the help!
In the case of not CRITICALERROR | CRASH, where the app would be expected to continue after the logging call, it would be better to queue, (thread-safe producer-consumer queue), off each log struct to a logging thread that performs the requested action/s. The logging thread would normally free the structs after handling them. Some advantages:
1) The action taken for each logging request is reduced to mallocing the struct, loading it and pushing it onto the queue. If the disk is temporarily busy, or has high latency because it's on a network, or becomes actually unavailable, the calling thread/s will continue to run almost normally. With this, the set of apps that will fail just because the logging has been turned on is reduced. A user with some intermittent problem that you cannot reproduce can be instructed to turn on the logger with little chance that the logging will affect normal operations or, worse, introduce delays that cover up the bug.
2) For the same reasons as (1), adding/changing logger functionality, even at runtime, is much easier. For example, maybee you want to restrict the size of log files or raise a new date/timestamped log file every day. A 'normal' call would introduce a long delay into the calling thread while the old file was closed and the new one opened. If the logging is queued off, all you get is a temporary increase in the number of queued log structs.
3) Controlling the logging is easier. In a GUI app, the logger could have its own form where the logging options can be modified. You could have a 'New log file now' button which, when clicked, queued a 'LOGCONTROL' struct to the logging thread, along with all the other logging messages. When the thread gets it, it opens a new log file.
4) Forwarding the log messages is fairly easy. Maybe you want to watch the logged messages, as well as write them to disk - queue up a 'LOGCONTROL' struct that instructs the thread to save a function ptr passed in the struct and henceforth call this function with subsequent logging messages after writing them to disk. The function passed could queue up the messages to your GUI for display in a 'terminal' type window, (PostMessage on Windows, Qt etc. have a similar functionalities to allow data to be passed to the GUI). Sure, on ***x, you could open a console window and 'tail-f' the log file, but this will not appear particularly elegant to a GUI user, is more difficult to manage for users and is anyway not available as standard on Windows, (how many users know how to copy paste from a console window and email you the error message?).
Another possibility is that the logging thread might be instructed to stream the log text to a remote server - another 'LOGCONTROL' struct could pass the hostname/port to the logger thread. The temporary delays of opening the network connection to the server would not matter because of the queued communications.
5) 'Lazy writing' and other such performance enhancements become easier, but:
Disadvantages:
1) The main one is that when the log call returns to the requestor, the log operation has probably not yet happened. This is very bad news in the case of CRITICALERROR | CRASH, and can be unacceptable in some cases even with 'ordinary' logging of progress messages etc. There should be an option to bypass in these cases and a direct disk write/flush made - fOpen/CreateFile a separate 'Direct.log', append, write, flush, close. Slow - but secure, just in case the app explodes after the log call returns.
2) More complex, so more development, more conditionals, bigger API interface include.
Rgds, Martin
Hi I use this for another language but you could research it and follow its design http://www.gurock.com/smartinspect/
regards
精彩评论