BORDER TOWN

Using Log Instances

Saturday September 22, 2012 16:19:46

Anyone who's programmed long running processes, like servers or daemons, knows the importance of logging. Assert statements may be passe but log statements proliferate in these code bases.

Log calls are so prevalent that they get a pass from one of the most important conventions in OO-programming: static variables should be avoided. Most logging frameworks allow users to create static variables, one per process or one per code file, to ease usage. While this might make logging simple to begin with, I'm starting to believe it's a bad idea that should be abandoned in favor of more laborious log creation.

Here's a fake example of how logging is typically done:

from superframework.logging import create_log

LOG = LOGGING.make_a_log

class ThingThatDoesNotWork:
    def do_something(self):
        LOG.write_line("I am totally doing something right now!")

To log we first create a single static reference which is used everywhere else in this code file. Information on how this object is created and what formatting and customization applies to the logs lives in some central location such as a properties file. Typical log formats show things like the time and place the log statement was made next to the message. It looks a bit like this:

2012-09-21 12:30:00 [things.py, 7] I am totalling doing something right now!

Other than these properties, which are usually global, the only way to affect the information sent to the logs is the argument to write_line. What if you wanted some information to appear often enough that putting it into the write_line argument was a hassle, but you didn't want it to show up for every call to write_line for the application?

For example, what if 10 or more “ThingThatDoesNotWork” objects are created? It would be nice to give them an ID and show this next to each log message to keep things clear.

from superframework.logging import create_log
from myframework.util import generate_id

LOG = LOGGING.make_a_log

class ThingThatDoesNotWork:
    def __init__(self):
        self.id = generate_id()

    def do_something(self):
        LOG.write_line("%s I am totally doing something right now!" % self.id)

The only problem is this is now a convention, which means programmers are free to ignore it. If you create a log method in ThingThatDoesNotWork that logs everything, you can have it always spit out the ID, which is a bit better:

...
class ThingThatDoesNotWork:
    ....
    def do_something(self):
        self.log("I am totally doing something right now!")
        if not complex_function():
            LOG.write_line("Complex function failure!")

    def log(self, message):
        LOG.write_line("%s %s" % (self.id, message))

The problem with this, aside from having to write a series of new logging methods on each suitable class, is the log lacks context from outside the ThingThatDoesNotWork class. For example, what created the ThingThatDoesNotWork? You could log when a new instance is created, and also log from where, by logging the “id” field of each new ThingThatDoesNotWork instance. You'd then have to find the ID from the log generated by ThingThatDoesNotWork, and then reverse search the logs to find where that same ID is generated. On top of taking awhile, all of this would still be by convention as well, meaning maybe sometime someone would not log the ID of a new instance when they created it, hiding the source of troubles.

2012-09-21 12:00:00 [server.py, 10] Starting server... 
... stuff ...
2012-09-21 12:28:00 [dispatch.py, 15] Received a new request: POST something 
2012-09-21 12:28:57 [dispatcher.py 75] Created a ThingThatDoesNotWork, id=1
2012-09-21 12:28:59 [file_you_do_not_care_about.py, 32] TODO: Is this code even executing?
2012-09-21 12:29:09 [things.py, 7] 1 I am totalling doing something right now!
2012-09-21 12:28:59 [file_you_do_not_care_about.py, 53] Blah blah blah blah
...
2012-09-21 12:29:43 [file_you_do_not_care_about.py, 83] Value of a local variable in function that works perfectly="abcdef"
2012-09-21 12:29:45 [dispatcher.py 75] Created a ThingThatDoesNotWork, id=57
2012-09-21 12:29:59 [file_you_do_not_care_about.py, 33] Returning my VALUE of <Person, age=4, name=Pedro>
...
2012-09-21 12:29:59 [file_you_do_not_care_about.py, 65] Swallowing exception because I'm pretty sure this function always works. Message: Memory buffer explosion, please terminate app.
2012-09-21 12:30:00 [things.py, 7] 57 I am totalling doing something right now!
2012-09-21 12:30:01 [things.py, 7] Complex function failure!

As seen here, figuring out where instance 57 demands you scroll up in the log past mountains of garbage. Additionally, when “complex function” fails its accidentally logged with the static LOG var so you miss the chance to know the ID anyway.

Most “best practice” logging frameworks provide quick and dirty techniques to get up and running which use singleton patterns of some kind or another. This approach works and can be jammed into any program quickly but, despite the fact everyone does it, I don't believe it's the best possible fit for any program. Like all aspects of your code its worthwhile to consider creating an interface that makes the most sense for the given problem domain.

Once I wrote an app that had to log long running jobs. A job might call a method 99 times in a loop without incident before something goes hideously wrong on the 100th call. I didn't care about the other 99 calls, but when that 100th call really screwed things up I wanted a way to quickly see any related information while shutting out the unrelated noise. This meant I wanted to log absolutely everything but somehow not see stuff after the fact.

My idea was that each log object needed to store the relationship to its parent. This way, I could have a nested log, one I could use to create a tree view and drill down into errors. I made an implementation that produced a flat log file as well, but I really preferred that tree view whenever I could have it and came up with an interface like this:

class Log:
    def __init__(self, name, parent=None):
        self.name = name
        self.parent = parent

    def write_line(self, message):
        """Write a message to the log for this task."""
        ...

    @property
    def factory(self):
        def create(name):
            return Log(name, self)
        return create

The way it works is at the start of a job, a new Log instance is created. As it calls methods, it either passes its Log instance in or makes a new one.

from myframework.logging import create_log
from myframework.util import generate_id

class ThingThatDoesNotWork:
    def __init__(self, log_creator):
        self.id = generate_id()
        self.log = log_create(self.id)
    def do_something(self):
        self.log.write("I am totally doing something right now!")

In a server application, this might means each request gets a new log, with an ID all its own, and as it calls long running methods, they too have identification information they set in their log objects.

The important thing is the chain of who created what where is preserved by the parent reference in each Log instance. You don't have to store the parent reference actually, whats important is that the child_factory method has some knowledge of where the log came from embedded in it.

In fact, using this interface I once created an implementation which didn't store the parent at all. Instead it just kept a number, which started at zero with the first Log instance and was incremented by one for each child instance. This was used to indent the logs to make the stack like nature clear.

In this example lets say we store such a count, as well as some string used to identify the log. The result looks like this:

...
class Log:
    def __init__(self, name, indent=0):
        self.name = name
        self.indent = indent

    def write_line(self, message):
        """Write a message to the log for this task."""
        LOG.write_line("%s [%s] %s"  
                               % ("|---" * self.indent, self.name, message)

    @property
    def factory(self):
        def create(name):
            return Log(name, self, self.indent + 1)
        return create

    def close(self):
        """Close this log somehow..."""

...

LOG = Log("server", parent=None)

def server_loop():
    LOG.write_line("Starting server...")
    while True:
        request = next_request()        
        method_dispatcher(request, LOG.factory)

...

def method_dispatcher(create_log):
    log = create_log(request.id)
    LOG.write_line("Received a new request: %s, %s " % (request.method, request.path))
    if request.method == "POST":
        some_other_method(log, request.path)
    ...
...

def some_other_method(log, path):
    if path="something":
        thing = ThingThatDoesNotWork(log.factory)
        do_something()

The log file of this trite example would look like this:

2012-09-21 12:00:00 Starting server... [server.py, 10]
2012-09-21 12:28:00 |--- [Request 01b7d394-eecc-4306-a628-0401676d85c1] Received a new request: POST something [dispatch.py, 15]
2012-09-21 12:30:00 |---|--- [Request 01b7d394-eecc-4306-a628-0401676d85c1][Thing] I am totalling doing something right now! [things.py, 7]

Forcing an instance to set up a log instance at important points means the code naturally registers important events in a hierarchical fashion, which is something you don't get by allowing code to get by with a lazy call to a global object or function. This opens the door to amazing possibilities later.

For example, what if you decide to log each request in its own file (what decadence! What luxury!). With this interface, writing such an implementation is child's play.

The Log implementation used by server_loop would itself log to a normal file, but the Log instance creation function returned by the factory property would create a new log file. This means every request would get its own file. You could then remove all the prefixed information relevant to the request, making it easier to read:

Let's say the name of the file was “Request_01b7d394-eecc-4306-a628-0401676d85c1”:

2012-09-21 12:28:00 Received a new request: POST something [dispatch.py, 15]
2012-09-21 12:29:50 Thing:57:
2012-09-21 12:30:00 |--- I am totalling doing something right now! [things.py, 7]

Notice how the implementation was changed, so that when ThingThatDoesNotWork calls create_log, it logs the argument given to create log, and creates a new log object which logs to the same file while indenting all log statements from that point out. Unlike the earlier example, where was on every applicable line, here it just appears before all the log statements. This makes sense because again the log file will all be from a single thread of execution.

I don't think this is the perfect logging interface. In many cases, creating a new log object for everything presents too much of a burden to the code clarity or execution time. However, there are definite benefits, and the trade-offs are worth considering.

However you accomplish logging, claim ownership of the interfaces and patterns so that you can tailor them to your app.

Doing the Same Thing Twice

Sunday September 16, 2012 18:52:01

Yesterday a friend of mine was mine was telling a story, to which I replied “well the definition of genius is doing the same thing twice and expecting a different result!”

Everyone had a good laugh. We discussed the meaning of the joke for half a minute, before I concluded by adding in the exact same inflection “like I always say, the definition of genius is doing the same thing twice and expecting a different result!”

At this point someone else in the room, who'd been within ear shot the entire time but hadn't been listening, suddenly piped in with “what? You know that's the definition of insanity, right?”

I had tried to do the exact same thing twice, implicitly proving the real saying as a joke. Ironically however the result was different the second time.

Scientifically, doing the same thing twice should yield the same result, so assuming something different will happen is crazy. In the real world, though, almost nothing is isolated, so nothing is idempotent, and something different usually happens. Getting the same thing to happen twice is actually really difficult, and in fact the only way in which experiments become respected in the scientific community is if they are repeated, preferably by different people.

However, too often people take it for granted that the same thing will happen twice because they mistakenly think they completely understand the causes of what happened the first time. If you acknowledge the fact that most of us never really know what we're doing anyway and those that do rarely have 100% control of the outcome of our actions, there are many things people could attempt repeatedly while getting widely different results. This phenomenon forms the basis of the game “Horse.”

If being insane means expecting a different outcome from the same actions, yet most people are incapable of performing the intended action once, let alone twice, then maybe the real measure of sanity (and competence in general) is the ability to identify why an outcome is different from a goal and adjust accordingly.

How to install Java on Windows

Tuesday September 11, 2012 00:28:41

I decided to learn Clojure recently, which unfortunately means installing Java on Windows.

In case you don't know, when people blame Microsoft for creating insecure operating systems, they're half-right. Really, Windows and its ilk are incredibly secure. The issue is all the common junk people download for it; things like Winamp, various chat clients, and Java. I have personally lost a Windows Vista install which was otherwise up to date due to an exploit in Java which infected my computer via an ad pop-up in YTMND. Here's an article where security experts berate Java for being increasingly unsafe.

So, what's my secret now? First off, I do all my web browsing inside a Virtual Box VM running Ubuntu. LOL! Imagine what a funny joke that would be if it weren't true!

Second off, when I install the JDK I make sure to disable installing Java applets. It seems like Oracle is increasingly aggressive about installing the ability to run Java applets on a machine whenever you download the JDK. Except for online poker sites and virus writers, no one has seriously used a Java applet since the late nineties so make sure you treat it like cancer and don't let it into your system!

These days the installer installs the applet viewer with the “public JRE.” When installing the JDK, make sure you deny this option:

Don't worry, there's a private JRE too that gets installed too which is all you need as a developer.

Java's crappy track record with security, along with Oracle's ever-increasing dickishness, has dampened my desire to do anything with Java over the past year even though its such a nice framework. One of the reasons I'm learning Clojure is because there is a seemingly finished port to .NET and Javascript compiler, meaning I could eventually use it on a more stable, less-litigated platform, although for now I plan on going with the most used platform for Clojure since that's what any book will teach.

Metal Gear Solid 3 Pocket Reference

Saturday September 1, 2012 14:28:57

I recently resumed playing through Metal Gear Solid HD Collection and had just started Metal Gear Solid 3 when I noticed something was missing. MGS3 is an incredibly deep game, but after you've beaten it there isn't much to do… unless you thoroughly explore each area Snake walks through, at which point you realize the game is teaming with details. Much of it is the animals you can capture or eat along with the special items you may not have found the first time through the game.

Metal Gear Solid 2 allowed you to capture the dog tags off enemy soldiers in order to earn special items, such as the stealth camouflage or infinite ammo, and had a viewer built into the game so you could track your progress. It was a nice way to see how much extra stuff you could find and pick up as you replayed the game.

MGS3 has nothing like that, so its a surprise to discover there is far more secret stuff to find. In MGS3 eating every animal you find, as well as shooting every single hidden frog coin bank in the game, will unlock special items. There's also additional secrets such as weapons or camouflage sprinkled throughout the game.

Now unfortunately, because MGS3 is linear this means you can't simply play to the last boss and then backtrack through the entire game to pick up stuff you missed; you really have to remember it as you play through. Unless you're sitting down with an FAQ or other guide in front of you as you play, this isn't really possible. As much as I love MGS3 it takes about 18 hours to play through, and most people won't feel like replaying the entire game because they missed some item ten hours in.

Years ago as I sought to alleviate this issue and to learn Adobe In Design I created a fairly short PDF which had a check list for all special items and things in the game, as well as a listing of every location in the order you visit it in the game with a summary of the unique stuff to do in that area. This included unique animals to kill and eat and the location of the frog. To get that information I copied the works of two guides on GameFAQs that told you the locations of all frogs and animals. The frog guide (whose sentences I copied verbatim) was MangaMan505's Find the Stupid Frog Guide and the animal one was Vesperas' Markhor Rank Help FAQ. I contacted the authors to get their permission to release the guide but only one ever responded before the trial to In Design expired (making me unable to ever update the PDF), so I never released it (I also pilfered dozens of official Konami images but that's par for course on the internet).

However the guide is so useful that in retrospect it seems like a mistake to not share it, so I'm doing that now. If either one of the GameFAQ authors contacts me with a complaint I'll take it down, but for now I'm going to offer it up in hopes others might also rekindle their love for MGS3.

Download the normal version of the PDF here, and the even more condensed version which contains some embarrassing spelling errors and other nonsense here.

<---2012-11-05 03:25:21

history

2012-08-18 14:44:06--->

Main

Projects