Monday, May 28, 2018

What do I know about FS?

In the effort to keep up with technological change and security, I try to host a small website and maintain it. What that means can depend on so many factors, and explaining that to everyone on the Internet is likely not to my own advantage. However it would seem that every year or so years I run into a fs problem. Has this ever happened to you? For me it is always irritating, but it does force me to learn at least a little bit more about the developments in file systems over the years.

The first issue I had was with ext2. I wrote over a partition with fdisk I didn't intend to. It happened after a conversation one night where a person had particularly annoyed me with some kind of character assassination. I don't remember what it was, but I learned that night to never operate on my file systems when annoyed or irritated, given if you make a mistake you will be even more irritated.

The second issue occurred likely because of a power failure with ext3, and the solution was easily enough resolved with a 'fsck -y' session. Of course then there were the interesting issues of working with ntfs read-only file systems on linux, and running the dd command to back-up a bunch of OS disks which I didn't intend to keep on platter.

A third issue involved a time I actually dropped a running hard drive and had to take it into a lab to have some important media files recovered. It was about this time I decided having multiple copies of certain files was more important than I ever thought possible, so I bought a larger disk array and started sync'ing disks every once in a while.

A fourth experience may have been fsck issue with ext4... but when I ended up needing to use xfs_repair on an xfs partition one day I was dumb founded how I had even installed xfs in the first place. I didn't remember choosing it explicitly. But you know your sitting there in front of a machine that won't boot and you say to yourself, I guess I am going to learn a new thing today... cause this one's new.

Then there are all those people who you know who's computers die in a virus fire, or disk issue that no one seems to ever to be able to help them with (on their budget)... so you step in. The last one I did this for turned out great, since I had just migrated everything I had to a Virtual Linux environment. It turns out this person was using a Mac and no Mac professional ever does anything for free right? I was able to not only use ddrescue to recover the partition from a faulty hard drive (that had likely been banged too much...), I was also able to virtualize the disk and have Mac/OS running on my Linux VM hosting cluster hardware. Sometimes helping others helps you.

Thinking about it now with CIFS and NFS and all the potential disk/network issues over the years there has been a sordid history there too, and I actually only wonder why it is I haven't made more use of Network filesystems and even cloud filesystems except that I don't seem to trust the network a whole lot. Any way that leads today's issue with ZFS... Yep somehow I skipped Y and went straight to Z (Jinx!). In today's ZFS issue I had to introduce some delay into booting the kernel, so a ZFS cache could be properly read... Oh wow and I am not even a FS geek, this is just what happens when you try to make sure your skills have some experience attached to it. Wow perhaps it's just better not knowing... ?

Friday, February 1, 2013

Deterministic Error Handling Win

I love this article about the different Java Exception types:

  http://johnpwood.net/2008/04/21/java-checked-exceptions-vs-runtime-exceptions/
 
Something that I realized some time ago(in after thought), is that if you provide enough Application functionality to be considered an API, using a RuntimeException master class can really help in debugging what occurs in each API call. You don't need to have all the messy throws statements cluttering up your methods, if the whole purpose of having them is to return an Error State. In an API it can help to know just what happened in a call deep in the hierarchy, and be able to trace back the error. An ErrorCode type RuntimeException can pass back additional information about the error, and even the unexpected Exception itself for cross analysis, at an API exit point. Logging all this information at a single API exit point then also provides a specific location where, you can hand off the error to the caller, and also provide a debugging snippet that describes just what occurred. In a high performance application where having errors occur is not an option, knowing what happened when an error does occur is very important. Instant visibility into unexpected cases is important, while pre-determined error states are simple to log/record, unexpected error states should provide for copious debugging information so you can determine exactly what expected actions occurred. One piece of advice I have for everyone is do not under estimate the importance of debugging and logging. If your application needs to know each error state, then even the unexplained errors can be logged well to determine application state/performance/failure etc.

Saturday, November 1, 2008

A First Entry

You know there are so many great blogs out there, I am content to read everyone else's blogs and just be the eternal learner. I am happy so many people have so much to say that is not just reiterated BS (At least for the observant eye.).