This article was originally written on July 19th, 2010, but never published.
Documentation is another topic where there appears to be disagreement in the sysadmin world. When to document, what to document, who do document for, and where to store that documentation always seem to be subjects of contention. Everyone likes documentation, but no one has the time to document, and the rules for documentation often feel arbitrary. I’d like to open this up for discussion and figure out some baselines.
Should I Document?
If you have to ask then probably; but it’s much more complex than that. Documentation is time-consuming and rarely of value at first, so few want to invest the effort into it unless it’s needed. There are several questions here that need to be answered:
- Why should I Document? What is the purpose of the documentation? Are you documenting a one-off process that you’ll have to do 10 months from now? Are you providing instructions for non-technical users? Perhaps you’re defining procedures for your team to follow. Whatever the reason, focus on it, and state it up front. There are few things worse than reading pages of documentation only to find out that it’s useless. Documentation for the sake of documentation is a waste of time.
- What should I Document? It’s very easy to ramble when writing documentation (as many of my articles prove). Step back and review what you’ve written, then remove any unneeded content. Find your focus and document only what needs to be explained, leave the rest for footnotes and hyper links.
- When should I Document? As soon as possible. Ideally you’d document as you worked, creating a perfect step-by-step record. Realistically, pressure to move quickly causes procrastination, but the truth of the matter is that the longer you wait, the less detail you’ll remember. Write down copious notes as you go, and massage it into a coherent plan after the fact.
- Who should I Document for? Write for your audience- a non-technical customer requires a much lighter touch compared to a seasoned techie. The boss may need things simplified that a coworker would instinctively understand. Pick your target audience and stick to it. Anything that falls outside of the audience interests should be flagged as “[Group B] should take note that…” Also remember that the person who requests the documentation may not be the target audience.
- Where should I Document? Where you keep documentation is often more important than the quality of your document. You can write the most compelling documentation in the company, but if it’s stored in a powerpoint slide on a shared drive, it’s of no use to someone searching a corporate wiki. Whatever your documentation repository may be, be it Alfresco, Sharepoint, Confluence or even Mediawiki, everyone has to be in agreement on a definitive source. The format should be searchable, track revisions, prevent unwanted access, and be inter-linkable.
Now that we’ve set some boundaries, let’s delve a little bit deeper into the types of documentation.
Types of Documentation
Documentation can take many forms. Over the course of any given day, you’ll see proposals, overviews, tutorials, standards, even in-depth topical arguments.
. Each type of documentation has its own rules and conventions- what’s required for one set may not be needed for another. That said, here are a few general rules to follow.
- Be Concise –
- NO: thoughtfully contemplate the reduction of flowery adjectives and adverbs for clarification;
- YES: remove unneeded words. Over-explaining will confuse the reader.
- Be Clear – Make sure your subject is obvious in each sentence. Ambiguity will destroy reader comprehension.
- Be Accurate – Incorrect documentation is worse that no documentation.
- Keep it Bite-sized – Large chunks of data are hard to process, so keep the content in small, digestible chunks that can be processed one at a time.
- Stay Focused – Keep a TODO list. Whenever you think of an improvement, make a note of it and move on.
- Refactor – The original structure may not make sense after a few revisions, so don’t be afraid to reorganize.
- Edit for Content -Make sure your topics are factually correct and the content flows properly.
- Edit for Grammar – Make sure your punctuation is correct and your structure is technically sound.
- Edit for Language – Make sure the text is actually interesting to read.
- Link to Further Information – If someone else has explained it well, link to it rather than rewrite it.
- Get Feedback – Feedback finds mistakes and adds value. The more trusted sources, the better off you are.
Proposals can be immensely rewarding (or mind-numbingly frustrating), depending on if they’re accepted or not. That’s not to say you shouldn’t write them; even a failed proposal has value. The point of a proposal is to communicate an idea, a way to tell your team or supervisor “this is what I think we should do.” If you’re successful, the idea will be implemented. If you’re unsuccessful, you may find out a better way to do it. The overall goal should be to improve team performance. Here’s what a proposal should include:
- The Problem – What problem are you trying to solve? Why is it a problem?
- The Solution – A simple overview of the solution
- The Benefits – what benefits it will provide?
- The Implementation – How to implement it.
- The Results – Explain the intended results
- The Flaws – What issues are expected, and if there is currently a solution
- The Timeframe – When should this project be started and completed? How long and how much effort will it take?
Lets presume you write a knockout proposal. Everything is perfect, and with 2 days of effort you’ll reduce a 2 hour daily task to a 15 minute weekly task. Regardless of the benefits, the response will be one of these:
- Complete Apathy – the worst response, because it shows how little you are valued. No response, approval, or denial. If this happens, run your idea past an uninvested third party. Perhaps a critical set of eyes may reveal the problem.
- Denied – perhaps the benefit isn’t worth the cost, the risk is to high, there’s not enough resources, or some other issues not addressed. Try to get specific reasoning as to why it won’t work, and rework your proposal taking that into account.
- Feigned Interest, no Support – Be it plausible deniability or lack of interest, the response is weak. Push for a yes or no answer, ask what the concerns are with it.
- Delay – It’s a good idea, but not right now. There might be hesitance due to a minor issue. Find a way to calm their fears, then push for an implementation date, create a checklist of conditions that need to be met.
- Conditional Agreement – It is a good idea, but conditions must be met first. Create a checklist and verify that it’s complete.
- Full Agreement – This should be your end goal. Full agreement means support from the boss and the team on implementation. Without support, your efforts may be wasted.
You can’t account for everything in your proposal, so make sure not to paint yourself into a corner. A method for dealing with problems is more valuable than individual solutions. It doesn’t need to be perfect, but does need to be flexible.
The most important thing a proposal needs is buy-in. If your team and management aren’t behind an idea, implementation will be a struggle. The final thing to keep in mind is that not all proposals are good. If there is universal apathy for your idea, it might just be bad and you’re oblivious to it.
Introductions and Overviews
Introductions are the first exposure someone may have to whatever you’ve been working on, be it a JBoss implementation, Apache configuration, or new software package. A clear understanding of what “it” is can help with acceptance. A bad introduction can taint the experience and prevent adaptation. So, how can you ensure a good introduction to a technology?
- Explain the Purpose – Why is the user reading this introduction? A new Authentication system? Messaging system? Explain why the reader should care.
- Define your Terms – Include a glossary of any new terms that the user needs to understand. Remember, this may be their first exposure to the topic. Don’t overwhelm them, but at the same time don’t leave them in the dark.
- Don’t Drown in Detail – An introduction should not cover everything in perfect detail, but it should give you references to follow up on.
The tone should be conversational- you need to draw the reader in, befriend them, and convince them that this new thing is not scary. This can be a tough task if the subject is replacing something that the reader if
Document a Process (Installation, Upgrade, Tutorials, How-to, Walk Through)
Documenting a process serves three purposes- it trains new employees in proper technique, ensures consistency, and covers your rear should something go wrong. That last point may sound a bit cynical, but you never know when you’ll need it. The process itself should be clear enough that any qualified user can follow it. Process documentation should have the following traits:
- Steps – Well defined tasks that need to be performed.
- Subtasks – any moderately complex task should be divided up.
- Document Common Problems – Surprises can derail a new user. Acknowledgement and fixes for issues can help ease new users into the process.
Dry runs are essential in documenting a process- test the process yourself and have others test it as well. Continual runs will expose flaws and allow you to address deficiencies. Keep testing and refining the process until a sample user can follow it without issue.
Topical guide (Feature-based)
Topical guides are both the most useful and yet the hardest documentation to write. They need to be thorough, both fully covering the material but not burying the user in frivolous details. So what should you cover in a topical guide?
- Be specific on the topic – Document a feature and all related material. If it’s not related, don’t include it.
- Cover Relevant Tangents –
- Be comprehensive – Cover everything a user needs to know, but remember it’s not intended to be a reference book.
Document a Standard (How Something Should be Done)
Inconsistency is the bane of system administration, and consistency can only be had when everyone is in agreement on how things should be done. There must be agreement not only on theory, but also in practice. As such, standards should be documented. What should a standard entail?
- Dynamic – Not the first word when you think of standards, but something you have to face; your standard will become out of date quickly. Document it and give it a revision number. Soon enough you’ll realize
- Audit – It’s not enough to document a standard, you also need to enforce it. Periodic verification can spot issues before they become problems. If configuration files are identical, md5sums can be used to find inconsistencies.
Annotation (Config Commenting)
One of the most common types of documentation is never published, yet often the most crucial in day-to-day operations. Comments within configuration files can explain what steps were taken and why.
- Explain Why – When you make changes, explain why you made the change.
- Keep it Simple – Comments should not overshadow the configuration. Leave over-documentation to sample configs.
- Consider Versioning – The best configuration documentation is a history of changes. Configurations that are both critical and fluid (for example, Bind zone files) are perfect candidates for versioning.
- Sign and Date Changes – When you make a change, leave your name and a datestamp. While versioning comments may be more permanent, inline comments provide instant context This is important when the change is revisited and no one remembers making it.
We recently ran across a problem in production that we could not replicate in lower environments. Since this is not only a high use application, but an exceptionally “chatty” app, searching the logs was an excersise in futility (*one* of yesterday’s production logs was 6,975,291 lines long, with multiple logfiles per app, multiple apps and multiple servers).
So how do you find a needle in the haystack? Get a smaller haystack. In the quickest window possible, perform the following three steps
- tail log1 log2 log3 log4 >combined.log
- reproduce error
- ctrl-c tail process as quickly as possible
Doing so reduced our 11,480,799 lines (with 780,527 errors) to 1200 lines.
My employer is currently looking for a sysadmin. If you’re interested, contact me for details.
SR SYSTEMS ENGINEER ROLE IN FARMINGTON HILLS, MI
We are looking for someone who will administer web hosting Linux systems infrastructure, including server hardware, operating system, enabling software, and application software/data for Internet-facing application systems. Direct other departments’ work on dependent systems such as network, firewall, load balancer, and external storage systems. Provide consultative expertise for our businesses to provide technical guidance, standards, knowledge and understanding of business and technology processes, and integration of technologies to deliver Internet-facing learning products and services.
The position will be responsible for systems configuration, implementation, administration, maintenance, and support, along with application integration, and troubleshooting for our eLearning systems.
The role encompasses daily operational systems support in development, QA, and production tiers. It also encompasses project work with business units, developers, test labs, end users and other groups involved in the planning, development, integration, testing, and problem solving for applications, content, and data.
- Ensure maximum uptime of hosted environments, including production, staging, testing, authoring, and development environments. This includes, but is not limited to ensuring the HW is configured properly; is secure; is networked properly; is backed up per company standards; is monitored accordingly; is tested to ensure operability; and is built to company standards.
- Act as a consultative resource for our businesses to provide technical guidance, standards, knowledge and understanding of business and technology processes, and integration of technologies for content management and delivery.
- Assist with integration efforts, including planning and coding where necessary in Apache, Tomcat Java, and MySQL database technologies, and scripting languages.
- Assume lead role in complex problem solving in hosted environments, offering meaningful solutions and implementation strategies. Engage other departments and direct their work on supporting systems such as network, firewall, load balancers, and external storage. Engage application teams with analysis from logs and data on the servers, and provide recommendations for problem resolution.
- Be part of an on-call rotation schedule that includes carrying a pager/email device 24/7. Respond to all alerts immediately and inform management of issues and work being performed to remedy the problem. Direct escalation to engage additional resources if required to troubleshoot and resolve a problem.
- Monitor, analyze, and report performance statistics for web hosting environments. Troubleshoot hosting environment failures and manage / assist in the development of solutions to these problems. This includes not only overall environment / platform problems, but also includes problems affecting individual client accounts (i.e. data integrity, reporting, security, etc.).
- Analyze web hosting environment averages and peak workloads / throughput compared to existing capacities and plan required accommodations to address environmental growth. Take necessary corrective actions (both scheduled and unscheduled) to proactively address potential problems before they become operational / environmental problems. Notify Manager of projected needs and actions taken.
- Ensure security of systems, including standard server build and lock-down procedures, and monitoring security access to systems.
- Review system logs regularly, report and research warnings and errors. Review system logs for backup completions and report any discrepancies.
- Execute implementation/migration of new software and application versions across the development and staging and production environments and prepare back out plans on all platforms to be updated. Ensure adherence to established Change Management and QA procedures. Verify results with appropriate parties.
- Work with peers and other departments to analyze ongoing processes and procedures. Where relevant, propose / design improvements to operational processes.
- Keep up to date with developments in the e-Learning / web-based information technology field through educational and other information resources and make management aware of possible applications for new technologies.
- For new web hosting infrastructure projects, act as technical lead for planning and implementation. Mentor and train junior team members in all areas of IT expertise.
- Bachelor’s degree in Information Systems, Computer Science, Business or Engineering or equivalent job related experience.
- Must have an excellent command of:
- Red Hat Linux Operating System
- Apache Web Servers
- Tomcat application environment running Java
- MySQL Database Server
- MarkLogic Content Management Systems
- Must possess experience designing, building, maintaining, migrating, tuning, administering, and supporting three-tiered web/application/database server environments
- Experience with Internet access and security for servers residing within a DMZ
- Must have excellent written and oral communications, including technical documents, and process documents.
- Must possess excellent problem-solving and analytical skills and be able to translate business requirements into information systems solutions.
- Able to translate business requirements into technical recommendations for information systems solutions.
- Must possess excellent problem-solving and analytical skills; ability to assist with network, system, and application troubleshooting required.
- This position demands a well-organized, action-oriented team player with the ability to prioritize daily work, change directions quickly, coordinate geographically dispersed team members and work on multiple projects simultaneously.
- Comprehensive knowledge of problem analysis, structured analysis and design, and programming techniques.
- Coding and scipting skills for a RedHat/Apache/Tomcat/MySQL environment, clustering and other high-availability architectures, TCP/IP, along with various server management and administrative tools.
- Ability to work with minimal supervision, engaging peers and other departments to accomplish assigned goals and effectively manage projects in a cross-functional environment.
Administer web hosting infrastructure, including server hardware, operating system, enabling software, and application software/data for content management systems. Direct other departments’ work on dependent systems such as network, firewall, load balancer, and external storage systems. Provide consultative expertise for our businesses to provide technical guidance, standards, knowledge and understanding of business and technology processes, and integration of technologies to deliver Internet-facing learning products and services.
The position will be responsible for systems configuration, implementation, management and support, along with application integration, and troubleshooting for our MarkLogic-based Content Management Systems. The role includes installation, configuration, administration and maintenance of the content management environment and integrating new systems and products into the platform.
The role encompasses daily operational support of the content management systems and application environment in development, QA, and production tiers. It also encompasses project work with business units, developers, test labs, end users and other groups involved in the planning, development, and testing of products, content, and workflows in the content management systems.
New job, new laptop. Many utilities here are windows only, so it requires a bit of… effort… to get myself up and running efficiently. The solution to the windows problem is VirtualBox. I had set this up on my last laptop with little effort, but this time around required a bit more effort. Hopefully the instructions below will help others get up and running quickly.
Disclaimer– your laptop may catch on fire and explode (or worse) if you attempt this… or something.
We’ll be presuming that you’ve already resized your windows partition and have both a working Windows and Linux partition.
Log into XP, grab MergeIDE.zip from Virtualbox’s site, extract and run it. It should be a quick flash and be done. (Note: I am not 100% sure this step is needed)
Create a new hardware profile and name it virtualbox. Make sure to set it as a choice during boot. Try rebooting into native windows once to ensure that it does offer you profile options.
You’ll need the following packages installed (May differ for non-ubuntu systems):
mbr, virtualbox-ose, virtualbox-ose-qt
Create a stand-alone mbr file to use for booting (yes, you need the force flag):
install-mbr ~/.VirtualBox/WindowsXP.mbr --force
We’re presuming that your windows partition is /dev/sda1. In the below command, we are defining
- a vmdk file (WindowsXP.vmdk)
- which raw disk to read (/dev/sda)
- which partition (1)
- the new MBR file we just created
VBoxManage internalcommands createrawvmdk -filename ~/.VirtualBox/WindowsXP.vmdk -rawdisk /dev/sda -partitions 1 -mbr ~/.VirtualBox/WindowsXP.mbr -relative -register
Note that you’ll need read/write access to that drive as your user, so you may want to figure out a cleaner/securer way to implement this, rather than adding your user to the disk group (which is very dumb and insecure). I would, but it’s working and I have more important things to do at the moment.
Another issue- apparently thinkpads report the drive heads and cylinders oddly (T410 for me and T60p in article), so we have to add some vmdk settings before virtualbox creates them incorrectly. Open ~/.VirtualBox/WindowsXP.vmdk and add the following at the bottom:
The biosHeads appears to be the magic value- it seems to work if it’s set to 240, but the default is 255 (which fails).
Once you add those, start up virtualbox and check the virtual media manager, your new vmdk should be listed there. Once it’s confirmed, create a new virtual machine. Rather than creating a disk, select your vmdk as an existing disk.
After you finish, go the the VM settings->system and make sure the motherboard tab as io-apicÂ enabled (I also had PAE/NX enabled under processor and VT-x enabled under Acceleration).
Start the VM
There are several errors that could pop up. I’m sure there are plenty more that I stumbled across, but these were the two big ones:
- a disk read error occurred, press ctrl+alt+del to restart – Caused by incorrect biosHeads- check and make sure it’s set to 240 (this was the fix for me, results may vary).
- Complaint about kvm/vmx – Virtualbox does not like kvm. Uninstall qemu-kvm.
If things go well, it should flicker mbr in the corner, then go to the hardware profile selection screen. Select the virtualbox profile, and continue, then log in.
What follows is a half-hour of installing generic drivers and dealing with hardware specific auto start apps complaining that they won’t work on this installation. Windows will warn that the new drivers are not blessed, so be forewarned.
Once completed, at the top of the VM windows select Devices-> Install Guest Additions. This will download and mount an ISO, and windows will pop open a folder with the addition executables. Select the one best for you and run the installer. It will prompt you for video and mouse drivers (and trust me, you want them).
The final step is to shut down the windows VM, then reboot into the native windows partition to make sure it still works.Â I did receive a few blue-screens before logging in at the beginning, but they appeared random and haven’t happened since.
And that’s all there is to it- simple, eh? Your windows partition should now run in native mode and vm mode.
So, I’m home sick again. Fourth time this year I’ve been sick. Cold, flu, cold, Bronchitis. Awesome. Will I get to rest today? No, of course not.
My server (Unicron) has been up and running for 2 years now- I got the parts right after Ian was born. I set up a nice software raid array at the time that’s served me well. I’d never set up a raid array like this before, so I wasn’t really sure how to monitor it. The raid array has been running fine for 2 years, so I just sorta let it slide.
Over this past weekend, I did some work resizing the lvm partitions and had to poke around with the raid stuff. I found not one, but two ways to monitor it one was to set up a monitor with the mdadm tools and have it email me if there was a problem, and the other lead me to create a simple nagios monitor. I set both up sunday night.
Flash forward to this morning- jackie wakes me up, asks me if I’m going to work (I’d come home sick the day before and was still out of it.) My intention was to wake up long enough to IM my manager and supervisor and let them know I was gonna be sick. I do so, then minimize the im stuff. Staring me in the face was the following:
Wait, wait, wait- my script must be crappy, there’s no way the raid array choked right after setting up the monitoring. I sorta go into denial and check my email:
This is an automatically generated mail message from mdadm
running on unicron
A Fail event had been detected on md device /dev/md2.
It could be related to component device /dev/sdc2.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md2 : active raid5 sda3 sdd2 sdc2(F) sdb3
1461633600 blocks level 5, 64k chunk, algorithm 2 [4/3] [UU_U]
md3 : active raid1 sdc1 sdd1
979840 blocks [2/2] [UU]
md1 : active raid1 sda2 sdb2
979840 blocks [2/2] [UU]
md0 : active raid1 sda1 sdb1
192640 blocks [2/2] [UU]
So here I am, praying it was a hiccup while I reboot and rebuild my raid array. it looks like sdc2 went nutty about 45 minutes after I went to bed. I restarted the server and sdc2 reappeared, and I’m rebuilding now to see what happens,
md2 : active raid5 sdc2 sda3 sdd2 sdb3
1461633600 blocks level 5, 64k chunk, algorithm 2 [4/3] [UU_U]
[=================>…] recovery = 85.0% (414436248/487211200) finish=26.8min speed=45182K/sec
One thing is for sure, I need to get a auxiliary drive in case this one goes kaput for real. I said I’d buy one in june… 2007. suppose I better get one that, huh?
My apologies of this was nonsensical, I’m really tired.
So I’m testing a nifty new wordpress plugin… check this out:
I wonder if it’ll work?
update: no, no it will not.
So I’ve been pretty quiet since I hit 100k words- what’s been going on?
- Round of layoffs at work
- Friend diagnosed with cancer
- Another round of layoffs at work.
- Jackie became a pampered chef consultant
- Finances have been wiped out from christmas and getting her PC stuff off the ground.
- 10% paycut at work
- Guitar lessons are now done because no one can afford them.
- Have been reading Manuscript Makeover for ways to improve my book
- Decided to do an initial cleanup of the first draft of my script, then rewrite the outline before starting draft #2
- started yet another opensource project- this time it’s a collection of Nagios Plugins.
So I’ve been pretty busy. I’ve finished the cleanup of the first two chapters of book 1; hopefully I’ll finish the rest shortly, but it’s very slow going. We’ll see where things head in the next few months- I expect more crappiness.
anyone know of any good jabber clients for the blackberry? I’ve tried a couple with little luck, and most of them cost more than I can afford for this test. Features required
- Must run on BlackBerry 8703e v4.1.0x
- Connection server can be configured differently than jid address (i.e. firstname.lastname@example.org for jid, jabber.morgajel.net for connection server.) This rules out Mobber as far as I can tell
- Requires SSL/TLS
- Non-strict cert checking
Let me know if you have any suggestions.