Transcription workflow notes

So, it’s been a while since I’ve written a blog post, but I’ve not been inactive. And since I have the day off today, I thought I’d catch you up. Over the next couple of days, I’ll be putting up two chapters from the 1946 Parish Practice in Universalist Churches as text; I’ve previously posted it as a scanned PDF.

I want to discuss my workflow. I can do the odd report, but I’d like to see more Universalist and other documents transcribed, and to have typographic errors discovered and corrected. I shouldn’t be the bottleneck.

In the past — going back twenty years or so — I would photocopy a book, carefully crop it into a single column, rephotocopy these onto letter size and take them to a central computer center where they would be processed by Optical Character Recognition (OCR). I’d get a file back, and then edit it.  Later, I would use a flatbed scanner at home and OCR software at home, but some documents required the images being edited to one column. These processes were very time consuming. Sometimes, transcribing by keyboard was more efficient!

Image capture and OCR software have improved markedly. Today, instead of scanning, I take a picture with my phone, and use a graphical front-end to powerful OCR software to process the text. It’s not always clean — a second snap and process is sometimes necessary — but the improvement over twenty years ago is striking.

In particular, on my Ubuntu Linux (14.04 LTR) machine, I use YAGF — “Yet Another Graphical Front-end for cuneiform and tesseract OCR engines” with the tesseract engine.

Another UU released today

The October 2014 (14.10) release of the Ubuntu Linux operating system came out today. The releases have a double initial codename: an adjective, bordering on the outlandish, and an animal. And we’ve come to the Us. So let me introduce, particularly to the Unitarian Universalists, the new release: Utopic Unicorn.

Release notes

Ubuntu Linux for Ministry: a feature for orders of service

So, this hasn’t been a weekly Thursday feature as I intended. Nor is this, properly speaking, a Ubuntu Linux-only feature, as it’s uses LibreOffice Writer, and that’s available for Windows and Mac OS X, too. (It is free and open-source software — FOSS — and you can get it here.)

A small thing — making it easy to put the information in an order of service (or a theater or music program) flush left and flush right respectively. Years ago, I would tab, tab, tab the biblical citation, or hymn name or the anthem composer over. Then I’d shim in extra spaces until the right margin wrapped to a new line…then I’d remove a space to pull the line back. It’s hacky, and never quite even. Here’s the right way.

Let’s start with a 5½ by 8½ inch page, as that’s letter paper folded in half and a common size for orders of service. And, for the sake of argument, half-inch margins. (Click the images to see them full-sized.)

To set the page size, use these menus. Format > Page > Page tab

Page style
Page style

Now, the idea of using tabs to set the left-hand information flush left and the right-hand information flush right isn’t entirely wrong. But the correct tab will be a “right tab” setting on the right margin. 5½ inch width, less a ½ inch margin on each side, and that means the “right tab” needs to be set at 4½ inches.

To add a tab, use these menus. Format > Paragraph >Tabs tab

Tabs tab
Tabs tab

As you see, you can use a “fill character” — like dots — to guide the eye. But that seems a little old-fashioned, so I didn’t; you may feel otherwise.

Which means in this example, you can type in “Opening hymn” and tab once to give its name.

Worked example
Worked example

And here is that file. Something to build on.

Is there something you’d like to see, to improve your church publications?

Ubuntu Linux for Ministry: a new feature, hopefully helpful

With all the talk about student debt, low salaries, missing employment, unwanted bivocationality and plain-old poverty in the ministry, it makes some sense to address ways of saving money as a way of making-do, because structural change (and success is not guaranteed) takes time.

That’s a good reason to put free-of-charge Ubuntu Linux on an old “obsolete” computer, to give it modern utility.

With concerns about online privacy invasion, copyright overreach and vendor lock-in, it makes sense to use an operating system that is backed by a community that takes your concerns seriously.

That’s a good reason to use free-to-use Ubuntu Linux, which has a community that takes these concerns seriously.

With brand-consciousness trumping utility, and the work of the ministry still being an under-served market, it makes sense to seek out an operating system that is easy (or easier) to build upon and responsive to active, if unprofitable, groups that create tools for their own use.

That’s a good reason to use free-to-adapt Ubuntu Linux, which has deep communities that address very specific needs, including those of congregations and ministers.

But Ubuntu, like all Linux versions, have a reputation — no longer fair — of being difficult or esoteric to install, maintain or use.

If you used a Linux version before, I recommend you try one again, as a group of more user-friendly versions have developed and improved in recent years. 

And that’s a good reason for me to start a weekly feature — each Thursday — demonstrating a feature or tool on the current long-term support version of Ubuntu Linux, probably the best used and most generally useful member of the desktop/laptop Linux family.


Linux, Microsoft users: protect yourself against repetitive stress

I had a harrowing day today at the emergency room. All is well — better safe that sorry — but at the very least, let it be said that I should mitigate against eye and neck strain.

Coming home, I re-installed a piece of software I once used: Workrave. It forces you to take short pauses and coffee breaks, and leads you through stretching your arms and shoulders, and refocusing your eyes. You can set the length between breaks and how many times you can defer them, say if you’re on deadline or showing someone something on your computer.

For users of the newest (Oneiric) version of Ubuntu Linux, install the backports repository (Edit > Software sources > Updates tab in the Ubuntu Software Center) and install it there or any standard way.

Linux users who compile from source and Microsoft users can get their software here.

Two PDF tools for Linux users

Like many people in an office setting, I deal with PDFs. But I’ve long given up any notion that they’re inviolable; indeed, marking on them, deleting some pages and not others and then rotating the whole bunch 90 degrees is one way the format can be useful. Sometimes I do this on the command line, but here are two graphical interface Linux tools — one I’ve been using a while; another I just discovered yesterday — that made today’s office work possible.

The new find was Xournal. Promoted as a hand-writing tool — which I’m unlikely ever to use — it serves admirably to “highlight” on a PDF, and does a nice job typing in extra text. Say, to modify a form for an office or congregation so everyone who signs up for a workshop — assuming there’s not an online sign-in! — doesn’t have to write out the same info each time, like name and address of a congregation.

And there’s PDF-Shuffler, that allows you to combine (concatenate) files, delete and reorder pages and pivot their orientation. Very handy.

Ubuntu Linux users can get both from the Ubuntu Software Center. Indeed, look there for details rather than the rather plain software project sites.

The code of conduct

With the prospect of a new church and one with a conspicuous online element, a clear upfront set of participant (much less member) expectations will have to come together almost immediately. But why draft one from scratch when — and this is a benefit of the free culture and liberal licensing, another intended value — when others have paved the way.

I’m thinking of the Code of Conduct of the Ubuntu operating system community. Not a perfect match, and it says nothing about contribution expectations. (Perhaps it shouldn’t. Haven’t got my head around that.) But it gets much of the way there and — thank God — lacks much of the ponderous, overwrought language that deeply theological people cannot escape.

I’ll keep my eye out for others.

New Ubuntu version out today

Ubuntu 11.04, a probably the world’s most popular version of desktop Linux, has a new version out today and is codenamed Natty Narwal.

I’m downloading/uploading the disk image (iso) of Natty via torrent — there are legal uses for BitTorrent — but I confess that the move of Ubuntu to include more and more proprietary software, its the greater hardware demands and changes to how it manages windows (in the next version) make me question if I’m going to continue with it.

But first today’s download and upgrade — I have more than one Ubuntu computer so it makes more sense to download a version to share than to upgrade each one from a remote server — and we shall see.

Typing in Esperanto with Ubuntu Linux

And while I’m talking about Ubuntu Linux, I recently discovered a feature for Esperantistoj, courtesy of Mikeo of the Junularo Esperantista Brita (British Esperantists Young-persons’ Group). Dankon! See the article for full details and other options.

For those unfamiliar, there are six letters found in Esperanto not found in other languages. This can complicate typing.

In short, System > Preferences > Keyboard > Layouts tab > Options button. Choose Adding Esperanto circumflexes.

Now, to get the point:­ just type the corresponding Latin letter while pressing the Alt key to the right of the space bar.