Computing


About five years ago I decided to pen a missive to my future teenage daughter – the reasons for which, deserving a post of their own, will come later. I had in mind that the content would potentially benefit others, so I decided I would write it as a book.

The first draft of the book was coming quickly and I had to decide which tools to use to publish the book. I chose self-publishing because I wanted full control of any future changes / corrections /edits that I might wish to apply – the content deals with some sensitive subjects and I might well wish to improve the clarity of some parts at a later date. What seems clear to the author when written may be ambiguous or misleading to some readers. I was very keen that the presentation of the book be to a high standard, and experience with the usual word processors has taught me that they are a poor choice for those wanting detailed control over the final printed page. I had previously used LateX, a free typesetting package suitable for the more technically oriented, and I was confident that LateX (the X is the upper case of the Greek letter χ (Chi), roughly pronounced as a K, so LateK) would allow me to have the control of the final printed results that I desired.

The book was also to contain a small amount of Arabic content and using Linux I found that the installation of the software together with its many accompanying packages, including Arabic support, was a snap. On its own LateX is a command line tool so I needed to select a GUI based program to actually type out my book into. I chose Kile, a LateX editor based on Qt and from the KDE family of applications. I’m a long time user of Linux and KDE, from the very early days in fact, and have generally had a good experience of it.

So, a couple of chapters written, it was time to have a peek at what the final presentation quality would be. In just a few seconds and using the QuickBuild option I had a pdf document to look at. There were lots of very strange warnings in the program log, but the final results looked great – this was going to be easy, I though to myself! Yes, dear reader, if this was TV or radio and not a piece of italicised text then right now you would hear some deep laughter echoing around the distant hills

bellBefore continuing this particular lamentation, allow my superego to interject. I was ultimately able to produce a very nicely presented book and the pdf that LateX created for me was just right for printing the paperback book via the self-publishing route, so to be fair – LateX did a great job. Now, continuing where I left off…

I completed the main body of the book in a couple of months. I really did write it for my daughter, and I needed time to consider if I would actually publish it for others to read. Over the course of the next few years I would occasionally amend or add to the text, and gave some preliminarily printed copies (using Inky Little Fingers, a UK based printer who I can recommend) to friends to look over. I received some great feedback from them, and they encouraged me to go ahead and release the book.

During this extended period of mullation (OK, that’s not a real word) I designed the front cover, using Inkscape. Inkscape works with vectors (lines, rather than pixels) so the designs that you create can be easily scaled to any size. I then saved the graphic design to pdf and fired up Scribus, which is desktop publishing software, to complete the cover (back page etc.).

This is where my natural lethargy kicked in. Tarek, you are right, I never finish anything. Look, I’m not a completer-finisher. I’m an ideas man. However, this time was going to be different. I would finish it, this was important to me.

So, the time came. This was it. I was going to publish. I had printed a few dozen more copies to give to friends etc., and to send in the post to anyone who wanted to buy it. However, how would anyone know about it? Also, I realised very late in the day, if anyone outside of the UK were interested in buying the book then the P&P would in all likelihood put them off any purchase. I needed an international distributor – Amazon suddenly started looking like my only realistic option. This meant going through  CreateSpace , an Amazon company that specialises in self-publishers. I submitted my pdf and cover design, and before long (through a fairly simple process in fact, kudos to the developers of their web site) I had a copy for review in my hands – just after Christmas 2015 in fact. The front cover image was a little offset to the left, but apart from that it was good.

Now, what about the ebook version? I felt at this late stage that the price I would have to charge for the paperback would be off-putting to too many potential readers. I needed an ebook version. It was here that the pain, unexpectedly, really kicked in. I’m a software developer, by profession and for the love of it. I would write a converter program that would take my source document (in LateX) and convert it to the HTML needed to create an EPUB format file. How hard could it be? Now, when developing IT systems, and software in general, you should never assume anything. It’s always in the assumptions that disaster falls. In this case, I assumed that the LateX document format was logical, consistent, even straightforward. After all, I hadn’t noticed anything untoward when using LateX, and I’m a highly experienced software developer. How wrong we can be! I started writing a parser for LateX sources, to analyse the overall structure of it. This is when it really hit me – LateX might look fairly sensibly structured at first glance, but if you are using various extra packages then the structure very quickly becomes, well, kind of unstructured :-). Each developer of each package can invent his own weird syntax for controlling the extra features. Sure enough, weird syntax abounds. My parser program and the algorithms to process the parsed document quickly became a horrendous mess.

The difficulty was further compounded by the fact that the ebook format is really designed for novels, where there are no complex layouts or odd languages. Technical text books are way out of reach for the format, which is why they tend to look so bad – it’s not entirely the publishers fault. In my case I worked around it by converting the hard parts (mainly decorative Arabic for quranic verses) to images, automated using my program.

Now, if this was a process that I was going to repeat then I would have stopped about half-way and re-written my parser from scratch, using the knowledge I had gained so far through the process. But this was a one-off job, I was nearly finished, I reasoned. Just the footnotes and bibliography left to do, I thought. But no, each last part of the LateX that I converted to HTML just made my program worse and worse. Still, I finally got an EPUB file that was a fair reproduction of the real printed book, Arabic content, pictures, footnotes, bibliography and all.

Using an Amazon provided utility I converted the EPUB file to a MOBI document (which is the ebook format that Amazon requires) and could then email it to my Amazon account, so that I could view it in an Amazon reader. Again, it gets funky. It turns out there are a dozen different Amazon devices and applications that readers could be using – it’s not just plain old Kindles. Each device can render things differently. If you want to keep your readers happy, you need to test your MOBI file on every one of them. Fortunately Amazon provide a device emulator, however it doesn’t work too accurately (overlines and dots under letters are just not rendered as they do on a real device) and recent versions don’t emulate the Kindle! Yes, in the end, despite Amazon providing a device emulator, I had to buy a Kindle just to test how the book would finally look on it. Even now I don’t know how the book will look on older Kindle devices, which run older versions of the software.

Finally, one last task – the ISBN number. I invented a new publisher, being me, which I named Westbury Hill Press. In the UK ISBN numbers can only be bought in batches of 10, and I only needed one (the ebook does not need an ISBN number). It is a simple process if a little pricey (around $150). Using some more free software I created a bar code that I added to the back cover, together with a suggested RRP and a publishing categorisation to make life easier for bookstore owners and librarians (as if they will ever see one! – I wish).

Finally then, I can push the Publish button. The softback is available on Amazon via CreateSpace, and the ebook via Kindle Direct Publishing. I don’t expect many sales, just a handful maybe. But that is not why I wrote the book and I don’t mind, I feel that the responsibility is fulfilled and my job is done.  الحمد لله, Thank God!

Oh, a sample of the  book? Head over to Amazon and download the sample, you can view it in the Amazon Kindle app on your tablet.

Learn more about the book at Westbury Hill Press – A Message For Tuqa.

P.S I’m now ready for further punishment – now to write my book about Madhhabs, Maps and Metaphors – I’ll be back in 5 years :-).

 

 


 

 

I enjoyed reading this article about the home office setups of journalists and editors working for Ars Technica. I thought I’d share mine, largely as a kind of personal history:

(With all photos, click to see full detail)

I’m using Ubuntu Linux, I’ve used linux as my primary OS for a very long time, and am using Ubuntu through laziness on my part. The prayer time widget in the top left hand corner of the left screen is one I developed myself (kprayertime). Below it is a moon phase widget, you can see it was just after the middle of the lunar month when I wrote this (in the blessed month of Ramadan). You can also see a weather widget that tells me it’s raining a lot in Bristol – well, I already knew that. Various items belonging to my daughter and wife are scattered on the desk, every few days I have to ‘send’ things back to their owners, usually by throwing them onto the respective bed. The desk lamp was given to me by my friend Zubair. There’s an uber mouse mat with the map of the world, that is now up-side down. An IKEA clock and Aldi watch can be seen, together with some Kumon homework.

Here you can see my wife’s books, and others about science, the philosphy and history of science and other miscellanea. Notable books for me in terms of my personal development are “Godel, Escher and Bach” and “Complexity, Entropy and the Physics of Information”.

Here we have a distilled collection of IT books. Such is the ephemeral nature of IT books that recently I gave/threw away two thousand pounds worth (well, spent) of IT text books due to lack of space in the flat. Below those is the books I studied during my physics degree, and some travel books below them. In the early 1990s I bought the book “The Science of Fractal Images” at Foyles in London. I spent many Saturday afternoons up and down Charing Cross Rd (also eating far too many pizzas at the Pizza Hut there) and bought my first islamic book at The Middle-Eastern Bookshop at the south end of the road. I think it was ‘The Book of Knowledge’, the first book in Imam alGhazali’s Ihyaa’. I also bought various books by Muhammad Asad including his translation of the Qur’aan. I think it’s fair to say that these books still have a great influence on me. Subhaan Allaah when I look back at my life there are definitely times when I was guided like an arrow to the straight path, in terms of knowledge and people.

This bookcase largely contains miscellanea, including various books about language in general, which is an interest of mine. The maps are there because when I travel I like to keep the maps that I used, as a memento. A number of the linguistics books are my wife’s, whose PhD thesis (in arabic-english linguistics) can also be seen.

Here we have various islamic books and below them some arabic language learning resources. I now teach arabic to beginners and intermediate level, while continually trying to improve my own level of arabic too.

We’re getting on to pride of place now, these books and those you can see below are on either side of me when I am doing my prayers, reading the Qur’aan or meditating on life, the universe and everything :-). There are numerous translations of Imam alGhazali, particularly from his Ihyaa’. A notable book is Searching For Solace, a biography of Yusuf Ali. A sad tale but which somehow sums up the state of the muslim world at the moment. Various books by Charles Le Gai Eaton, who I think writes beautifully about islam and explains it so well to the Western mind. Because he has some heterodox views he is not promoted by the muslims, which is a shame.

We’ve reached some truly great books now. Bottom left in blue and white are some grammar and morphology books by Antoine Dahdah. I love those books. We then have various less technical books about iman and other important islamic topics. Above them we have Sharh wa Tahleel of alHikam al`Ataa`iyya by Dr Ramadan Buti (they’re up-side down too – how did they get like that!) which is an explanation of the great book by Ibn `Ataa’illaahi alIskandariyya. My wife chose many of these books when we came back from Syria. There are the five volumes of Sufficient For Seekers of the Path of Truth which is a fine translation of the great book AlGhunya.

Pride of place, of course, goes to the masaahif (copies of the Qur’aan) and tafaaseer (explanations of the Qur’aan). For those who can’t read arabic the red and gold volumes are Tafseer utTabari, an explanation of the Qur’aan largely based on hadith narrated about the meaning of each ayat. The black and gold volumes are Lisaan al`Arab, a wonderful (and huge) dictionary of arabic. There are hundred of thousands of words, often accompanied by arabic poetry (or hadith or ayats) giving an example of the word in use. At the top, physically and metaphorically, are the masaahif, copies of the noble Qur’aan.

Writing the above has made me realise why I cart some of these older books around, even though in a sense they represent a skin that has been shed. They are a part of my history and remind me of the times when I was reading them. They serve as a type of authentication that my islam is not based on an ignorance of the pinnacle of Western knowledge, but as an ascent from it to higher goals.

If you enjoyed this post then please do something similar yourself and let me know!

P.S. my wife is complaining about the mess in the photos, but I wanted it to be ‘as-it-is’ 😉

I love my job of architecting, designing and implementing (programming) IT systems – metaphorically we build elaborate castles in the sky. There is a joy in solving difficult problems and then, after a long period of reflection, analysis and craft, we gratifyingly see the solutions appear before our eyes.

The development of IT systems remains a poorly understood business – it’s unlike other disciplines that seem comparable, such as building bridges or tower blocks. The IT system often takes much longer to develop than expected, and can even fail to deliver anything of use at all. There are many different approaches and there is still great debate as to which one is best. As an individual I feel I’ve learnt a lot about the field, but it’s a type of experience (I’m loath to call it wisdom) that I find hard to explain, regretfully, to new practictioners of the ‘art’. I have established one rule of thumb however, which is that keeping things simple is, truly, the hard part. Unfortunately for me I’ve had to maintain one too many bits of code where the author thought that being ‘clever’ and writing algorithms that were complex and hard to understand, was something to be proud of. On the contrary, it’s crafting simpler-to-understand algorithms which solve the same problem that is an achievement worthy of note.

So, I struggle to express the lessons I’ve learnt over the years. That’s why I’m so delighted having just watched Rich Hickeys lecture “Simple Made Easy”. He articulates many of the important lessons that I’ve learnt over the years, and a lot more. Usually I don’t watch videos as the information content per hour is so low that it just can’t compete with reading a book. However, Rich has managed to beat the information density of most books in his great one hour talk (link below). He elaborates on the contradistinction between simple/complex vs easy/hard. He moves on from a really entertaining philosophical talk in the first 20 minutes into a brilliant analysis of the pros and cons of different approaches in programming languages. Rich, by the way, fairly recently invented one of the best new languages on the block, Clojure.

If you’re a hardened disciple of XP/Agile (i’m just a humble practitioner myself) then fasten your seatbelt. I think this lecture actually kicks off the next debate that IT professionals should be having. Rich formulates a number of philosophical principles and then gives a detailed view on how they apply to the job of programming. You may not agree with everything he says, but you’ll be entertained and ready to discuss the issues in the upcoming round of serious IT debate.

Anyway, on to that lecture:
Simple Made Easy – Rich Hickey

I myself didn’t understand why he thinks switch statements are so bad – if you thing you got it then please explain in the comments!

I’d like to share these beautiful islamic aperiodic patterns with you, created using software researched and developed by two Argentinian brothers (Luis Fernando and Julian Eduardo Molina Abaca) with software / scientific backgrounds.

See more of their research here and here.

You can read more about these patterns and some ongoing scientific research (relating to quasicrystals) at physicsworld.com (requires free registration).

IBM develop ‘most realistic’ computerised voice

The voice is made even more convincing because it has been programmed to include verbal tics such as “ums”, “ers” and sighs….

So while IBM struggle to make the computer seem more human, humans in call centres are instructed to follow scripted conversations as closely as possible.

I’m wondering, will man race more quickly to be robot, or the robot to be the man?

We need to pay more attention to our heart and souls, the hyper-rational mind is favoured too much.

Here’s a neat explanation of why there can never be a single algorithm that can compress all files, even if only by one bit each.

  1. Take the set of all files of 10 bits length. There are 1024 files. Call this Set 1024.
  2. Now take the set of all files of 9 bits length. It is one half the size of the previous set, with 512 files. Call this Set 512.
  3. Let’s take a compression algorithm and compress all the files in Set 1024. Each output file must be found in Set 512, as it is by definition compressed and of 9 bits or less.
  4. But note, that’s a mapping of a set of 1024 files to a set of 512 files. There must be at least one output file in Set 512 which is mapped from two or more input files in Set 1024. Call this File A. So, when we decompress File A from Set 512, which of the files in Set 1024 that it is mapped from should it return?

Edit:

There’s a wrinkle here that I didn’t appreciate when I first wrote this. I should have used the word string rather than file. The problem with using files is that in many cases the same compressed output can be stored differently, by using different length files. E.g. the output 000100101 could be stored as a 9 bit file, or 8 bit file, or 7 or 6 bit file! So the output set, when using files, has a total space of 2 + 4 + 8 + 16 + 32 + 64 + 128 + 256 + 512 = 1022, rather than 512. However, the basic explanation still holds, because 1022 < 1024, so some output files must map back to more than one input file.

For my web site arabicreader.net I need to dynamically create pdf files that contain the users arabic vocabulary that they wish to revise.

This was done using python and some libraries that I created some time ago to output unicode arabic to postscript. I dynamically create the postscript file (postscript is a language of its own and it’s well worth spending a few hours reading the official developers documentation from Adobe) and then convert it to pdf using ps2pdf. I instruct ps2pdf to embed the font ( an old free type1 arabic font) so that the pdf is readable by anyone even if they do not have that font available.

The code to convert unicode arabic to postscript involves shaping the arabic (getting the correct glyph for a character according to the characters before and after it) and creating an encoding that maps the glyphs in the font to the character codes that I output. It also calculates the length of the resulting shaped text so that it can be placed, right-to-left and aligned to the right, at the correct x,y coordinate.

Also, because learners of arabic need to learn the tashkeel, the algorithm also places the harakaat nicely above and below the glyphs according to the size of the glyph. Without this the harakaat tend to overlay the glyph and are unreadable. I do this by stripping the harakaat out of the word before outputing it, and then adding each harakat in a second sweep, taking into account the dimensions of the glyph it is over.

I will be releasing the code as open-source but it’s not yet published. Contact me if you need it now and I’ll email it to you.

Next Page »