For some years I ran a blog on the My Space platform. My Space was an iffy proposition but it had other features that recommended themselves to me.

Why blog?
The thing that propelled me was the onset of the debt crisis. The narrative is now well known (although, surprisingly, the financial authorities have been reluctant to clarify the key movers behind it – anyway that’s a litigious subject – the winners of that debacle have very deep pockets etc etc).
I thought that keeping a record of my thoughts online was a way to get to know blogging better. I’m a die-hard tech-resistor. I’m the type of guy who thinks the bells and whistles of current software doesn’t have the good of the customer at its heart. e.g. Steam and  independent PC gaming / gamers.

Enough of the wool-gathering.

When MySpace started to shrink, it’s interface became increasingly difficult to navigate. In addition I found my old computer struggled with the increasingly ‘rich content’ for rich content read CPU and bandwidth demand due to bloat.
By 2010 I’d reached decision time; did I struggle on with My Space, or migrate to another platform. I decided to give My Telegraph (the blogging platform of the Daily Telegraph) a roll. My Telegraph uses a stripped down version of WordPress, except for comments which are driven by Disqus. It’s attraction: it had a small writing group.

I left MySpace with the intention of later migrating my blog posts to My Telegraph.
I didn’t.
Why? It’s a long story. I’d signed up to My Telegraph because of its writing group. Superficially this group was busy. When I looked deeper it was plain that it had lost direction. The focus of its activity was a monthly competition, a competition that no-one was willing to run. It needed structure. Basically the writing group was moribund. In February 2011 I set up a ‘vote the best entry’ competition instead of a critique based on (critiques came later). It stabilised the ship but took a lot of effort.
At the time I was rewriting / restructuring my first novel – hey I had a real reason for going to My Telegraph! All this took time. Migrating my blog from My Space was going to take a lot of my time. It went on a back-burner. Gradually the urgency decayed. I could still access the posts, albeit with increasing difficulty – My Space was imploding but its legacy blogs were still there…
Or so I thought.
Wind forward a couple of years.
My Telegraph is okay but it is essentially WordPress lite. It was also prone to keyboard warrior comment-fests. Some of this wasn’t pleasant. The facility is provided foc of charge by the Daily Telegraph – who are entitled to run the place in line with their budgetary constraints; yet at times many of the denizens would whinge like little kids denied their favourite entitlement I realise that this thought will win me few friends but the truth is the daily Telegraph could pull the plug just to alleviate the nuisance. I looked for another blog spot.
Moons ago I’d set up a WordPress account – I needed somewhere to post a short story and I could access my account on My Telegraph but the My Telegraph platform was going through one of its periodic difficult phases. At the time I was still also running the monthly writing competition, so that was a tricky and frustrating time.

Things moved on. In my short time at the helm of that competition, I could see myself becoming a fixture. The problem was I wasn’t there to be a fixture; I was there to fix my writing. The writing group expanded. It became and still is a viable forum. I wanted to back out. Organising groups is fine; I’d proved the method and suggested other things they could do. Early in 2012, work commitments gave me the excuse to back away.
Yet I needed a place to blog that didn’t have visibility. The web robots ties to my presence on My Telegraph go silly when I blogged there, so I decided to divide my efforts; I revived my WordPress account and not the place is a home from home.

When I first tried out WordPress, I posted notes I made while reading Plato’s The Republic. (people might think of me as a philosophical dude – be in the world but not of it comes to mind – if the philosopher has something decent to say – can spark of a train of thoughts – Plato had a lot to convey and so I made notes)

I noticed that those old posting were getting a lot of interest… from Morocco. Now notes are – well merely an indicator. Not precise. My original posts had been made to my My Space blog. there were more. I decided to roust them out – I knew how to do that – or did I?
I didn’t. All access went through a new (and blog unfriendly) process. I had direct links – carefully hoarded on a draft blog post in Mt Telegraph (remember them). No go.
My images could be resurrected but not my blog.
I didn’t know what I’d lost – at least 100 posts I reckoned. ‘Still’ I thought ‘many other people must be in the same boat. why not see what happened?’
I looked. Lo and behold the blogs could be recovered.

From this point it gets technical, folks.

This post by Jim Younkin tells you how to grab your My Space blog. I followed it up. What happened in my case? I was sent a zipped file which contained my posts in HTML. It was time to investigate further and this is what I did:

0) unzip
I chose a folder that was easy to navigate to

1) investigate
Is there a common structure, yes
Is it worth conversion? possibly
Did I fancy converting by hand? Nope. Time for a bit of Excel know how – fortunately I’ve been using Excel for the past 20 years.

2) The structure of the HTML files when opened in Excel was
A1:A4 contain standard labels: Subject, DateCreated, Posted Date, Body
B1 Title of blog post
B2 Date the post was created (nb date and time format mm/dd/yyyy hh:mm:ss AM/PM*
B3 Date it was posted. Same date and time format as above.* As the date was editable; I occasionally backdated posts that were there as hobby placeholders.
B4 Contents of blog post
B5 Contents – 2nd paragraph
B6 More contents etc to end of blog post.


* My Excel defaults to dd/mm/yyyy so I had to watch out for mm/dd/yyyy.

3) List as text
I decided to go for it. The next stage was to get a list of files. There are probably any no. of ways to do this. What I did is use on old DOS trick to pipe a directory listing into a text file. First I opened a command prompt, navigated to the appropriate folder and typed in the following command:
dir *.html > html.txt [press the return key]
This generated a list of files according to the template *.html, but instead of listing them on screen, it piped the list into a text file.


4) text to data
I opened the file I’d created, html.txt. Using Text to data, I reformatted the list, stripping out everything but the file name (even the extension). This became my base file which I resaved as HTML_catalog.xlsx.

5) formulae to evaluate
After that I practised a little to get the right formulae. My plan was to construct something that used the function =Indirect(). I needed nothing from column A. These were labels; handy but just in the way for me. Column B was what I wanted. My plan was to create a whole bunch of formula; open up all the files; allow the formulae to evaluate; convert to values; save HTML_catalog. Job done.
It wasn’t that simple, these things never are. I thought at first that all I needed was to pick up B1, B2, B3 for the title and dates of the post; and maybe a couple of paragraphs – say B4, B5 and B6, just to be on the safe side. However, some of my posts have the odd blank line in. Some of them were plainly tables and I needed to capture enough info to know what was in the tables – add 10 cells – to B16. Then I thought – how cool would it be to grab the whole post. I had the spreadsheet, I had the formulae, I wasn’t likely to repeat the process – why not go whole hog? I went down as far at B70. Even then some posts were truncated.
And because I’m an awkward so-and-so, I constructed the formulae to transpose the info (columns to rows…) this made the data structure manageable.

5) load the files
Finally I was satisfied with my formulae. It was time to load the html files into Excel. I did this in stages. Firat I loaded 5 files – to check how it handled, and how the data came through; then I did 20 files; then 100 – then the balance. there were 600 in all (euchh). it took 20 minutes to get all open at the same time.

6) formulae notes
A11:A610 – filenames as text:
e.g. in A11 I had 506634748
in A12: 506646849
You noticed my html files had numbers for names.

C10:BY10 – text entries for indirect formula – cell reference
in C10: B1
in D10: B2
in E10 B3
all the way to BY10: B75

B11:B610 – indirect formula – file reference
in B11 =”‘”&$A11&”.html’!”
in B12 =”‘”&$A12&”.html’!”


C11:BY610 – generate indirect references
in C11 =$B11&C$10
in C12 =$B12&C$10

in D11 =$B11&D$10

CC11:EY610 – evaluate indirect references

7) It was a little more complicated than that


but I think that’s sufficient flavour.

So, going back to those old blog posts, I thought it worth while to dig out those on Plato’s Republic.
The Republic
Philosopher Kings and Forms (to the Corruption of Philosophy)
The Prejudice Against Philosophy
The Good as Ultimate Object of Knowledge
The Simile of the Sun
Moving onto the Cave

Maybe sometime, I’ll dig more out. Perhaps those relating to when the credit crisis exploded in everyone’s faces, during Gordon Brown’s tenure.

Right now, I need to get back to writing.


