Introduction
Previously: I started looking at The CerebralRift due to a couple of things that I needed to accomplish. I knew I needed to change my backup system, and I needed to make certain I had removed all links to The CerebralMix. Most pressing was changing my backups to (a) address hosting service changes, and (b) cut the expense of the plug-in I’d implemented 3-4 years ago. Less pressing, but still important was looking at and cleaning up the after effects of migrating CerebralAudio to its new home. However, I found a bomb had gone KBOOM on the site, and an audit pointed to an immediate need to do a lot more:
- Clean up and remove plug-ins that either: (a) were unused, (b) would be made redundant during the upgrade, or (c) were having a performance impact on the site. And there was the matter of a migration between plug-ins I’d forgotten.
- There was a need to reorganize the appearance of the website. I needed to address structural issues, I needed to remove irrelevant references, and I needed to re-focus the the interface on the content.
- Overall the site seemed a bit sluggish. Not horrible, but not optimal.
- And then there were content issues. A massive pile of them:
- 1300-1500 broken links.
- Articles that were no longer relevant to the site.
- A cleanup needed on the majority of the content I had started over a year ago, but never completed.
- A massive pile of images that were (a) no longer relevant, (b) extra copies of images in sizes that weren’t needed, (c) images that weren’t optimized correctly, and (d) images that had broken for unknown reasons.
All of this played into another need: addressing the future of the CerebralRift and CerebralMix. Both have been on my mind since the beginning of the year. Both had been set aside due to personal life-changing events, my focus on CerebralAudio, and my desire to start some other creative endeavors (under my new moniker ‘Unattributed’). Recently, I had started to design a new plan. It was somewhat vague at this point, but the process of refreshing the CerebralRift has brought some new clarity.
But before I could move forward with new plans, I needed to focus on the task at hand.
Content Cleanup, Round One
The focus of the CerebralRift has always been the social and artistic commons. Most of this has involved Creative Commons licensed works. And the vast majority of the content has been music reviews, along with some interviews, a few editorials and feature stories. The first thing I tackled was locating the content I felt would no longer fit the site. These articles were in the news and editorial sections. I was a little stressed over this: some of the articles mixed content that I wanted to keep with content that didn’t fit. In this case, I kept any article with content that fit my renewed focus for the site. (Eventually I may pull those pieces out and write new content to replace the original articles.) That reduced the article count by approximately 20 articles.
Next was the big one: The reviews go all the way back to 2010, and there are close to 500 of them. Over During the eight years I’ve been writing my standards have changed. There wasn’t time to rewrite all the posts to fit the current standard, so I came up with a set of guidelines for refreshing them:
- Make the heading format the same: list the release title, artist, date and license. Other information would be moved.
- The ratings and release links needed to move to the new format at the bottom of the review.
- Album / release artwork needed to move to the featured image.
- Ratings, pros, cons, and links moved to the bottom of the article. This is the plug-in conversion previously mentioned.
- Remove additional media, attachments and embeds (with a few exceptions).
- Validate the links and availability of the release.
The idea was to avoid major edits to the articles while making them more consistent. Of course, there were occasional typographic errors that I caught along the way, but I didn’t go out of my way to scan the articles for all such errors and mistakes. In some cases edits were made to remove references to embeds and attachments.
Removing the additional media, attachments and embeds deserves a bit of explanation as it was something of a painful choice. When I originally started The CerebralRift, it wasn’t about reviewing Creative Commons music releases, it was a personal website, in fact it was based on a wiki format. When I moved to a CMS based system and decided to start writing about some of the music I was listening to I still didn’t have plans for it to be a Creative Commons Journal.
I started attaching copies of some tracks, thinking people would be interested in hearing the music while reading the articles. Embedding players wasn’t well supported at this point, many sites didn’t even have embed-able players. However, I quickly realized attaching tracks was chewing up a lot of storage. I couldn’t afford to buy extra storage, so I stopped uploading attachments. When embeds became widely available I started using them. The thought on using embeds was that readers would like to sample a release before going to the actual website to check it out (which was beneficial to me, I thought, it meant people would stay on the page longer…which, it turns out, isn’t true).
The Painful Reality
When I performed an audit of the site several things became apparent:
- In a few cases the attached media had broken, which served no purpose.
- The old reviews weren’t getting much traction.
- There were a lot of embeds that had broken.
- Broken embeds were affecting page loads negatively. Very negatively, sometimes taking up to 10 times longer to load the page completely. (Note: in most cases the article would display, but the page wouldn’t finish loading.)
It was clear something needed to be done about this situation as part of the refresh. The easiest option was to remove the embeds, but I didn’t like that idea. It seemed like a lazy way out: surely I could use my broken links checker to fix broken embeds, right? Then I started thinking about the function of a review from my perspective, from the reader’s perspective, and from the artist’s and/pr label’s perspective. I had prided The CerebralRift on providing as much of an “all-inclusive” service as possible. The reader could:
- Read the review.
- See the artwork.
- Listen to the release.
- Get the release.
What I hadn’t considered is this meant that The CerebralRift was functioning as a filter. In this case, the filter doesn’t serve an important part of the community: the artist and label. Yes, they might see a few extra plays if a reader listens on the embed. But that isn’t guaranteed (as I mentioned above, I found no additional time spent on reviews with embeds or attached media). But more importantly: the reader doesn’t interact with the artist or label directly in many cases. By including the embeds, I had removed the incentive to go to a label or artist’s website, explore their content, and possibly obtain something they would like.
That is when I decided that the embeds were toast on the site. And it was that thinking, along with the restructuring of the posts, that lead me to the revised structure in three primary sections:
- Block information at top: artwork, discography info, synopsis / teaser.
- Main body of the review.
- Summary at the bottom: pros, cons, pricing info and links.
And with that idea in mind I set about editing the reviews, one by one. Moving pieces around, putting them in the new structure. Making minor changes to the text to remove references to embeds. And, of course, removing the embeds themselves. It didn’t take too long (I was probably 20 to 30 articles into this process) when I made a discovery that pissed me off, and inspired the rant that follows.
WTF Is Wrong With Embed Developers?
There is a principle in software development, and especially in web development: if a piece of software is going to fail, it needs to fail gracefully. Obviously many developers working on applications with embedded elements don’t know or understand this principle.
Why do I say this? Very simple: after updating 20-30 reviews, I noticed that my broken link count was down, as expected. But I didn’t expect that it would be down by nearly 200 links. Yes, 200 broken links disappeared from my site just by removing the embeds from 20-30 articles.
That would mean that each embed accounted for somewhere between 6 and 10 broken links. But wait, it gets better (or worse depending on your perspective): not all of those articles had embeds in them… When I was writing about The Faust Cycle by Ergo Phizmiz there I couldn’t attach the release as it was a massive 14 hour long work… I also couldn’t embed a copy of it because it wasn’t available to embed.
Nearly one third of the articles I had edited didn’t contain embeds. So, it was only 12-22 articles that I edited to see the broken link count go down by approximately 200 links. That’s more like 12 to 16 broken links per article. I couldn’t believe my f****ing eyes when I saw that count.
So I started verifying: I found more articles with broken embeds and looked at the web pages that were generated. Sure enough several of them hadn’t failed gracefully. They had generated code for the embed, but the links weren’t valid in that code. There was no attempt made during the API call to verify that the links that were being inserted into the embed were valid.
This makes me angry for several reasons. The first is the obvious: their broken embed code was affecting the performance of my website in multiple ways. In fact, the negative impact of their code was having a magnified effect on the ranking of my website with all of the indexing sites: Google, Yahoo, Bing, Yandex, etc.
But second, and this is more important, I don’t know if they were introducing a potential security risk to my site.
I admit to being a little paranoid, but the majority of security issues that happen because of broken code. The majority of security breaches happen because hackers are able to inject bad data into a site and exploit something that is broken or designed incorrectly. The fact that these embeds were failing badly leaves me thinking that there was potential for exploitation. Fortunately, since this site hasn’t had a lot of traffic for a while no one thought to try to exploit them.
(While I am being a bit paranoid here, security is always worth being paranoid about in my opinion. Even with a low traffic site, I have caught hackers attempting to break into my site. I have had to take actions to prevent security breaches.)
Honestly, I am now happy that I removed the embeds from the site. If the developers of the world cannot make certain that they fail gracefully, don’t impact site performance, don’t cause major negative effects with website ranking, and not raise potential risk to site security, I don’t want to infect my site with their crappy code. (Note: I didn’t name which embeds I found the problem(s) with. It wasn’t just one site. It was several of them… So, this wasn’t just a situation where one provider has a bad developer or two. It is a situation where ir appears there is a bad standard set for web development.)
The Long Grind
It was a long grind to go through every article on the site and restructure it. Along the way I made a couple of other discoveries that I want to rant about in part four of this series.