Sifting and ranking millions of websites into a top 50 or so is difficult, obviously. (I say 50, because people probably rarely dig further than a few pages).
Most of the time, Google's algorithm works amazingly well considering the scale of the challenge, especially considering that spammers are constantly trying to take advantage of it.
But I think there might be room for improvement.
Google definitely seems to rely at least partly on the popularity of a site (as measured in click-throughs) to rank it. This is clearly open to all sorts of abuse, hence I think ranking should be more about the merit of the text rather than the number of clicks.
Popularity is all well and good, but just look at some of the things that are popular. I'm not being snobby about this - I can enjoy I'm a Celeb as much as the next person, but it would be a bit annoying if I had to craft a fiendishly complex search term or scroll down 1,000 results before I could watch Newsnight.
A choice of algorithms would help to fix this problem. One could search by popularity, another by subject or level of language used, and so on. I think it would be a good idea if there was an learning algorithm that was manually adjusted by actual people moving things up and down the rankings - experts in their field, perhaps, though obviously there'd be issues of staffing and bias to address. A textual meritocracy approach might also enable blogs to be ranked more fairly alongside traditional websites - at the moment, although they do appear in normal search results, they seem to be somewhat ghetto-ised into a specific blog search option.
Facebook and Twitter are currently moving vast swathes of text, information and human interaction out of the reach of search engines. Of course, I'm not a Zuckerberg-style anti-privacy zealot, far from it. People should always be able to communicate conveniently and privately with their friends via computers. That's what email and instant messaging is for.
But where would we be if every little question asked and answered vanished into the depths of Zuckerberg's private network? What if, one day, you wanted to look up that strange error message your computer was giving you (or, to take a more life-and-death example, imagine you needed to know how to cook a haggis) and the only results that came back in Google were those spam sites that are increasingly clogging up the internet - because no-one was talking on forums or contributing to public comments any more?
No doubt there's also a lot of text stored in Facebook that is of no value whatsoever, and the internet would be a poorer place for having it, but that's the job of search engines - sorting the wheat from the chaff.
Not only is text vanishing from the internet, but so is information about that text. If hyperlinks to interesting content are increasingly shared via non-searchable social networks, then search engines will be increasingly starved of information used to determine what is popular or useful. Facebook delivers a double blow to the internet - not only is it syphoning off content, it may also be making search engines less able to sort the information they already have.
It's not just Facebook. Many websites are increasingly making use of facilities that can't be seen or used without enabling Javascript. This is annoying for people like myself who value their online security and therefore use NoScript, but it creates a much greater problem. For example, look at the comment facility on this very website, provided by IntenseDebate. It's convenient for me to use because I didn't have to write the comments system myself and all the processing is stored and handled by IntenseDebate's servers, elsewhere. But because it uses Javascript, it can't easily be indexed by Google, nor any search engine for that matter. The comments left on this website do not appear in any search engine.
This article may also be useful for other versions of Windows, it's difficult to say - Windows XP is the only version of Windows that I run. Why? Because it's a good, solid version that still has a few years left in it (support for Windows XP ends April 2014) and it has quite modest demands in terms of how modern and fast a computer it requires. The less your computer is being overrun by Windows' demands, the more of its resources are available to the actual programs you want to run. That's why, if you look at the minimum hardware requirements for computer games, software developers tend to specify more memory and processor power for Vista and Windows 7 than they do for Windows XP. In many ways, it pays to stay with the oldest version of Windows that Microsoft still supports that also meets your requirements.
Follow this advice at your own risk, and remember that with today's delightfully complex operating systems, there is no such thing as a completely hack-proof, internet-connected computer. No, not even a Linux or Apple Mac PC. I've tried to make this article as understandable as possible, though you will need some familiarity with various IT concepts. In an attempt to offset this, I'll try to provide helpful links to explain things where I can. Be sure to read and understand the entire article before attempting to follow the steps.
Cars are increasingly being fitted with GPS screens, trip computers wired into the engine management computers and on-board entertainment systems that can display TV, play MP3s, and even connect to the internet.
All great stuff, but computer security can be really complicated, so I hope car manufacturers' software engineering is more rigorous than that often found in the home computing sphere.
It's fine if all these systems are separate from each other, but convenience, interoperability and efficiency suggests that all of these systems will get knitted together into one in-car IT system. Once this happens, a vulnerability in one system could open the doors to exploitation of all the other systems. For example, a music CD burned with some encoded malware might be able to instruct the engine's fuel injection system to shut down, disabling the engine. In my opinion, engineering and diagnostic functions should never be accessible via a method that could be potentially invisible to the car owner. The possibilities for a car equipped with mobile internet connectivity could be horrifyingly endless.
I like in-car gadgets, but I hope the car industry has learned from the mistakes of other sectors and are keeping their various systems completely separate, or at least keeping the security and engine management systems apart from everything but immobilisers and diagnostics.
I'll be surprised and impressed if we don't hear of cars being hacked within the next few years.
From the standpoint of someone who makes some effort to keep up with the very fast-moving world of IT (sometimes needlessly fast, in my opinion), car manufacturers often seem very slow to innovate. Maybe this is as a result of being cautious of new technologies and extensive testing and engineering.
Progress is generally a good thing. Few people today would want a car that had to be started with a crank handle because it lacked a starter motor. Similarly, most people would not want to be without an engine management computer, as the computer can generally adjust the fuel/air mixture better than a human operating a manual choke, and manage a number of other engine parameters better than the old mechanical methods that were used.
Since my last article on the amazing Raspberry Pi miniature computer, the Raspberry Pi team have been extremely busy.
The specifications as reported in my original article have now changed, and there are now two different models planned.
Both models are equipped with a 700Mhz ARM11 processor. I did some rough calculations around the time of the Raspberry's announcement and decided the processing power would probably be roughly equivalent to a Pentium III running at about 600Mhz. Delvings into the Raspberry Pi website suggest it might be closer to a PII or PIII@250Mhz. However it should be noted that it's very difficult to fairly compare the performance of different processors against each other in an easy way. The proof of the Raspberry Pi will, of course, be in the real-world feel of how it performs - something we might discover by the end of November, which is the earliest mentioned launch for their initial 10,000 unit production run.
Both models are based on a Broadcom BCM2835 board, with a surprisingly powerful on-board OpenGL ES 2.0-capable VideoCore IV graphics processor (my rough guess is that it's somewhat better than an old Nvidia GeForce 2).
The Raspberry Pi running Quake 3
.
This means that the RasPi will probably be capable of running all my favourite Linux games.
The graphics card will share the memory available to the system, and the amount of RAM it reserves (and removes from the memory available to the system) will be variable and can be customised depending on need.
All versions of the Pi will come with a full-sized SD memory card slot. This will provide a solid-state hard drive and swap memory facility. It could also be used like a video game cartridge slot, since the graphics chip will boot whatever Operating System (OS) it finds on the card.
The 140 character limit encourages the use of URL-shortening services such as Bitly.
The problem with such services is firstly, that you can't see what website you're being redirected to - it could be the BBC, or Goatse, or a site infested with malware and viruses. There's no way of knowing - hovering over the link tells you nothing, unlike normal links.
Secondly, the way the URL-shortening service works means that anyone who clicks a shortened link is briefly passed through the shortening services' webservers, meaning they can track everything you click. At best, they'll keep that data to themselves. A slightly worse scenario might be that they make that tracking data public. Just like Bitly does.
The worst case scenario is they use a combination of tracking data, cookies and information gathered from other sources, such as advertising, to build a profile of the people you know or follow on Twitter and what things interest you, and then sell that data to the highest bidder.
It would have taken 9 tweets to explain this simple point. That's another thing I don't like about Twitter.
David Braben, the programmer famous for the classic Elite and Elite II: Frontier games, is heading up a charitable foundation that aims to provide a £15 Linux PC the size of a memory stick, whilst providing today's schoolchildren with access to the sort of flexible computing experience that was more common in the past.
The Raspberry Pi is (according to the provisional specification) based on the ARM11 700Mhz processor. It's to be loaded with 128MB of memory, a USB2.0 connector and a composite/HDMI video output. This means it can use a TV as a display, while the USB connector allows cheap and standard accessories to be plugged in, including mice and keyboards (and much more). It will also have a SD/MMC/SDIO memory slot and a 'general purpose' interface. The operating system will be Ubuntu, and software will include Iceweasel (a re-branded version of the popular Firefox web browser), Koffice (similar to Microsoft Office) and the Python programming language. Braben says he hopes the Raspberry will be ready within the next 12 months.
128MB might sound like a very small amount of memory for a modern operating system, so after reading about Raspberry Pi I tried installing the latest version of Debian Linux (Ubuntu is based on Debian) on an old 128MB machine*. After removing some of the unneeded software components, it was actually perfectly usable, if a bit slow at times. These are only the provisional specifications, though, and I expect Raspberry Pi will strip down the software down more carefully than I did. Also, I was using KDE4 as my desktop manager (which is roughly equivalent to Vista/Windows 7 in terms of fancy graphics and polish), whilst Ubuntu's default desktop environment is probably a little lighter on the processor and memory.
In addition to the usual tricks used in Search Engine Optimisation (SEO), there's an aspect of the way Google ranks pages that seems to get little attention, yet if it's true then it opens a potentially serious weakness in Google's rankings, and possibly in their entire business. I suspect that this flaw, if not corrected, might forever place them at the mercy of social networks of one sort or another.
Google is the world's favourite search engine. In theory that means getting a website displayed on or near the front page could make a huge difference to the amount of traffic it receives, and that makes page ranking worth money. The internet, as they say, is serious business.
Naturally, that leads to people studying the way Google works and then adjusting their websites to fit the pattern. Thus, the SEO industry was born.
I've learned a fair bit about SEO over the years, though I don't deliberately optimise this site for rankings. I just designed it in such a way that it would have a fair chance, nothing more.
SEO is complicated, but the important parts of it are pretty well known - have good content and acquire incoming links. There's all sorts of technical bits and pieces too, but this isn't meant to be a technical article so I won't go into all that.
Supposedly, one of the ways that Google judges the quality of a webpage is by how many people do a search and then click on that website.
That seems sensible, but it opens Google's rankings to being gamed. It would be easy to make a page look more popular than it really is. Friends of a website owner could repeatedly do searches and click the link, making it rise up the rankings. Even worse, companies could recruit hundreds of people across the country or the world and have them click links for their clients. It would be very difficult for Google to figure out which clicks were genuine. Worse still, a well-designed botnet could do the same thing, with the owners of the compromised machines having no idea that their machine was quietly doing SEO on behalf of a hacker.
The interesting thing about this scandal is that it very much looks like a case of the stupid being caught and the more deviously clever getting away, as is so often the case.
From what I've read, it appears that (most of?) the phone 'hacks' have consisted of celebs having their mobile phone voice mail accessed by the press because they didn't bother to change the default PIN on their voice mail service.
Mobile phone providers generally provide a service whereby if you ring a mobile number and get put through to voice mail (for example by not answering the phone), then you can not only leave a message but also listen to recorded messages if you know the PIN. Not changing the default PIN is rather like leaving your front door unlocked. Yes, the mobile phone companies should do more to make customers change the default number. Yes, it's unethical and perhaps downright criminal for anyone to take advantage of someone who doesn't know what they're doing. Yes, it's foolish to not read the manual and secure your voice mail.
My reason for highlighting the stupidity of the celebs for not changing their PINs is not to ridicule them (who hasn't made a mistake of this sort, at some point?) but to point out that the simplicity of this so-called-hack means it is easy to do, and also easy to track down and catch. The unsaid reverse of this, is that there are probably much more sophisticated hacks currently undetected and unreported.
On a 'social engineering' level, it would be somewhat surprising if there wasn't some bribery or blackmail going on within the low-paid workers of major communications companies, such as Virgin or BT. As communication hubs for telephone and internet, they'd be obvious and valuable targets, and the people who work there who have access to the recordings, logs and traffic probably aren't paid enough for all of them to resist bribery, nor sufficiently vetted to resist blackmail. If that sounds far-fetched, then perhaps you didn't read the news stories quite recently about phone banking call centre staff giving up information about their clients for money in their lunch breaks, as reported in the press. I'll add a link if I can find it again.
The website is now three years old. To mark the occasion, I have finished working the Intense Debate software into my Content Management System and public comments are now enabled across the site.
Fair Pay launched in November 2007 as the official website of the Fair Pay Action Group. It was a response to the pay cuts brought in by a Labour Council in the name of the Single Status Agreement. The Single Status Agreement was finalised by the unions in 1997 and was designed to implement the 1970 Equal Pay Act in 2007, and also work as a defensive measure against expensive tribunals brought against councils in the name of the Act. This site, and the group's members, worked to oppose the cuts, publicise the workers' plight and lobby councillors and politicians.
Today Fair Pay is fully independent and no longer associated to a particular council, nor is it part of the Fair Pay Action Group.
One of the most popular suggestions made to the Government's Spending Challenge website is for the public sector to start using more Open Source software.
This would mean replacing things like Microsoft Word with something like OpenOffice Writer.
For almost every piece of software in common use in the public sector, there is a free alternative. Imagine how much money could be saved by replacing paid-for software with free Open Source alternatives. For every PC in the public sector, about £200 could be saved by replacing Windows with Linux and Office with OpenOffice. Considering how many public sector workers have a PC, and that there are 6 million workers in the public sector, there's some serious savings to be made.
Free, Open Source software is widely used - last time I checked the figures, it was estimated that around half the world's websites run on OSS - including this one. If I used paid-for commercial alternatives, I would be bankrupt by now.
So I guess you'd expect me to be an enthusiastic supporter of Open Source software in the public sector, right? Yes - but with a few small reservations.
First, the 'yes' part of the equation.
I have little doubt that widespread use of Open Source software would bring some massive savings. Additionally, Open Source software often performs better than commercial offerings, and as a result it will usually run much faster on the same, or older hardware. This means that IT departments could keep the same old computers for much longer, reducing upgrade costs for years to come. Local authorities regularly throw away machines far more powerful than the Linux machine I use as a development PC, file server, database server, and all-round workhorse. (It's an 800Mhz Pentium / 384Mb RAM, for the interested).
COINS records government expenditure and categorises it by various headings, such as government department, project, account or month. The categories are fiendishly difficult to translate into meaningful, real-life things. If you'd like to know exactly how much the previous government spent on ID cards, for example, you'll be disappointed. According to The Guardian, they've cunningly hidden the figures in a general 'identity and passport' category.
To anyone familiar with local government expenditure, the released data may resemble council budget codes and expenditure records.
The data will likely provide some good information on where taxpayer's money has been spent, but it's probably vague, obfuscated and in some cases perhaps misleading.
For example, money might be spent purchasing assets and services in one category, whilst a different department begins monthly payments to the first department against a different spending category to the same department, and a few months later the entire spend against services is mysteriously refunded, and 2 years later the assets are amortised... (see the Olympic funding contribution category for an example of this sort of confusion)
It would take serious time and effort to uncover useful expenditure information from amongst the inter-departmental cost-code juggling and accountancy-speak.
Hopefully future expenditure data releases will be more straightforward - all central government expenditure over £25,000 and all local government expenditure over £500 is to be published online from November.
The newly published database is difficult to work with and requires access to a high performance database system such as Oracle, MySQL, PostgreSQL, or Microsoft SQL Server - there are over three million records in the files which is more than Microsoft Excel can load and is probably impractical for Microsoft Access.