Why We Love Google Custom Search Engines for People-finding

When we talk to people about the kinds of tools and techniques that we apply during searches for specific people or profiles, many are surprised when we say that one of the most effective tools in the identification phase of a search is Google.

Often, we find that people are so engrossed in working out what sources of candidates to use (CV databases, social networks etc etc) that they sometimes forget that not only is the vast majority of the web open, free and indexed but also that an increasing number of people like to keep information about themselves on this free, open and indexed web.

Indeed, in the right hands, Google is a fantastic hunting tool.

Or at least that’s the theory. Because as anyone who has tried to use Google (or Bing, or any other search engine for that matter) will tell you, the minute you try to move beyond the relatively simplistic to the relatively complex, you’ll soon find yourself somewhat frustrated.

Let’s start with an example. It’s relatively simple to search for, say people who mention Java on some sort of personal site or page. We just use something like:

java AND (intitle:”about me” OR intitle:profile OR intitle:CV OR intitle:resume)

This returns a decent number of results:

Now obviously this set of results is pretty useless to us in its current form (unless we genuinely have use for a such a geographically diverse and very wide set of results, as well as the ability to use such a volume of data)  But it does prove something of a point: we have eight and a half million pages containing information about people who might be Java candidates.

But let’s see what happens when we try to refine our set of results.  Now you could do this by location (so include as many search terms as is deemed necessary to identify which of these results mights correspond to people in certain locales).  In this case we’re going to do it by mention of specific Universities.  Why?  Because as it happens we encountered this very problem when we were recently engaged on a project to look for Java people who had studied a particular subject at University and who spoke Japanese.

Amongst many different techniques we used, it occurred to us that we could identify which Japanese Universities taught the particular subject in question (which we’ll keep secret, for no particular reason), and run a Google search for profiles of people or pages about people and who went to one of those of Universities.

Sounds simple enough.  Here’s our search string:

Java AND (“Waseda University” OR “University of Waseda” OR “University of Tokyo” OR “Tokyo University” OR “University of Tsukuba” OR “Tsukuba University” OR “NAIST” OR “Nara Institute of Science and Technology” OR “Nagoya University” OR “University of Nagoya” OR “JAIST” OR “Japan Advanced Institute of Science and Technology” OR “Keio University” OR “University of Tokushima” OR “Tokushima University” OR “Kyoto University” OR “Hokkaido University”) AND (intitle:“About Me” OR intitle:profile OR intitle:CV OR intitle:resume)

So we’ve moved from the world of the relatively simple to the relatively complex, and as you’ll see below, Google doesn’t like it much (the key thing to note is the little disclaimer just under the astonishing number of results we get):

151 million results is all well and good, but none of the first 20 represent information about potential candidates (and nor will the rest of them), and that’s because our query has been trimmed, and only the first 32 words used.

Limiting queries to 32 words doesn’t seem like much of a problem, but it’s not just something that happens in our admittedly convoluted example.  In fact, by the time you’ve chucked in a few location or job title-related keywords into pretty much any search you’ll soon find that 32 words is just about enough to seriously limit the usefulness of our foray into Google search.

There is however a solution.  In fact, the observant amongst you will have probably figured out, by careful reading of this article’s title, that the solution lies with Google Custom Search.

Google Custom Search was probably not designed with people search in mind.  In fact, it was really designed to allow a webmaster to add search functionality to their website.  But it turns out that we can twist it slightly to solve even the most complex of Google sourcing problems.

What follows is not a full guide: if you want any specific help setting something similar up, then please drop us a note.  What it is though, should be enough for you to fully arm your Google search efforts.

You’ll find Custom Search at www.google.co.uk/cse.  Go and have a poke around.  When you’re ready, follow the basic instructions for setting up a search engine.  Go through the process, leaving every set-up field blank except for “Search engine name” and “search engine description”.  When given the option, make sure you check the option to “Search the entire web emphasising the sites I select” as opposed to “search only the sites I select”.

Once your search engine is created, go back to your search engine dashboard, click “manage my search engines” and then click “control panel” next to the engine you just created.  This is where the magic happens.

CSE gives you a surprising amount of functionality.  We have used the “sites” functionality to create LinkedIn Power X-Ray searches, where the search itself is saved using CSE and the results then scraped into an RSS feed to give us live updates of new LinkedIn members whose profiles meet certain criteria.

But that’s a story for another day.  Here, we’re interested in the “Refinements” tab.  A refinement is basically a piece of functionality that allows you to annotate a user’s search string with a pre-defined string that you create in the engine.  In our case, we’re going to stick the “engine” part – the part that specifies that list of Universities we’re interested in and the part that limits results to profiles or “About Me” sections – of our Java/Japan search in there:

Now the really interesting thing abut refinements is that there doesn’t appear to be a limit to how long your string is.  If there is, I am yet to find it.

You can use the finished version of our engine below.  Just type in your basic search terms (in our case Java), hit search and then click the little button with the name of your refinement just above the search bar:

In this particular case, we get 238 results, and the length and complexity of our string was not even remotely an issue.

We don’t have the exact figures, but we ended up approaching around 20 people we identified through this method…more than we identified from any other single source.

Feel free to have a play with the CSE we created (below) in this example.  If you think you’d benefit from some specific help on this, do get in touch.  We love talking about this stuff…and talk is free :)

Tags: , , ,

No comments yet.

Leave a Reply