Posted by: Paraic | August 19, 2008

Engineering in Crisis?

I’ve been reading the Irish Independent’s editorial yesterday on the ‘Crisis in Engineering‘ in Ireland and the thousands of high-skilled jobs for which there’s a scarcity of applicants given the low numbers of graduating engineers.  The articles are using numbers from the School of Computing at Dublin City University where I’m based.

“In Dublin City University in 2005, 224 students graduated with a BSc in computer applications. In 2006, only 92 qualified. Last year, the figure was 78. This year, it has gone down to 70.”

Coincidentally, I’m writing this post from our stand at the COLING conference in Manchester, where CNGL is a bronze sponsor.  We’re here as part of a major recruitment drive as we ramp up our research.  The posters we have here are advertising seven (7) positions for Post-Doctoral Researchers and eighteen (18!) openings for funded PhD Studentships.  (See for more details).

[There used to be a B.Sc. degree in Applied Computational Linguistics offered jointly by the School of Computing and the School of Applied Languages and Intercultural Studies at DCU, but that fell victim to low levels of interest.   The last group of students graduated in 2006.]

Our CSET is keenly focused on ‘training’ and graduating PhD students.  It’s one of our metrics, given the government’s goal to double the number of PhDs in Ireland under the National Development Plan.  Chapter 9 of the Plan deals with the Science, Technology and Innovation (STI) Programme which aligns with the Strategy for Science Technology and Innovation (SSTI).  This strategy calls for:

“Significant increases in the number of new doctoral awards in Science, Technology and Engineering and in Humanities and Social Sciences, from a total of 730 in 2005 to an estimated 1,312 in 2013. This increase will be matched by a sustained improvement in the quality of research output as measured by publications and citations. Each research organisation will also have specific targets in relation to research commercialisation and will be benchmarked against leading international institutions.”

But where are the candidates for Doctoral awards in Science, Technology and Engineering going to come from if we’re not graduating engineers?

In fact, things may be even worse than that….  if we look further back in the pipeline.  News last week was focused on the results of the Leaving Certificate examinations (that determine, based on a system of points administered through a Central Applications Office, young people’s entrance into third level courses).  Concern over poor Maths results and the more than 5,000 pupils who failed maths caught the headlines…

“But the results overall highlight the shallow pool of high achievers in maths. While over 55,000 sat the Leaving Cert, only 6,600 gained an “honour” (Grade C3 or better) in higher-level maths. An honour in maths is the minimum requirement for third-level courses in science and engineering. In practice, this means that tens of thousands of students have already disqualified themselves from third-level courses in science and engineering, despite the priority given by the Government to the knowledge economy.”

If we examine the funnel to consider how many of this year’s leaving cert students might be candidates ultimately for Doctoral awards, we immediately cut out 88% of the field; they don’t qualify for third level courses in science and engineering.  (There are certainly other ways to get to the same end point, but it’s worth looking at the standard case of a straight four-year Bachelors degree in science/engineering).

I’m going to try to do some more digging for numbers to fill out this funnel a little more… like how many graduates from degree courses each year typically consider a post-graduate qualification rather than taking up employment.  I’ll look at how many post-graduate positions we’ve already filled and how many of the successful candidates are new immigrants to Ireland.

Meanwhile (and I’ve been interrupted several times in writing this post) I’ll continue to hand out our recruitment flyers here at COLING and see if we can’t get another 18 candidates into the pipeline for Doctoral awards…

Posted by: Paraic | August 12, 2008

Google Translate for iPhone

Just saw over at SiliconRepublic that Google has developed a version of Google Translate for iPhone.  Following this space because of our work at CNGL

Posted by: Paraic | August 11, 2008

Vive la France

We went car shopping this weekend (another thing on the long list of things to do).  I’ve been commuting by bus (Matthews coach) from Dundalk to DCU since I moved in February, but now that the family is here too and we’re looking at living more ‘in the country’ a car (more likely two) is on the agenda.

We ended up looking at Peugeots.  My wife liked the 308sw (diesel engine) because it seats 7 (assuming little kids in the very back) but is nothing like her old hulking Ford Winstar that (maybe) got 18 or 19 miles per gallon.

It struck me though.  There are no French cars in the US market.  Basically the common cars (excluding the high end luxury cars) are all US, Japanese or German (Mercedes, BMW, Audi, Volkswagen), plus some Korean (Hyundai/Kia).  [I had thought Saab was another exception but just learned it’s owned by GM.  I knew Volvo was owned by Ford].  No Renault, no Peugeot, no Citroën.  Those are all popular brands here.

I remember reading some time ago a story of Peugeot in the US market.  It was a story of innovation gone wrong because of a lack of appreciation for the customer.  It was about Peugeot innovating in feul injection technology.  The problem was that people were used to pumping the gas peddle in their cars to prime the engine before starting and this caused the new Peugeot engines to flood and not start.  Peugeot’s response was that people would simply have to change their behaviour – rather than them try to address this problem in some way.  That was the wrong approach. (The new technology did eventually take hold, because it was better, but Peugeot lost the advantage).

I couldn’t find reference to this story online.  I did find a story in a book on Consumer Boycotts about a boycott of Peugeot [page 85] in the US market related to passenger airbags that may or may not have contributed to Peugeot’s decision in 1991 to pull out of the US market.  I also found a number of reports sugggesting that there has been some recent interest on Peugeot’s part in returning to the US market.

I hope Peugeot is showing better customer appreication now.  I do like their cars.

Posted by: Paraic | August 8, 2008

Catching a Falling Knife

I just got word today that our house sale in Syracuse has closed!  Yay! [Hope too that the dollar keeps strenghtening to help my conversion to Euros!]

In the meantime, we’ve turned our attention to house-hunting here in Ireland.  Given the state of the market here, I’ve been describing our decision process related to buying a house here as like ‘catching a falling knife’.  In the end though, I think the important thing is that I don’t view our home as an ‘investment’ as such but as a home – a place for our family, for some of our happiest times (and long term!).  It’s also the case that we’re looking in the ‘countryside’ to the north/north-west of Dublin, rather than in the city itself – I guess the rent versus buy decision would be different if we were considering living in Dublin.

In moving back to Ireland, the cost of housing was always a concern.  The Permanent-TSB/ESRI National House Price Index, a widely quoted source of information on Irish house prices, shows that house prices in the ‘Border region’ rose 226% from 1996 through end of 2007 (In Dublin, price growth was 370% in that period).  Our house in Syracuse appreciated exactly 56% between 1998 and 2008 based on our actual purchase price and selling price (and yes, I know Syracuse is ‘different’ from much of the US in terms of property prices).

So I was quite happy then to see house prices starting to fall back down here sine I moved back in February.  According to the Permanent-TSB/ESRI Index, prices nationally fell by 7.3% in 2007 and have fallen another 4.4% in the first five months of 2008.  Of course with real estate everything is local, so the situation on the ground varies greatly in different areas and for different kinds of houses.  But how much further will prices fall?  And when will they stabilize?

Professor Morgan Kelly at UCD has been referred to in the Irish Times as “the economist who predicted the property slump’ based on his paper back in 2006 on Irish House Prices: Gliding into the Abyss?.  In that paper, he set out the possibility of “large and prolonged falls in real house prices of the order of 40-50 percent, and a collapse of house building activity”. We haven’t gone that far yet…

…but wouldn’t you be crazy to think about buying in the middle of this freefall?  I don’t know… we’re thinking (hard) about it!

Posted by: Paraic | August 6, 2008

Google as Spell-Checker

I just noticed something that hadn’t really jumped out at me before, but now it really does (I realize this is not the first time I’ve done this).  In writing my last blog post, I included the word “conscious“, which I had incorrectly spelled “concious” .  It didn’t seem right.  It wasn’t underlined as a spelling mistake, but it still didn’t seem right.  So I did what I now realise I do somewhat often – I typed the word as I spelled it into the search box in Firefox.

Now my Firefox search box is set to search SearchMe.  And it turns out that there are plenty of pages found for a search on the word ‘concious’. But I still wasn’t satisfied.  It still didn’t seem right.  So I jumped over to Google to do the same search and saw the familiar, “Did you mean? with ‘conscious’ highlighted for me to click on.

It turns out, now that I realise it, that I use the search box for spell-checking reasonably often – I recognise the pattern of behaviour now that it has changed because of using SearchMe rather than Google.

In writing this, and going back to double-check, I now notice that SearchMe does in fact have the question, “Did you mean ‘conscious’?” to the left of the search results; it just isn’t very prominent and I missed it.

But something to note; if you want to have a good search engine, you should make sure you have a good spell-checker to alert people to mis-spelled queries… people rely on that (well at least I do)!

[And for completeness, I’ve just run the same query on Cuil.  As I was typing ‘concious’ it gave me several suggestions for queries, all incorporating the mis-spelling (e.g. ‘conciousness’).  It also came up with 1,314,346 results for the mis-spelled query; without alerting me to the mis-spelling]

Posted by: Paraic | August 6, 2008

On Tags and Categories

I’ve been blogging for just about a month now.  One of the things I’ve been conscious of recently is the choice I’m faced with (using WordPress) at the end of each post, before I publish; the choice of adding Tags and/or choosing Categories.

So far I have not used Tags at all, but I’ve built up a list of 17 categories I think (and I’ll probably create a new one for this post).  I haven’t introduced any ‘structure’ into the categories yet, like broader categories having narrower ‘children’ categories, and I’m not even sure I can do that here.  So I’ve been wondering whether I should be using Tags rather than Categories or using both.

Then I noticed in the July Wrap-Up post on the WordPress Blog that they reported on Tags and Categories created in the month:

You added more tags and categories than ever before: more than 8 thousand tags and 19 thousand categories.

Should I read anything into the fact that there were more than twice as many new Categories as Tags?  Are lots of other people using only Categories and no Tags like me?  Or is there more overlap in Tags than Categories (are those even unique numbers they’re reporting) – I doubt that.  There is a posting on the difference between Tags and Categories in the WordPress FAQ, but I think it boils down to the fact that you can use them as you like.

I’m aware of the distinction between Taxonomy and Folksonomy of course, and I would tend to think of Tags as relating to Folksonomy and Categories as relating to Taxonomy, but in the case of WordPress, it seems that this distinction is not there.  My blog Categories are free-form words and phrases assigned by me and they are an independent list with no particular structure.

So it’s all about how I use them I guess.  I’m going to stick with my Categories rather than Tags and see how it goes.  I’ve added a Category list widget in the right side-bar (the Category cloud seemed cluttered but I may try it again later); maybe that will be useful?  And I’m going to create a new ‘Categorisation’ Category – I think it’s something I’m going to come back to…

Posted by: Paraic | August 5, 2008

Bank (Public) Holidays

Just returned from a long weekend due to the ‘August Bank Holiday’.

Now, in the US, I always knew why I had a day off work.  We had as holidays New Year’s Day, Memorial Day (last Monday in May; remembering soldiers who’ve given their lives), Indepdence Day (July 4th; commemorating signing the Declaration of Independence), Labor Day (first Monday in September), Thanksgiving (last Thursday in November; probably my favorite US holiday) and Christmas.  There are other US Federal Holidays, but these are the main ones and the ones that were observed where I worked.

So I wondered where the ‘Bank Holiday’ originated, since it seemed to be a holiday for no particular reason and there seem to be a few of them sprinkled throughout the year (I guess I took them for granted growing up).  It turns out they can be traced back to the Bank Holidays Act of 1871.  Interesting to note the original motivation related to giving Bank employees the opportunity to participate in and attend cricket games!

There are now 9 public holidays in the Republic of Ireland according to the chart (10 in Northern Ireland) [although Good Friday is also a holiday in the Republic, just not counted under bank holidays since that tradition pre-dates the Bank Holidays Act].  All but three of these now are associated with some commemoration or observance, but the June Bank Holiday, Summer Bank Holiday (August), and October Bank Holiday remain just to give us a day off for no particular reason (other than to participate in Cricket).

Posted by: Paraic | August 1, 2008

Search is not a solved problem

Search is not a solved problem!  You’ll hear that from the Google folks too, say from Eric Schmidt in his comments at the Google analyst day.

I bring this up becuase I’ve continued to follow the initial response to the launch of Cuil, most of it negative.  Much of it though seems to miss the point.  Yeah, I too initially had a thing about the name, but I got over that – it’s not going to matter.  Neither is it going to matter longer-term how well they did on their first day, when they admittedly had some problems.  They’ve posted a founder’s note on their site and I’ve also been following the robust defense being offered on Sarah Carey’s blog.  Cuil is offereing something different; the size of their index, suggestions by category, matching based on content not popularity, and a magazine-style results layout.  Now some of those things I’m describing based directly on their own claimes; I don’t know yet what those mean for me as a searcher – am I going to have a better experience overall?  Only time will tell – well at least more than 1 day.

Cuil is now the third new/different search engine I’ve come across this year.  PowerSet definitely had something new/different to offer (I say had because they’re now part of Microsoft).  I’m really enjoying using SearchMe and I’ve now made it my default engine in Firefox.  They definitely have something new/different to offer.

Yes, it’s certainly a very difficult challenge to gain any traction with search with such dominent players in the market (particularly with such a strong market leader in Google, who is also constantly innovating).  And that’s just with respect to gaining uers, saying nothing of the challenge of then monetizing the search traffic.

But these smaller engines definitely have something new/different to offer that is worth trying.  Could Google experiment with a completely different style of results presentation like SearchMe (visual) or Cuil (magazine-style) given such a huge base of users for whom the ranked list of 10 results per page is ingrained in their use of search?

Of course new/different is no good if it’s not better in some way.  I am finding in fact that I’m often having a better experience with SearchMe.  The visual presentation of results pages, with query terms highlighted, is proving a very efficient way to scour through results in certain cases.  I’m still trying to get a handle on how often I find SearchMe better (I’ll often run the same search on Google, and now also on Cuil to test) versus those times when I have to give up on SearchMe and go back to Google.  I’ll come back to this with more detail as I get a better grasp of this.

But in the meantime, these little engines are worth a try, and not just once, but are worth coming back to.  “Search is not a solved problem”, after all.

Posted by: Paraic | July 30, 2008

The mortgage situation up close

Glad to see (sort of) on CNN Money that while foreclosure filings are up 120% in the US, Syracuse is one of the handful of places that has remained relatively unscathed:

“On the other hand, there were a handful of metro areas that remained relatively unscathed. Honolulu, at one filing for every 1,331 households had the lowest rate of all, followed by Allentown, Pa. (one for every 972) and Syracuse, N.Y. (one for every 880).”

Of course, Syracuse NY has always had a low cost of housing (in terms of the cost of a house – don’t get me started on the property taxes).  Our 2,000 sq. ft. 4-bed home went on the market March 19th at $175,000.  It was slow going for a couple of months, followed by a couple of decisions along the way to bring our price down a little given the state of the market.  We had one buyer make an unreasonable offer and determined that they really couldn’t afford the house, so we let that one pass.  Then in mid-July we found our buyers who came in with a good offer and wanted to close quickly (no later than July 25th).  At that point, we were priced at $160,000 and accepted $159,000 [people here can’t figure out now how you can buy a 2,000 sq. ft. 4 bed house anywhere for roughly €100,000].

But then the mortgage situation in the US kicks in.  The contract stipulated July 9th as the date our buyers would have their formal letter of commitment from their bank… and July 9th came and went.  In fact July 25th, the original must-close-by date, came and went without even a commitment letter.  In the meantime, we were furiously trying to pack up all our stuff (not insigificant given 10 years in the house and 3 kids, who were all born there – and did I mention a huge attic in the house, so things tended to get ‘saved’) and organize shipping to Ireland, so we were happy for a while to go along with things.

Unfortunately, we have no insight into exactly what’s going on with the delay except that it’s ‘the bank’.  Apparently there have been multiple requests back to our buyers for more information or updates on information already provided… but it had gotten to the point yesterday that we decided to give until this Friday for the letter of commitment to show up – otherwise we were going to have to start thinking about stepping back from this contract and showing our house again.

It’s a rollercoaster ride – having your house on the market in this envrionment, having now moved our family and ‘stuff’ (in transit) out of the country, not knowing absolutely for sure that this contract is going to close, and maybe we’ll be back to square one…

…but that was last night.  This morning, I got word that our buyers had gotten their formal commitment from the bank… now we just need to satisfy all the closing conditions and get this wrapped up.

You’ll hear my sigh when I get the cheque! (I just wrote cheque! – it used to be check!).

Posted by: Paraic | July 29, 2008

Is that Cuil or what?

There’s already been a lot of coverage of the new search engine Cuil, especially given the backgrounds of its management team.  I wasn’t planning originally to post on it since I didn’t think I’d have anything to add to what has already been written, but from a combination of surprise (not in a good way) and frustration, here are some observations:

The first major claim of differentiation for Cuil is the 120 Billion page index they’re using.  The claim of course is that with all these pages indexed you’re going to be able to find more/better results, especially for less frequent query terms.  But my first impression is that all those extra pages they’ve indexed are not really adding to the value of the results – if anything, after a certain point it seems to be that it’s just noise/junk that’s being added in to the index.  Think about how you get all these extra pages into a search index.  These aren’t the easy ‘static’ pages you can just crawl for content; everybody has those.  To get to this number you try to generate as many variations of dynamically-generated pages as possible.  If you over-do it though, you end up with junk pages in your index.

Take my sample search for CNGL.  One of the results at Cuil comes from a site that helps you find UK company names:  If you browse their directory, you can find companies starting with ‘C’, then those with ‘CN’, then drill down to ‘CNG’, at which point you’ll see your options are now limited to companies that start with ‘CNGA’, ‘CNGT’, or ‘CNG-(space)’.  The Cuil results list though has a page where they’ve attempted to find companies with ‘CNGL’, generating the page with content that says, ” Sorry, no joy looking for that company, try again though…”  (  A similar example is a set of results from a ‘TypoTrap’ at where Cuil results include, “Did you say… CNBL?” and “Oops… VNGL“.  So in cases like this, all those extra pages in the index are not really helping me – and the CNGL home page was not in the result set at all.

My second observation is on the result presentation.  I’m all for exploring alternatives to the standard ranked list of 10 results with snippets (I’m continuing to use SearchMe and still liking it so far), but I found the Cuil results presentation just really disorienting.  It’s not at all clear what the ordering of results is, or even whether there is any ordering of the results presented.  I know that the assessment of which result suits me best is up to me, but I find now that I’m now “trained” from my use of search that there will be a rank ordering of results and the obvious strategy is to start at the first/best and work my way down the list.  The Cuil presentation just breaks that model and it’s disorienting in a big way.  (On the other hand it’s interesting to discover how deeply that model is ingrained – we really rely on the search engine to guide us in navigating the result set).

It will be interesting to see how Cuil adapts and evolves.  I’ve seen some people saying that they’ve already seen changes/updates in Cuil’s behaviour in terms of search performance after just the first day.  I’m sure it will evolve quickly.  Worth watching.

Older Posts »