Wednesday, November 09, 2005

One of my personal pet programming projects at one point in time was to create some kind of screen scraper of monster.com and other job websites so I could "download" all the current job postings and do things like:

1) Rank/Prioritize the job listings by personal appeal
2) Detect job listing duplicates between monster/dice/etc.
3) Keep track of which jobs I applied for and when
4) Automate running multiple queries with multiple terms in job site specific ways so I could detect new entries in my areas of interest (something an RSS feed would normally be used for)
5) Categorize jobs by factors that were important to me - Salary, Location, Consulting vs. Full-time, How qualified I thought I was, etc.

Everyone puts that much effort into their job search, right?  ;)

I did get something of an automated process for the worst bits with tons of manual intervention required such that I could do this and it worked wonderfully.  It was never anywhere close to automated enough to share with friends though.  However, screen scrapping different web sites is not fun, rewarding work and I eventually shelved the project (and I now find most of my contract work through recruiters who I've talked to in the past and never apply for jobs from web site job listings anymore).

I still think this is very cool though:

http://www.indeed.com/jsp/apiinfo.jsp

Indeed is yet another job search engine, except with one major difference (it may not be the only job search engine with this feature, but it's the only one I know of).  Indeed publishes a Web Services API which you can query directly that seems to return job listings from Monster, Dice, etc.  This would have made my pet project above a piece of cake to implement.  This is a very good sign of things to come in the web services world!

Now if someone would just publish a web service for stock price quotes (you know it's going to happen eventually).

11/9/2005 8:13:21 AM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [2]  | 
Related Posts:

11/14/2005 9:00:42 AM (Central Standard Time, UTC-06:00)
Hi Michael - I'm glad you found our Web Services API to be so useful. Let me know if you have any questions or want to delve into it further.

Thanks for the good word!

david parmet
indeed.com
12/23/2005 9:06:15 AM (Central Standard Time, UTC-06:00)
I remember I built a similar tool a while ago. It was scraping a completely different information though. I built it w/ C++ on top of libwww[1] and HTMLTidy[2] - the rest was simply using XPath on XML documents - what could be simpler than that? You don't need libwww, but you'll need HTMLTidy if you want to go my way, and I'm not sure it has a .NET wrapper. If you're familiar w/ C++ (and C++/CLI) you can build one yourself though.

[1] http://www.w3.org/Library/
[2] http://tidy.sourceforge.net/

Cheers,
Stoyan
Name
E-mail
Home page

Comment (HTML not allowed)  

Enter the code shown (prevents robots):