Parse the Craigslist stream.

by James on September 14, 2010

So I am trying to find a house to rent.  Everyday I am checking Craigslist, but this gets old fast!  So finnaly I am like… I can write a script for that.  I fire up a bash prompt and get to work.

Here is what I came up with. It is a simple scripts that curls an rss feed, strips out all the html mumbo jumbo, looks for a list of specific cities that I am willing to move to, saves my results to a file, and then finally emails me the list . The next time the script runs it checks to make sure that it does not email me the same results. I am using mutt to send an email from the bash script.



if [ -f update.txt ];then
   rm -rf update.txt
# The rss feed for my house search

curl -s "$rss" | grep -E '<item rdf:about=|<title>' | sed s'/<item rdf:about="//'g \
        | sed s'/<title><!\[CDATA\[/ /'g | sed s'/">//'g  \
		| sed s'/]]><\/title>//'g \
		| sed s'/<title>craigslist san diego | apts\/housing for rent search &#x22;house&#x22;<\/title>/ /' \
		| sed -n -e ":a" -e "$ s/html\n/html /gp;N;b a" \
		| grep -i 'ocean beach\|point loma\|pacific beach\|southpark\|south park\|oceanbeach' > tmp.txt

# 1st loop for checking the new listings againts the old
while read newline

    # 2nd loop loads the list of old matches to check
	while read line   
		   if [[ "$newline\n" == "$line\n" ]];then
	done < oldlist.txt

	if [ $axold != 1 ];then
		echo "$newline" >> update.txt
		echo "$newline" >> oldlist.txt
done < tmp.txt

rm -rf tmp.txt

#send email if update.txt exists
if [ -f update.txt ];then
   mutt -n -s "New Craigslist Post"  -- "" < update.txt

rm -rf update.txt

Now lets make it check craigslist every 15mins.

[zlabx]$ crontab -e
  */15 * * * * /path/to/my/script/

One comment


This was super useful. I made a few modifications and posted it to github:


by Ohm Architect on June 28, 2014 at 8:27 am. Reply #

Leave your comment


Required. Not published.

If you have one.