Saturday 29 September 2012

Downloading Photobucket albums with Matlab and wget

Here's a script to take the pain out of downloading an album of photos from photobucket.

It uses matlab to search a particular url for photo links (actually it looks for the thumbnail links and modifies the string to give you the full res photo link), then uses a system command call to wget to download the photos All in 8 lines of code! 

Ok to start you need the full url for your album. The '?start=all' bit at the end is important because this will show you all the thumbnails on a single page. Then simply use regexp to find the instances of thumbnail urls. Then loop through each link, strip the unimportant bits out, remove the 'th_' from the string which indicates the thumbnail version of the photo you want, and call wget to download the photo. Easy-peasy!

Saturday 15 September 2012

Alternative to bwconvhull

Matlab's bwconvhull is a fantastic new addition to the image processing library. It is a function which, in its full capacity, will return either the convex hull of the entire suite of objects in a binary image, or alternatively the convex hull of each individual object. I use it, in the latter capacity, for filling multiple large holes (holes large enough that imfill has no effect) in binary images containing segmented objects 

For those who do not have a newer Matlab version with bwconvhull included, I have written this short function designed to be an alternative to the usage

 P= bwconvhull(BW,'objects');

and here it is:



It uses the IP toolbox function regionprops, around for some time, to find the convex hull images of objects present, then inscribes them onto the output image using the bounding box coordinates. Simples!

Sunday 9 September 2012

Craigslist Part 2: using matlab to plot your search results in Google Maps

In 'Craigslist Part 1' I demonstrated a way to use matlab to automatically search craigslist for properties with certain attributes. In this post I show how to use matlab and python to create a kml file for plotting the results in Google Earth. Download the matlab googleearth toolbox from here Add google earth toolbox to path. Import search results data and get map links (location strings). Loop through each map location and string the address from the url, then geocode to obtain coordinates. The function 'geocode' writes and executes a python script to geocode the addresses (turn the address strings into longitude and latitude coordinates). The python module may be downloaded here Once we have the coordinates, we then need to get rid of nans and outliers (badly converted coordinates due to unreadable address strings). Use the google earth toolbox to build the kml file. Finally, run google earth and open the kml file using a system command: The above is a simple scatter plot which only shows the location of the properties and not any information about them. Next shows a more complicated example where the points are plotted with labels (the asking price) and text details (the google map links) in pop-up boxes First each coordinate pair is packaged with the map and name tags. Concatenate the strings for each coordinate and make a kml file. Finally, run google earth and open the kml file using a system command:

Saturday 8 September 2012

Broadcom wireless on Ubuntu 12.04

I just upgraded to Ubuntu 12.04 and my broadcom wireless card was not being picked up. It took me some time to figure out how to do this, and the forums give a lot of tips which didn't work for me, so I thought I'd give it a quick mention.




Then disconnect your wired connection and reboot. I found it is crucial that you disconnect any wired connections.

 Upon reboot, go to System Settings - Hardware - Additional Drivers

 It should then pick up the broadcom proprietary driver. Install it, then reboot, and you should be back up and running!

Friday 7 September 2012

Craigslist Part 1: Using matlab to search according to your criteria and retrieve posting urls

I have a certain fondness for using matlab for obtaining data from websites. It simply involves understanding the way in which urls are constructed and a healthy does of regexp. In this example, I've written a matlab script to search craiglist for rental properties in Flagstaff with the following criteria: 1) 2 bedrooms 2) has pictures 3) within the price range 500 to 1250 dollars a month 4) allows pets 5) are not part of a 'community' housing development. This script will get their urls, prices, and google maps links, and write the results to a csv file. First, get the url and find indices of the hyperlinks only, then get only the indices of urls which are properties. Then get only the indices of urls which have a price in the advert description; find the prices; strip the advert urls from the string; sort the prices lowest to highest; and get rid of urls which mention a community housing development. Finally, get the links to google maps in the adverts if there are some, and write the filtered urls, maps, and prices to csv file. Save and execute in matlab, or perhaps add this line to your crontab for a daily update!

/usr/local/MATLAB/R2011b/bin/matlab -nodesktop -nosplash -r "craigslist_search;quit;"