Rayleigh has only been tested with Python 2.7, on OS X 10.8 and Ubuntu 12.04.
First, install FLANN from source, making sure to compile with Python support. Test that you can import pyflann from a python console.
Now, go into the Rayleigh directory that you cloned from the Github repository, and run
pip install -r requirements.txt
To get your Rayleigh running locally, we must a) populate the database with image information; b) load a SearchableImageCollection object.
The best way to get started is to download a .zip file containing all you need for a demo here. Unzip it in the repo dir, so that it populates a subfolder called data/.
Let’s get the mongodb server running:
cd rayleigh_repo_dir
mongod --config mongo.conf
Now to load the data from the zipfile you downloaded
mongorestore --port 27666 data/flickr_100k
We should now be all ready to run the server. In another shell tab:
python rayleigh/client/app.py
The website should now be up at http://127.0.0.1:5000/
You can download more pickled SearchableImageCollections from https://s3.amazonaws.com/rayleigh/.
To construct your own dataset from scratch, run
nosetests test/collection.py:TestFlickrCollection -s
This uses the file data/flickr_1M.json.gz, which lists a million images from Flickr fetched by the “interestingness” API query over the last few years.
Running this will download and process 100K images (or less, or more, if you modify the code). Data is stored into the mongodb database. It will help to have multiple cores working, so in a separate tab, do
ipcluster start 8
This relies on the IPython parallel framework.
If you want, you can reoutput the Flickr URLs
python rayleigh/assemble_flickr_dataset.py
Have fun!
Rayleigh is an open-source system for quickly searching medium-sized image collections by multiple colors given as a palette or derived from a query image.
Assemble a list of URLs to Flickr images fetched by repeated calls to the API method flickr.interestingness.getList.
To use, you must place your [API key](http://www.flickr.com/services/apps/72157632345167838/key/) into a file named ‘flickr_api.key’ located in the same directory as this file.
There is a limit of 500 images per day, so to obtain more images than that, we iterate backwards from the current date until the desired number of images is obtained.
Assemble dataset containing the specified number of images using Flickr ‘interestingness’ API calls. Returns nothing; writes data to file.
Parameters : | api_filename : string
data_filename : string
num_images_to_load : int |
---|
Load the data in given filename and return the first num_images urls (or all of them if num_images exceeds the total number).
Parameters : | data_filename : string
num_images : int |
---|---|
Returns : | ids : list
urls : list
|
ImageCollection stores color information about images and exposes a method to add images to it, with support for parallel processing. The datastore is MongoDB, so a server must be running (launch with the settings in mongo.conf).
Bases: object
Initalize an empty ImageCollection with a color palette that will be used to extract color information from images.
Parameters : | palette : Palette
|
---|
Methods
Add all images in a list of URLs. If ipcluster is running, load images in parallel.
Parameters : | image_urls : list image_ids : list, optional
|
---|
Return histograms of all images as a single numpy array.
Returns : | hists : (N,K) ndarray
|
---|
Return information about the image at id, or None if it doesn’t exist.
Parameters : | image_id : string no_hist : boolean
|
---|---|
Returns : | image : dict, or None
|
Bases: object
Read the image at the URL in RGB format, downsample if needed, and convert to Lab colorspace. Store original dimensions, resize_factor, and the filename of the image.
Image dimensions will be resized independently such that neither width nor height exceed the maximum allowed dimension MAX_DIMENSION.
Parameters : | url : string
id : string, optional
|
---|
Methods
Bases: object
Extract a L*a*b color array from a dict representation of a palette query. The array can then be used to histogram colors, output a palette image, etc.
Parameters : | palette_query : dict
|
---|
Encapsulate the list of hex colors and array of Lab values representations of a palette (codebook) of colors.
Provide methods to work with color conversion and the Palette class.
Provide a parametrized method to generate a palette that covers the range of colors.
Bases: object
Create a color palette (codebook) in the form of a 2D grid of colors, as described in the parameters list below. Further, the rightmost column has num_hues gradations from black to white.
Parameters : | num_hues : int
sat_range : int
light_range : int
|
---|---|
Returns : | palette: rayleigh.Palette : |
Methods
Methods to search an ImageCollection with brute force, exhaustive search.
Bases: object
Initialize with a rayleigh.ImageCollection, a distance_metric, and the number of dimensions to reduce the histograms to.
Parameters : | image_collection : rayleigh.ImageCollection dist_metric : string
sigma : nonnegative float
num_dimensions : int
|
---|
Methods
Return the smoothed image histogram of the image with the given id.
Parameters : | img_id : string |
---|---|
Returns : | color_hist : ndarray |
Return num closest nearest neighbors (potentially approximate) to the query color_hist, and the distances to them.
Override this search method in extending classes.
Parameters : | color_hist : (K,) ndarray
num : int
|
---|---|
Returns : | nn_ind : (num,) ndarray
nn_dists (num,) ndarray :
|
Search images in database by color similarity to the given histogram.
Parameters : | color_hist : (K,) ndarray
num : int, optional
reduced : boolean, optional
|
---|---|
Returns : | query_img : dict
results : list
|
Search images in database by color similarity to image.
See search_by_color_hist().
Search images in database for similarity to the image with img_id in the database.
See search_by_color_hist() for implementation.
Parameters : | img_id : string num : int, optional |
---|---|
Returns : | query_img_data : dict results : list
|
Bases: rayleigh.searchable_collection.SearchableImageCollection
Use the cKDTree data structure from scipy.spatial for the index.
over to brute-force.
k-th returned value is guaranteed to be no further than (1 + eps) times the distance to the real k-th nearest neighbor.
NOTE: These parameters have not been tuned.
Methods
Bases: rayleigh.searchable_collection.SearchableImageCollection
Search the image collection exhaustively (mainly through np.dot).
Methods
Bases: rayleigh.searchable_collection.SearchableImageCollection
Search the image collection using the FLANN library for aNN indexing.
The FLANN index is built with automatic tuning of the search algorithm, which can take a while (~90s on 25K images).
Methods
MATLAB-like tic/toc.
Methods
Print <msg> every <interval> seconds, running the timer for <label>.
label (string): [optional] label for the timer
msg (string): [optional] message to print
interval (int): [optional] print every <interval> seconds
Output the main colors in the histogram to a “palette image.”
Parameters : | color_hist : (K,) ndarray palette : rayleigh.Palette percentile : int, optional:
filename : string, optional:
|
---|---|
Returns : | rgb_image : ndarray |
Returns a palette histogram of colors in the image, smoothed with a Gaussian. Can smooth directly per-pixel, or after computing a strict histogram.
Parameters : | lab_array : (N,3) ndarray
palette : rayleigh.Palette
sigma : float
direct : bool, optional
|
---|---|
Returns : | color_hist : (K,) ndarray |
Return a palette histogram of colors in the image.
Parameters : | lab_array : (N,3) ndarray
palette : rayleigh.Palette
plot_filename : string, optional
|
---|---|
Returns : | color_hist : (K,) ndarray |
Assign colors in the image to nearby colors in the palette, weighted by distance in Lab color space.
Parameters : | lab_array (N,3) ndarray: :
palette : rayleigh.Palette
sigma : float
>>> from pylab import * : >>> ds = linspace(0,5000) # squared distance : >>> sigma=10; plot(ds, exp(-ds/(2*sigma**2)), label=’$sigma=%.1f$’%sigma) : >>> sigma=20; plot(ds, exp(-ds/(2*sigma**2)), label=’$sigma=%.1f$’%sigma) : >>> sigma=40; plot(ds, exp(-ds/(2*sigma**2)), label=’$sigma=%.1f$’%sigma) : >>> ylim([0,1]); legend(); : >>> xlabel(‘Squared distance’); ylabel(‘Weight’); : >>> title(‘Exponential smoothing’) : >>> #plt.savefig(‘exponential_smoothing.png’, dpi=300) :
Returns: : color_hist : (K,) ndarray
|
---|
Return base64-encoded image containing the color palette histogram.
Return an object suitable to be sent as an image by Flask, containing the color palette histogram.
Convert a list of hex colors and their values to an RGB image of given width and height.
a dictionary of hex colors to unnormalized values, e.g. {‘#ffffff’: 20, ‘#33cc00’: 0.4}.
Return Figure containing the color palette histogram.
color_hist (K, ndarray)
palette (Palette)
Save histogram to this file, if given.