Author Archives: power

email habits over time

I was curious if my sleeping/waking habits had really changed over the years – I definitely don’t feel I work as late now as when I was 22, but it’s hard to tell. To test this, I looked over all of the timestamps of mail I’ve sent in the past few years and tried to make a pretty graph.

I’m not sure how meaningful it is, but thanks to ggplot, it is pretty, at least.

Posted in Uncategorized | Leave a comment

multi-processor cache simulator

Does anyone know of a simple cache simulator for multi-processors? I’m curious what the performance of various applications would be like if you had hardware level DSM across a low-latency interconnect like Infiniband.

Would handling the cache-line coherency issues in hardware be a big difference over a software DSM?

Posted in Uncategorized | Leave a comment

mycloud

Do you have a few computers lying around the house or lab that you want to make use of for your Python computing? Too lazy to setup a real MapReduce system like Hadoop? Well, this could be your solution! (It probably won’t be, but, hey, you never know).

mycloud allows you to make use of your spare cycles, all from within Python – all you need to have available is the SSH connection that you’re already using to manage your systems.  Here’s an example:

Starting your cluster:

# list each machine and the number of cores to use
cluster = mycloud.Cluster([('machine1', 4),
                           ('machine2', 4)],
                          fs_prefix='/path/to/store/results')

Invoke a function over a list of inputs:

result = cluster.map(my_expensive_function, range(1000))

Use the MapReduce interface to easily handle processing of larger datasets:

from mycloud.resource import CSV
input_desc = [CSV('my_input_%d.csv' % i for i in range(100)]
output_desc = [CSV('my_output_file.csv']

def map_identity(k, v):
  yield (k, int(v[0]))

def reduce_sum(k, values):
  yield (k, sum(values))

mr = mycloud.mapreduce.MapReduce(cluster,
                                 map_identity,
                                 reduce_sum,
                                 input_desc,
                                 output_desc)

result = mr.run()

for k, v in result[0].reader():
  print k, v

The code is mostly not-documented, but please contribute documentation and/or complaints as you see fit! :)

You can access the source code here: https://bitbucket.org/rjpower/mycloud

Or you can access the code with pip via pypi:

pip install mycloud

 

Posted in Uncategorized | Leave a comment

autocite

I’ve had this idea poking around in my head for a long time, but had never gotten around to actually implementing it.   Finally this weekend, I did.

Finding citations for a paper is a useful but onerous process.  You have to track down all of the work in the past that touched on similar ideas as your own, which amounts to spending hours searching through old journals looking for examples where your work has been done before.  (In computer science, there are no new papers, just recyclings of ideas from the 60′s and 70′s).

In any case, this problem is no longer!  With autocite, you can let the computer find the citations for you, or at least the most promising set of them.  Here’s an example list of papers found based on inputting the abstract for my paper,

Piccolo:

  • http://cs.mst.edu/documents/technical_reports/91-14.pdf
  • http://www-i2.informatik.rwth-aachen.de/Research/MCS/SDL/safire_compatibility_table.pdf
  • http://cs.mst.edu/documents/technical_reports/92-02.pdf
  • http://cs.mst.edu/documents/technical_reports/92-24.pdf
  • http://staffweb.psdschools.org/wleggett/unitplan2/weblessonplan.pdf
  • http://cr.yp.to/bib/1991/ecma-48.pdf
  • http://www.lfcs.inf.ed.ac.uk/research/database/publications/vldb04_ranking.pdf
  • http://sziami.cs.bme.hu/%7Eildi/seminar/2010_01/kapolnai_abs.pdf
  • http://www.hpdc.org/2009/PDF/hursey.pdf
  • http://www.cs.arizona.edu/people/dkl/students.pdf

Okay, maybe it won’t solve all of your citation problems, but it’s somewhat entertaining to play with anyway. Try it out here:

Auto-citation generator

If you’re curious, the code is available here, though it’s a bit of a mess.

autocite source

Posted in Uncategorized | Leave a comment